Flow
====

IVRE flow is a beta feature meant to analyze network flows between
hosts. It can be seen as:

-  a recon tool for the case of an unknown network (hence its apparition
   in IVRE/DRUNK)
-  a cartography tool to get a better understanding of a supposedly
   known network (but there is no such thing as a "known network")
-  a monitoring tool to spot unwanted flows in your network

Usage
-----

Data insertion
..............

There are two tools for data insertion, the first is based on Zeek
(previously known as Bro):

.. code:: bash

       $ zeek -r capture_file.pcap
       $ ivre zeek2db ./*.log
       $ ivre flowcli

The second can take either argus logs or netflow logs:

.. code:: bash

       $ argus -m -r capture_file.pcap -w flows.argus
       $ ivre flow2db flows.argus

Or:

.. code:: bash

       $ ivre flow2db flows.nfdump

Or:

.. code:: bash

       $ ivre flow2db -t iptables iptables-from-syslog.log

Any of these tools can be called with '--init' to reinitialize the DB.

Data exploration
................

The main exploration tool are the CLI (``ivre flowcli``) and the Web UI
(``<ivre-web-root>/flow.html``).

CLI
~~~

You can access the CLI through ``ivre flowcli``. Features include:

-  Searching for flows and nodes with filters (see the Flow Filters
   section of this document)
-  Producing top values for given criteria
-  Plotting flows amounts over hours of days, on average.

See ``ivre flowcli -h`` for usage details.

Web UI
~~~~~~

Overview
^^^^^^^^

The central view is a graph representing the network:

-  nodes represent hosts; white ones represent hosts that have incoming
   network flows, grey ones those who do not have any
-  edges represent network flows; same [proto, dport] couple will have
   the same color

Flows are aggregated by destination port (or code, for icmp), two
different connection from the same source to the same destination on the
same destination port (so called ``dport``) but with different source
ports will be aggregated on the same edge.

On the bottom of the graph, there is a timeline, representing the amount
of different flows during some time ranges. This timeline can be played
by going to the **Display** pane.

On the left, there is a control pane with 3 tabs:

-  **Explore:** Allows to explore and reduce the dataset to display with
   node-based or edge-based queries. See the next section for more
   details. It also allows to navigate through the data (limit/skip) and
   change the query mode. At the top of this pane, there is a count of
   the flows, servers and clients matching the current query. Note that
   servers can also be counted as clients if they have outgoing flows.
-  **Display:** Allows to change the way data is displayed (size of
   nodes and edges, timeline precision).
-  **Details:** Details on the currently selected item.

Interaction
^^^^^^^^^^^

Hover nodes and edges to display their basic properties in the
**Details** tab. Click on an edge or a node to query the database for
more information, including any associated metadata (for example DNS
queries happening on a network flow).

There are two ways of filtering the data:

-  Right click on a node or edge and ``Filter by``/``Filter out`` by
   attribute
-  Write filters yourself

See the Flow Filter section of this document for more information on the
filter syntax.

The **Display** pane allows to change the size of nodes and edges based
on some criteria:

-  On nodes, available keywords are ``$in`` and ``$out``, to make the
   size proportional to the number of incoming or outgoing flows of a
   node.
-  On edges, a property can be specified (for example ``scbytes``, the
   number of bytes from the server to the client).

Do not forget to increase the ``Size scale`` to make the result more
visible.

The **Display** pane also allows to change the amount of time slots to
represent on the timeline (capped by the actual time precision set in
``ivre.conf``). The timeline can also be played on the graph by clicking
the 'Play timeline' button.

Flow Filters
~~~~~~~~~~~~

To write filters, the syntax is as follows:

::

   [!][ANY|ALL|ONE|LEN ][src.|dst.][meta.]<attribute> [<operator> <value>]
   [OR <other filter>]

The ``[src.|dst.]`` part is only available for node filters.

The special keywords ``ANY``, ``ALL``, ``ONE`` and ``LEN`` are for
working with array attributes:

-  ALL: matches if all the elements of the array fulfil the predicate
-  ANY: the same if any of the elements match
-  ONE: the same if exactly one of the elements match
-  LEN: the predicate will use the len of the array

Some examples:

-  Node filter ``dst.addr = 192.168.1.1`` will match all the flows whose
   destination is a host with address ``192.168.1.1``.
-  Node filter ``addr =~ 192\.168\.1\..*`` will match all the flows that
   come from or go to a host whose address matches the
   ``192\.168\.1\..*`` regex (sorry, CIDR masks are on their way to be
   implemented).
-  Edge filter ``dport > 10000`` will match all the flows with a
   ``dport`` (destination port) above 10000. ``!dport <= 10000`` will
   match the same flows plus the ones that do not have any destination
   port.
-  Edge filter ``meta.query =~ .*google.*`` will match all the flows
   that have an associated metadata which have a ``query`` attribute
   that match the ``.*google.*`` regex.
-  Edge filter ``ANY sports < 1024`` will match flows with at least one
   source port < 1024.
-  Edge filter ``LEN sports = 1`` will match flows with only one known
   source port.
-  Filter ``ANY meta.answers =~ .*example.com`` will match any metadata
   that contain an array attribute ``answers`` where at least one entry
   matches ``'.*example.com'``.

Available operators are:

-  ``=`` or ``:`` (equality)
-  ``!=``
-  ``<``, ``<=``, ``>``, ``>=``
-  ``=~``
