The FFPF framework can be used in userspace, the kernel, the IXP1200 network processor, or a combination of the above. As network processors are not yet widely used, and (pure) userspace FFPF does not offer many speed advantages, the kernel version is currently the most popular. For this reason, we use FFPF-kernel to explain the architecture, and describe the userspace and network processor versions later. The main components are illustrated in Figure (1.a).
A key concept in FFPF is the notion of a flow which is different from what is traditionally thought of as a flow (e.g., a `TCP flow'). It may be thought of as a generalized socket: a flow is `created' and `closed' by an application and delivers a stream of packets, where the packets match arbitrary user criteria (e.g., ``all UDP and TCP packets sent to port 554'', or ``all UDP packets containing the CodeRed worm plus all TCP SYN packets''). The flow may also provide other application-specific information (e.g., traffic statistics).
A flow is captured by a flow grabber. For now, consider a flow
grabber to be a filter that passes just the information (packets,
statistics) in which the user is interested. Packets arrive in the
system via one or more packet sources. Examples of packet sources
include: (a) a network driver that interacts with a dumb NIC, (b) a
smart NIC that interacts with FFPF directly, or (c) a higher-layer
abstraction in the operating system that hides device-specific
issues. A flow grabber receives the packets and if they correspond to
its flow, stores them in a circular packet buffer known as
. In addition, it places a pointer to this packet in a second
circular buffer, known as the index buffer, or
. Applications use
the pointers in
to find packets in
.
The reason for using two buffers for capturing a flow is that
while
is specific to a flow,
is shared. If the
application opens two flows, there will be just one
and two
s. If the flows are `overlapping' (i.e., some packets in flow
are also in flow
), only one copy of each packet will be in
. However, if a packet is in both flows, a pointer to it is placed
in both
s. In other words, we do not copy packets to individual
flows. Moreover, the buffers are memory mapped, so we do not copy
between kernel and userspace either. We show later how
can also
be shared by multiple applications (as sketched in
Figure (1.b)). Using memory mapping to avoid copying is a
known technique, also used in monitoring solutions like DAG and
SCAMPI [10,30]. Edwards et al. also give
userspace applications direct control over packet buffers, but provide
an explicit API to access the buffers rather than memory
mapping [15].
Thus far, we have assumed that a flow grabber is equivalent to a
filter. In reality, a flow grabber can be a complex graph of
interconnected filters, where a filter is defined as an element that
takes a stream of packets as input and returns a (possibly
empty) subset of this stream as output. In addition, a filter may provide
arbitrary information about the traffic, e.g., statistics, intrusion
alerts, etc. For this purpose, every filter has an associated
(also memory mapped), which is a buffer that is used to produce
results for applications, or to keep persistent state. It can also be
used by the application to pass configuration parameters to the
filter. For instance, in case of a `blacklist filter' the application
may store the addresses of the blacklist in
. Note that the
ability to perform more complex processing than just filtering, helps
to reduce context switches, e.g., because applications that are
interested in periodic statistics only and not in the packets
themselves need not be scheduled for packet processing.
In later sections, we show that FFPF is language neutral, so that, for instance, BSD packet filters can be combined with filters written in other languages. In fact, the filters in a flow grabber are simple instantiations of filter classes, one of which may be the class of BPF filters. In addition to existing languages like BPF, we support two new languages (see Section 3.3) that are explicitly designed to exploit all features offered by FFPF. Among other things, they provide extensibility of the FFPF framework by their ability to call `external functions' (provided these functions were previously registered with FFPF). External functions commonly contain highly optimised native or even hardware implementations of operations that are too expensive to execute in a `safe' language (e.g., pattern matching, generating MD5 message digests).
We have covered most aspects of FFPF that are relevant if a single
monitoring application is active. It is now time to consider what
happens if multiple applications are present. For this purpose, we
introduce a new concept, called the flow group. A flow group is
a set of applications with the same access rights to packets,
i.e., if one application is allowed to read a packet, all others in
the same group may also access it. Flow groups are again used to
minimise packet copying. Applications in the same group share a
common
.
contains all packets for which one or more
applications in the group have expressed interest. This is illustrated
in Figure (1.b). If more than one group express interest
in the packet, it is copied once per group, unlike existing approaches
(such as BPF/LSF) which copy the packet to each application
separately. This makes FFPF cheaper than other solutions when
supporting multiple applications. In the current implementation, the
flow group is determined by group id. In the future, we plan to
provide applications with more explicit control over flow groups.
We see that FFPF demultiplexes packets to their respective flows early, i.e., well before they are processed by the kernel protocol stack. This is a tried technique that is also used in projects like LRP [14]. Unlike LRP, however, we do not place the packets themselves on application-specific queues, but only the corresponding pointers. Thus, it is possible to avoid copying both for demultiplexing purposes and for crossing the protection domain boundaries.