COBEA is a general event architecture for building distributed active systems. Its main goal is to allow scalability by reducing the volume of notifications. The goal is achieved by implementing efficient filtering at event source. Our initial measurement against the prototype implementation has shown that filtering at event source is crucial for the system to scale as the volume of events increases. Some domain specific issues such as real-time issues described by [7] are not particularly addressed by COBEA. COBEA could be tuned for specific domains if required.
These tests were run on two Digital Alpha AXP 3000 workstations connected by a 155Mbit ATM network. One (for the event source and/or the event sink) is AXP 3000/900 with 275MHz CPU; and the other (mainly for the event sink) is AXP 3000/300 with 150MHz CPU; both have dual-issue processors running the OSF V3.2D-1 operating system. The event system and applications were built with g++ 2.7.2 with -O2 optimisation. The load was light on both machines most of the time during the testing, although the Alpha AXP 3000/900 normally had about 50 users and around 400 processes. We run many sinks on the Alpha AXP 3000/300 to avoid causing significant delay on the shared Lab machine (Alpha AXP 3000/900).
Firstly, we conducted the latency tests to determine the latency of filtering at event source. Secondly we conducted volume tests to determine the impact of event volume increase upon the event server.
Table 1: Event Filtering Latency for One or More Consumers
Table 1 shows the result of the latency tests. When events occur, the server tries to match them against the registered event templates. The latency of event template matching increased logarithmically as shown in columns 1 and 2. For instance, column 1 shows that as the number of registered events increased from 10 to 100, the latency was 4.9 s, 5.9 s, and 6.7 s for 10, 50, 100 registered events respectively, for an average of 1000 runs. The latency for one server and multiple consumers increased linearly as the number of consumers increased (as shown in row 1); this is due to preparing events to send to each consumer after template matching. However, as shown in Figure 5, the overall cost of filtering is small; only the matched events are copied. Event dispatching delay in the case of one source and one consumer was 39 s on average when 100 events were registered. The overall delay for event creation, template matching and event dispatching (i.e. moving the events to the ``sendqueue'') were 271 s. The total latency between the occurrence of an event and delivery to the consumer is estimated to be less than 1 ms.
Figure 5: The Overall Cost of Filtering
and Preparing to Send Events at Source
The advantage of filtering at event source is clearly shown in Table 2. We tested the effect of increasing event volume in two modes: raw, when no attempt to recover missing events was taken; and normal, when event sequence numbers were checked and attempts to recover missing events were made. When 10 events were registered by the consumer at the event source, the consumer detected no missing event at an event volume less than or equal to 2000Hz. In contrast, when 100 events were registered, events started to be missed when the volume reached 20Hz. If events were not recovered after loss, all registered events were correctly delivered to the consumer at the event rate as high as 8000Hz or beyond. Our test events were randomly generated integers between 1 to 1000. Parameterised filtering means that the consumer can register, say, number 1 to 10 or as required. This is a real advantage over type-based filtering, in which case the consumer has no choice but to be overwhelmed by events.
Table 2: Event Volume from a Single Source to One Consumer
It is interesting to notice that the cost of the integrated fault tolerance (i.e. sequence number checking and event recovering) was not as costly as we first thought, especially when the event volume and the number of registered events were low: i.e. at a volume less than 2000Hz when 10 events were registered. However, as the number of registered events increased, the system became much more sensitive to event volumes. Our experiments show that as 100 events were registered, about 98% of events were received by the sink in raw mode, while only 55% were received in normal mode. As the event volume goes extremely high, such cost will become expensive as it contributes to the load on both the source and the sink. In the case that many sinks try to recover missing events from a single event source, the source may eventually not be able to cope. COBEA provides solution to this by allowing replication of event sources and partition of the sinks, so each source is responsible to only a small number of sinks.
The current implementation may be improved for better scalability by allowing multicast of events, thus removing the need to copy events. One potential obstacle to scalability is the integration of fault tolerance irrespective of the underlying communication platform; the cost can be eliminated if the communication support is reliable.