Check out the new USENIX Web site. Previous Next Contents

7   Early experiences





Table 1: Example signatures.
Signature Description
Usage Average daily usage
Frequency Calling frequency
Activity Days since last seen
Bizocity ``Business-likeness''




Table 2: Record structure for example signatures
Signature Signature Approximation Approximation Number of
  type type method fields
Usage int (0-15) variable-width buckets 4
Frequency double (0-255) fixed-width buckets 2
Activity int (0-39) clamping 3
Bizocity char (0-15) fixed-width buckets 2


Table 1 briefly describes four signatures: Usage, Frequency, Activity, and Bizocity. These signatures are computed daily from call records. In this section, we discuss how these signatures use Hancock's data and control-flow mechanisms.

The maps used by these signatures all have index types of line_t and value types that are records. Usage, Frequency, and Activity use constant defaults. Bizocity uses a default function that queries a secondary map, which indicates whether the phone is a known residence, a known business, or unknown.

The records used in these signatures vary based on the application, but they all have a common form: the desired profile contains several fields that have the same underlying structure. To express this structure in Hancock, we use two records: one to describe the basic fields and a second to group these fields into a profile.

Table 2 describes the basic fields for the sample signatures and indicates how many such fields are contained in the profile record. In all these examples, the approximation type is a range that can be represented with a C char type. The signatures use different approximation techinques. Bucketing divides the range of signature values into disjoint buckets and associates a default value with each such bucket. With this technique, freezing converts a signature value into the containing bucket, whereas thawing returns the default value for a bucket. Bucketing can use either fixed-width or variable-width buckets. Clamping converts values above the range of signature values to the largest value in the range and values below the range to the lowest value in the range.

In all four signatures, the amount of Hancock code needed to describe the data is small. The largest, Bizocity, takes fewer than 30 lines.

In terms of control-flow, the example signatures share the same high-level structure, each containing two phases: one to compute information for outgoing calls and another for incoming calls. The event structures for the signatures are different, however. Frequency tracks only the existence of a call for a given number, so its bottom-level event is the line event. Activity and Usage do work at both call and line events. Bizocity uses these events and does significant computation at the exchange level.

The Hancock code that implements these phases is small: the smallest, Frequency, takes 40 lines of code; the largest, Bizocity, takes 300 lines, more than 100 of which are for processing exchange events.

In all, these examples indicate that Hancock data descriptions are compact and that the event processing code is modest in size. Ideally, we would like to compare the Hancock implementations with hand-written C implementations. Unfortunately, this comparison is very hard to do fairly. The only C implementation of Usage, Activity, and Bizocity available to us is a program that combines the computation of all three signatures and has code to manage the on-disk representations of the signature files embedded in it. This program is about 1500 lines of code.


Previous Next Contents