Check out the new USENIX Web site. Previous Next Contents

5   Wiring Diagram

In the previous section, we explained that computing a signature may require multiple passes over the data. Hancock provides the sig_main construct to express the data flow between passes and to connect command-line input to the variables in the program. The arcs between the phase boxes in Figure 1 depict this construct. The following code implements sig_main for Usage:

void sig_main(       const callStream calls <c:>,  
              exists const uMap y_usage <u:>,
              new          uMap usage <U:>) { 
   usage :=: y_usage;
   out(calls, usage);
   in(calls, usage); 
}
There are three parameters to the Usage signature. The first is a stream that contains the raw call data. The const keyword indicates that this data is read-only. The syntax (<c:>) after the variable name calls specifies that this parameter will be supplied as a command-line option using the -c flag. The colon indicates that this flag takes an argument, in this case the name of the directory that holds the binary call files. The absence of a colon indicates that the parameter is a boolean flag. The Hancock compiler generates code to parse command-line options. The second parameter is a Usage map, the name of which is specified using the -u flag. The const qualifier indicates the map is read-only, while the exists annotation indicates the map must exist on disk. The final parameter names the Usage map used to hold the result of this signature computation; the -U flag specifies the file name for this map. The new qualifier indicates that the map must not exist on disk.

In general, the body of sig_main is a sequence of Hancock and C statements. In Usage, sig_main copies the data from y_usage into usage and then invokes Usage's outgoing and incoming phases with the raw call stream and the Usage map under construction as arguments.

5.1   Discussion

The wiring diagram clarifies the dataflow between phases. For example, some signatures need to make off-direction references to signature data. A question that arises is: are these references referring to data computed in a previous phase or to data computed the previous day? This question can be answered by looking at sig_main. If the parameters to the phase do not include the original input map, then all references must be to the partially computed map.

The automatic generation of argument parsing code is convenient and removes a source of tedium, but its real benefit is that it connects Hancock variables to their on-disk counterparts. It helps programmers protect valuable data through the const, new and exists qualifiers. The runtime system catches attempts to write to constant data and generates error messages.3 It detects when data annotated as new already exists or when data tagged with exists is not on disk, in each case reporting a run-time error. These data-protection features are important when it is time-consuming or even impossible to reconstruct an accidentally overwritten signature.


Previous Next Contents