5 Wiring Diagram
In the previous section, we explained that computing a signature may
require multiple passes over the data. Hancock provides the sig_main
construct to express the data flow between passes and
to connect command-line input to the variables in the program. The arcs
between the phase boxes in Figure 1 depict this
construct. The following code implements sig_main for Usage:
void sig_main( const callStream calls <c:>,
exists const uMap y_usage <u:>,
new uMap usage <U:>) {
usage :=: y_usage;
out(calls, usage);
in(calls, usage);
}
There are three parameters to the Usage signature. The first is a
stream that contains the raw call data. The const keyword
indicates that this data is read-only. The syntax (<c:>)
after the variable name calls specifies that this parameter will
be supplied as a command-line option using the -c flag. The
colon indicates that this flag takes an argument, in this case the
name of the directory that holds the binary call files. The absence
of a colon indicates that the parameter is a boolean flag. The
Hancock compiler generates code to parse command-line options. The
second parameter is a Usage map, the name of which is specified using
the -u flag. The const qualifier indicates the map is
read-only, while the exists annotation indicates the map must
exist on disk. The final parameter names the Usage map used to hold
the result of this signature computation; the -U flag specifies
the file name for this map. The new qualifier indicates that
the map must not exist on disk.
In general, the body of sig_main is a sequence of Hancock and C
statements. In Usage, sig_main copies the data from
y_usage into usage and then invokes Usage's outgoing and
incoming phases with the raw call stream and the Usage map under
construction as arguments.
5.1 Discussion
The wiring diagram clarifies the dataflow between phases. For
example, some signatures need to make off-direction references to
signature data. A question that arises is: are these references
referring to data computed in a previous phase or to data computed the
previous day? This question can be answered by looking at sig_main. If the parameters to the phase do not include the
original input map, then all references must be to the partially
computed map.
The automatic generation of argument parsing code is convenient and
removes a source of tedium, but its real benefit is that it connects
Hancock variables to their on-disk counterparts. It helps programmers
protect valuable data through the const, new and
exists qualifiers. The runtime system catches attempts to
write to constant data and generates error messages.3 It
detects when data annotated as new already exists or when
data tagged with exists is not on disk, in each case
reporting a run-time error. These data-protection features are
important when it is time-consuming or even impossible to reconstruct
an accidentally overwritten signature.