Next: Trace Analysis
Up: Home IP Trace Gathering
Previous: IPSE
The 45 day trace contains approximately 24,000,000 HTTP requests,
representing the web surfing behaviour of over 8,000 unique clients. The
trace capture tool collected the following information for each HTTP
request seen:
- the time at which the client made the request, the time that
the first byte of the server response was seen, and the time that
the last byte of the server response was seen,
- the client and server IP addresses and ports,
- the values of the no-cache, keep-alive,
cache-control, if-modified-since, useragent, and
unless client headers (if present),
- the values of the no-cache, cache-control,
expires, and last-modified server headers (if present),
- the length of the response HTTP header and response data, and
- the request URL.
IPSE wrote this information to disk in a compact, binary form. Every four
hours, IPSE was shut down and restarted, as its memory image would get
extremely large over time due to a memory leak that we were unable to
eliminate. This implies that there are two potential weaknesses in these
traces:
- Any connection active when the engine was brought down will have a
possibly incorrect timestamp for the last byte seen from the server, and a
possibly incorrect reported size.
- Any connection that was forged in the very small time window (about
300 milliseconds) between when the engine was shut down and restarted will
not appear in the logs.
We estimate that no more than 150 such entries (out of roughly
90,000-100,000) are misreported for each 4 hour period.
Steve Gribble
Tue Oct 21 15:56:39 PDT 1997