Next: Acknowledgements Up: System Design Issues for Previous: Related Work

Conclusions

In this paper, we presented the results of an extensive, unintrusive client-side HTTP tracing efforts. These traces were gathered from a 10 Mb/s Ethernet over which traffic from 600 modems (used by more than 8,000 UC Berkeley Home IP users) flowed. Forty-five days worth of traces were gathered. We used a custom module written on top of the Internet Protocol Scanning Engine (IPSE) to perform on-the-fly traffic reconstruction, HTTP protocol parsing, and trace file generation. Being able to do this on the fly allowed us to write out only the information that interested us, giving us smaller and more manageable trace files.

We measured and observed a number of interesting properties in our Home IP HTTP traces, from which we have drawn a number of conclusions related to Internet middleware service design:

Although most web clients can be classified as accessing Internet services using a PC-based browsers and desktop machines, there is significant heterogeneity in the client population that Internet middleware services must be prepared to handle.
There is an extremely prominent diurnal cycle affecting the rate at which clients access services. Furthermore, clients' activity is relatively smooth at large time scales (on the order of tens of minutes, hours, or days), but increasingly bursty at smaller time scales (order of minutes or seconds). Internet middleware services can thus provision their resources based on the request rate observed over several hours if they can afford to smooth bursts observed over second-long time scales.
There is a very large amount of locality of reference within clients' requests. The amount of locality increases with the client population size, as does the working set of the client population. Thus, caches that take advantage of this locality must grow in size in parallel with the client population that they service in order to avoid thrashing.
Although Internet services tend to be very reactive, the latency of delivering data to clients is quite lengthy, implying that there could potentially be many hundreds or thousands of outstanding, parallel requests being handled by a middleware service. Services must thus minimize the amount of state and switching overhead associated with these outstanding, mostly idle tasks.

Next: Acknowledgements Up: System Design Issues for Previous: Related Work

Steve Gribble
Tue Oct 21 15:56:39 PDT 1997