Response Time vs. Data Set Size

Next: Service Inversion Up: Blocking in Web Servers Previous: Response Time Effects

Response Time vs. Data Set Size

A deeper investigation of the effect of data set size on server latency provides more insight into the blocking problems as well as a surprising result. Figures 8 shows mean and median latencies as functions of data set size. The mean latency remains relatively flat for the in-memory workload, but begins to grow when the data set size exceeds the physical memory of the machine, 1GB. This increase in mean latency is expected, since these filesystem cache misses require disk access, and the disk latency will raise the mean.

The increase in median latency is quite surprising for this workload - the measured cache hit rate is more than 99%, suggesting that most requests should be comfortably served out of the filesystem cache. The cache hit rate is in line with what we showed in Table 2. These tests confirm that the small amount of cache miss activity is interfering with accesses that should be cache hits.

This observation is problematic, because it implies that, for non-trivial workloads, server latency is tied to disk performance, even for cached requests. Without server or operating system modification, latency scalability is therefore tied to mechanical improvements, rather than faster improvements in electronic components. The expected latency behavior would have been precisely the opposite - that as the number of disk accesses increased, and the overall throughput decreased, the median latency would actually decrease since fewer requests would be contending for the CPU at any time. Queuing delays related to CPU scheduling would be mitigated, as would any network contention effects.

**Figure 9:** Service inversion example - Assume three requests (A, B, and C) arrive at the same time, and A is processed first. If it is cached and is sent to the networking code in the kernel bottom half, interrupt-based processing for it can continue even if the the process gets blocked. In this case, even if A is large, it may get finished before processing on C even starts.

Next: Service Inversion Up: Blocking in Web Servers Previous: Response Time Effects

Yaoping Ruan
2006-04-18