Check out the new USENIX Web site. next up previous
Next: Rate compensation Up: PCP Design Previous: PCP Design

Emulating Request-and-Set

Figure: Example of a successful and a failed PCP probe.
\begin{figure}
\centering
\epsfig{file=graphs/simple.eps, scale=.85}
\end{figure}

PCP is very simple at its most basic (Figure 1): endpoints send probe packets, which are short sequences of packets spaced at a target test rate, to determine if the network has the capacity to accommodate the request. If this probe is successful, the end host can immediately increase its base rate by the target rate of the probe; it then transmits the baseline packets in a paced (equally spaced) manner at the new base rate. The probe rate is adjusted (primarily) by changing the duration over which the probe data is sent, not the amount of data that is sent. If the initial probe is unsuccessful (e.g., the network does not have the spare capacity), the end host must try again. A key element of our approach is that endpoints only increase their base sending rates immediately after a successful probe, and at no other time; thus, modulo the round trip delay, a successful probe indicates that the network resource is unallocated.

Probe success and failure is defined by whether the probe packets induce queueing inside the network, measured by whether the delay increases during the probe. For each PCP data packet, a timestamp is recorded at the receiver and returned to the sender, akin to the TCP timestamp option [26]; measuring changes in delay at the receiver allows us to eliminate variability induced by the reverse path. Our test for available bandwidth is similar to the one suggested by Jain and Dovrolis [29], but is designed to be more robust to the noise caused by PCP's probe process. Specifically, we use least squares to fit a line through the sequence of delay measurements and accept the test rate if the measurements are consistent with flat or decreasing delay; otherwise, the probe is rejected. Furthermore, since measurements are rarely determinative of the true state of the network, we use a ``probabilistic accept'' rule that accepts probes non-deterministically. The least squares fit yields a probability distribution function characterized by the estimated delay slope and a standard error for the estimated value. A low error indicates a good fit, while a high value might be due to measurement noise or variability in cross traffic. We randomly choose whether to accept a probe based on this probability distribution.

We do not assume a hard real-time operating system implementation; some jitter is acceptable in the scheduling of packets as the fitting process is robust to small variations in packet spacing. We however assume the availability of fine-grained clocks; nanosecond clocks and timers have become commonplace on modern processors [40], sufficient for scaling to gigabit speeds. Generic timestamp and packet pacing logic is also becoming increasingly common on network interface hardware.

To enable an apples to apples comparison, we set the initial probe size to match the initial TCP packet. After an initial packet exchange to verify the receiver is willing to accept packets (a mirror of the TCP SYN exchange), PCP sends, as its first probe, data equal to a maximum size packet, but divided into $k$ separate chunks (currently, 5). Probe packets may carry live data, so that if the data to be transferred is sufficiently small, it may complete within the probe packets - that is, whether or not the network can accept packets indefinitely at the probe rate. Although some have proposed using a larger initial window size in TCP to speed its discovery of the available bandwidth for high capacity paths, this would come at the potential cost of added congestion on low capacity paths. By separating the probe rate from the size of the probe, PCP avoids having to make this trade-off; as we describe below, we can safely use history to select an aggressive rate for the initial test.

If the probe is successful, PCP immediately jumps to the requested rate. As a minor optimization, once an end host is successful in obtaining sufficient bandwidth, we convert to using $k$ maximum sized packets as probes, again, paced at the target bit rate. This provides better accuracy for high bandwidth paths.


next up previous
Next: Rate compensation Up: PCP Design Previous: PCP Design
Arvind Krishnamurthy 2006-04-06