A latency service based on NCs exploits several properties of NCs that help satisfy the design goals from Section 2.
To achieve simple application integration, we propose two different architectures: a stand-alone NC service and a per-application NC library. Both approaches have the advantage that they provide a correct implementation of NC to applications. As will be explained in Section 4, the application programmer does not have to deal with the complexity of latency measurement.
Network Coordinate Service. If the network infrastructure is cooperative and under control of a single authority, such as PlanetLab, an efficient solution is to deploy a NC service on all the nodes. Each application then accesses the locally running NC service. This has the advantage that the cost of inter-node measurements is amortized across all applications that share the service. A drawback of this approach is that parameters, such as the measurement frequency, which determines the convergence of the NCs, must be set globally for all applications.
Above we show the API of the latency service that is part of our SBON deployment [17] on PlanetLab. The function estimateLat returns the latency estimate between a local and remote node given the remote node's NC. The local NC and confidence are returned by the getNC and getConfidence calls, respectively. A call to getRelError returns the current median relative error over the last latency measurement that were used for coordinate updates. If the application needs an up-to-date latency to a remote node, a call to forceUpdate causes the NC service to perform a measurement to the remote node returning the observed latency. This API assumes that nodes in a distributed application are identified as an IP address and NC pair, (IPAddr,NC). As a result, any node can obtain a latency estimate to another node about which it has learned.
Network Coordinate Library. In some cases, an application should include a module for latency estimation without relying on an externally running service. This is true for peer-to-peer applications that are deployed on a varying set of heterogeneous nodes. To address this, we also propose a NC library that any application can link against to support NCs. In order to avoid duplicating functionality, the library handles only the computation of coordinates but leaves the actual network communication for network probing to the application. This enables the application to exploit application traffic as much as possible for measurements.
In addition to the functions provided by the stand-alone service, the NC library API has a function
updateNC that is used by the application to feed in new network measurements from
application-level traffic. Only if the application-level traffic is not frequent enough or does not
cover a large enough set of nodes to compute an accurate NC does the library request additional
latency measurements from the application. As will be explained in
Section 4.2, the NC library monitors its relative error to decide if the
NC is converging sufficiently. If this is not the case, it uses the forceUpdate callback to
the application to request more diverse measurements by initiating a latency measurement to a new
remote node.
Jonathan Ledlie 2005-10-18