Check out the new USENIX Web site.

Service Placement in a Shared Wide-Area Platform

David Oppenheimer[1], Brent Chun[2], David Patterson[3], Alex C. Snoeren[1], and Amin Vahdat[1]
[1]UC San Diego, [2]Arched Rock Corporation, [3]UC Berkeley
{doppenhe,snoeren,vahdat}@cs.ucsd.edu, bnc@theether.org, pattrsn@cs.berkeley.edu

Abstract:

Emerging federated computing environments offer attractive platforms to test and deploy global-scale distributed applications. When nodes in these platforms are time-shared among competing applications, available resources vary
across nodes and over time. Thus, one open architectural question in such systems is how to map applications to available nodes--that is, how to discover and select resources. Using a six-month trace of PlanetLab resource utilization data and of resource demands from three long-running PlanetLab services, we quantitatively characterize resource availability and application usage behavior across nodes and over time, and investigate the potential to mitigate the application impact of resource variability through intelligent service placement and migration.

We find that usage of CPU and network resources is heavy and highly variable. We argue that this variability calls for intelligently mapping applications to available nodes. Further, we find that node placement decisions can become ill-suited after about 30 minutes, suggesting that some applications can benefit from migration at that timescale, and that placement and migration decisions can be safely based on data collected at roughly that timescale. We find that inter-node latency is stable and is a good predictor of available bandwidth; this observation argues for collecting latency data at relatively coarse timescales and bandwidth data at even coarser timescales, using the former to predict the latter between measurements. Finally, we find that although the utilization of a particular resource on a particular node is a good predictor of that node's utilization of that resource in the near future, there do not exist correlations to support predicting one resource's availability based on availability of other resources on the same node at the same time, on availability of the same resource on other nodes at the same site, or on time-series forecasts that assume a daily or weekly regression to the mean.



David Oppenheimer 2006-04-14