Workshop Program

All sessions will be held in Columbus Ballroom unless otherwise noted.

The full papers published by USENIX for this workshop are available as a download or individually below. Copyright to the individual works is retained by the author(s).

Download Paper Archives

Attendee Files 

 

Tuesday, June 17, 2014

8:00 a.m.–9:00 a.m. Tuesday

Continental Breakfast

Columbus Foyer

9:00 a.m.–10:00 a.m. Tuesday

Joint Keynote with HotStorage

Cloudy with a Chance of …

Michael Franklin, Thomas M. Siebel Professor of Computer Science, University of California, Berkeley

Available Media

10:00 a.m.–10:30 a.m. Wednesday

Break with Refreshments

Columbus Foyer

10:30 a.m.–12:10 p.m. Tuesday

Joint Session with HotStorage

Session Chair: Curt Kolovson, VMware

Convergent Dispersal: Toward Storage-Efficient Security in a Cloud-of-Clouds

1:30 pm

Mingqiang Li, Chuan Qin, Patrick P. C. Lee, The Chinese University of Hong Kong; Jin Li, Guangzhou University

Cloud-of-clouds storage exploits diversity of cloud storage vendors to provide fault tolerance and avoid vendor lock-ins. Its inherent diversity property also enables us to offer keyless data security via dispersal algorithms. However, the keyless security of existing dispersal algorithms relies on the embedded random information, which breaks data deduplication of the dispersed data. To simultaneously enable keyless security and deduplication, we propose a novel dispersal approach called convergent dispersal, which replaces original random information with deterministic cryptographic hash information that is derived from the original data but cannot be inferred by attackers without knowing the whole data. We develop two convergent dispersal algorithms, namely CRSSS and CAONT-RS. Our evaluation shows that CRSSS and CAONT-RS provide complementary performance advantages for different parameter settings.

Available Media

On the Feasibility of Data Loss Insurance for Personal Cloud Storage

Xiaosong Ma, Qatar Computing Research Institute

Personal data are important assets that people nowadays entrust with cloud storage services for the convenience of easy, ubiquitous access. To attract/retain customers, cloud storage companies aggressively replicate and georeplicate data. Such replication may be over-cautious for the majority of data objects and contributes to the relatively high price of cloud storage. Yet cloud storage companies are reluctant to provide costumers with any guarantee against permanent data loss.

In this paper, we discuss the viability for cloud storage service to provide optional data insurance. We examine major risks associated with cloud storage data loss and derive a crude model for premium calculation. The estimated premium level (per unit declared value) in most scenarios is found significantly smaller than that accepted in mature businesses like shipping. Therefore, optional insurance can potentially provide cloud storage services with more flexibility and cost-effectiveness in resource management, and customers with both peace of mind and lowered cost.

Available Media

Harmonium: Elastic Cloud Storage via File Motifs

Helgi Sigurbjarnarson, Petur Orri Ragnarsson, Ymir Vigfusson, Reykjavik University; Mahesh Balakrishnan, Microsoft Research

Modern applications expand to fill the space available to them, exploiting local storage to improve performance by caching, prefetching and precomputing data. In virtualized settings, this behavior compromises storage elasticity owing to a rigid contract between the hypervisor and the guest OS: once space is allocated to a virtual disk and used by an application, it cannot be reclaimed by the hypervisor. In this paper, we propose a new guest filesystem called Harmonium that exploits the ephemeral or derivative nature of application data. Each file in Harmonium optionally has a motif that describes how the file can be reconstructed via computation, network accesses, or operations on other files. Harmonium expands files from their motifs when space is available, and contracts them back to their motifs when it is scarce. Given a target size, the system selects files to expand or contract based on the load on the CPU, network, and storage, as well as expected access patterns. As a result, Harmonium enables elastic cloud storage, allowing the hypervisor to dynamically balance storage across multiple VMs.

Available Media

Impression Store: Compressive Sensing-based Storage for Big Data Analytics

Jiaxing Zhang, Ying Yan, Liang Jeff Chen, Minjie Wang, Thomas Moscibroda, and Zheng Zhang, Microsoft Research

For many big data analytics workloads, approximate results suffice. This begs the question, whether and how the underlying system architecture can take advantage of such relaxations, thereby lifting constraints inherent in today’s architectures. This position paper explores one of the possible directions. Impression Store is a distributed storage system with the abstraction of big data vectors. It aggregates updates internally and responds to the retrieval of top-K high-value entries. With proper extension, Impression Store supports various aggregations, top-K queries, outlier and major mode detection. While restricted in scope, such queries represent a substantial and important portion of many production workloads. In return, the system has unparalleled scalability; any node in the system can process any query, both reads and updates. The key technique we leverage is compressive sensing, a technique that substantially reduces the amount of active memory state, IO, and traffic volume needed to achieve such scalability.

Available Media

12:10 p.m.–1:30 p.m. Tuesday

FCW '14 Luncheon

Grand Ballrom ABC

1:30 p.m.–3:35 p.m. Tuesday

Systems and Architecture

Session Chair: Paolo Costa, Microsoft Research

Academic Cloud Computing Research: Five Pitfalls and Five Opportunities

8:15 am

Adam Barker, Blesson Varghese, Jonathan Stuart Ward, and Ian Sommerville, University of St Andrews

This discussion paper argues that there are five fundamental pitfalls, which can restrict academics from conducting cloud computing research at the infrastructure level, which is currently where the vast majority of academic research lies. Instead academics should be conducting higher risk research, in order to gain understanding and open up entirely new areas.

We call for a renewed mindset and argue that academic research should focus less upon physical infrastructure and embrace the abstractions provided by clouds through five opportunities: user driven research, new programming models, PaaS environments, and improved tools to support elasticity and large-scale debugging. The objective of this paper is to foster discussion, and to define a roadmap forward, which will allow academia to make longer-term impacts to the cloud computing community.

Available Media

Towards a Leaner Geo-distributed Cloud Infrastructure

Iyswarya Narayanan, The Pennsylvania State University; Aman Kansal, Microsoft Corporation; Anand Sivasubramaniam and Bhuvan Urgaonkar, The Pennsylvania State University; Sriram Govindan, Microsoft Corporation

Modern cloud infrastructures are geo-distributed. Geodistribution offers many advantages but can increase the total cloud capacity required. To achieve low latency, geo-distribution forfeits statistical multiplexing of demand that a single data center could benefit from. Geo-distribution also complicates software design due to storage consistency issues. On the other hand, geodistribution can lower costs through eliminating redundancies at individual sites or exploiting regional differences in energy prices. We discuss several factors that influence geo-distributed capacity provisioning, and quantify latency, availability, and capacity trade-offs that emerge. We describe open research challenges in designing software that efficiently uses cloud capacity

Available Media

A Way Forward: Enabling Operating System Innovation in the Cloud

Dan Schatzberg, James Cadden, Orran Krieger, and Jonathan Appavoo, Boston University

Cloud computing has not resulted in a fundamental change to the underlying operating systems. Rather, distributed applications are built over middleware that provides high-level abstractions to exploit the cloud’s scale and elasticity. This middleware conjoins many general purpose OS instances.

Others have demonstrated that a new operating system built specifically for the cloud can achieve increased efficiency, scale and functionality [11, 14]. However, this work does not take into account the way applications are being deployed in cloud environments. In particular, entire physical or virtual machines are being dedicated to run a single application, rather than concurrently supporting many users and multiple applications.

In this paper we introduce a new model for distributed applications that embraces a reduced role of the OS in the cloud. It allows for the construction of application-driven compositions of OS functionality wherein each application can employ its own customized operating system.

Available Media

Software Defining System Devices with the "Banana" Double-Split Driver Model

Dan Williams and Hani Jamjoom, IBM T. J. Watson Research Center; Hakim Weatherspoon, Cornell University

This paper introduces a software defined device driver layer that enables new ways of wiring devices within and across cloud environments. It builds on the split driver model, which is used in paravirtualization (e.g., Xen) to multiplex hardware devices across all VMs. In our approach, called the Banana Double-Split Driver Model, the back-end portion of the driver is resplit and rewired such that it can be connected to a different back-end driver on another hypervisor. Hypervisors supporting Banana cooperate with each other to (1) expose a consistent interface to rewire the back-end drivers, (2) allow different types of connections (e.g., tunnels, RDMA, etc.) to coexist and be hot-swapped to optimize for placement, proximity, and hardware, and (3) migrate backend connections between hypervisors to maintain connectivity irrespective of physical location. We have implemented an early prototype of Banana for network devices. We show how network devices can be split, rewired, and live migrated across cloud providers with as low as 1.4 sec of downtime, while fully maintaining the logical topology between application components.

Available Media

Building Scalable Multimedia Search Engine Using Infiniband

Qi Chen, Peking University; Yisheng Liao, Christopher Mitchell, and Jinyang Li, New York University; Zhen Xiao, Peking University

The approach of vertically partitioning the index has long been considered as impractical for building a distributed search engine due to its high communication cost. With the recent surge of interest in using High Performance Computing networks such as Infiniband in the data center, we argue that vertical partitioning is not only practical but also highly scalable. To demonstrate our point, we built a distributed image search engine (VertiCut) that performs multi-round approximate neighbor searches to find similar images in a large image collection.

Available Media

3:35 p.m.–4:05 p.m. Tuesday

Break with Refreshments

Columbus Foyer

4:05 p.m.–5:45 p.m. Tuesday

Mobility and Security

Session Chair: Phillip Gibbons, Intel Labs

POMAC: Properly Offloading Mobile Applications to Clouds

Mohammed A. Hassan, George Mason University; Kshitiz Bhattarai, SAP Lab; Qi Wei and Songqing Chen, George Mason University

Prior research on mobile computation offloading has mainly focused on how to offload as well as what to offload. However, the problem of whether the offloading should be done attracted much less attention. In addition, existing offloading schemes either require special compilation or modification to the applications’ source code or binary, making them difficult to be deployed in practice. In this work, we introduce POMAC, a framework to enable dynamic and transparent mobile application offloading to clouds. A prototype has been implemented on the Dalvik virtual machine and our preliminary evaluations show that POMAC can outperform existing schemes significantly and work with real-world applications seamlessly.

Available Media

Mobile App Acceleration via Fine-Grain Offloading to the Cloud

1:45 pm

Chit-Kwan Lin, UpShift Labs, Inc.; H. T. Kung, Harvard University

Mobile device hardware can limit the sophistication of mobile applications. One strategy for side-stepping these constraints is to opportunistically offload computations to the cloud, where more capable hardware can do the heavy lifting. We propose a platform that accomplishes this via compressive offloading, a novel application of compressive sensing in a distributed shared memory setting. Our prototype gives up to an order-of-magnitude acceleration and 60% longer battery life to the end user of an example handwriting recognition app. We argue that offloading is beneficial to both end users and cloud providers—the former experiences a performance boost and the latter receives a steady stream of small computations to fill periods of under-utilization. Such workloads, originating from ARM-based mobile devices, are especially well-suited for offloading to emerging ARM-based data centers.

Available Media

Leveraging Virtual Machine Introspection for Hot-Hardening of Arbitrary Cloud-User Applications

3:45 pm

Sebastian Biedermann and Stefan Katzenbeisser, Technische Universität Darmstadt; Jakub Szefer, Yale University

Correctly applying security settings of various different applications is a time-consuming and in some cases a very difficult task. Moreover, with explosion in cloud computing popularity, cloud users are able to download and run pre-packaged virtual appliances. Many users may assume that these come with correct security settings and never bother to check or update these settings. In this paper we propose an architecture that can automatically and transparently improve security settings of arbitrary network applications in a cloud computing setup. Users can deploy virtual machines with different applications, and our system will attempt to find and test better security settings tailored towards their specific setup. We call this approach “hot-hardening” since our techniques are applied to running applications.

Note: The video and audio of this presentation have been removed at the request of the authors.

Available Media

Practical Confidentiality Preserving Big Data Analysis

Julian James Stephen, Savvas Savvides, Russell Seidel, and Patrick Eugster, Purdue University

The “pay-as-you-go” cloud computing model has strong potential for efficiently supporting big data analysis jobs expressed via data-flow languages such as Pig Latin. Due to security concerns—in particular leakage of data — government and enterprise institutions are however reluctant to moving data and corresponding computations to public clouds. We present Crypsis, a system that allows execution of MapReduce-style data analysis jobs directly on encrypted data. Crypsis transforms data analysis scripts written in Pig Latin so that they can be executed on encrypted data. Crypsis to that end employs existing practical partially homomorphic encryption schemes, and adopts a global perspective in that it can perform partial computations on the client side when PHE alone would fail. We outline the original program transformations underlying Crypsis for reducing the cost of data analysis computations in this larger perspective. We show the practicality of our approach by evaluating Crypsis on standard benchmarks.

Available Media

6:00 p.m.–7:00 p.m. Tuesday

Tuesday Happy Hour

Columbus Foyer

 

Wednesday, June 18, 2014

8:00 a.m.–9:00 a.m. Wednesday

Continental Breakfast

Columbus Foyer

9:00 a.m.–10:00 a.m. Wednesday

Keynote Address

Programming Cloud Infrastructure

Albert Greenberg, Director of Development, Microsoft Azure Networking

Large scale cloud infrastructure requires many management applications to run concurrently to function smoothly. Traditional management applications fall short, tripped up by unexpected failures and unanticipated interference. We present a graph based data model for cloud infrastructure, enabling architects and operators to describe large scale complex infrastructure. A goal state driven framework enables quick and safe application development, sustaining infrastructure growth and maintenance at huge scale.

Large scale cloud infrastructure requires many management applications to run concurrently to function smoothly. Traditional management applications fall short, tripped up by unexpected failures and unanticipated interference. We present a graph based data model for cloud infrastructure, enabling architects and operators to describe large scale complex infrastructure. A goal state driven framework enables quick and safe application development, sustaining infrastructure growth and maintenance at huge scale.

Available Media
10:00 a.m.–10:30 a.m. Wednesday

Break with Refreshments

Columbus Foyer

10:30 a.m.–12:10 p.m. Wednesday

Networking

Session Chair: Jonathan Appavoo, Boston University

SmartSwitch: Blurring the Line Between Network Infrastructure & Cloud Applications

Wei Zhang and Timothy Wood, The George Washington University; K.K. Ramakrishnan, Rutgers University; Jinho Hwang, IBM T. J. Watson Research Center

A revolution is beginning in communication networks with the adoption of network function virtualization, which allows network services to be run on common off-the-shelf hardware—even in virtual machines—to increase flexibility and lower cost. An exciting prospect for cloud users is that these software-based network services can be merged with compute and storage resources to flexibly integrate all of the cloud’s resources.

We are developing an application aware networking platform that can perform not only basic packet switching, but also typical functions left to compute platforms such as load balancing based on application-level state, localized data caching, and even arbitrary computation. Our prototype “memcached-aware smart switch” reduces request latency by half and increases throughput by eight fold compared to Twitter’s TwemProxy. We also describe how a Hadoop-aware switch could automatically cache data blocks near worker nodes, or perform some computation directly on the data stream. This approach enables a new breed of application designs that blur the line between the cloud’s network and its servers.

Available Media

Rethinking the Network Stack for Rack-scale Computers

Paolo Costa, Hitesh Ballani, and Dushyanth Narayanan, Microsoft Research

The rack is increasingly replacing individual servers as the basic building block of modern data centers. Future rack-scale computers will comprise a large number of tightly integrated systems-on-chip, interconnected by a switch-less internal fabric. This design enables thousands of cores per rack and provides high bandwidth for rack-scale applications. Most of the benefits promised by these new architectures, however, can only be achieved with adequate support from the software stack.

In this paper, we take a step in this direction by focusing on the network stack for rack-scale computers. Using routing and rate control as examples, we show how the peculiarities of rack architectures allow for new approaches that are attuned to the underlying hardware. We also discuss other exciting research challenges posed by rack-scale computers.

Available Media

LOOM: Optimal Aggregation Overlays for In-Memory Big Data Processing

William Culhane, Kirill Kogan, Chamikara Jayalath, and Patrick Eugster, Purdue University

Aggregation underlies the distillation of information from big data. Many well-known basic operations including top-k matching and word count hinge on fast aggregation across large data-sets. Common frameworks including MapReduce support aggregation, but do not explicitly consider or optimize it. Optimizing aggregation however becomes yet more relevant in recent “online” approaches to expressive big data analysis which store data in main memory across nodes. This shifts the bottlenecks from disk I/O to distributed computation and network communication and significantly increases the impact of aggregation time on total job completion time.

This paper presents LOOM, a (sub)system for efficient big data aggregation for use within big data analysis frameworks. LOOM efficiently supports two-phased (sub)computations consisting in a first phase performed on individual data sub-sets (e.g., word count, top-k matching) followed by a second aggregation phase which consolidates individual results of the first phase (e.g., count sum, top-k). Using characteristics of an aggregation function, LOOM constructs a specifically configured aggregation overlay to minimize aggregation costs. We present optimality heuristics and experimentally demonstrate the benefits of thus optimizing aggregation overlays using microbenchmarks and real world examples.

Available Media

Cicada: Introducing Predictive Guarantees for Cloud Networks

Katrina LaCurts, MIT/CSAIL; Jeffrey C. Mogul, Google, Inc.; Hari Balakrishnan, MIT/CSAIL; Yoshio Turner, HP Labs

In cloud-computing systems, network-bandwidth guarantees have been shown to improve predictability of application performance and cost [1, 28]. Most previous work on cloud-bandwidth guarantees has assumed that cloud tenants know what bandwidth guarantees they want [1, 17]. However, as we show in this work, application bandwidth demands can be complex and time-varying, and many tenants might lack sufficient information to request a guarantee that is well-matched to their needs, which can lead to over-provisioning (and thus reduced cost-efficiency) or under-provisioning (and thus poor user experience).

We analyze traffic traces gathered over six months from an HP Cloud Services datacenter, finding that application bandwidth consumption is both time-varying and spatially inhomogeneous. This variability makes it hard to predict requirements. To solve this problem, we develop a prediction algorithm usable by a cloud provider to suggest an appropriate bandwidth guarantee to a tenant. With tenant VM placement using these predictive guarantees, we find that the inter-rack network utilization in certain datacenter topologies can be more than doubled.

Available Media

12:10 p.m.–1:30 p.m. Wednesday

FCW '14 Luncheon

Grand Ballrom ABC

1:30 p.m.–3:35 p.m. Wednesday

Diagnostics and Testing

Session Chair: John Arrasjid, VMware

A Novel Technique for Long-Term Anomaly Detection in the Cloud

Owen Vallis, Jordan Hochenbaum, and Arun Kejariwal, Twitter Inc.

High availability and performance of a web service is key, amongst other factors, to the overall user experience (which in turn directly impacts the bottom-line). Exogenic and/or endogenic factors often give rise to anomalies that make maintaining high availability and delivering high performance very challenging. Although there exists a large body of prior research in anomaly detection, existing techniques are not suitable for detecting long-term anomalies owing to a predominant underlying trend component in the time series data.

To this end, we developed a novel statistical technique to automatically detect long-term anomalies in cloud data. Specifically, the technique employs statistical learning to detect anomalies in both application as well as system metrics. Further, the technique uses robust statistical metrics, viz., median, and median absolute deviation (MAD), and piecewise approximation of the underlying long-term trend to accurately detect anomalies even in the presence of intra-day and/or weekly seasonality. We demonstrate the efficacy of the proposed technique using production data and report Precision, Recall, and F-measure measure. Multiple teams at Twitter are currently using the proposed technique on a daily basis.

Available Media

PerfCompass: Toward Runtime Performance Anomaly Fault Localization for Infrastructure-as-a-Service Clouds

Daniel J. Dean, Hiep Nguyen, Peipei Wang, and Xiaohui Gu, North Carolina State University

Infrastructure-as-a-service (IaaS) clouds are becoming widely adopted. However, as multiple tenants share the same physical resources, performance anomalies have become one of the top concerns for users. Unfortunately, performance anomaly diagnosis in the production IaaS cloud often takes a long time due to its inherent com- plexity and sharing nature. In this paper, we present PerfCompass, a runtime performance anomaly fault lo- calization tool using online system call trace analysis techniques. Specifically, PerfCompass tackles a chal- lenging fault localization problem for IaaS clouds, that is, differentiating whether a production-run performance anomaly is caused by an external fault (e.g., interfer- ence from other co-located applications) or an internal fault (e.g., software bug). PerfCompass does not require any application source code or runtime instrumentation, which makes it practical for production IaaS clouds. We have tested PerfCompass using a set of popular soft- ware systems (e.g., Apache, MySQL, Squid, Cassandra, Hadoop) and a range of common cloud environment issues and real software bugs. The results show that PerfCompass accurately diagnoses all the faults while imposing low overhead during normal application exe- cution time.

Available Media

MrLazy: Lazy Runtime Label Propagation for MapReduce

Sherif Akoush, Lucian Carata, Ripduman Sohan, and Andy Hopper, University of Cambridge

Organisations are starting to publish datasets containing potentially sensitive information in the Cloud; hence it is important there is a clear audit trail to show that involved parties are respecting data sharing laws and policies.

Information Flow Control (IFC) has been proposed as a solution. However, fine-grained IFC has various deployment challenges and runtime overhead issues that have limited wide adoptation so far.

In this paper we present MrLazy, a system that practically addresses some of these issues for MapReduce. Within one trust domain, we relax the need of continuously checking policies. We instead rely on lineage (information about the origin of a piece of data) as a mechanism to retrospectively apply policies on-demand. We show that MrLazy imposes manageable temporal and spatial overheads while enabling fine-grained data regulation.

Available Media

Mechanisms and Architectures for Tail-Tolerant System Operations in Cloud

2:30 pm

Qinghua Lu, China University of Petroleum and NICTA; Liming Zhu, Xiwei Xu, and Len Bass, NICTA; Shanshan Li, Weishan Zhang, and Ning Wang, China University of Petroleum

Paper Only/No Presentation

Conducting system operations (such as upgrade, reconfiguration, deployment) for large-scale systems in cloud is error prone and complex. These operations rely heavily on unreliable cloud infrastructure APIs to complete. The inherent uncertainties and inevitable errors cause a long-tail in the completion time distribution of operations. In this paper, we propose mechanisms and deployment architecture tactics to tolerate the long-tail. We wrapped cloud provisioning API calls and implemented deployment tactics at the architecture level for system operations. Our initial evaluation shows that the mechanisms and deployment tactics can effectively reduce the long tail.

Available Media

The Case for System Testing with Swift Hierarchical VM Fork

1:45 pm

Junji Zhi, Sahil Suneja, and Eyal de Lara, University of Toronto

System testing is an essential part of software development. Unfortunately, comprehensive testing of large systems is often resource intensive and time-consuming. In this paper, we explore the possibility of leveraging hierarchical virtual machine (VM) fork to optimize system testing in the cloud. Testing using VM fork has the potential to save system configuration effort, obviate the need to run redundant common steps, and reduce disk and memory requirements by sharing resources across test cases. A preliminary experiment that uses VM fork to run a subset of MySQL database test suite shows that the technique reduces VM run time to complete all test cases by 60%.

Available Media

3:35 p.m.–4:05 p.m. Wednesday

Break with Refreshments

Columbus Foyer

4:05 p.m.–5:20 p.m. Wednesday

Economics

Session Chair: Michael Kozuch, Intel Labs

BitBill: Scalable, Robust, Verifiable Peer-to-Peer Billing for Cloud Computing

Li Chen and Kai Chen, The Hong Kong University of Science and Technology

Accounting and billing of cloud resources is vital for the operation of cloud service providers and their tenants. In this paper, we categorize the trust models of current industrial and academic cloud billing solutions, and discuss the problems with these models in terms of degree of trust, scalability and robustness. Based on the analysis, we propose a novel public trust model to ensure natural and intuitive verification of billable events in the cloud. Leveraging a Bitcoin-like mechanism, we design BitBill, a scalable, robust and mutually verifiable billing system for cloud computing. Our initial results show that BitBill has significantly better scalability (supporting 10x concurrent tenants using the billing service) than the state-of-the-art third-party centralized billing system.

Available Media

A Day Late and a Dollar Short: The Case for Research on Cloud Billing Systems

Robert Jellinek, Yan Zhai, Thomas Ristenpart, and Michael Swift, University of Wisconsin—Madison

Cloud computing platforms such as Amazon Web Services, Google Compute Engine, and Rackspace Public Cloud have been the subject of numerous measurement studies considering performance, reliability, and cost efficiency. However, little attention has been paid to billing. Cloud providers rely upon complex, large-scale billing systems that track customer resource usage at fine granularity and generate bills reflecting measured usage. However, it is not known how visible such usage is to customers, and how closely provider charges correspond to customers’ view of their resource usage.

We initiate a study of cloud billing systems, focusing on Amazon EC2, Google Compute Engine, and Rackspace, and uncover a variety of issues, including: inherent difficulties in predicting charges; bugs that lead to free CPU time on EC2 and over-charging for storage in Rackspace; and long and unpredictable billing-update latency. Our measurements motivate further study on billing systems, and so we conclude with a brief discussion of open questions for future work.

Available Media

A Case for Virtualizing the Electric Utility in Cloud Data Centers

Cheng Wang, Bhuvan Urgaonkar, George Kesidis, Uday V. Shanbhag, and Qian Wang, The Pennsylvania State University

Since energy-related costs make up an increasingly significant component of overall costs for data centers run by cloud providers, it is important that these costs be propagated to their tenants in ways that are fair and promote workload modulation that is aligned with overall cost-efficacy. We argue that there exists a big gap in how electric utilities charge data centers for their energy consumption (on the one hand) and the pricing interface exposed by cloud providers to their tenants (on the other). Whereas electric utilities employ complex features such as peak-based, time-varying, or tiered (load-dependent) pricing schemes, cloud providers charge tenants based on IT abstractions. This gap can create shortcomings such as unfairness in how tenants are charged and may also hinder overall cost-effective resource allocation. To overcome these shortcomings, we propose a novel idea of a virtual electric utility (VEU) that cloud providers should expose to individual tenants (in addition to their existing IT-based offerings). We discuss initial ideas underlying VEUs and challenges that must be addressed to turn them into a practical idea whose merits can be systematically explored.

Available Media

6:30 p.m.–8:00 p.m. Wednesday

Wednesday Reception

Grand Ballroom AB