On The {[Ir]relevance} of Network Performance for Data Processing

Animesh Trivedi; Patrick Stuedi; Jonas Pfefferle; Radu Stoica; Andrea C. Arpaci-Dusseau; Bernard Metzler; Remzi H. Arpaci-Dusseau; Ioannis Koltsidas; Nikolas Ioannou

Workshop Program

All sessions will be held in Colorado Ballroom G–J unless otherwise noted.
Papers are available for download below to registered attendees now and to everyone beginning June 20, 2016. Paper abstracts are available to everyone now. Copyright to the individual works is retained by the author[s].

Downloads for Registered Attendees

Attendee Files

HotCloud '16 Paper Archive (ZIP)

HotCloud '16 Attendee List (PDF)

Monday, June 20, 2016

7:30 am–9:00 am Monday

Continental Breakfast

Ballroom Foyer

9:00 am–9:15 am Monday

Opening Remarks

Program Co-Chairs: Austin Clements, Google, and Tyson Condie, University of California, Los Angeles

9:15 am–10:30 am Monday

Cloud Economics

Session Chair: Fred Douglis, EMC

How Not to Bid the Cloud

Prateek Sharma, David Irwin, and Prashant Shenoy, University of Massachusetts Amherst

Cloud providers have begun to allow users to bid for surplus servers on a spot market. These servers are allocated if a user’s bid price is higher than their market price and revoked otherwise. Thus, analyzing price data to derive optimal bidding strategies has become a popular research topic. In this paper, we argue that sophisticated bidding strategies, in practice, do not provide any advantages over simple strategies for multiple reasons. First, due to price characteristics, there are a wide range of bid prices that yield the optimal cost and availability. Second, given the large number of spot markets, there is always a market with available surplus resources. Thus, if resources become unavailable due to a price spike, users need not wait until the spike subsides, but can instead provision a new spot resource elsewhere and migrate to it. Third, current spot market rules enable users to place maximum bids for resources without any penalty. Given bidding’s irrelevance, users can adopt trivial bidding strategies and focus instead on modifying applications to efficiently seek out and migrate to the lowest cost resources.
Available Media

QoX: Quality of Service and Consumption in the Cloud

Murad Kablan and Eric Keller, University of Colorado Boulder; Hani Jamjoom, IBM Research

Cloud services today are increasingly built using functionality from other running services. In this paper, we question whether legacy Quality of Services (QoS) metrics and enforcement techniques are sufficient as they are producer centric. We argue that, similar to customer rating systems found in banking systems and many sharing economy apps (e.g., Uber and Airbnb), Quality of Consumption (QoC) should be introduced to capture different metrics about service consumers. We show how the combination of QoS and QoC, dubbed QoX, can be used by consumers and providers to improve the security and management of their infrastructure. In addition, we demonstrate how sharing information among other consumers and providers increase the value of QoX. To address the main challenge with sharing information, namely sybil attacks and mis-information, we describe how we can leverage cloud providers as vouching authorities to ensure the integrity of information. We explore the motivations, challenges, and potentials to introduce such a framework in the cloud environment.
Available Media

Cloud Spot Markets are Not Sustainable: The Case for Transient Guarantees

Supreeth Subramanya, Amr Rizk, and David Irwin, University of Massachusetts Amherst

Computational spot markets enable users to bid on servers, and then continuously allocates them to the highest bidder: if a user is “out bid” for a server, the market revokes it and re-allocates it to the new highest bidder. Spot markets are common when trading commodities to balance real-time supply and demand—cloud platforms use them to sell their idle capacity, which varies over time. However, server-time differs from other commodities in that it is “stateful”: losing a spot server incurs an overhead that decreases the useful work it performs. Thus, variations in the spot price actually affect the inherent value of server-time bought in the spot market. As the spot market matures, we argue that price volatility will significantly decrease the value of spot servers. Thus, somewhat counter-intuitively, spot markets may not maximize the value of idle server capacity. To address the problem, we propose a more sustainable alternative that offers a variable amount of idle capacity to users for a fixed price, but with transient guarantees
Available Media

10:30 am–11:15 am Monday

Break with Refreshments

Ballroom Foyer

11:15 am–12:30 pm Monday

Programming Models

Session Chair: Irene Zhang, University of Washington

Interactive Debugging for Big Data Analytics

Muhammad Ali Gulzar, Xueyuan Han, Matteo Interlandi, and Shaghayegh Mardani, University of California, Los Angeles; Sai Deep Tetali, Google, Inc.; Todd Millstein and Miryung Kim, University of California, Los Angeles

An abundance of data in many disciplines has accelerated the adoption of distributed technologies such as Hadoop and Spark, which provide simple programming semantics and an active ecosystem. However, the current cloud computing model lacks the kinds of expressive and interactive debugging features found in traditional desktop computing. We seek to address these challenges with the development of BIGDEBUG, a framework providing interactive debugging primitives and tool-assisted fault localization services for big data analytics. We showcase the data provenance and optimized incremental computation features to effectively and efficiently support interactive debugging, and investigate new research directions on how to automatically pinpoint and repair the root cause of errors in large-scale distributed data processing.
Available Media

Ovid: A Software-Defined Distributed Systems Framework

Deniz Altınbüken and Robbert van Renesse, Cornell University

We present Ovid, a framework for building evolvable large-scale distributed systems that run in the cloud. Ovid constructs and deploys distributed systems as a collection of simple components, creating systems suited for containerization in the cloud. Ovid supports evolution of systems through transformations, which are automated refinements. Examples of transformations include replication, batching, sharding, and encryption. Ovid transformations guarantee that an evolving system still implements the same specification. Moreover, systems built with transformations can be combined with other systems to implement more complex infrastructure services. The result of this framework is a software-defined distributed system, in which a logically centralized controller specifies the components, their interactions, and their transformations.
Available Media

Serverless Computation with OpenLambda

Scott Hendrickson, Stephen Sturdevant, and Tyler Harter, University of Wisconsin—Madison; Venkateshwaran Venkataramani; Andrea C. Arpaci-Dusseau and Remzi H. Arpaci-Dusseau, University of Wisconsin—Madison

We present OpenLambda, a new, open-source platform for building next-generation web services and applications in the burgeoningmodel of serverless computation. We describe the key aspects of serverless computation, and present numerous research challenges that must be addressed in the design and implementation of such systems. We also include a brief study of current web applications, so as to better motivate some aspects of serverless application construction.
Available Media

12:30 pm–2:00 pm Monday

Luncheon for Workshop Attendees

Colorado Ballroom E

2:00 pm–3:15 pm Monday

Networking

Session Chair: Angela Demke Brown, University of Toronto

Accelerating Complex Data Transfer for Cluster Computing

Alexey Khrabrov and Eyal de Lara, University of Toronto

The ability to move data quickly between the nodes of a distributed system is important for the performance of cluster computing frameworks, such as Hadoop and Spark. We show that in a cluster with modern networking technology data serialization is the main bottleneck and source of overhead in the transfer of rich data in systems based on high-level programming languages such as Java. We propose a new data transfer mechanism that avoids serialization altogether by using a shared clusterwide address space to store data. The design and a prototype implementation of this approach are described. We show that our mechanism is significantly faster than serialized data transfer, and propose a number of possible applications for it.
Available Media

HyperOptics: A High Throughput and Low Latency Multicast Architecture for Datacenters

Dingming Wu, Xiaoye Sun, Yiting Xia, Xin Huang, and T. S. Eugene Ng, Rice University

Multicast has long been a performance bottleneck for data centers. Traditional solutions relying on IP multicast suffer from poor congestion control and loss recovery on the data plane, as well as slow and complex group membership and multicast tree management on the control plane. Some recent proposals have employed alternate optical circuit switched paths to enable lossless multicast and a centralized control architecture to quickly configure multicast trees. However, the high circuit reconfiguration delay of optical switches has substantially limited multicast performance.

In this paper, we propose to eliminate this reconfiguration delay by an unconventional optical multicast architecture called HyperOptics that directly interconnects top of rack switches by low cost optical splitters, thereby eliminating the need for optical switches. The ToRs are organized to form the connectivity of a regular graph. We analytically show that this architecture is scalable and efficient for multicasts. Preliminary simulations show that running multicasts on HyperOptics can on average be 2.1x faster than on an optical circuit switched network.
Available Media

Graviton: Twisting Space and Time to Speed-up CoFlows

Akshay Jajoo, Rohan Gandhi, and Y. Charlie Hu, Purdue University

In this paper, we make a key observation that using multiple priority queues and weighted fair sharing at each port, Aalo does a good job in approximating SJF, but it does so only at the queue-granularity, as using FIFO to schedule CoFlows in each queue is rather simplistic, and has no reminiscence of SJF.

Instead, we discuss three insights into Aalo’s scheduler where exploiting the spatial dimension of the problem domain, i.e., the width (number of ports) of the CoFlows, can lead to better scheduling policies within each priority queue, improving the overall CCT.
Available Media

3:15 pm–4:00 pm Monday

Break with Refreshments

Ballroom Foyer

4:00 pm–5:15 pm Monday

Architecture

Session Chair: Tyson Condie, University of California, Los Angeles

Mlcached: Multi-level DRAM-NAND Key-value Cache

I. Stephen Choi, Byoung Young Ahn, and Yang-Suk Kee, Samsung Memory Solutions Lab

We present Mlcached, multi-level DRAM-NAND keyvalue cache, that is designed to enable independent resource provisioning of DRAM and NAND flash memory by completely decoupling each caching layers. Mlcached utilizes DRAM for L1 cache and our new KVcache device for L2 cache. The index-integrated FTL is implemented in the KV-cache device to eliminate any inmemory indexes that prohibit the independent resource provisioning. We show that Mlcached is only 12.8% slower than a DRAM-only Web caching service in the average RTT with 80% L1 cache hit while saving twothirds of its TCO. Moreover, our model-based study shows that Mlcached can provide up to 6X lower cost or 4X lower latency at the same SLA or TCO, respectively.
Available Media

When Spark Meets FPGAs: A Case Study for Next-Generation DNA Sequencing Acceleration

Yu-Ting Chen, Jason Cong, Zhenman Fang, Jie Lei, and Peng Wei, University of California, Los Angeles

FPGA-enabled datacenters have shown great potential for providing performance and energy efficiency improvement. In this paper we aim to answer one key question: how can we efficiently integrate FPGAs into stateof- the-art big-data computing frameworks like Apache Spark? To provide a generalized methodology and insights for efficient integration, we conduct an indepth analysis of challenges at single-thread, single-node multi-thread, and multi-node levels, and propose solutions including batch processing and the FPGA-as-a- Service framework to address them. With a step-by-step case study for the next-generation DNA sequencing application, we demonstrate how a straightforward integration with 1,000x slowdown can be tuned into an efficient integration with 2.6x overall system speedup and 2.4x energy efficiency improvement.
Available Media

Unikernel Monitors: Extending Minimalism Outside of the Box

Dan Williams and Ricardo Koller, IBM T. J. Watson Research Center

Recently, unikernels have emerged as an exploration of minimalist software stacks to improve the security of applications in the cloud. In this paper, we propose extending the notion of minimalism beyond an individual virtual machine to include the underlying monitor and the interface it exposes. We propose unikernel monitors. Each unikernel is bundled with a tiny, specialized monitor that only contains what the unikernel needs both in terms of interface and implementation. Unikernel monitors improve isolation through minimal interfaces, reduce complexity, and boot unikernels quickly. Our initial prototype, ukvm, is less than 5% the code size of a traditional monitor, and bootsMirageOS unikernels in as little as 10ms (8× faster than a traditional monitor).
Available Media

6:00 pm–7:00 pm Monday

Joint Poster Session and Happy Hour with HotStorage

Colorado Ballroom A–E

Tuesday, June 21, 2016

7:30 am–9:00 am Tuesday

Continental Breakfast

Ballroom Foyer

9:00 am–10:30 am Tuesday

Joint Keynote Address with HotStorage

What's Changing in Big Data?

Matei Zaharia, Massachusetts Institute of Technology

Matei Zaharia is an assistant professor of computer science at MIT as well as CTO of Databricks, the company commercializing Apache Spark. He is broadly interested in computer systems, data centers and data management. He started the Spark project while he was a PhD student at UC Berkeley, and he has also contributed to other open source cluster computing projects such as Apache Mesos and Apache Hadoop. Matei received the 2014 ACM Doctoral Dissertation Award for his graduate work.

Big data analytics became a hot research topic nearly ten years ago, but since that time, a lot of things have changed. On the hardware side, trends such as the slowdown of processing with respect to I/O are starting to affect the design of big data systems. On the application side, big data systems are increasingly being used by non-programmers and require similar forms of interaction to "small data" analysis tools. Finally, big data systems are increasingly provided "as a service" on cloud infrastructure. I'll talk about these changes from the perspective of the Apache Spark project and from my experience at a company offering a cloud service for big data (Databricks).
Big data analytics became a hot research topic nearly ten years ago, but since that time, a lot of things have changed. On the hardware side, trends such as the slowdown of processing with respect to I/O are starting to affect the design of big data systems. On the application side, big data systems are increasingly being used by non-programmers and require similar forms of interaction to "small data" analysis tools. Finally, big data systems are increasingly provided "as a service" on cloud infrastructure. I'll talk about these changes from the perspective of the Apache Spark project and from my experience at a company offering a cloud service for big data (Databricks).

Matei Zaharia is an assistant professor of computer science at MIT as well as CTO of Databricks, the company commercializing Apache Spark. He is broadly interested in computer systems, data centers and data management. He started the Spark project while he was a PhD student at UC Berkeley, and he has also contributed to other open source cluster computing projects such as Apache Mesos and Apache Hadoop. Matei received the 2014 ACM Doctoral Dissertation Award for his graduate work.
Available Media

Read more about What's Changing in Big Data?

10:30 am–11:00 am Tuesday

Break with Refreshments

11:00 am–12:15 pm Tuesday

Cross Cloud

Session Chair: Nisha Talagala, Parallel Machines

Mapping Cross-Cloud Systems: Challenges and Opportunities

Yehia Elkhatib, Lancaster University

Recent years have seen significant growth in the cloud computing market, both in terms of provider competition (including private offerings) and customer adoption. However, the cloud computing world still lacks adopted standard programming interfaces, which has a knock-on effect on the costs associated with interoperability and severely limits the flexibility and portability of applications and virtual infrastructures. This has brought about an increasing number of cross-cloud architectures, i.e. systems that span across cloud provisioning boundaries. This paper condenses discussions from the CrossCloud event series to outline the types of cross-cloud systems and their associated design decisions, and laments challenges and opportunities they create.
Available Media

Towards a Network Marketplace in a Cloud

Da Yu, Brown University; Luo Mai, Imperial College London; Somaya Arianfar, Cisco Systems; Rodrigo Fonseca, Brown University; Orran Krieger, Boston University; David Oran, Cisco Systems

Virtually all public clouds today are run by single providers, and this creates near-monopolies, inefficient markets, and hinders innovation at the infrastructure level. There are current proposals to change this, by creating open architectures that allow providers of computing and storage resources to compete for tenant services at multiple levels, all the way down to the bare metal. Networking, however, is not part of this, and is viewed as a commodity much like power or cooling. In this paper we borrow ideas from the Internet architecture, and propose to structure the cloud datacenter network as a marketplace where multiple service providers can offer connectivity services to tenants. Our marketplace, NetEx, divides the network into independently managed pods of resources, interconnected with multiple providers through special programmable switches that play a role analogous to that of an IXP. We demonstrate the feasibility of such an architecture by a prototype in Mininet, and argue that this can be a way to provide innovation, competition, and efficiency in future cloud datacenter networks.
Available Media

Neutrality in Future Public Clouds: Implications and Challenges

George Kesidis, Bhuvan Urgaonkar, Neda Nasiriani, and Cheng Wang, The Pennsylvania State University

With public cloud providers poised to become indispensable utility providers, neutrality-related mandates will likely emerge to ensure a level playing field among their customers (“tenants”). We analogize with net neutrality to discuss: (i) what form cloud neutrality might take, (ii) what lessons might the net neutrality debate have to offer, and (iii) in what ways cloud neutrality would be different from (and even more difficult than) net neutrality. We use idealized thought experiments and simple workload case studies to illustrate our points and conclude with a discussion of challenges and future directions. Our paper points to a rich and important area for future work.
Available Media

12:15 pm–2:00 pm Tuesday

Luncheon for Workshop Attendees

Colorado Ballroom A–E

2:00 pm–3:15 pm Tuesday

Virtual Machines and Containers

Session Chair: Dilma Da Silva, Texas A&M University

Scalable Cloud Security via Asynchronous Virtual Machine Introspection

Sundaresan Rajasekaran, Zhen Ni, Harpreet Singh Chawla, Neel Shah, and Timothy Wood, George Washington University; Emery Berger, University of Massachusetts Amherst

Software will always be vulnerable to attacks. Although techniques exist that could prevent or limit the risk of exploits, performance overhead blocks their adoption. Services deployed into the cloud are typically customer facing, leaving them even more exposed to attacks from malicious users. However, the use of virtual machines, and the economy of scale found in cloud platforms, provides an opportunity to offer strong security guarantees to tenants at low cost to the cloud provider. We present ScaaS, a security Scanning as a Service framework for cloud platforms that uses frequent virtual machine checkpointing coupled with memory introspection techniques to detect bugs and malicious behavior in real time. By buffering VM outputs (i.e., outgoing network packets and disk writes) until a scan has been completed, ScaaS gives strong guarantees about the amount of damage an attack can do, while minimizing overheads.
Available Media

Low-Profile Source-side Deduplication for Virtual Machine Backup

Daniel Agun and Tao Yang, University of California, Santa Barbara; Wei Zhang, Pure Storage Inc.

This paper presents a source-side backup scheme with low-resource usage through collaborative deduplication and approximated lazy deletion when frequent virtual machine snapshot backup is required in a large-scale cloud cluster. The key ideas are to orchestrate multiround duplicate detection batches among machines in a partitioned asynchronous manner and remove most unreferenced content chunks with approximated snapshot deletion. This paper discusses the challenges, main design and strategies, and evaluation results.
Available Media

Design Patterns for Container-based Distributed Systems

Brendan Burns and David Oppenheimer, Google

In the late 1980s and early 1990s, object-oriented programming revolutionized software development, popularizing the approach of building of applications as collections of modular components. Today we are seeing a similar revolution in distributed system development, with the increasing popularity of microservice architectures built from containerized software components. Containers [15] [22] [1] [2] are particularly well-suited as the fundamental “object” in distributed systems by virtue of the walls they erect at the container boundary. As this architectural style matures, we are seeing the emergence of design patterns, much as we did for objectoriented programs, and for the same reason – thinking in terms of objects (or containers) abstracts away the lowlevel details of code, eventually revealing higher-level patterns that are common to a variety of applications and algorithms.

This paper describes three types of design patterns that we have observed emerging in container-based distributed systems: single-container patterns for container management, single-node patterns of closely cooperating containers, and multi-node patterns for distributed algorithms. Like object-oriented patterns before them, these patterns for distributed computation encode best practices, simplify development, and make the systems where they are used more reliable.
Available Media

3:15 pm–4:00 pm Tuesday

Break with Refreshments

Ballroom Foyer

4:00 pm–5:15 pm Tuesday

Performance

Session Chair: Austin Clements, Google

An Experiment on Bare-Metal BigData Provisioning

Ata Turk, Boston University; Ravi S. Gudimetla, Northeastern University; Emine Ugur Kaynar, Jason Hennessey, and Sahil Tikale, Boston University; Peter Desnoyers, Northeastern University; Orran Krieger, Boston University

Many BigData customers use on-demand platforms in the cloud, where they can get a dedicated virtual cluster in a couple of minutes and pay only for the time they use. Increasingly, there is a demand for bare-metal bigdata solutions for applications that cannot tolerate the unpredictability and performance degradation of virtualized systems. Existing bare-metal solutions can introduce delays of 10s of minutes to provision a cluster by installing operating systems and applications on the local disks of servers. This has motivated recent research developing sophisticated mechanisms to optimize this installation. These approaches assume that using network mounted boot disks incur unacceptable run-time overhead. Our analysis suggest that while this assumption is true for application data, it is incorrect for operating systems and applications, and network mounting the boot disk and applications result in negligible run-time impact while leading to faster provisioning time.
Available Media

The Tail at Scale: How to Predict It?

Minh Nguyen, Zhongwei Li, Feng Duan, Hao Che, Yu Lei, and Hong Jiang, The University of Texas at Arlington

Scale-out applications have emerged as the dominant Internet services today. A request in a scale-out workload generally involves task partitioning and merging with barrier synchronization, making it difficult to predict the request tail latency to meet stringent tail Service Level Objectives (SLOs). In this paper, we find that the request tail latency can be faithfully predicted, in the high load region, by a prediction model using only the mean and variance of the task response time as input. The prediction errors for the 99th percentile request latency are found to be consistently within 10% at the load of 90%for both model and measurement-based testing cases. Consequently, the work in this paper establishes an important link between the request tail SLOs and the low order task statistics in a high load region, where the resource provisioning is desired. Finally, we discuss how the prediction model may facilitate highly scalable, tail-constrained resource provisioning for scaleout workloads.
Available Media

On The [Ir]relevance of Network Performance for Data Processing

Animesh Trivedi, Patrick Stuedi, Jonas Pfefferle, Radu Stoica, Bernard Metzler, Ioannis Koltsidas, and Nikolas Ioannou, IBM Research, Zurich

Modern data processing frameworks are used in a variety of settings for a diverse set of workloads such as sorting, indexing, iterative computations, structured query processing, etc. As these frameworks run in a distributed environment, a natural question to ask is – how important is the network to the performance of these frameworks? Recent research in this field has led to contradictory results. One camp advocates the limited impact of networking performance on the overall performance of the framework. On the other hand, there is a large body of work on networking optimizations for data processing frameworks.

In this paper, we search for a better understanding of the matter. While answering the basic question concerning the importance of the network performance, our analysis raises new questions and points to previously unexplored or unnoticed avenues for performance optimizations. We take Apache Spark as a representative of a modern data-processing framework. However, to broaden the scope of our investigation, we also experiment with other frameworks such as Flink, Power- Graph or Timely. In our study – rather than analysing Spark-specific peculiarities – we look into procedures and subsystems that are common in any of these frameworks such as networking IO, shuffle data management, object (de)serialization, copies, job scheduling and coordination, etc. Nonetheless, we are aware that the roles of those individual components are different for the various systems, and we exercise caution when making generalized statements about the performance.
Available Media

7:30 am–9:00 am	Monday
Continental Breakfast Ballroom Foyer
9:00 am–9:15 am	Monday
Opening Remarks Program Co-Chairs: Austin Clements, Google, and Tyson Condie, University of California, Los Angeles
9:15 am–10:30 am	Monday
Cloud Economics Session Chair: Fred Douglis, EMC How Not to Bid the Cloud Prateek Sharma, David Irwin, and Prashant Shenoy, University of Massachusetts Amherst Cloud providers have begun to allow users to bid for surplus servers on a spot market. These servers are allocated if a user’s bid price is higher than their market price and revoked otherwise. Thus, analyzing price data to derive optimal bidding strategies has become a popular research topic. In this paper, we argue that sophisticated bidding strategies, in practice, do not provide any advantages over simple strategies for multiple reasons. First, due to price characteristics, there are a wide range of bid prices that yield the optimal cost and availability. Second, given the large number of spot markets, there is always a market with available surplus resources. Thus, if resources become unavailable due to a price spike, users need not wait until the spike subsides, but can instead provision a new spot resource elsewhere and migrate to it. Third, current spot market rules enable users to place maximum bids for resources without any penalty. Given bidding’s irrelevance, users can adopt trivial bidding strategies and focus instead on modifying applications to efficiently seek out and migrate to the lowest cost resources. Available Media QoX: Quality of Service and Consumption in the Cloud Murad Kablan and Eric Keller, University of Colorado Boulder; Hani Jamjoom, IBM Research Cloud services today are increasingly built using functionality from other running services. In this paper, we question whether legacy Quality of Services (QoS) metrics and enforcement techniques are sufficient as they are producer centric. We argue that, similar to customer rating systems found in banking systems and many sharing economy apps (e.g., Uber and Airbnb), Quality of Consumption (QoC) should be introduced to capture different metrics about service consumers. We show how the combination of QoS and QoC, dubbed QoX, can be used by consumers and providers to improve the security and management of their infrastructure. In addition, we demonstrate how sharing information among other consumers and providers increase the value of QoX. To address the main challenge with sharing information, namely sybil attacks and mis-information, we describe how we can leverage cloud providers as vouching authorities to ensure the integrity of information. We explore the motivations, challenges, and potentials to introduce such a framework in the cloud environment. Available Media Cloud Spot Markets are Not Sustainable: The Case for Transient Guarantees Supreeth Subramanya, Amr Rizk, and David Irwin, University of Massachusetts Amherst Computational spot markets enable users to bid on servers, and then continuously allocates them to the highest bidder: if a user is “out bid” for a server, the market revokes it and re-allocates it to the new highest bidder. Spot markets are common when trading commodities to balance real-time supply and demand—cloud platforms use them to sell their idle capacity, which varies over time. However, server-time differs from other commodities in that it is “stateful”: losing a spot server incurs an overhead that decreases the useful work it performs. Thus, variations in the spot price actually affect the inherent value of server-time bought in the spot market. As the spot market matures, we argue that price volatility will significantly decrease the value of spot servers. Thus, somewhat counter-intuitively, spot markets may not maximize the value of idle server capacity. To address the problem, we propose a more sustainable alternative that offers a variable amount of idle capacity to users for a fixed price, but with transient guarantees Available Media
10:30 am–11:15 am	Monday
Break with Refreshments Ballroom Foyer
11:15 am–12:30 pm	Monday
Programming Models Session Chair: Irene Zhang, University of Washington Interactive Debugging for Big Data Analytics Muhammad Ali Gulzar, Xueyuan Han, Matteo Interlandi, and Shaghayegh Mardani, University of California, Los Angeles; Sai Deep Tetali, Google, Inc.; Todd Millstein and Miryung Kim, University of California, Los Angeles An abundance of data in many disciplines has accelerated the adoption of distributed technologies such as Hadoop and Spark, which provide simple programming semantics and an active ecosystem. However, the current cloud computing model lacks the kinds of expressive and interactive debugging features found in traditional desktop computing. We seek to address these challenges with the development of BIGDEBUG, a framework providing interactive debugging primitives and tool-assisted fault localization services for big data analytics. We showcase the data provenance and optimized incremental computation features to effectively and efficiently support interactive debugging, and investigate new research directions on how to automatically pinpoint and repair the root cause of errors in large-scale distributed data processing. Available Media Ovid: A Software-Defined Distributed Systems Framework Deniz Altınbüken and Robbert van Renesse, Cornell University We present Ovid, a framework for building evolvable large-scale distributed systems that run in the cloud. Ovid constructs and deploys distributed systems as a collection of simple components, creating systems suited for containerization in the cloud. Ovid supports evolution of systems through transformations, which are automated refinements. Examples of transformations include replication, batching, sharding, and encryption. Ovid transformations guarantee that an evolving system still implements the same specification. Moreover, systems built with transformations can be combined with other systems to implement more complex infrastructure services. The result of this framework is a software-defined distributed system, in which a logically centralized controller specifies the components, their interactions, and their transformations. Available Media Serverless Computation with OpenLambda Scott Hendrickson, Stephen Sturdevant, and Tyler Harter, University of Wisconsin—Madison; Venkateshwaran Venkataramani; Andrea C. Arpaci-Dusseau and Remzi H. Arpaci-Dusseau, University of Wisconsin—Madison We present OpenLambda, a new, open-source platform for building next-generation web services and applications in the burgeoningmodel of serverless computation. We describe the key aspects of serverless computation, and present numerous research challenges that must be addressed in the design and implementation of such systems. We also include a brief study of current web applications, so as to better motivate some aspects of serverless application construction. Available Media
12:30 pm–2:00 pm	Monday
Luncheon for Workshop Attendees Colorado Ballroom E
2:00 pm–3:15 pm	Monday
Networking Session Chair: Angela Demke Brown, University of Toronto Accelerating Complex Data Transfer for Cluster Computing Alexey Khrabrov and Eyal de Lara, University of Toronto The ability to move data quickly between the nodes of a distributed system is important for the performance of cluster computing frameworks, such as Hadoop and Spark. We show that in a cluster with modern networking technology data serialization is the main bottleneck and source of overhead in the transfer of rich data in systems based on high-level programming languages such as Java. We propose a new data transfer mechanism that avoids serialization altogether by using a shared clusterwide address space to store data. The design and a prototype implementation of this approach are described. We show that our mechanism is significantly faster than serialized data transfer, and propose a number of possible applications for it. Available Media HyperOptics: A High Throughput and Low Latency Multicast Architecture for Datacenters Dingming Wu, Xiaoye Sun, Yiting Xia, Xin Huang, and T. S. Eugene Ng, Rice University Multicast has long been a performance bottleneck for data centers. Traditional solutions relying on IP multicast suffer from poor congestion control and loss recovery on the data plane, as well as slow and complex group membership and multicast tree management on the control plane. Some recent proposals have employed alternate optical circuit switched paths to enable lossless multicast and a centralized control architecture to quickly configure multicast trees. However, the high circuit reconfiguration delay of optical switches has substantially limited multicast performance. In this paper, we propose to eliminate this reconfiguration delay by an unconventional optical multicast architecture called HyperOptics that directly interconnects top of rack switches by low cost optical splitters, thereby eliminating the need for optical switches. The ToRs are organized to form the connectivity of a regular graph. We analytically show that this architecture is scalable and efficient for multicasts. Preliminary simulations show that running multicasts on HyperOptics can on average be 2.1x faster than on an optical circuit switched network. Available Media Graviton: Twisting Space and Time to Speed-up CoFlows Akshay Jajoo, Rohan Gandhi, and Y. Charlie Hu, Purdue University In this paper, we make a key observation that using multiple priority queues and weighted fair sharing at each port, Aalo does a good job in approximating SJF, but it does so only at the queue-granularity, as using FIFO to schedule CoFlows in each queue is rather simplistic, and has no reminiscence of SJF. Instead, we discuss three insights into Aalo’s scheduler where exploiting the spatial dimension of the problem domain, i.e., the width (number of ports) of the CoFlows, can lead to better scheduling policies within each priority queue, improving the overall CCT. Available Media
3:15 pm–4:00 pm	Monday
Break with Refreshments Ballroom Foyer
4:00 pm–5:15 pm	Monday
Architecture Session Chair: Tyson Condie, University of California, Los Angeles Mlcached: Multi-level DRAM-NAND Key-value Cache I. Stephen Choi, Byoung Young Ahn, and Yang-Suk Kee, Samsung Memory Solutions Lab We present Mlcached, multi-level DRAM-NAND keyvalue cache, that is designed to enable independent resource provisioning of DRAM and NAND flash memory by completely decoupling each caching layers. Mlcached utilizes DRAM for L1 cache and our new KVcache device for L2 cache. The index-integrated FTL is implemented in the KV-cache device to eliminate any inmemory indexes that prohibit the independent resource provisioning. We show that Mlcached is only 12.8% slower than a DRAM-only Web caching service in the average RTT with 80% L1 cache hit while saving twothirds of its TCO. Moreover, our model-based study shows that Mlcached can provide up to 6X lower cost or 4X lower latency at the same SLA or TCO, respectively. Available Media When Spark Meets FPGAs: A Case Study for Next-Generation DNA Sequencing Acceleration Yu-Ting Chen, Jason Cong, Zhenman Fang, Jie Lei, and Peng Wei, University of California, Los Angeles FPGA-enabled datacenters have shown great potential for providing performance and energy efficiency improvement. In this paper we aim to answer one key question: how can we efficiently integrate FPGAs into stateof- the-art big-data computing frameworks like Apache Spark? To provide a generalized methodology and insights for efficient integration, we conduct an indepth analysis of challenges at single-thread, single-node multi-thread, and multi-node levels, and propose solutions including batch processing and the FPGA-as-a- Service framework to address them. With a step-by-step case study for the next-generation DNA sequencing application, we demonstrate how a straightforward integration with 1,000x slowdown can be tuned into an efficient integration with 2.6x overall system speedup and 2.4x energy efficiency improvement. Available Media Unikernel Monitors: Extending Minimalism Outside of the Box Dan Williams and Ricardo Koller, IBM T. J. Watson Research Center Recently, unikernels have emerged as an exploration of minimalist software stacks to improve the security of applications in the cloud. In this paper, we propose extending the notion of minimalism beyond an individual virtual machine to include the underlying monitor and the interface it exposes. We propose unikernel monitors. Each unikernel is bundled with a tiny, specialized monitor that only contains what the unikernel needs both in terms of interface and implementation. Unikernel monitors improve isolation through minimal interfaces, reduce complexity, and boot unikernels quickly. Our initial prototype, ukvm, is less than 5% the code size of a traditional monitor, and bootsMirageOS unikernels in as little as 10ms (8× faster than a traditional monitor). Available Media
6:00 pm–7:00 pm	Monday
Joint Poster Session and Happy Hour with HotStorage Colorado Ballroom A–E

7:30 am–9:00 am	Tuesday
Continental Breakfast Ballroom Foyer
9:00 am–10:30 am	Tuesday
Joint Keynote Address with HotStorage What's Changing in Big Data? Matei Zaharia, Massachusetts Institute of Technology Matei Zaharia is an assistant professor of computer science at MIT as well as CTO of Databricks, the company commercializing Apache Spark. He is broadly interested in computer systems, data centers and data management. He started the Spark project while he was a PhD student at UC Berkeley, and he has also contributed to other open source cluster computing projects such as Apache Mesos and Apache Hadoop. Matei received the 2014 ACM Doctoral Dissertation Award for his graduate work. Big data analytics became a hot research topic nearly ten years ago, but since that time, a lot of things have changed. On the hardware side, trends such as the slowdown of processing with respect to I/O are starting to affect the design of big data systems. On the application side, big data systems are increasingly being used by non-programmers and require similar forms of interaction to "small data" analysis tools. Finally, big data systems are increasingly provided "as a service" on cloud infrastructure. I'll talk about these changes from the perspective of the Apache Spark project and from my experience at a company offering a cloud service for big data (Databricks). Big data analytics became a hot research topic nearly ten years ago, but since that time, a lot of things have changed. On the hardware side, trends such as the slowdown of processing with respect to I/O are starting to affect the design of big data systems. On the application side, big data systems are increasingly being used by non-programmers and require similar forms of interaction to "small data" analysis tools. Finally, big data systems are increasingly provided "as a service" on cloud infrastructure. I'll talk about these changes from the perspective of the Apache Spark project and from my experience at a company offering a cloud service for big data (Databricks). Matei Zaharia is an assistant professor of computer science at MIT as well as CTO of Databricks, the company commercializing Apache Spark. He is broadly interested in computer systems, data centers and data management. He started the Spark project while he was a PhD student at UC Berkeley, and he has also contributed to other open source cluster computing projects such as Apache Mesos and Apache Hadoop. Matei received the 2014 ACM Doctoral Dissertation Award for his graduate work. Available Media Read more about What's Changing in Big Data?
10:30 am–11:00 am	Tuesday
Break with Refreshments
11:00 am–12:15 pm	Tuesday
Cross Cloud Session Chair: Nisha Talagala, Parallel Machines Mapping Cross-Cloud Systems: Challenges and Opportunities Yehia Elkhatib, Lancaster University Recent years have seen significant growth in the cloud computing market, both in terms of provider competition (including private offerings) and customer adoption. However, the cloud computing world still lacks adopted standard programming interfaces, which has a knock-on effect on the costs associated with interoperability and severely limits the flexibility and portability of applications and virtual infrastructures. This has brought about an increasing number of cross-cloud architectures, i.e. systems that span across cloud provisioning boundaries. This paper condenses discussions from the CrossCloud event series to outline the types of cross-cloud systems and their associated design decisions, and laments challenges and opportunities they create. Available Media Towards a Network Marketplace in a Cloud Da Yu, Brown University; Luo Mai, Imperial College London; Somaya Arianfar, Cisco Systems; Rodrigo Fonseca, Brown University; Orran Krieger, Boston University; David Oran, Cisco Systems Virtually all public clouds today are run by single providers, and this creates near-monopolies, inefficient markets, and hinders innovation at the infrastructure level. There are current proposals to change this, by creating open architectures that allow providers of computing and storage resources to compete for tenant services at multiple levels, all the way down to the bare metal. Networking, however, is not part of this, and is viewed as a commodity much like power or cooling. In this paper we borrow ideas from the Internet architecture, and propose to structure the cloud datacenter network as a marketplace where multiple service providers can offer connectivity services to tenants. Our marketplace, NetEx, divides the network into independently managed pods of resources, interconnected with multiple providers through special programmable switches that play a role analogous to that of an IXP. We demonstrate the feasibility of such an architecture by a prototype in Mininet, and argue that this can be a way to provide innovation, competition, and efficiency in future cloud datacenter networks. Available Media Neutrality in Future Public Clouds: Implications and Challenges George Kesidis, Bhuvan Urgaonkar, Neda Nasiriani, and Cheng Wang, The Pennsylvania State University With public cloud providers poised to become indispensable utility providers, neutrality-related mandates will likely emerge to ensure a level playing field among their customers (“tenants”). We analogize with net neutrality to discuss: (i) what form cloud neutrality might take, (ii) what lessons might the net neutrality debate have to offer, and (iii) in what ways cloud neutrality would be different from (and even more difficult than) net neutrality. We use idealized thought experiments and simple workload case studies to illustrate our points and conclude with a discussion of challenges and future directions. Our paper points to a rich and important area for future work. Available Media
12:15 pm–2:00 pm	Tuesday
Luncheon for Workshop Attendees Colorado Ballroom A–E
2:00 pm–3:15 pm	Tuesday
Virtual Machines and Containers Session Chair: Dilma Da Silva, Texas A&M University Scalable Cloud Security via Asynchronous Virtual Machine Introspection Sundaresan Rajasekaran, Zhen Ni, Harpreet Singh Chawla, Neel Shah, and Timothy Wood, George Washington University; Emery Berger, University of Massachusetts Amherst Software will always be vulnerable to attacks. Although techniques exist that could prevent or limit the risk of exploits, performance overhead blocks their adoption. Services deployed into the cloud are typically customer facing, leaving them even more exposed to attacks from malicious users. However, the use of virtual machines, and the economy of scale found in cloud platforms, provides an opportunity to offer strong security guarantees to tenants at low cost to the cloud provider. We present ScaaS, a security Scanning as a Service framework for cloud platforms that uses frequent virtual machine checkpointing coupled with memory introspection techniques to detect bugs and malicious behavior in real time. By buffering VM outputs (i.e., outgoing network packets and disk writes) until a scan has been completed, ScaaS gives strong guarantees about the amount of damage an attack can do, while minimizing overheads. Available Media Low-Profile Source-side Deduplication for Virtual Machine Backup Daniel Agun and Tao Yang, University of California, Santa Barbara; Wei Zhang, Pure Storage Inc. This paper presents a source-side backup scheme with low-resource usage through collaborative deduplication and approximated lazy deletion when frequent virtual machine snapshot backup is required in a large-scale cloud cluster. The key ideas are to orchestrate multiround duplicate detection batches among machines in a partitioned asynchronous manner and remove most unreferenced content chunks with approximated snapshot deletion. This paper discusses the challenges, main design and strategies, and evaluation results. Available Media Design Patterns for Container-based Distributed Systems Brendan Burns and David Oppenheimer, Google In the late 1980s and early 1990s, object-oriented programming revolutionized software development, popularizing the approach of building of applications as collections of modular components. Today we are seeing a similar revolution in distributed system development, with the increasing popularity of microservice architectures built from containerized software components. Containers [15] [22] [1] [2] are particularly well-suited as the fundamental “object” in distributed systems by virtue of the walls they erect at the container boundary. As this architectural style matures, we are seeing the emergence of design patterns, much as we did for objectoriented programs, and for the same reason – thinking in terms of objects (or containers) abstracts away the lowlevel details of code, eventually revealing higher-level patterns that are common to a variety of applications and algorithms. This paper describes three types of design patterns that we have observed emerging in container-based distributed systems: single-container patterns for container management, single-node patterns of closely cooperating containers, and multi-node patterns for distributed algorithms. Like object-oriented patterns before them, these patterns for distributed computation encode best practices, simplify development, and make the systems where they are used more reliable. Available Media
3:15 pm–4:00 pm	Tuesday
Break with Refreshments Ballroom Foyer
4:00 pm–5:15 pm	Tuesday
Performance Session Chair: Austin Clements, Google An Experiment on Bare-Metal BigData Provisioning Ata Turk, Boston University; Ravi S. Gudimetla, Northeastern University; Emine Ugur Kaynar, Jason Hennessey, and Sahil Tikale, Boston University; Peter Desnoyers, Northeastern University; Orran Krieger, Boston University Many BigData customers use on-demand platforms in the cloud, where they can get a dedicated virtual cluster in a couple of minutes and pay only for the time they use. Increasingly, there is a demand for bare-metal bigdata solutions for applications that cannot tolerate the unpredictability and performance degradation of virtualized systems. Existing bare-metal solutions can introduce delays of 10s of minutes to provision a cluster by installing operating systems and applications on the local disks of servers. This has motivated recent research developing sophisticated mechanisms to optimize this installation. These approaches assume that using network mounted boot disks incur unacceptable run-time overhead. Our analysis suggest that while this assumption is true for application data, it is incorrect for operating systems and applications, and network mounting the boot disk and applications result in negligible run-time impact while leading to faster provisioning time. Available Media The Tail at Scale: How to Predict It? Minh Nguyen, Zhongwei Li, Feng Duan, Hao Che, Yu Lei, and Hong Jiang, The University of Texas at Arlington Scale-out applications have emerged as the dominant Internet services today. A request in a scale-out workload generally involves task partitioning and merging with barrier synchronization, making it difficult to predict the request tail latency to meet stringent tail Service Level Objectives (SLOs). In this paper, we find that the request tail latency can be faithfully predicted, in the high load region, by a prediction model using only the mean and variance of the task response time as input. The prediction errors for the 99th percentile request latency are found to be consistently within 10% at the load of 90%for both model and measurement-based testing cases. Consequently, the work in this paper establishes an important link between the request tail SLOs and the low order task statistics in a high load region, where the resource provisioning is desired. Finally, we discuss how the prediction model may facilitate highly scalable, tail-constrained resource provisioning for scaleout workloads. Available Media On The [Ir]relevance of Network Performance for Data Processing Animesh Trivedi, Patrick Stuedi, Jonas Pfefferle, Radu Stoica, Bernard Metzler, Ioannis Koltsidas, and Nikolas Ioannou, IBM Research, Zurich Modern data processing frameworks are used in a variety of settings for a diverse set of workloads such as sorting, indexing, iterative computations, structured query processing, etc. As these frameworks run in a distributed environment, a natural question to ask is – how important is the network to the performance of these frameworks? Recent research in this field has led to contradictory results. One camp advocates the limited impact of networking performance on the overall performance of the framework. On the other hand, there is a large body of work on networking optimizations for data processing frameworks. In this paper, we search for a better understanding of the matter. While answering the basic question concerning the importance of the network performance, our analysis raises new questions and points to previously unexplored or unnoticed avenues for performance optimizations. We take Apache Spark as a representative of a modern data-processing framework. However, to broaden the scope of our investigation, we also experiment with other frameworks such as Flink, Power- Graph or Timely. In our study – rather than analysing Spark-specific peculiarities – we look into procedures and subsystems that are common in any of these frameworks such as networking IO, shuffle data management, object (de)serialization, copies, job scheduling and coordination, etc. Nonetheless, we are aware that the roles of those individual components are different for the various systems, and we exercise caution when making generalized statements about the performance. Available Media