7:30 am–9:00 am |
Monday |
Continental Breakfast
Ballroom Foyer
|
9:00 am–9:15 am |
Monday |
Program Co-Chairs: Austin Clements, Google, and Tyson Condie, University of California, Los Angeles
|
9:15 am–10:30 am |
Monday |
Session Chair: Fred Douglis, EMC
Prateek Sharma, David Irwin, and Prashant Shenoy, University of Massachusetts Amherst Cloud providers have begun to allow users to bid for surplus servers on a spot market. These servers are allocated if a user’s bid price is higher than their market price and revoked otherwise. Thus, analyzing price data to derive optimal bidding strategies has become a popular research topic. In this paper, we argue that sophisticated bidding strategies, in practice, do not provide any advantages over simple strategies for multiple reasons. First, due to price characteristics, there are a wide range of bid prices that yield the optimal cost and availability. Second, given the large number of spot markets, there is always a market with available surplus resources. Thus, if resources become unavailable due to a price spike, users need not wait until the spike subsides, but can instead provision a new spot resource elsewhere and migrate to it. Third, current spot market rules enable users to place maximum bids for resources without any penalty. Given bidding’s irrelevance, users can adopt trivial bidding strategies and focus instead on modifying applications to efficiently seek out and migrate to the lowest cost resources.
Murad Kablan and Eric Keller, University of Colorado Boulder; Hani Jamjoom, IBM Research Cloud services today are increasingly built using functionality from other running services. In this paper, we question whether legacy Quality of Services (QoS) metrics and enforcement techniques are sufficient as they are producer centric. We argue that, similar to customer rating systems found in banking systems and many sharing economy apps (e.g., Uber and Airbnb), Quality of Consumption (QoC) should be introduced to capture different metrics about service consumers. We show how the combination of QoS and QoC, dubbed QoX, can be used by consumers and providers to improve the security and management of their infrastructure. In addition, we demonstrate how sharing information among other consumers and providers increase the value of QoX. To address the main challenge with sharing information, namely sybil attacks and mis-information, we describe how we can leverage cloud providers as vouching authorities to ensure the integrity of information. We explore the motivations, challenges, and potentials to introduce such a framework in the cloud environment.
Supreeth Subramanya, Amr Rizk, and David Irwin, University of Massachusetts Amherst Computational spot markets enable users to bid on servers, and then continuously allocates them to the highest bidder: if a user is “out bid” for a server, the market revokes it and re-allocates it to the new highest bidder. Spot markets are common when trading commodities to balance real-time supply and demand—cloud platforms use them to sell their idle capacity, which varies over time. However, server-time differs from other commodities in that it is “stateful”: losing a spot server incurs an overhead that decreases the useful work it performs. Thus, variations in the spot price actually affect the inherent value of server-time bought in the spot market. As the spot market matures, we argue that price volatility will significantly decrease the value of spot servers. Thus, somewhat counter-intuitively, spot markets may not maximize the value of idle server capacity. To address the problem, we propose a more sustainable alternative that offers a variable amount of idle capacity to users for a fixed price, but with transient guarantees
|
10:30 am–11:15 am |
Monday |
Break with Refreshments
Ballroom Foyer
|
11:15 am–12:30 pm |
Monday |
Session Chair: Irene Zhang, University of Washington
Muhammad Ali Gulzar, Xueyuan Han, Matteo Interlandi, and Shaghayegh Mardani, University of California, Los Angeles; Sai Deep Tetali, Google, Inc.; Todd Millstein and Miryung Kim, University of California, Los Angeles An abundance of data in many disciplines has accelerated the adoption of distributed technologies such as Hadoop and Spark, which provide simple programming semantics and an active ecosystem. However, the current cloud computing model lacks the kinds of expressive and interactive debugging features found in traditional desktop computing. We seek to address these challenges with the development of BIGDEBUG, a framework providing interactive debugging primitives and tool-assisted fault localization services for big data analytics. We showcase the data provenance and optimized incremental computation features to effectively and efficiently support interactive debugging, and investigate new research directions on how to automatically pinpoint and repair the root cause of errors in large-scale distributed data processing.
Deniz Altınbüken and Robbert van Renesse, Cornell University We present Ovid, a framework for building evolvable large-scale distributed systems that run in the cloud. Ovid constructs and deploys distributed systems as a collection of simple components, creating systems suited for containerization in the cloud. Ovid supports evolution of systems through transformations, which are automated refinements. Examples of transformations include replication, batching, sharding, and encryption. Ovid transformations guarantee that an evolving system still implements the same specification. Moreover, systems built with transformations can be combined with other systems to implement more complex infrastructure services. The result of this framework is a software-defined distributed system, in which a logically centralized controller specifies the components, their interactions, and their transformations.
Scott Hendrickson, Stephen Sturdevant, and Tyler Harter, University of Wisconsin—Madison; Venkateshwaran Venkataramani; Andrea C. Arpaci-Dusseau and Remzi H. Arpaci-Dusseau, University of Wisconsin—Madison We present OpenLambda, a new, open-source platform for building next-generation web services and applications in the burgeoningmodel of serverless computation. We describe the key aspects of serverless computation, and present numerous research challenges that must be addressed in the design and implementation of such systems. We also include a brief study of current web applications, so as to better motivate some aspects of serverless application construction.
|
12:30 pm–2:00 pm |
Monday |
Luncheon for Workshop Attendees
Colorado Ballroom E
|
2:00 pm–3:15 pm |
Monday |
Session Chair: Angela Demke Brown, University of Toronto
Alexey Khrabrov
and Eyal de Lara, University of Toronto The ability to move data quickly between the nodes of a distributed system is important for the performance of cluster computing frameworks, such as Hadoop and Spark. We show that in a cluster with modern networking technology data serialization is the main bottleneck and source of overhead in the transfer of rich data in systems based on high-level programming languages such as Java. We propose a new data transfer mechanism that avoids serialization altogether by using a shared clusterwide address space to store data. The design and a prototype implementation of this approach are described. We show that our mechanism is significantly faster than serialized data transfer, and propose a number of possible applications for it.
Dingming Wu, Xiaoye Sun, Yiting Xia, Xin Huang, and T. S. Eugene Ng, Rice University Multicast has long been a performance bottleneck for data centers. Traditional solutions relying on IP multicast suffer from poor congestion control and loss recovery on the data plane, as well as slow and complex group membership and multicast tree management on the control plane. Some recent proposals have employed alternate optical circuit switched paths to enable lossless multicast and a centralized control architecture to quickly configure multicast trees. However, the high circuit reconfiguration delay of optical switches has substantially limited multicast performance.
In this paper, we propose to eliminate this reconfiguration delay by an unconventional optical multicast architecture called HyperOptics that directly interconnects top of rack switches by low cost optical splitters, thereby eliminating the need for optical switches. The ToRs are organized to form the connectivity of a regular graph. We analytically show that this architecture is scalable and efficient for multicasts. Preliminary simulations show that running multicasts on HyperOptics can on average be 2.1x faster than on an optical circuit switched network.
Akshay Jajoo, Rohan Gandhi, and Y. Charlie Hu, Purdue University In this paper, we make a key observation that using multiple priority queues and weighted fair sharing at each port, Aalo does a good job in approximating SJF, but it does so only at the queue-granularity, as using FIFO to schedule CoFlows in each queue is rather simplistic, and has no reminiscence of SJF.
Instead, we discuss three insights into Aalo’s scheduler where exploiting the spatial dimension of the problem domain, i.e., the width (number of ports) of the CoFlows, can lead to better scheduling policies within each priority queue, improving the overall CCT.
|
3:15 pm–4:00 pm |
Monday |
Break with Refreshments
Ballroom Foyer
|
4:00 pm–5:15 pm |
Monday |
Session Chair: Tyson Condie, University of California, Los Angeles
I. Stephen Choi, Byoung Young Ahn, and Yang-Suk Kee, Samsung Memory Solutions Lab We present Mlcached, multi-level DRAM-NAND keyvalue cache, that is designed to enable independent resource provisioning of DRAM and NAND flash memory by completely decoupling each caching layers. Mlcached utilizes DRAM for L1 cache and our new KVcache device for L2 cache. The index-integrated FTL is implemented in the KV-cache device to eliminate any inmemory indexes that prohibit the independent resource provisioning. We show that Mlcached is only 12.8% slower than a DRAM-only Web caching service in the average RTT with 80% L1 cache hit while saving twothirds of its TCO. Moreover, our model-based study shows that Mlcached can provide up to 6X lower cost or 4X lower latency at the same SLA or TCO, respectively.
Yu-Ting Chen, Jason Cong, Zhenman Fang, Jie Lei, and Peng Wei, University of California, Los Angeles FPGA-enabled datacenters have shown great potential for providing performance and energy efficiency improvement. In this paper we aim to answer one key question: how can we efficiently integrate FPGAs into stateof- the-art big-data computing frameworks like Apache Spark? To provide a generalized methodology and insights for efficient integration, we conduct an indepth analysis of challenges at single-thread, single-node multi-thread, and multi-node levels, and propose solutions including batch processing and the FPGA-as-a- Service framework to address them. With a step-by-step case study for the next-generation DNA sequencing application, we demonstrate how a straightforward integration with 1,000x slowdown can be tuned into an efficient integration with 2.6x overall system speedup and 2.4x energy efficiency improvement.
Dan Williams and Ricardo Koller, IBM T. J. Watson Research Center Recently, unikernels have emerged as an exploration of minimalist software stacks to improve the security of applications in the cloud. In this paper, we propose extending the notion of minimalism beyond an individual virtual machine to include the underlying monitor and the interface it exposes. We propose unikernel monitors. Each unikernel is bundled with a tiny, specialized monitor that only contains what the unikernel needs both in terms of interface and implementation. Unikernel monitors improve isolation through minimal interfaces, reduce complexity, and boot unikernels quickly. Our initial prototype, ukvm, is less than 5% the code size of a traditional monitor, and bootsMirageOS unikernels in as little as 10ms (8× faster than a traditional monitor).
|
6:00 pm–7:00 pm |
Monday |
Joint Poster Session and Happy Hour with HotStorage
Colorado Ballroom A–E
|