8:15 a.m.–8:45 a.m. |
Monday |
Continental Breakfast
Tamaya Prefunction North |
8:45 a.m.–9:00 a.m. |
Monday |
Program Chair: Petros Maniatis, Intel Labs
|
9:00 a.m.–10:00 a.m. |
Monday |
Session Chair: Andrew Warfield, University of British Columbia
Pravin Shinde, Antoine Kaufmann, Timothy Roscoe, and Stefan Kaestle, Systems Group, ETH Zurich Operating systems fail both to efficiently exploit, and to effectively manage, the considerable hardware resources of modern network interface controllers. We survey the kinds of hardware facilities available and their applicability, and then investigate (and critique) the reasons why OS designers eschew core support for such features. We then describe Dragonet, a new network stack design based on explicit descriptions of NIC capabilities, aimed at making the best use of today’s and tomorrow’s networking hardware. Dragonet represents both the physical capabilities of the network hardware and the current protocol state of the machine as dataflow graphs. We then embed the former into the latter, instantiating the remainder in software.
Jeffrey C. Mogul, Jayaram Mudigonda, Jose Renato Santos, and Yoshio Turner, HP Labs Cloud computing does not inherently require the use of virtual machines, and some cloud customers prefer or even require “bare metal” systems, where no hypervisor separates the guest operating system from the CPU. Even for bare-metal nodes, the cloud provider must find a means to isolate the guest system from other cloud resources, and to manage the instantiation and removal of guests. We argue that an enhanced NIC, together with standard features of modern servers, can provide all of the functions for which a hypervisor would normally be required.
William Jannen, Chia-che Tsai, and Donald E. Porter, Stony Brook University When one uses virtual machines for application compatibility, such as running Windows programs on Linux, the user only wants the API components, yet must emulate a disk drive and execute a second, counterproductive level of media heuristics and I/O scheduling. Systems should have a clean interface between API implementation and media optimization, which would lead to more efficient paravirtualization and facilitate rapid, independent evolution of media optimizations and API features. We describe a design that meets these goals, called Zoochory.
Animesh Trivedi, Patrick Stuedi, Bernard Metzler, and Roman Pletka, IBM Research Zurich; Blake G. Fitch, IBM Research; Thomas R. Gross, ETH Zurich Fast non-volatile memories are exposing inefficiencies in traditional I/O stacks. Though there have been fragmented efforts to deal with the issues, there is a pressing need for a high-performance storage stack. Interestingly, 20 years ago, networks were faced with similar challenges, which led to the development of concepts and implementations of multiple high-performance network stacks. In this paper we draw parallels to illustrate synergies between high-performance storage requirements and concepts from the networking space. We identify common high-performance I/O properties and recent efforts in storage to achieve those properties. Instead of reinventing the performance wheel, we advocate a case for using mature high-performance networking abstractions and frameworks to meet the storage demands, and discuss opportunities and challenges that arise with this unification.
|
10:00 a.m.–10:30 a.m. |
Monday |
Open Mike
Each session of papers will be followed by an Open Mike session, during which attendees can interactively discuss any issues, ideas, or controversy that arose during the paper session. Open Mike sessions are not meant as extended Q&A for the paper presenters, but as discussions sparked by the papers. |
10:30 a.m.–11:00 a.m. |
Monday |
Break with Refreshments |
11:00 a.m.–noon |
Monday |
Session Chair: George Candea, EPFL Attention, registered attendees! Please check your email soon for more information about this session.
Early on in the workshop, we will have an opportunity for attendees to take two minutes to address the audience with work-in-progress reports, solicitations for topics to discuss during the four unconference sessions, and even advertisements for talks to be presented late in the workshop (i.e., during the third day). No Q&A will be allowed during this session, since the purpose is to drive discussion during the rest of the workshop and the unconference sessions. We will open up sign-up for these slots in the weeks before the workshop.
|
Noon–1:30 p.m. |
Monday |
Workshop Luncheon
Puma |
1:30 p.m.–2:15 p.m. |
Monday |
This is one of four sessions during which attendees can break out into parallel groups discussing topics of their choosing. The idea for these sessions is to foster interaction without excessive structure. Topics to discuss might be related to the papers presented, demos of systems being built, ideas for a HotOS paper to write for 2015, etc. We will open up sign-up for topics before the workshop.
|
2:15 p.m.–2:30 p.m. |
Monday |
Break |
2:30 p.m.–3:15 p.m. |
Monday |
Session Chair: Alex C. Snoeren, University of California, San Diego
Seungyeop Han, University of Washington; Matthai Philipose, Microsoft Research Much has been said recently on off-loading computations from the phone. In particular, workloads such as speech and visual recognition that involve models based on “big data” are thought to be prime candidates for cloud processing. We posit that the next few years will see the arrival of mobile usages that require continuous processing of audio and video data from wearable devices. We argue that these usages are unlikely to flourish unless substantial computation is moved back on to the phone. We outline possible solutions to the problems inherent in such a move. We advocate a close partnership between perception and systems researchers to realize these usages.
Ariel Rabkin, Matvey Arye, Siddhartha Sen, Vivek Pai, and Michael J. Freedman, Princeton University Many data sets, such as system logs, are generated from widely distributed locations. Current distributed systems often discard this data because they lack the ability to backhaul it efficiently, or to do anything meaningful with it at the distributed sites. This leads to lost functionality, efficiency, and business opportunities. The problem with traditional backhaul approaches is that they are slow and costly, and require analysts to define the data they are interested in up-front. We propose a new architecture that stores data at the edge (i.e., near where it is generated) and supports rich real-time and historical queries on this data, while adjusting data quality to cope with the vagaries of wide-area bandwidth. In essence, this design transforms a distributed data collection system into a distributed data analysis system, where decisions about collection do not preclude decisions about analysis.
Pengyu Zhang, Deepak Ganesan, and Boyan Lu, University of Massachusetts Amherst As sensors penetrate into deeply embedded settings such as implantables, wearables, and textiles, they present new challenges due to their tiny energy buffers and extremely low harvesting conditions under which they need to operate. However, existing low-power operating systems are not designed with the goal of scaling down to such severely constrained environments. We address these challenges with QuarkOS, an OS that scales down by carefully dividing every communication, sensing, and computation task into tiny fragments (e.g. half-bit, one pixel) and introduces sleeps between such fragments to re-charge. In addition QuarkOS is designed to have minimal run-time overhead, while still adapting performance to harvesting conditions. Our results are promising and show continuous communication from an RF-powered CRFID can occur at a third of the harvesting levels of prior approaches, and continuous image sensing to be performed with a tiny solar panel under natural indoor light.
|
3:15 p.m.–3:45 p.m. |
Monday |
Open Mike |
3:45 p.m.–4:15 p.m. |
Monday |
Break with Refreshments |
4:15 p.m.–5:00 p.m. |
Monday |
|
5:00 p.m.–5:15 p.m. |
Monday |
Break |
5:15 p.m.–6:00 p.m. |
Monday |
Session Chair: Michael Walfish, The University of Texas at Austin
Zhenyu Guo, Sean McDirmid, Mao Yang, and Li Zhuang, Microsoft Research Asia; Pu Zhang, Microsoft Research Asia and Peking University; Yingwei Luo, Peking University; Tom Bergan, Microsoft Research and University of Washington; Madan Musuvathi, Zheng Zhang, and Lidong Zhou, Microsoft Research Asia Cloud services inevitably fail: machines lose power, networks become disconnected, pesky software bugs cause sporadic crashes, and so on. Unfortunately, failure recovery itself is often faulty; e.g. recovery can accidentally recursively replicate small failures to other machines until the entire cloud service fails in a catastrophic outage, amplifying a small cold into a contagious deadly plague! We propose that failure recovery should be engineered foremost according to the maxim of primum non nocere, that it “does no harm.” Accordingly, we must consider the system holistically when failure occurs and recover only when observed activity safely allows for it.
Ryan Stutsman and John Ousterhout, Stanford University There are no widely accepted design patterns for writing distributed, concurrent, fault-tolerant code. Each programmer develops her own techniques for writing this type of complex software. The use of a common pattern for fault-tolerant programming has the potential to produce correct code more quickly and increase shared understanding between developers.
We describe rules, tasks, and pools, patterns extracted from the development of RAMCloud, a fault-tolerant datacenter storage system. We illustrate their application and discuss their relationship to concurrent programming models. Our goal is to generate discussion that will ultimately lead to common techniques for fault-tolerant programming.
Shriram Rajagopalan, IBM T. J. Watson Research Center and University of British Columbia; Dan Williams and Hani Jamjoom, IBM T. J. Watson Research Center; Andrew Warfield, University of British Columbia Software is modular, and so is run-time state. We argue that by allowing individual layers of the software stack to store isolated runtime state, we cripple the ability of systems to effectively scale or respond to failures. Given the strong desire to build elastic and highly available applications for the cloud, we propose Slice, an abstraction that allows applications to declare appropriate granularities of scale-oriented state, and allows layers to contribute the appropriate layer-specific data to those containers. Slices can be transparently migrated and replicated between application instances, thereby simplifying design of elastic and highly available systems, while retaining the modularity of modern software.
|
6:00 p.m.–6:30 p.m. |
Monday |
Open Mike |
6:30 p.m.–7:00 p.m. |
Monday |
Break |
7:00 p.m.–9:00 p.m. |
Monday |
Workshop Dinner
Puma and Patio |