History-Based Harvesting of Spare Cycles and Storage in Large-Scale Datacenters

Authors: 

Yunqi Zhang, University of Michigan and Microsoft Research; George Prekas, École Polytechnique Fédérale de Lausanne (EPFL) and Microsoft Research; Giovanni Matteo Fumarola and Marcus Fontoura, Microsoft; Inigo Goiri and Ricardo Bianchini, Microsoft Research

Abstract: 

An effective way to increase utilization and reduce costs in datacenters is to co-locate their latency-critical services and batch workloads. In this paper, we describe systems that harvest spare compute cycles and storage space for co-location purposes. The main challenge is minimizing the performance impact on the services, while accounting for their utilization and management patterns. To overcome this challenge, we propose techniques for giving the services priority over the resources, and leveraging historical information about them. Based on this information, we schedule related batch tasks on servers that exhibit similar patterns and will likely have enough available resources for the tasks’ durations, and place data replicas at servers that exhibit diverse patterns. We characterize the dynamics of how services are utilized and managed in ten large-scale production datacenters. Using real experiments and simulations, we show that our techniques eliminate data loss and unavailability in many scenarios, while protecting the co-located services and improving batch job execution time.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {199307,
author = {Yunqi Zhang and George Prekas and Giovanni Matteo Fumarola and Marcus Fontoura and Inigo Goiri and Ricardo Bianchini},
title = {{History-Based} Harvesting of Spare Cycles and Storage in {Large-Scale} Datacenters},
booktitle = {12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16)},
year = {2016},
isbn = {978-1-931971-33-1},
address = {Savannah, GA},
pages = {755--770},
url = {https://www.usenix.org/conference/osdi16/technical-sessions/presentation/zhang-yunqi},
publisher = {USENIX Association},
month = nov
}

Presentation Audio