sponsors
usenix conference policies
Why Let Resources Idle? Aggressive Cloning of Jobs with Dolly
12 Tuesday | 13 Wednesday | 14 Thursday | 15 Friday |
---|---|---|---|
HotCloud '12 | TaPP '12 | ||
WiAC '12 | USENIX ATC '12 | ||
UCMS '12 | HotStorage '12 | NSDR '12 | |
USENIX Cyberlaw '12 | WebApps '12 |
Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica, University of California, Berkeley
Despite prior research on outlier mitigation, our analysis of jobs from the Facebook cluster shows that outliers still occur, especially in small jobs. Small jobs are particularly sensitive to long-running outlier tasks because of their interactive nature. Outlier mitigation strategies rely on comparing different tasks of the same job and launching speculative copies for the slower tasks. However, small jobs execute all their tasks simultaneously, thereby not providing sufficient time to observe and compare tasks. Building on the observation that clusters are underutilized, we take speculation to its logical extreme—run full clones of jobs to mitigate the effect of outliers. The heavy-tail distribution of job sizes implies that we can impact most jobs without using much resources. Trace-driven simulations show that average completion time of all the small jobs improves by 47% using cloning, at the cost of just 3% extra resources.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Ganesh Ananthanarayanan and Ali Ghodsi and Scott Shenker and Ion Stoica},
title = {Why Let Resources Idle? Aggressive Cloning of Jobs with Dolly},
booktitle = {4th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 12)},
year = {2012},
address = {Boston, MA},
url = {https://www.usenix.org/conference/hotcloud12/workshop-program/presentation/ananthanarayanan},
publisher = {USENIX Association},
month = jun
}
connect with us