Solving the Straggler Problem with Bounded Staleness
James Cipar, Qirong Ho, Jin Kyu Kim, Seunghak Lee, Gregory R. Ganger, and Garth Gibson, Carnegie Mellon University; Kimberly Keeton, HP Labs; Eric Xing, Carnegie Mellon University
Many important applications fall into the broad class of iterative convergent algorithms. Parallel implementations of these algorithms are naturally expressed using the Bulk Synchronous Parallel (BSP) model of computation. However, implementations using BSP are plagued by the straggler problem, where every transient slowdown of any given thread can delay all other threads. This paper presents the Stale Synchronous Parallel (SSP) model as a generalization of BSP that preserves many of its advantages, while avoiding the straggler problem. Algorithms using SSP can execute efficiently, even with significant delays in some threads, addressing the oft-faced straggler problem.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {James Cipar and Qirong Ho and Jin Kyu Kim and Seunghak Lee and Gregory R. Ganger and Garth Gibson and Kimberly Keeton and Eric Xing},
title = {Solving the Straggler Problem with Bounded Staleness},
booktitle = {14th Workshop on Hot Topics in Operating Systems (HotOS XIV)},
year = {2013},
address = {Santa Ana Pueblo, NM},
url = {https://www.usenix.org/conference/hotos13/session/cipar},
publisher = {USENIX Association},
month = may
}
connect with us