sponsors
usenix conference policies
Exploiting Bounded Staleness to Speed Up Big Data Analytics
Henggang Cui, James Cipar, Qirong Ho, Jin Kyu Kim, Seunghak Lee, Abhimanu Kumar, Jinliang Wei, Wei Dai, and Gregory R. Ganger, Carnegie Mellon University; Phillip B. Gibbons, Intel Labs; Garth A. Gibson and Eric P. Xing, Carnegie Mellon University
Many modern machine learning (ML) algorithms are iterative, converging on a final solution via many iterations over the input data. This paper explores approaches to exploiting these algorithms' convergent nature to improve performance, by allowing parallel and distributed threads to use loose consistency models for shared algorithm state. Specifically, we focus on bounded staleness, in which each thread can see a view of the current intermediate solution that may be a limited number of iterations out-of-date. Allowing staleness reduces communication costs (batched updates and cached reads) and synchronization (less waiting for locks or straggling threads). One approach is to increase the number of iterations between barriers in the oft-used Bulk Synchronous Parallel (BSP) model of parallelizing, which mitigates these costs when all threads proceed at the same speed. A more flexible approach, called Stale Synchronous Parallel (SSP), avoids barriers and allows threads to be a bounded number of iterations ahead of the current slowest thread. Extensive experiments with ML algorithms for topic modeling, collaborative filtering, and PageRank show that both approaches significantly increase convergence speeds, behaving similarly when there are no stragglers, but SSP outperforms BSP in the presence of stragglers.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Henggang Cui and James Cipar and Qirong Ho and Jin Kyu Kim and Seunghak Lee and Abhimanu Kumar and Jinliang Wei and Wei Dai and Gregory R. Ganger and Phillip B. Gibbons and Garth A. Gibson and Eric P. Xing},
title = {Exploiting Bounded Staleness to Speed Up Big Data Analytics},
booktitle = {2014 USENIX Annual Technical Conference (USENIX ATC 14)},
year = {2014},
isbn = {978-1-931971-10-2},
address = {Philadelphia, PA},
pages = {37--48},
url = {https://www.usenix.org/conference/atc14/technical-sessions/presentation/cui},
publisher = {USENIX Association},
month = jun
}
connect with us