A Community Cache with Complete Information

Website Maintenance Alert

Due to scheduled maintenance, the USENIX website may not be available on Monday, March 17, from 10:00 am–6:00 pm Pacific Daylight Time (UTC -7). We apologize for the inconvenience and thank you for your patience.

If you would like to register for NSDI '25, SREcon25 Americas, or PEPR '25, please complete your registration before or after this time period.

Authors: 

Mania Abdi, Northeastern University; Amin Mosayyebzadeh, Boston University; Mohammad Hossein Hajkazemi, Northeastern University; Emine Ugur Kaynar, Boston University; Ata Turk, State Street; Larry Rudolph, TwoSigma; Orran Krieger, Boston University; Peter Desnoyers, Northeastern University

Abstract: 

Kariz is a new architecture for caching data from datalakes accessed, potentially concurrently, by multiple analytic platforms. It integrates rich information from analytics platforms with global knowledge about demand and resource availability to enable sophisticated cache management and prefetching strategies that, for example, combine historical run time information with job dependency graphs (DAGs), information about the cache state and sharing across compute clusters. Our prototype supports multiple analytic frameworks (Pig/Hadoop and Spark), and we show that the required changes are modest. We have implemented three algorithms in Kariz for optimizing the caching of individual queries (one from the literature, and two novel to our platform) and three policies for optimizing across queries from, potentially, multiple different clusters. With an algorithm that fully exploits the rich information available from Kariz, we demonstrate major speedups (as much as 3×) for TPC-H and TPC-DS.

FAST '21 Open Access Sponsored by NetApp

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX

Presentation Video