Lucas Kuhring, IMDEA Software Institute, Madrid, Spain; Eva Garcia, Universidad Autónoma de Madrid, Spain; Zsolt István, IMDEA Software Institute, Madrid, Spain
In order to keep up with big data workloads, distributed storage needs to offer low latency, high bandwidth and energy efficient access to data. To achieve these properties, most state of the art solutions focus either exclusively on software or on hardware-based implementation. FPGAs are an example of the latter and a promising platform for building storage nodes but they are more cumbersome to program and less flexible than software, which limits their adoption.
We make the case that, in order to be feasible in the cloud, solutions designed around programmable hardware, such as FPGAs, have to follow a service provider-centric methodology: the hardware should only provide functionality that is useful across all tenants and rarely changes. Conversely, application-specific functionality should be delivered through software that, in a cloud setting, is under the provider's control. Deploying FPGAs this way is less cumbersome, requires less hardware programming and flexibility increases overall.
We demonstrate the benefits of this approach by building an application-aware storage for Parquet files, a columnar data format widely used in big data frameworks. Our prototype offers transparent 10Gbps deduplication in hardware without sacrificing low latency operation and specializes to Parquet files using a companion library. This work paves the way for in-storage filtering of columnar data without having to implement file-type and tenant-specific parsing in the FPGA.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Lucas Kuhring and Eva Garcia and Zsolt Istv{\'a}n},
title = {Specialize in {Moderation{\textemdash}Building} Application-aware Storage Services using {FPGAs} in the Datacenter},
booktitle = {11th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 19)},
year = {2019},
address = {Renton, WA},
url = {https://www.usenix.org/conference/hotstorage19/presentation/kuhring},
publisher = {USENIX Association},
month = jul
}