More Than Capacity: Performance-oriented Evolution of Pangu in Alibaba

Authors: 

Qiang Li, Alibaba Group; Qiao Xiang, Xiamen University; Yuxin Wang, Haohao Song, and Ridi Wen, Xiamen University and Alibaba Group; Wenhui Yao, Yuanyuan Dong, Shuqi Zhao, Shuo Huang, Zhaosheng Zhu, Huayong Wang, Shanyang Liu, Lulu Chen, Zhiwu Wu, Haonan Qiu, Derui Liu, Gexiao Tian, Chao Han, Shaozong Liu, Yaohui Wu, Zicheng Luo, Yuchao Shao, Junping Wu, Zheng Cao, Zhongjie Wu, Jiaji Zhu, and Jinbo Wu, Alibaba Group; Jiwu Shu, Xiamen University; Jiesheng Wu, Alibaba Group

Abstract: 

This paper presents how the Pangu storage system continuously evolves with hardware technologies and the business model to provide high-performance, reliable storage services with a 100-microsecond level of I/O latency. Pangu’s evolution includes two phases. In the first phase, Pangu embraces the emergence of the solid-state drive (SSD) storage and remote direct memory access (RDMA) network technologies by innovating its file system and designing a user-space storage operating system to substantially reduce the I/O latency while providing high throughput and IOPS. In the second phase, Pangu evolves from a volume-oriented storage provider to a performance-oriented one. To adapt to this change of business model, Pangu upgrades its infrastructure with storage servers of much higher SSD volume and RDMA bandwidth from 25Gbps to 100Gbps. It introduces a series of key designs, including traffic amplification reduction, remote direct cache access, and CPU computation offloading, to ensure Pangu fully harvests the performance improvement brought by hardware upgrade. Other than introducing these technology innovations, we also share our operating experiences during Pangu’s evolution, and discuss important lessons learned from them.

Category: 
Deployed-Systems Paper

FAST '23 Open Access Sponsored by
NetApp

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

This content is available to:

BibTeX
@inproceedings {285788,
author = {Qiang Li and Qiao Xiang and Yuxin Wang and Haohao Song and Ridi Wen and Wenhui Yao and Yuanyuan Dong and Shuqi Zhao and Shuo Huang and Zhaosheng Zhu and Huayong Wang and Shanyang Liu and Lulu Chen and Zhiwu Wu and Haonan Qiu and Derui Liu and Gexiao Tian and Chao Han and Shaozong Liu and Yaohui Wu and Zicheng Luo and Yuchao Shao and Junping Wu and Zheng Cao and Zhongjie Wu and Jiaji Zhu and Jinbo Wu and Jiwu Shu and Jiesheng Wu},
title = {More Than Capacity: Performance-oriented Evolution of Pangu in Alibaba},
booktitle = {21st USENIX Conference on File and Storage Technologies (FAST 23)},
year = {2023},
isbn = {978-1-939133-32-8},
address = {Santa Clara, CA},
pages = {331--346},
url = {https://www.usenix.org/conference/fast23/presentation/li-qiang-deployed},
publisher = {USENIX Association},
month = feb
}

Presentation Video