Finding Crash-Consistency Bugs with Bounded Black-Box Crash Testing

Authors: 

Jayashree Mohan, Ashlie Martinez, Soujanya Ponnapalli, and Pandian Raju, University of Texas at Austin; Vijay Chidambaram, University of Texas at Austin and VMware Research

Abstract: 

We present a new approach to testing file-system crash consistency: bounded black-box crash testing (B3). B3 tests the file system in a black-box manner using workloads of file-system operations. Since the space of possible workloads is infinite, B3 bounds this space based on parameters such as the number of file-system operations or which operations to include, and exhaustively generates workloads within this bounded space. Each workload is tested on the target file system by simulating power-loss crashes while the workload is being executed, and checking if the file system recovers to a correct state after each crash. B3 builds upon insights derived from our study of crash-consistency bugs reported in Linux file systems in the last five years. We observed that most reported bugs can be reproduced using small workloads of three or fewer file-system operations on a newly-created file system, and that all reported bugs result from crashes after fsync() related system calls. We build two tools, CrashMonkey and Ace, to demonstrate the effectiveness of this approach. Our tools are able to find 24 out of the 26 crash-consistency bugs reported in the last five years. Our tools also revealed 10 new crash-consistency bugs in widely-used, mature Linux file systems, seven of which existed in the kernel since 2014. The new bugs result in severe consequences like broken rename atomicity and loss of persisted files.

Additional media

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {222585,
author = {Jayashree Mohan and Ashlie Martinez and Soujanya Ponnapalli and Pandian Raju and Vijay Chidambaram},
title = {Finding {Crash-Consistency} Bugs with Bounded {Black-Box} Crash Testing},
booktitle = {13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)},
year = {2018},
isbn = {978-1-939133-08-3},
address = {Carlsbad, CA},
pages = {33--50},
url = {https://www.usenix.org/conference/osdi18/presentation/mohan},
publisher = {USENIX Association},
month = oct
}

Presentation Audio