Rex: Preventing Bugs and Misconfiguration in Large Services Using Correlated Change Analysis

Authors: 

Sonu Mehta, Ranjita Bhagwan, and Rahul Kumar, Microsoft Research India; Chetan Bansal, Microsoft Research; Chandra Maddila and B. Ashok, Microsoft Research India; Sumit Asthana, University of Michigan; Christian Bird, Microsoft Research; Aditya Kumar

Abstract: 

Large services experience extremely frequent changes to code and configuration. In many cases, these changes are correlated across files. For example, an engineer introduces a new feature following which they also change a configuration file to enable the feature only on a small number of experimental machines. This example captures only one of numerous types of correlations that emerge organically in large services. Unfortunately, in almost all such cases, no documentation or specification guides engineers on how to make correlated changes and they often miss making them. Such misses can be vastly disruptive to the service.

We have designed and deployed Rex, a tool that, using a combination of machine learning and program analysis, learns change-rules that capture such correlations. When an engineer changes only a subset of files in a change-rule, Rex suggests additional changes to the engineer based on the change-rule. Rex has been deployed for 14 months on 360 repositories within Microsoft that hold code and configuration for services such as Office 365 and Azure. Rex has so far positively affected 4926 changes without which, at the very least, code-quality would have degraded and, in some cases, the service would have been severely disrupted.

NSDI '20 Open Access Sponsored by NetApp

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {246332,
author = {Sonu Mehta and Ranjita Bhagwan and Rahul Kumar and Chetan Bansal and Chandra Maddila and B. Ashok and Sumit Asthana and Christian Bird and Aditya Kumar},
title = {Rex: Preventing Bugs and Misconfiguration in Large Services Using Correlated Change Analysis },
booktitle = {17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20)},
year = {2020},
isbn = {978-1-939133-13-7},
address = {Santa Clara, CA},
pages = {435--448},
url = {https://www.usenix.org/conference/nsdi20/presentation/mehta},
publisher = {USENIX Association},
month = feb
}

Presentation Video