sponsors
general information
Venue
DoubleTree by Hilton Dublin - Burlington Road
Leeson Street Upper
Dublin 4, Ireland
Questions?
About SREcon?
About Registration?
About Sponsorship?
usenix conference policies
Bridging the Safety Gap from Scripts to Full Auto-Remediation
David Mah, Dropbox
At Dropbox, to bridge the gap between “scripts” and “fully automatic automation”, we’ve introduced a concept of “Human Authorized Execution”. This means that a tool automatically finds problems and decides how to fix them, but a human operator is required to audit the tool’s decisions before the automation may run.
Why do we need this? Because it’s terrifying to have automation run fully automatically. With a human involved, their intuition can answer a really important question: Why might I NOT want to run this script? If we took a simple approach… for instance deploying a cron job to run our scripts whenever alerts fire, then we would lose that human’s sense of danger.
At Dropbox, we’ve built an alert auto-remediation platform which forces us to build our maintenance automation in a way that adheres to these principles. Through it, we’ve been able to overcome our discomfort with risky automation and transition our way into actually running scripts fully automatically.
In this talk we will discuss the thought process we bring towards building trustworthy automation, how we’ve driven our infrastructure organization towards a culture of embracing it, and simple steps that you could take to start gaining similar benefits in your organization.
This talk is targeted towards helping organisations who do not currently have extensive automation but wish to put together a road map on how to move towards fully automated operational infrastructure.
David Mah is an SRE at Dropbox, where he built out several of Dropbox’s “Magic Pocket” storage system’s verification and safety subsystems. More recently, he built Dropbox’s Naoru - an automation platform that is used ot de-risk dangerous maintenance automation tasks.
On the flip-side of career interests, David cares a lot about how to keep folks growing and happy. Towards this, he runs Dropbox’s engineering internship program and is heavily involved in SRE recruiting, particularly university recruiting.
David Mah, Dropbox
David Mah is an SRE at Dropbox, where he built out several of Dropbox’s “Magic Pocket” storage system’s verification and safety subsystems. More recently, he built Dropbox’s Naoru - an automation platform that is used ot de-risk dangerous maintenance automation tasks.
On the flip-side of career interests, David cares a lot about how to keep folks growing and happy. Towards this, he runs Dropbox’s engineering internship program and is heavily involved in SRE recruiting, particularly university recruiting.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {David Mah},
title = {Bridging the Safety Gap from Scripts to Full {Auto-Remediation}},
year = {2016},
address = {Dublin},
publisher = {USENIX Association},
month = jul
}
connect with us