The Value of Reliability

Website Maintenance Alert

Due to scheduled maintenance, the USENIX website may not be available on Monday, March 17, from 10:00 am–6:00 pm Pacific Daylight Time (UTC -7). We apologize for the inconvenience and thank you for your patience.

If you would like to register for NSDI '25, SREcon25 Americas, or PEPR '25, please complete your registration before or after this time period.

Thursday, 12 October, 2023 - 11:5012:30

Niall Murphy, Stanza Systems

Abstract: 

Niall Murphy will deliver an overview of how we measure and articulate the value of reliability in our organisations.

How do we evaluate down time? What are the highest value parts of your stack? How do you prioritise your engineering effort to best improve the set of likely outcomes?

Furthermore, when we know the value of our systems, how do we communicate it effectively to the rest of the business? What ways could there be to prioritise reliability work over feature work?

Niall Murphy, Stanza Systems

Niall is the CEO of Stanza Systems, has occupied various engineering and leadership roles in Microsoft, Google, and Amazon, and is the instigator of the best-selling & prize-winning Site Reliability Engineering, which he hopes at some stage to live down. His most recent book is Reliable Machine Learning, with Todd Underwood and many others.

BibTeX

Presentation Video