Postmortem Action Items: Plan the Work and Work the Plan

Tuesday, March 14, 2017 - 10:55am11:20am

Sue Lueder and Betsy Beyer, Google

Abstract: 

In the 2016 O'Reilly book Site Reliability Engineering, Google described our culture of blameless postmortems, and recommended that organizations institute a similar culture of postmortems after production incidents. This talk shares some best practices and challenges in designing an appropriate action item plan and subsequently executing that plan in a complex environment of competing priorities, resource limitations, and operational realities. We discuss best practices for developing high-quality action items (AIs) for a postmortem, plus methods of ensuring these AIs actually get implemented so that we dont suffer the exact same outage or even worse again. It's worth noting that Google teams are by no means perfect at formulating and executing postmortem action items. We still have a lot to learn in this difficult area, and are sharing our thoughts and strategies to give a starting point for discussion throughout the industry.

Sue Lueder, Google

Sue Lueder joined Google as a Site Reliability Program Manager in 2014 and is on the team responsible for disaster testing and readiness, incident management processes and tools, and incident analysis. Previous to Google, Sue was a technical program manager and a systems, software, and quality engineer in wireless and smart energy industries (Ingenu Wireless, Texas Instruments, Qualcomm). She has a M.S. in Organization Development from Pepperdine University and a B.S in Physics from UCSD.

Betsy Beyer, Google

Betsy Beyer is a Technical Writer for Google Site Reliability Engineering in NYC. She has previously written documentation for Google Datacenters and Hardware Operations teams. Before moving to New York, Betsy was a lecturer on technical writing at Stanford University. She holds degrees from Stanford and Tulane.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@conference {201854,
author = {Sue Lueder and Betsy Beyer},
title = {Postmortem Action Items: Plan the Work and Work the Plan},
year = {2017},
address = {San Francisco, CA},
publisher = {USENIX Association},
month = mar
}

Presentation Video 

Presentation Audio