Bad Machinery— Managing Interrupts Under Load
Dave O'Connor, Google Dublin
Lots of thought is given to how to organise oncall rotations — around people's schedules, around periods of critical coverage, and with fairness in mind. Less thought is given to the human aspect of oncall —how it affects people's ability to get other work done, their general cognitive flow state, and burnout rates. This talk will present a paper used internally in several SRE teams at Google to organise rotations around people, bearing in mind that people are not machines.
Dave O'Connor is a Senior Site Reliability Manager at Google. He has been at Google for almost 11 years, 9 of which were spent oncall, and organising oncall rotations. He has spent time on several teams in Google SRE, and currently manages the teams that run Google's storage in their Dublin, Ireland office. His specialty is being spectacularly grumpy at being interrupted, both for himself and on people's behalf; he has spent many years building teams that handle heavy interrupt load without everyone hating their lives.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

author = {Dave O{\textquoteright}Connor},
title = {Bad {Machinery{\textemdash}} Managing Interrupts Under Load},
year = {2015},
address = {Dublin},
publisher = {USENIX Association},
month = may
}