Josh Simon, University of Michigan
A disaster recovery plan (DRP) documents policies and detailed procedures for recovering your organization's critical technology infrastructure, systems, and applications after a disaster. Hopefully you have DRPs for your organization, but how complete are they really, and how and how often do you test them?
In this talk, we'll help you get a better understanding of what a DRP is and contains, as well as why it's important to write, test, and maintain service-specific DRPs and affiliated documentation. We'll talk about how we're developing and using collaborative discussion-based thought experiments to test our DRPs, including things you should and shouldn't do when you write and test your own. You may even get some insights on how to design your own services for reliability and recovery!

Josh is a senior systems administrator with over 30 years of experience across industry and higher education. His areas of expertise include systems administration, project management, technical writing, and facilitation. Among his many roles and responsibilities is coordinating his team's disaster recovery planning process. He enjoys sharing his experiences... especially if it saves other people from problems in the future.

author = {Josh Simon},
title = {Running {DRP} Tabletop Exercises},
year = {2025},
address = {Santa Clara, CA},
publisher = {USENIX Association},
month = mar
}