What Do SRE ICs Do? How to Build SRE Skillsets

Wednesday, March 26, 2025 - 11:00 am12:35 pm PDT

Beth Adele Long, Adaptive Capacity Labs, and Fred Hebert, Honeycomb.io

Format: Breakout Group Discussion

Abstract: 

The focus of this session is developing individual contributor skills in SRE. SREs do a lot of different things, including but not limited to: load testing, setting up and maintaining infrastructure services, building integration test pipelines, setting SLOs, maintaining alerts, being an incident commander, writing post incident reviews, system design for scalability, building automation, contributing code changes to core products which are aimed at increasing reliability or performance, and troubleshooting. Few engineers come to SRE with all of these skills. How, as practitioners, should we think about building skills in new areas? Does it make sense to be an SRE jack-of-all-trades, or should one specialize?

Beth Adele Long is a writer and engineer with wide experience building, maintaining, and repairing web systems (mostly repairing). She’s a founding member of the Resilience in Software Foundation and a Principal at Adaptive Capacity Labs.

Fred Hebert is a staff SRE at Honeycomb.io, caring for SLOs and error budgets, on-call health, alert hygiene, incident response, and operational readiness. He’s a published technical author who loves distributed systems, systems engineering and has a strong interest in resilience engineering and human factors.
BibTeX
@conference {305902,
author = {Beth Adele Long and Fred Hebert},
title = {What Do {SRE} {ICs} Do? How to Build {SRE} Skillsets },
year = {2025},
address = {Santa Clara, CA},
publisher = {USENIX Association},
month = mar
}