Chris Sinjakli, PlanetScale
Service Level Objectives (SLOs) are a familiar topic in SRE circles. They provide a framework for measuring and thinking about the reliability of a service in terms of a percentage of successful operations, such as HTTP requests.
That key strength of SLOs - viewing reliability as a percentage game - can also also be a weakness. Within that framing, there are certain solutions we're likely to overlook.
This talk explores another lens for reliability - one that's complementary to SLOs: structuring software in a way that rules out entire classes of problem.
We'll explore this idea via three worked examples, and finish with some concrete take-aways, including how to spot problems that fit this shape.
Chris Sinjakli, PlanetScale
Chris enjoys working on the strange parts of computing where software and systems meet. He particularly likes the challenges of databases and distributed systems.
All his programs are made from organic, hand-picked, artisanal keypresses.
author = {Chris Sinjakli},
title = {Making the Impossible Impossible: Improving Reliability by Preventing Classes of Problems},
year = {2022},
address = {Amsterdam},
publisher = {USENIX Association},
month = oct
}