Monitoring Systems as a Service – Walking the Line between Giving Your Devs Good M&O and Setting All Your Money on Fire

Thursday, 31 October, 2024 - 09:0009:40 GMT

Joan O'Callaghan, Udemy

Abstract: 

Monitoring-as-a-Service products, like Datadog and Honeycomb are amazing products for implementing monitoring & observability with minimal effort, but like Anything-as-a-Service, it comes at a cost.

We are a very normal company, with all the tech debt and orphaned code that any company over a certain age has. Like everyone else, we had staff that heard, "measure everything!" but they didn't know what the monitoring bill looked like and that "everything" included a lot of junk.

In the talk I'll discuss how we managed to reduce cost wastage, enable extra vendor features, improve M&O knowledge within the engineering organisation and keep the bill the same or lower, despite a 60% growth in infrastructure at our company.

Notes re the vendor - I won't say who the Vendor is, but I think our experience was universal enough that our fixes and techniques will be helpful to other companies.

Joan O'Callaghan, Udemy

Joan O'Callaghan is a Monitoring and Observability Director at Udemy. She has worked in SRE and Incident Management and M&O (in one form or another), for many, many years. She likes to host and write blameless incident reviews and take long walks on the beach where she has imaginary arguments with people that don't like resilience as much as she does. She is always very happy when she meets people more paranoid than her.

BibTeX
@conference {302249,
author = {Joan O{\textquoteright}Callaghan},
title = {Monitoring Systems as a Service {\textendash} Walking the Line between Giving Your Devs Good {M\&O} and Setting All Your Money on Fire},
year = {2024},
address = {Dublin},
publisher = {USENIX Association},
month = oct
}

Presentation Video