Kubernetes the Very Hard Way

Tuesday, October 29, 2019 - 2:00 pm2:45 pm

Laurent Bernaille, Datadog

Abstract: 

Running large Kubernetes clusters is challenging. At large scales, practitioners need to adapt and tune both their architectures and component configurations in specialized ways.

Our organisation has been running large scale Kubernetes clusters (up to 2000 nodes, and growing) for more than a year, and we have learned several lessons the hard way. This talk will dive into complex runtime and networking issues that occur when running Kubernetes in production at scale. We will provide examples of how to improve the architecture of clusters to increase scalability and performance, both on the control plane and the data plane. Further, tools from the greater ecosystem will be examined, as they are rarely tested within the context of very large clusters.

Finally, the talk will also discuss the mutually beneficial relationship we built with the larger Kubernetes community by providing feedback on the tools and contributing both fixes and improvements upstream.

Laurent Bernaille, Datadog

He is Staff Engineer at Datadog and works in the Compute team, which is responsible for setting up and scaling Kubernetes platforms. Laurent has given several talks on the topic of application deployment and containers in conferences such as Dockercon, Open Source Summit, EuroBSDcon or Kubecon.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@conference {240806,
author = {Laurent Bernaille},
title = {Kubernetes the Very Hard Way},
year = {2019},
address = {Portland, OR},
publisher = {USENIX Association},
month = oct
}

Presentation Video