Productionizing Machine-Learning Services: Lessons from Google SRE

Thursday, June 07, 2018 - 1:20 pm2:15 pm

Salim Virji and Carlos Villavieja, Google

Abstract: 

Have you thought that your model trained on a Monday might not work on Saturday? Or that the model that you trained on users in Florida might not work for all Spanish-speaking users? In this talk, we present lessons learned from deploying and productionizing ML systems across various products at Google.

Salim Virji, Google

Salim Virji is a Site Reliability Engineer at Google, where he has built distributed compute, consensus, and storage systems.

Carlos Villavieja, Google

Carlos Villavieja is a Computer Architect/Researcher working as a Software/Site Reliability Engineer at Google. He works on Storage optimizations and his interests vary from micro-architecture to machine learning.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@conference {214969,
author = {Salim Virji and Carlos Villavieja},
title = {Productionizing {Machine-Learning} Services: Lessons from Google {SRE}},
year = {2018},
publisher = {USENIX Association},
month = jun
}

Presentation Video 

Presentation Audio