Monday, May 20 • 4:40pm - 5:00pm
Fast, Reliable, Yet Catastrophically Failing!?! Safely Avoiding Incidents When Putting Machine Learning into Production

Sign up or log in to save this to your schedule and see who's attending!

Safely releasing machine learning based services into production presents a host of challenges that even the most experienced SRE may not expect. We'll outline some severe outages seen in the wild, their causes, and detail how emergent cutting edge techniques from the DevOps and SRE world around "testing in prod", progressive delivery, and deterministic simulation are the PERFECT solution for increasing safety, resilience, and confidence for SREs operating and managing ML-based services at scale.


Ramin Keene

Ramin has spent the last 5 years working with data teams and large enterprises to put machine learning, a/b testing, and data science products into production. He’s made ALL the mistakes and then some, helping companies lose thousands, if not millions, of dollars along the way... Read More →

Monday May 20, 2019 4:40pm - 5:00pm
Lawrence/San Tomas/Lafayette Rooms

Attendees (12)