Back To Schedule
Monday, May 20 • 2:30pm - 2:50pm
The Power of Metrics—How to Monitor and Improve ML Efficiency

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

This talk is about an ML operational tool born on account of the rapid development of ML training workload, the need for headlight to perform issues, and the seeking of best practices for ML training. It helps to make the most of the limited computing resources and assures that the production model is of efficiency, reliability, and scalability. You will know our motivation behind developing this tool, the challenges we have faced, its main features, use cases and how diverse users have leveraged this tool during their work to improve productivity.


Yan Yan

Yan Yan is a production engineer, working at Facebook. She belongs to the Ads Ranking PE team that improves efficiency, reliability, and scalability for machine learning at Facebook. Her mission is to share her knowledge to help society by anticipating ML problems and solve the existing... Read More →

Zhilan Zweiger

Zhilan Zweiger is a staff engineer and tech lead in the Production Engineering team at Facebook. She is primarily responsible for reliability, efficiency, and scalability of the Ads Machine Learning infrastructure stack. Before that, she worked at Twitter in the Data Platform SRE... Read More →

Monday May 20, 2019 2:30pm - 2:50pm PDT
Stevens Creek Room