Loading…
Monday, May 20 • 10:50am - 11:10am
Accelerating Large Scale Deep Learning Inference through DeepCPU at Microsoft

Sign up or log in to save this to your schedule and see who's attending!

The application of deep learning models presents significant improvement to many Microsoft services and products. In this paper, we introduce our experience and methodology of developing and applying the DeepCPU library for serving deep learning models in production at large scale with remarkable latency improvement and infrastructure cost reduction. We describe two ways to use the library, through customized optimization or framework integration, targeting different scenarios.

Speakers
MZ

Minjia Zhang

Microsoft AI and Research
SR

Samyam Rajbandari

Microsoft AI and Research
WW

Wenhan Wang

Microsoft AI and Research
EZ

Elton Zheng

Microsoft
OR

Olatunji Ruwase

Microsoft AI and Research
JR

Jeff Rasley

Microsoft AI and Research
JL

Jason Li

Microsoft
JW

Junhua Wang

Microsoft
YH

Yuxiong He

Microsoft


Monday May 20, 2019 10:50am - 11:10am
Winchester Room

Attendees (8)