Schedule:
6:30pm: Pizza + Beer networking
7:00pm: 10-minute talks from NVIDIA, Bloomberg, & Dataiku
7:30pm: Open Q&A
Putting models into production is often seen as the completion of machine learning projects - but what happens post-deployment? This meetup will focus on this often underappreciated (and unpredicted) side of machine learning, addressing how models evolve & tackling the organizational & engineering challenges of maintenance, such as managing technical debt & compiling complexity.
In a series of 10-minute talks with Twitter, Bloomberg, & Dataiku, we will discuss different industry approaches to model maintenance. These talks will be followed by a Q&A panel with all of the speakers - come ready with a question or two!
Abstracts:
Challenges & learnings to bring research into production by Nicolas Koumchatzky, NVIDA:
As a manager or a large team working on ML problems, one issue I encountered was the difficulty to translate valuable research work into production systems. Researchers need flexibility & fast iteration, while production systems need safety, scale & robustness. In this talk, we will go over a few of the experiences I had over the years, & what I tried to do to improve the situation, with various degrees of success.
System Design v. Infrastructure by Michael Burkholder, Bloomberg:
When developing machine learning models for production, there is no one-size-fits-all recipe for system design. It's important to consider the engineering & product objectives early on in model development lifecycle, & to spend effort developing supporting infrastructure. In this talk, I will give an anecdotal case-study illustrating the modeling tradeoffs faced while developing an ML platform to compute the prices of financial instruments, & the infrastructure requirements to support the platform.
Exploring & Preventing Technical Debt by Patrick Masi-Phelps, Dataiku:
Patrick will discuss the concept of technical debt in production machine learning projects, a concept refined in 2015 by Google researchers (Sculley et. al), building on the software engineering concept introduced in the 1990s. Taking the extra time to simplify pipelines & account for changes in model inputs & configuration parameters can save time & mitigate risks of models in production. He'll talk theory then present a couple examples from clients in healthcare & aviation.
Bios:
Nicolas Koumchatzky started as a Quant, using models to evaluate the price of complex financial derivatives. He quickly joined a startup called DerivExperts in Paris to deliver that service to third-party buyers. After spending 5 years there as a manager, he embarked into another startup adventure at Madbits, focused on deep learning for image+text search, which was promptly acquired by Twitter. There, he developed deep learning models for image & spam filtering, moving on to create the first iteration of the first deep learning platform at Twitter called DeepBird. He then became a manager for the Twitter Cortex team, developing the ML platform with automation, better recommender systems & an improved version of the deep learning platform. A year ago, he joined NVIDIA as a Director of AI Infrastructure to build an ML platform to develop self-driving cars.
Michael Burkholder received his PhD in Mechanical Engineering from Carnegie Mellon University, studying nonlinear, chaotic, & stochastic electrochemical systems. He leads an ML team at Bloomberg LP developing high-performance models & infrastructure to power Bloomberg's risk analysis engine. Michael enjoys roasting his own coffee & listening to vinyl.
Patrick Masi-Phelps is a Data Scientist at Dataiku, where he helps clients build & deploy predictive models. Before joining Dataiku, he studied math & economics at Wesleyan University & was a fellow at NYC Data Science Academy. Patrick is always keeping up with the latest ML techniques in astronomical & public policy research.