In this meetup, we will focus on the art & science of doing Machine Learning on Big Data. We will have talks on best practices for ML models, & then dive deep into what a scalable ML infra looks like. Its an evening not to be missed!|
Food & Drinks sponsored by Lyft
6:00 - 6:30 pm: Check in, food, networking
6:30 - 6:35 pm: Intros
6:35 - 8:30 pm - 3 Talks
8:30 - 8:45 pm - Wrap up
Important Note: It is required to register for the event (free) on ti.to, before the event. You will then be sent an eNDA which needs to be signed 24 hours before the event, for security reasons. A badge would be pre-printed for you when you arrive at the event. Please register here (https://ti.to/big-data/machine-learning-on-big-data/with/pv-t9pxogse). If for some reason you are not able to sign the eNDA online, you can still attend, however you may have a wait in a long line at the sign in desk.
Talk #1: Ridesharing - Accounting for uncertainty in dispatch decisions to optimize marketplace balance
Dispatch is one of the most powerful levers to optimize a two-sided marketplace of physical goods, as it is able to use rider payments to reallocate supply within a network. However, uncertainty of user behavior, such as riders canceling or drivers rejecting dispatches, makes achieving perfect optimality a challenge.
In this talk, Parker discusses how Lyft has accounted for uncertainty in ride-sharing networks to achieve better overall outcomes. This talk will dive into modeling challenges with sparsity & non-continuity of various ML models, preventing moral hazard in user behavior from these assumptions, & understanding the biases different model assumptions have on the overall objective.
Parker Spielman has extensive experience in ridesharing, both at Lyft & previously Uber, where he has worked on a variety of problems including dynamic pricing, dispatch, & incentives. All of these areas contribute to a set of levers focused on better overall control systems for real-time marketplaces.
Talk #2: More Data Science with Less Engineering: ML Infrastructure at Netflix
Netflix is known for its unique culture that gives an extraordinary amount of freedom & responsibility for individual engineers & data scientists. Our data scientists are expected to develop & operate large machine learning workflows autonomously. However, we do not expect that all our scientists are deeply experienced with systems or data engineering. Instead, we provide them with delightfully usable machine learning infrastructure that they can use to manage the whole lifecycle of a data science project.
In this talk, we will share the key concepts that has made our ML infrastructure successful at Netflix.
Ville Tuulos manages the machine learning infrastructure team at Netflix. Prior to Netflix, Ville has been designing & leading ML & data infrastructure efforts at various startups & large companies in the Bay Area for over a decade, with a particular focus on human-centric tooling.
Talk #3: Machine learning & large-scale data analysis on a centralized platform at Walmart
In this talk, speakers explore the design of a centralized risk & abuse management platform & how this highly sophisticated platform enables dynamic & complex analytics of large-scale data from different domains. They share a study of protecting customer accounts through linking customer behaviors in their purchases, returns, & financial services.
Youll get an introduction to the Walmart risk & abuse management platform, risk & abuse problems in the Walmart ecosystem, the data-driven analytics & advanced machine learning algorithm used to defend against fraud & abuse, & case studies of customer account protection.
James Tang is a senior director of engineering at Walmart Labs. Yiyi Zeng is a senior manager & principal data scientist at Walmart Labs. Linhong Kang is a manager & staff data scientist at Walmart Labs.