Thanks for your interest in this Dataiku NYC Meetup! The health & safety of our attendees & speakers is our primary concern. While this currently proves to be a tricky time for public gatherings, Dataiku is still committed to providing great tech content & facilitating discussions in the data science space. As such, weve decided to pivot towards online webinars via our partner platform, BrightTalk.|
IMPORTANT - RSVP HERE:
Tentative Schedule: (EST)
7:05pm: Optimizing Performance of ML Models Through a Bayesian Lens with TripAdvisor
Bayesian Imputation of Missing Feature Values in Product Sort & Recommendation at Tripadvisor
Do you encounter missing values in your model features, but dont give them much thought? I have two goals in this talk: 1) use my work with sort algorithms at Tripadvisor to show how ad-hoc imputation of missing values severely hurts the performance of real-world ML models, & 2) cast the missing value problem as a probabilistic model which one can solve through Bayesian inference. I will end by showing that the most widely used missing value imputation technique in the statistics community (Multiple Imputation by Chained Equations, MICE), which scikit-learn implements in its IterativeImputer) can be better understood as approximate Bayesian inference in a simple probabilistic model.
This talk will have content that should appeal to data & ML related researchers of all skill levels. For beginning data-related practitioners, part 1 of my talk will demonstrate why it is important to think about missing values carefully during feature engineering & how to examine their role in a models predictive performance. For more experienced attendees, part 2 of my talk will try to draw a bridge between the statistical literature on missing value imputation & the world of the machine learning practitioner through a Bayesian lens.
Narendra is a long time Bayesian interested in the connections between statistics, causal inference & machine learning. Currently, he is a Machine Learning Scientist at Tripadvisor based at their global headquarters in Needham, MA. His work at Tripadvisor spans the entire range of customer-centric ML problems from recommendation engines to building probabilistic models of user-generated content creation. Before Tripadvisor, Narendra obtained his PhD in systems neuroscience from Brandeis University where he developed probabilistic latent variable models of stimulus coding in the brain. He got into the world of Bayesian machine learning during his PhD, & has been in love with that world ever since! Outside of Bayes & ML, he is an avid cyclist & has explored much of north-east US on my bike. To learn more about Narendra, look at his webpage at: https://narendramukherjee.github.io
Disclaimer: All views, thoughts, & opinions expressed in the webinar belong solely to the panelists, & not to the panelists employer, organization, committee, other group or individual.