Scikit-learn is a machine learning library in Python, that has become avaluable tool for many data science practitioners.
This talk will cover some of the more advanced aspects of scikit-learn,such as building complex machine learning pipelines, model evaluation, parameter search,and out-of-core learning.
Apart from metrics for model evaluation, we will cover how to evaluate modelcomplexity, and how to tune parameters with grid search, randomized parameter search,and what their trade-offs are. We will also cover out of core text feature processingvia feature hashing.
---------------------------------------------------------
Andreasis an Assistant Research Scientist at the NYU Center for Data Science, building a group to work on open source software for data science. Previously he worked as a Machine Learning Scientist at Amazon, working on computer visionand forecasting problems. He is one of the core developers of the scikit-learnmachine learning library, and maintained it for several years.
Material will be posted here:
https://github.com/amueller/pydata-nyc-advanced-sklearn
Blog:
peekaboo-vision.blogspot.com
Twitter:
https://twitter.com/t3kcit