|
|
| |
EVENT DETAILS |
About This Event
General Assembly & Dataiku will be hosting Brian Lavery & John Paletto of the New York Times to discuss tools for productionalizing Data Science.
Tentative Schedule:
6:30pm: Pizza + Beer networking
7:00pm: TBD with Data Scientist at Dataiku
7:30pm: Intro to Airflow for Data Analysts & Data Scientists with Brian Lavery, Senior Data Engineer & John Paletto, Data Scientist from The New York Times
Talk Abstracts:
Why do Data Scientists & Engineers at the New York Times use Apache Airflow to chain together their batch jobs into workflows & how well does it scale? What is Airflow's role in productionalizing models & what other tools come into play for training models? And if you want to get started with Airflow, what are your options & how hard is the system to maintain?
Preparation
None!
About the Panelists
Brian Lavery
Senior Data Engineer ,
New York Times
Brian Lavery is a Senior Data Engineer at the New York Times. He currently co-hosts the NYC Apache Airflow Meetup Group. His IT career has spanned 20 years but he's been in the data engineering world for the past 12, most of that at the New York Times. Since he's worked for the Times, it has moved off of star schemas on relational databases & into the big data world. The Times has tried a lot of technologies & therefore Brian has gotten to play with a lot of different tools. From EMR & Redshift on AWS to an in-house Hadoop cluster to where the Times is now on BigQuery & Airflow on Google Cloud Platform.
John Paletto
Data Scientist ,
The New York Times
John Paletto is a Data Scientist at the New York Times. He has over 3 years experience working in & deploying data science at scale. Prior to The Times, his work focused on applying data science to predictive maintenance in the fields of aerospace & performance materials. Since joining The Times he has worked on machine learning data products for advertising, subscription growth, & print distribution. He currently spends most of his time in Python, Airflow, & Google Cloud Platform. He enjoys most sports, all ice cream & lots of cookies (preferably together).
About Our Partners
Dataiku
Dataiku DSS is the collaborative data science software platform for teams of data scientists, data analysts, & engineers to explore, prototype, build, & deliver their own data products more efficiently.
|
|
|
|
|
|