Events  Deals  Jobs  SF Climate Week 2024 
    Sign in  
 
 
Get basic understanding of data analysis techniques & tackle problems involving multi-dimensional data.
Mon, Mar 19, 2018 @ 08:30 AM   $1900   Online
 
   
 
 
              

    
 
Sign up for our awesome New York
Tech Events weekly email newsletter.
   
LOCATION
EVENT DETAILS
Introduction to Data Science Overview

Data science has become the central approach to tackling data-heavy problems in both business & academia. In this course, students learn how data science is done in the wild, with a focus on data acquisition, cleaning, & aggregation, exploratory data analysis & visualization, feature engineering, & model creation & validation. Students use the Python scientific stack to work through real-world examples that illustrate these concepts. Concurrently, students learn some of the statistical & mathematical foundations that power the data-scientific approach to problem solving.

Who is this course for?

Introduction to Data Science is for anyone with a basic understanding of data analysis techniques & anyone interested in improving their ability to tackle problems involving multi-dimensional data in a systematic, principled way. A familiarity with a programming language is helpful, but unnecessary, if the pre-work for the course is completed (more on that below). No prior advanced mathematical training beyond an introductory statistics course is necessary.

Prerequisites

Students should have some experience with Python & have some familiarity with basic statistical & linear algebraic concepts such as mean, median, mode, standard deviation, correlation, & the difference between a vector & a matrix. In Python, it will be helpful to know basic data structures such as lists, tuples, & dictionaries, & what distinguishes them (that is, when they should be used).

Students should skip the pre-work if they can accomplish all of the following:

Write a program in Python that finds the most frequently occurring word in a given sentence.
Explain the difference between correlation & covariance, & why the difference between the two terms matters.
Multiply two small matrices together (e.g. 3X2 & 2X4 matrices).

Otherwise, students should complete the following pre-work (approximately 8 hours) before the first day of class:

Exercises 1-7, 13, 18-21, 27-35, 38,39 of Learn Python The Hard Way.
Videos 1-6 of Linear Algebra review from Andrew Ng's Machine Learning course (labeled as: III. Linear Algebra Review (Week 1, Optional).
The exercises in Chapters 2 & 3 of OpenIntro Statistics.

Outcomes

Upon completing the course, students have:
An understanding of problems solvable with data science & an ability to attack those problems from a statistical perspective.
An understanding of when to use supervised & unsupervised statistical learning methods on labeled & unlabeled data-rich problems.
The ability to create data analytical pipelines & applications in Python.
Familiarity with the Python data science ecosystem & the various tools one can use to continue developing as a data scientist
 
 
 
 
© 2024 GarysGuide      About    Feedback    Press    Terms