Events  Classes  Deals  Spaces  Jobs 
    Sign in  
 
 
Comprehensive introduction to data analysis w/ Python programming language.
Sunday, June 11, 2017 at 01:00 PM    Cost: $1590
NYC Data Science Academy, 500 8th Ave, Ste 905
 
     
 
 
              

              
 
Sign up for our awesome New York
Tech Events weekly email newsletter.
   
 
LOCATION
 
DESCRIPTION
This class is a comprehensive introduction to data analysis with the Python programming language. This class targets people who have some basic knowledge of programming & want to take it to the next level. It introduces how to work with different data structures in Python & covers the most popular data analytics & visualization modules, including numpy, scipy, pandas, matplotlib, & seaborn. We use Ipython notebook to demonstrate the results of codes & change codes interactively throughout the class.

Prerequisites

Some rudimentary knowledge of programming


Syllabus

Unit 1: Introduction to Python
Python is a high-level programming language. You will learn the basic syntax & data structures in Python. We demonstrate & run codes within Ipython notebook, which is a great tool providing a robust & productive environment for interactive & exploratory computing.

Introduction to Ipython notebook
Basic objects in Python
Variables & self-defining functions
Control flow
Data structures

Unit 2: Explore Deeper with Python
Python is an object-oriented programming (OOP) language. Having some basic knowledge of OOP will help you understand how Python codes work. More often than not, you will have to deal with data that is dirty & unstructured. You will learn many ways to clean your data such as applying regular expressions.

Introduction to object-oriented programming
How to deal with files
Run Python scripts
Handling & processing strings

Unit 3: Scientific Computation Tools
There are two modules for scientific computation that make Python powerful for data analysis: Numpy & Scipy. Numpy is the fundamental package for scientific computing in Python. SciPy is an expanding collection of packages addressing scientific computing.

Numpy
Scipy

Unit 4: Data Visualization
Python can also generate graphics easily using “Matplotlib” & “Seaborn”. Matplotlib is the most popular Python library for producing plots & other 2D data visualizations. Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing statistical graphics.

Seaborn
Matplotlib

Unit 5: Data manipulation with Pandas
Pandas provides rich data structures & functions for working with structured data. The “DataFrame” object in Pandas is just like the “data.frame” object in R. Pandas makes data manipulation (filter, select, group, aggregate, etc.) as easy as in R.

Pandas

Final Project

After 20 hours of structured lectures, students are encouraged to work on an exploratory data analysis project based on their own interests. A project presentation demo will be arranged afterwards.
 
 
 
 
© 2017 GarysGuide      About   Terms   Press   Feedback