Please RSVP this class at http://nycdatascience.com/course/hadoop-data-analytic-platform/
Sign up for the newsletter for free Data Science learning material & upcoming classes at http://nycdatascience.com/
Sign up for NYC Open Data Meetup for free workshops twice/week!
Instructor: Vivian Zhang, CTO of SupStat Inc, Founder of NYC Data Science Academy
Length of class: 35 hours
Dates: April 23rd, 25th, 30th, May 2nd, 7th, 9th, 14th, 16th (total 4 weeks, 8 nights on Wednesday & Friday)
Time: 6:30pm - 9:30pm
Extra teaching: 1 hour video / week for 3 weeks
NYC Data Science Academy is pleased to offer our introductory Hadoop class: a 5 week intensive weekend course that will bring your basic understanding of Hadoop to a professional level. Hadoop Data Analytic Platform aims to provide a firm foundation for the start of a professional career using Hadoop.
Potential students should be familiar with Java & basic Linux commands.
What is Hadoop?
Hadoop is an open source, database framework that allows for the processing of large data sets using parallel computing methods. Utilizing Google's MapReduce & the Hadoop Distributed File System (HDFS), Hadoop allows for scalability, flexibility & fault tolerance. Hadoop is optimized to handle massive quantities of data either structured, semi-structured, or unstructured- meaning Hadoop is perfect for Big Data.
As part of the Apache Framework, there are a litany of Apache compliments such as Hive, Pig & Zookeeper, that further extend Hadoop's applications & usability.
Project Demo Day & Certificates
From building your first cluster to enterprise level application of Hadoop clusters, the course
ends with a demonstration of a project of your choice on Project Demo Day.
On Demo Day you will showcase a project of your choosing, utilizing the tools & skillsets
taught to you throughout this course. We encourage you to be creative!
After the successful completion of the course, you will qualify for one of three certificates:
Extraordinary Standing, Honorable Graduation, & Active Participation.
Certificates are awarded according to your understanding, skill, & participation.
Week 1: Introduction to the origin & system of Hadoop, Build a Hadoop cluster
Week 2: The principle & operation of Hadoop Distributed File System (HDFS), HDFS API programming
Week 3: The principle, system, Working mechanism of Map-Reduce; Hadoop data flow, Practice on Map-Reduce programming, the connection of eclipse & Hadoop cluster
Week 4: Advanced Hadoop application, Installation & application of Pig, Architecture & installation of Hive; Application of HiveQL, Data Mining with Mahout
Week 5: Architecture of HBase & Zookeeper, Installation & management of HBase, Data model of HBase; Analysis of application