Please RSVP this class at http://nycdatascience.com/course/hadoop-data-analytic-platform/
Sign up for the newsletter for free Data Science learning material and upcoming classes at http://nycdatascience.com/
Sign up for NYC Open Data Meetup for free workshops twice/week!
Instructor: Vivian Zhang, CTO of SupStat Inc, Founder of NYC Data Science Academy
Length of class: 35 hours
Dates: April 23rd, 25th, 30th, May 2nd, 7th, 9th, 14th, 16th (total 4 weeks, 8 nights on Wednesday and Friday)
Time: 6:30pm – 9:30pm
Extra teaching: 1 hour video / week for 3 weeks
NYC Data Science Academy is pleased to offer our introductory Hadoop class: a 5 week intensive weekend course that will bring your basic understanding of Hadoop to a professional level. Hadoop Data Analytic Platform aims to provide a firm foundation for the start of a professional career using Hadoop.
Potential students should be familiar with Java and basic Linux commands.
What is Hadoop?
Hadoop is an open source, database framework that allows for the processing of large data sets using parallel computing methods. Utilizing Google’s MapReduce and the Hadoop Distributed File System (HDFS), Hadoop allows for scalability, flexibility and fault tolerance. Hadoop is optimized to handle massive quantities of data either structured, semi-structured, or unstructured– meaning Hadoop is perfect for Big Data.
As part of the Apache Framework, there are a litany of Apache compliments such as Hive, Pig and Zookeeper, that further extend Hadoop’s applications and usability.
Project Demo Day and Certificates
From building your first cluster to enterprise level application of Hadoop clusters, the course
ends with a demonstration of a project of your choice on Project Demo Day.
On Demo Day you will showcase a project of your choosing, utilizing the tools and skillsets
taught to you throughout this course. We encourage you to be creative!
After the successful completion of the course, you will qualify for one of three certificates:
Extraordinary Standing, Honorable Graduation, and Active Participation.
Certificates are awarded according to your understanding, skill, and participation.
Week 1: Introduction to the origin and system of Hadoop, Build a Hadoop cluster
Week 2: The principle and operation of Hadoop Distributed File System (HDFS), HDFS API programming
Week 3: The principle, system, Working mechanism of Map-Reduce; Hadoop data flow, Practice on Map-Reduce programming, the connection of eclipse and Hadoop cluster
Week 4: Advanced Hadoop application, Installation and application of Pig, Architecture and installation of Hive; Application of HiveQL, Data Mining with Mahout
Week 5: Architecture of HBase and Zookeeper, Installation and management of HBase, Data model of HBase; Analysis of application