


 

DESCRIPTION 
This 35hour course introduces both the theoretical foundation of machine learning algorithms as well as their practical applications of machine learning techniques in R. It will introduce you to data mining, performance measures & dimension reduction, regression models, both linear & generalized, KNN & NaÃ¯ve Bayes models, tree models, & SVMs as well as the Association Rule for analysis. After successfully completing of this course, you will be able to break down the mathematics behind major machine learning algorithms, explain the principles of machine learning algorithms, & implement these methods to solve realworld problems.
Prerequisites
Knowledge of R programming
Able to munge, analyze, & visualize data in R
Syllabus
Unit 1: Foundations of Statistics & Simple Linear Regression
Undestand your data
Statistical inference
Introduction to machine learning
Simple linear regression
Diagnostics & transformations
The coefficient of determination
Unit 2: Multiple Linear Regression & Generalized Linear Model
Multiple linear regression
Assumptions & diagnostics
Extending model flexibility
Generalized linear models
Logistic regression
Maximum likelihood estimation
Model interpretation
Assessing model fit
Unit 3: kNN & Naive Bayes, the Curse of Dimensionality
The KNearest Neighbors Algorithm
The choice of K & distance measure
Conditional probability: Bayesâ€ Theorem
The Naive Bayesâ€ Algorithm
The Laplace estimator
Dimension reduction
The PCA procedure
Ridge & Lasso regression
Crossvalidation
Unit 4: Tree Models & SVMs
Decision trees
Bagging
Random forests
Boosting
Variable Importance
Hyperplanes & maximal margin classifier
Sort margin & support vector classifier
Kernels & support vector machines
Unit 5: Cluster Analysis & Neural Networks
Cluster analysis
Kmeans clustering
Hierarchical clustering
Neural networks & perceptrons
Sigmoid neurons
Network topology & hidden features
Back propagation learning with gradient descent
Final Project
After 35 hours of structured lectures, students are encouraged to work on an exploratory data analysis project based on their own interests. A project presentation demo will be arranged afterwards.





