


 

DESCRIPTION 
This 35hour course introduces both the theoretical foundation of machine learning algorithms as well as their practical applications of machine learning techniques in R. It will introduce you to data mining, performance measures and dimension reduction, regression models, both linear and generalized, KNN and Nave Bayes models, tree models, and SVMs as well as the Association Rule for analysis. After successfully completing of this course, you will be able to break down the mathematics behind major machine learning algorithms, explain the principles of machine learning algorithms, and implement these methods to solve realworld problems.
Prerequisites
Knowledge of R programming
Able to munge, analyze, and visualize data in R
Syllabus
Unit 1: Foundations of Statistics and Simple Linear Regression
Undestand your data
Statistical inference
Introduction to machine learning
Simple linear regression
Diagnostics and transformations
The coefficient of determination
Unit 2: Multiple Linear Regression and Generalized Linear Model
Multiple linear regression
Assumptions and diagnostics
Extending model flexibility
Generalized linear models
Logistic regression
Maximum likelihood estimation
Model interpretation
Assessing model fit
Unit 3: kNN and Naive Bayes, the Curse of Dimensionality
The KNearest Neighbors Algorithm
The choice of K and distance measure
Conditional probability: Bayes Theorem
The Naive Bayes Algorithm
The Laplace estimator
Dimension reduction
The PCA procedure
Ridge and Lasso regression
Crossvalidation
Unit 4: Tree Models and SVMs
Decision trees
Bagging
Random forests
Boosting
Variable Importance
Hyperplanes and maximal margin classifier
Sort margin and support vector classifier
Kernels and support vector machines
Unit 5: Cluster Analysis and Neural Networks
Cluster analysis
Kmeans clustering
Hierarchical clustering
Neural networks and perceptrons
Sigmoid neurons
Network topology and hidden features
Back propagation learning with gradient descent
Final Project
After 35 hours of structured lectures, students are encouraged to work on an exploratory data analysis project based on their own interests. A project presentation demo will be arranged afterwards.





