Instructor: Paul Trowbridge
Date: July 19th, 26th, August 2nd, 9th, 16th
Our Venue: 500 7th Ave, 17th floor, New York, NY
(close to Times Square, between 37th and 38th street)
This course introduces statistical computing and statistical modeling with C++. All of the computational work will be programmed in C++, however, we will link our compiled code into R functions. This will give students experience coding statistical models in a compiled language, and by linking compiled code into R functions, end users will have the familiar R interface to work with and use the code. All the statistical and numerical methods are introduced using relevant contemporary examples. Each week students will have homework assignments based on contemporary applications in order to practice and develop the skills introduced in that week's session, by analyzing real datasets and coding the analysis in C++. Students will also complete a course project of their choosing. Students will identify a topic or problem of interest to them, and apply the skills and concepts taught to address the topic. Students are encouraged to be creative with their course project and the instructor will provide valuable feedback as students work complete their projects.
After completing the course, students will have a solid understanding of core statistical modeling methods, those most commonly encountered in applied work, as well as learning how to code these models from scratch.
- Eddelbuettel, Dirk (2013). Seamless R and C++ Integration with Rcpp. New
York: Springer. isbn: 978-1-4614-6867-7.
- Monahan, John F. (2011). Numerical Methods of Statistics. English. 2nd Edi-
tion. Cambridge, UK: Cambridge University Press, pp. xiv + 428. isbn:
0-521-79168-5/hbk. doi: 10.1017/CBO9780511812231.
- Press, W. H. et al. (2007). Numerical Recipes: the Art of Scientific Computing.
3rd Edition. Cambridge, UK: Cambridge University Press.
Week 1: Introduction to C and C++
Introduction to the Course
Creating R packages
- Introduction to the .C and .Call interfaces in R
- Review of Probability for Statistical Modeling
- Linear regression
- Non-parametric regression via splines
- Solving linear systems
- Computing matrix inverse
- Least square fit
Week 2: Maximum Likelihood Estimation and Non-Linear Models
- Generalized linear models
- Non-linear regression models
- Numerical Differentiation
- Non-linear Optimization
- Fisher Scoring algorithm
Week 3: Numerical Integration and Generalized Linear Mixed Models
- Generalized linear mixed models
Laplace method and Quadrature
- Numerical Integration
Week 4: Monte Carlo Methods; Hypothesis testing and Goodness-of-fit
- Network analysis; testing hypotheses about network characteristics
- Evaluating Goodness-of-fit when Chi-Square assumptions are violated
- Monte Carlo Integration
Week 5: Markov Chain Monte Carlo:
- Gaussian Copula models
- Discrete Choice models with random coefficients
- Markov chains
- Gibbs sampler
- Metropolis-Hastings algorithm
- Statistical Genetics
- Spatial Epidemiology
- Markov Chain Monte Carlo maximum likelihood estimation