Description
For situations when characteristics have been measured on two or more groups, we utilize supervised learning techniques to assist us in understanding how these attributes can be used to predict the membership of future observations. This course is designed to teach best practices for algorithm selection, training, validation and the delivery of predictive model results.
Prerequisites
- Basic understanding of statistics
- Knowledge of R Programming
- R 3.4.x
- RStudio 1.0.x
Outline
1. Motivation – We’ll begin the course with an overview of the Omni Analytics statistical modeling process that highlights all of the required steps to construct and complete a successful supervised learning project.
2. Data Wrangling – In this section, techniques for recoding variables, handling missing values, and restructuring data will be discussed as part of the first set of preparatory steps required for model fitting.
3. Exploratory Data Analysis – Our focus in this section will be on data summary and graphical approaches used to analyze multivariate data structures for model selection.
4. Model Fitting – This section will provide an overview of the most popular statistical techniques used for prediction, along with examples of how to fit the models and interpret their output.
5. Validation Techniques – For this unit, we’ll introduce accuracy metrics and cross validation techniques designed to assess model bias.
6. Delivery – The main portion of the course will conclude with approaches designed to facilitate effective communication of the modeling results to project stake holders.
7. Special Topics – Here we’ll end with a discussion recent developments in supervised machine learning such as deep learning artificial intelligence.