Description
R is an open source language for statistical analysis and graphics. In this course we will learn the nuts and bolts of how to use R for basic data analysis, including how to get important statistics out of a data set, reshaping data to make it easier to analyze, and how to create basic plots and use graphics. We also describe how to create reports using R markdown for reproducible data analysis.
Prerequisites
- Basic understanding of statistics
- Exposure to spreadsheets
- R 3.4.x
- RStudio 1.0.x
Outline
1. Motivation – Before we jump into the syntax and fundamentals in R, we go through simple R scripts to get an idea of R’s capability for data analysis.
2. Basics – We will cover how to setup your workspace, reading and writing files, data types in R, calculate summary statistics, and using basic programming constructs.
3. Functions and Packages – Here we will go through creating and using functions, some of the many packages in R and see why they are useful in simplifying our code and making it faster.
4. Data Manipulation – In this section we learn the concept of ‘tidy data’ and how to manipulate raw data into a format that makes data analysis easier. We then utilize functions to easily extract answers from the cleaned data.
5. Visualization in R – Here you will get to think about how to visualize data effectively. We will learn to create different chart types and assigning data to appropriate chart elements using ggplot2. We will also learn how to polish these plots and make them presentable to an audience.
6. Reproducible Data Analysis – We will learn dynamic report/webpage generation with R using the knitr package to create R-Markdowns which embed your code along with the outputs and plots.