Quick Overview

Column 1

Outline

R is rapidly becoming the standard platform for data manipulation, visualization and analysis and has a number of advantages over other statistical software packages. A wide community of users contribute to R, resulting in an enormous coverage of statistical procedures, including many that are not available in any other statistical program. Furthermore, it is highly flexible for programming and scripting purposes, for example when manipulating data or creating professional plots. However, R lacks standard GUI menus, as in SPSS for example, from which to choose what statistical test to perform or which graph to create. As a consequence, R is more challenging to master. Therefore, this course offers an elaborate introduction into statistical programming in R. Students learn to operate R, make plots, fit, assess and interpret a variety of basic statistical models and do advanced statistical programming and data manipulation. The topics in this course include regression models for linear, dichotomous, ordinal and multivariate data, statistical inference, statistical learning, bootstrapping and Monte Carlo simulation techniques.

Materials covered:

Day 1:
  • Installing R/Rstudio (done at home)
  • Getting comfortable with notebooks/projects/scripts
  • Getting help
  • Variables in R: basic data types (character, numeric, integer, logical, date) and data structures (vectors, matrices, lists, data.frames)
  • Filtering using logical operator
  • Type conversion (as.integer/as.numeric/as.factor)
  • Understanding/installing packages
  • Reading a CSV and calculating descriptive statistics
Day 2:
  • Control flow (if-else statements and for loops)
  • Functions: creating your own functions
  • Principles of tidy data and short comparison of base R and the tidyverse
  • Reading and writing files in several formats
  • Inferential statistics: A 5-min primer of linear regression
  • Best practices in R

Daily schedule

Start End What?
09.00 09.15 Welcome
09.15 09.45 Lecture
09:45 10.30 Practical
10.30 10.50 Discussion
break
11.05 11.45 Lecture
11:45 12.30 Practical
12:30 13.00 Discussion

How to prepare

Column 1

Preparing your machine for the course

To realize a steeper learning curve, we will use some functionality that is not part of the base installation for R. The below steps guide you through installing both R as well as the necessary additions.

System requirements

Bring a laptop computer to the course and make sure that you have full write access and administrator rights to the machine. We will explore programming and compiling in this course. This means that you need full access to your machine. Some corporate laptops come with limited access for their users, we therefore advice you to bring a personal laptop computer, if you have one.

1. Install the latest version of R

R can be obtained here. We won’t use R directly in the course, but rather call R through RStudio. Therefore it needs to be installed.

2. Install the latest RStudio Desktop

Rstudio is an Integrated Development Environment (IDE). It can be obtained as stand-alone software here. The free and open source RStudio Desktop version is sufficient.

3. Start RStudio and install the following packages.

Execute the following lines of code in the console window:

install.packages(c("ggplot2", "tidyverse", "magrittr", "knitr", "rmarkdown", 
                   "plotly", "ggplot2", "shiny", "devtools", "boot", "class", 
                   "car", "MASS", "ggplot2movies", "ISLR", "DAAG", "mice", 
                   "purrr", "furrr", "future"), dependencies = TRUE)

If you are not sure where to execute code, use the following figure to identify the console - ignore the outdated version in the example: