- Gentle introduction to:
- Programming languages
- R for data science
- Provide a foundation for the course of Gerko Vink’s course
Introduction to R and RStudio
install.packages("ggplot2") #Install new package (you only need to do it once) library(ggplot2) #Load the package
Write your R code (load data, clean it, model it, etc)
All the variables that you have defined
File explorer, find your files.
Get information about code (super useful!)
Write R code (not recommended at this point) and see the output of your R scripts
See the plots, and export it
History of all the code you have run.
All packages that you have loaded (I don’t recommend loading/unloading packages this way)
Run commands on your terminal (this is not R, you won’t need to use this)
../
somefile.csv”: find “somefile.csv” one level down../../
somefile.csv”: find “somefile.csv” two levels down./
somefile.csv”: find “somefile.csv” in the current level (not so useful, it is identical to “somefile.csv”)~/
somefile.csv”: find “somefile.csv” in your home directoryTell the computer to save an object
(a number, a string, a spreadsheet) with a name.
Creating variables in R is very straightforward:
<-
(assignment operator)For example, if you assign the value 100
(an element) to variable a
, you would type
a <- 100 print(a)
## [1] 100
character
: “some text”numeric
: e.g., 2.1integer
: e.g., 2Llogical
: TRUE/FALSEfactor
: e.g., factor(“amsterdam”)vector
: c(2, 4, 2)list
: list(first_col = 1, second = “a”, third = TRUE)matrix
: matrix(c(4, 4, 4, 4), nrow = 2, ncol = 2)data.frame
: The most important ~ spreadsheetEverything that is published on the Comprehensive R
Archive Network (CRAN) and is aimed at R
users, must be accompanied by a help file.
If you know the name of the function that performs an operation, e.g. anova()
, then you just type ?anova
or help(anova)
in the console, or use the “Help” menu.
If you do not know the name of the function: type ??
followed by your search criterion. For example ??anova
returns a list of all help pages that contain the word ‘anova’
Alternatively, the internet will tell you almost everything you’d like to know and sites such as http://www.stackoverflow.com and http://www.stackexchange.com, as well as Google
and LLM
can be of tremendous help.
R
related issues; use ‘R:’ as a prefix in your search termYou just use type the name you have given to the object
For example, we assigned the value 100
to object a
.
a <- 100
To call object a
, we would type
a
## [1] 100
# This is a comment, it won't be read by R student_number <- 4 paste("The number of students is: ", student_number, sep = " ")
## [1] "The number of students is: 4"
#sep can be any character, or "\n" (newline), "\t" (tab),
# install.packages("tidyverse") #installing packages library(readr) #loading the library to read csv, usually on top of the file # Using the readr library (the readr:: is optional, but useful when the function) data <- readr::read_csv("../common_datasets/dataset_boys.csv", col_select = c("age","hgt")) ## Rows: 748 Columns: 2 ## ── Column specification ──────────────────────────────────────────────────────── ## Delimiter: "," ## dbl (2): age, hgt ## ## ℹ Use `spec()` to retrieve the full column specification for this data. ## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. # Summary statistics summary(data) ## age hgt ## Min. : 0.035 Min. : 50.00 ## 1st Qu.: 1.581 1st Qu.: 84.88 ## Median :10.505 Median :147.30 ## Mean : 9.159 Mean :132.15 ## 3rd Qu.:15.267 3rd Qu.:175.22 ## Max. :21.177 Max. :198.00 ## NA's :20
Goal: Get used to RStudio using R as a calculator, and install one library