Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table...

59
Introduction to R 21/11/2016 C3BI Vincent Guillemot & Anne Biton

Transcript of Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table...

Page 1: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Introduction to R 21/11/2016

C3BI

Vincent Guillemot & Anne Biton

Page 2: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

R: presentation and installation

Page 3: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Where?

https://cran.r-project.org/

Page 4: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

How to install and use it?

I Follow the steps: you don’t need advanced rights to install it!I Open the R GUI.I Test a command: plot(-10:10, (-10:10)ˆ2).I Open an R script and save it in your working directory.

Page 5: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Rstudio

https://www.rstudio.com/

Page 6: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

. . .

Page 7: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

The few commands you must know

Command What it does

read.table Read a tabulated file.write.table Write a matrix or data frame.plot Command for graphical representation.x <- 1 Assign sthg (here 1) to object x.1:10 Create a vector containing integers 1 to 10.x[1:10] Extract a subvector from x.

Page 8: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

. . .

Command What it does

c(2, 5) Create a vector containing 2 and 5.A[, 2:5] Extract columns 2 to 5 of matrix A.DF$variable Extract from data frame DF its column called variable.?rnorm Get help on the function called rnorm.??gaussian Get help on the topic gaussian.

Page 9: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

A beginner’s test

I If you already know the previous commands → move to theback of the room, you can work independently on the handoutand the exercises and go home whenever you are finished ;

I If you are not familiar with these commands, move to the frontof the room.

I In any case, please ask us any R related question during theclass!

Page 10: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Basic commands

Page 11: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Prompt

A prompt is a special character that appears in the R console:

I > means that R is awaiting for an R command ;I + means that R is awaiting for the end of the current command;I A blank prompt means that R is computing something.

E.g., type:

1+12*3Sys.sleep(10)

Page 12: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Brackets

Brackets Use

() In functions, e.g. sin(2*pi).[] While indexing, e.g. x[1:2].{} In code blocks, e.g.

{x <- rnorm(10)y <- x[1:2]mean(y)

}

Page 13: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Writing your scripts: survival tips

1. Use spaces:

I x <- -1 is OK,I x<--1 not so much. . .

2. Indent!3. Save your scripts, it’s so easy with RStudio.4. Comment, comment, comment (use #).

“You are collaborating with at least one person: yourfuture self!”

– ∼ Hadley Wickham

Page 14: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

What this course is about

I Write short R programsI Read and predict the outcome of simple R functions / programsI Make graphical representationsI Read data and write tables

To go beyond (or slower), there’s a lot of material available online:Quick-R, TryR, Data Camp, cookbook-r etc.

Page 15: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Types of exercises

Three levels:

1. Copy & paste some code and see what it does.2. Read some code and explain what it does.3. Create your own code to answer a question.

Page 16: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Ex. 1

1. Copy and execute the following command: log(exp(2)).2. What does this code do: log10(10ˆ3) ?3. Find a function to run a t-test.

Page 17: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

R Objects

Page 18: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Types. . .

The type of an object is directly associated to the way it is stored inmemory:

I character : let <- "a"I double : nbr <- 2.0I integer : intg <- 1LI logical : TRUE or T or FALSE or FI Particular values: NA, +Inf, NaN

Page 19: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Types. . . and classes

The class of an object describes how different values are structuredwithin the object:

I vector: v <- c("a", "b", "a")I factor: fac <- factor(v)I matrix: M <- matrix(1:4, 2, 2)I data.frame: D <- data.frame(v, fac)I list,I etc.

Page 20: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Transformations

I as.integerI as.numericI as.characterI as.factorI as.vectorI . . .

Page 21: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Classes

Here are the classes that you need to know of:

I vectors and factors,I matrices,I data-frames,I lists,I functions.

Page 22: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

FAMuSSS

FAMuSSS : The Functional Single Nucleotide PolymorphismsAssociated with Human Muscle Size and Strength Study

Page 23: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Load an RData file

In the RData file famusss.RData, there is an example of each ofthe 5 R classes we mentioned:

Name Class Content

ndrm.diff Vector Difference in strength in thenon-dominant arm

snp1 Factor SNP rs577x located in the geneACTN3

M Matrix Matrix containing the Age, heightand weight of the individuals

D Data-frame Sample data extracted from theFAMuSSS data

L List List containing various objectsbmi Function Computes the BMI of an

individual from their weight (lb)and height (in)

Page 24: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Ex. 2

1. Load the objects with the following command

load("famusss.RData")

2. Print all the objects: what type of data do they contain?3. What is the BMI of a person 70 inch tall person weighting 150

lb?4. What does L$Dimensions do? What does names(L) do?5. Extract the element called GenderTable from L?

Page 25: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

VectorsCreate them with the combine function c or with the : operator:

x <- c(1, 10, -4, 5.0)i <- 1:10

Access elements from a vector with the square brackets

x[1]

## [1] 1

x[3:4]

## [1] -4 5

Page 26: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Factors

You can create factors in a number of ways, one of them is withfunction gl:

f0 <- gl(n = 3, k = 6, labels = c("CRTL", "A", "B"))

Page 27: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Ex. 3

1. What does f0 == "A" do?2. What do rep and seq do?3. Create a vector called v of length 18.4. What does v[f0 == "A"] do?5. Extract from v the values for which f0 is equal to B?

Page 28: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Matrices

A matrix is a two-dimensional kind of vector:

A <- matrix(0, 2, 2)B <- matrix(c("un","deux","trois","quatre"), 2, 2)A[1,]

## [1] 0 0

B[2,2]

## [1] "quatre"

Page 29: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Data framesA data frame is a two-dimensional structure that allows differenttypes for its columns:

D <- data.frame(a=1:10, b=letters[1:10], cos=cos(1:10))D[1:2,2:3]

## b cos## 1 a 0.5403023## 2 b -0.4161468

D$a[3]

## [1] 3

D[[1]]

## [1] 1 2 3 4 5 6 7 8 9 10

Page 30: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Lists

In R, data frames are special lists:

L <- list(1:10, b=3, f=cos, char=letters[5:7])names(L)

## [1] "" "b" "f" "char"

Page 31: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Block of code

A block allows to gather several commands in order to execute all ofthem at once!

{a <- 1b <- 2

}

It is used in

I functions,I loops (for, while. . . )I Control-flow constructs (?Control).

Page 32: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Functions

I Syntax : f <- function(arg1=, ...) {Commands}.I f ends with a return.I What can f return? Whatever you like (e.g. in a list).I Indent!

Page 33: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Ex. 4

I Create a matrix filled with random numbers (rnorm).I Compute the sum of each column (colSums).I Which elements are > 0?I Create a second matrix filled with 1s. It should have the same

dimensions as the first matrix.I Combine it with the first matrix (rbind or cbind).I Write a function returning the square and the square root of a

positive real number.

Page 34: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

for loops

Repeat a block, depending on an iterator i, n times.

for (i in 1:10) {j <- i^2 + i + 1print(j)

}

In general, we want to save the result:

s <- rep(NA, 10)for (i in 1:10) {

s[i] <- i^2 + i + 1}s

Page 35: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Ex. 5

What does this loop do?

library(tm)library(stringr)aveu <- removePunctuation(scan("phedre.txt",what = ""))nba <- 0 ; nbe <- 0 ; nbi <- 0 ; nbo <- 0 ; nbu <- 0

for (mot in aveu) {nba <- nba + str_count(mot, "a")nbe <- nbe + str_count(mot, "e")nbi <- nbi + str_count(mot, "i")nbo <- nbo + str_count(mot, "o")nbu <- nbu + str_count(mot, "u")

}

c(a=nba, e=nbe, i=nbi, o=nbo, u=nbu)

Page 36: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

if, else

The random p-value generator:

r <- runif(1)

if (r < 0.05) {print("Youpi !")

} else if (r < 0.1) {print("I still trust my result!")

} else {print(" :'( ")

}

Page 37: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Read and write data

Page 38: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Many available commands

Command Read Save

data Yes Noload Yes Nosave No Yesread.table Yes Yeswrite.table No Yesread.* Yes Yeswrite.* No Yes

Page 39: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Correspondance

Figure 1: diagrammer

Page 40: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

data

I Example: data(cars).I Before and after: ls().I Class of the loaded object: class(cars).I Quick object exploration: str(cars).I Only the beginning of the table: head(cars).

Page 41: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Working directory

You may (will) want to change the working directory in which yourcommands will look for data and save your outputs.

You can do this:

I with the commands setwd and getwd,I in a much simpler way with RStudio : Session → Set working

directory → . . .

Page 42: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Tabulated data

I Column names,I lines separated with and EOL (end of line),I column separator (tab, ;, etc.),I the same number of columns per line.

Page 43: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

long and wide formats: a wide table

## ctrl trt1 trt2## 1: 4.17 4.81 6.31## 2: 5.58 4.17 5.12## 3: 5.18 4.41 5.54## 4: 6.11 3.59 5.50## 5: 4.50 5.87 5.37## 6: 4.61 3.83 5.29## 7: 5.17 6.03 4.92## 8: 4.53 4.89 6.15## 9: 5.33 4.32 5.80## 10: 5.14 4.69 5.26

Page 44: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

long and wide formats: a long table

## values ind## 1: 4.17 ctrl## 2: 5.58 ctrl## 3: 5.18 ctrl## 4: 6.11 ctrl## 5: 4.50 ctrl## ---## 26: 5.29 trt2## 27: 4.92 trt2## 28: 6.15 trt2## 29: 5.80 trt2## 30: 5.26 trt2

Page 45: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

read.table

5 important parameters:

I file → where the file is,I header → whether the first line contains the names of the

columns,I sep → column separator,I dec → decimal point (3, 1419 or 3.1419 ?),I skip → how many lines should be skipped.

Page 46: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

write.table

4 important parameter:

I x → matrix or data.frame to save,I file → where the file should be stored,I sep → column separator,I dec → decimal point (3, 1419 or 3.1419 ?),

Page 47: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

save and load

I save can write any R object into an RData file.I load reads RData files.

Example :

x <- 1:10 ; a <- "toto" ; objetaunomtreslong <- pisave(x, a, objetaunomtreslong, file="Sauvegarde.RData")rm(list=ls())load("Sauvegarde.RData")

Page 48: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Plots

Page 49: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

plot

Syntax : plot(objet, ...) !

Parameter Role

main Main titlexlab & ylab Axis titlexlim & ylim Axis limitstype Type of graph : points, lines etc. . .col Color, e.g. “black”, “red”, “green”. . .

Page 50: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Ex. 6

Apply plot to any function, e.g. choose one among the alreadybuilt-in functions: sin, cos, exp, log, sqrt. . .

Page 51: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Ex. 7With plot and grid, reproduce this plot:

Figure 2: image

Page 52: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Add points, and lines or a function

You can draw a graph on an existing plot with the followingcommands:

I points to add points,I lines to add lines,I plot(f, add=TRUE, ...) to add a function.

Page 53: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Ex. 8

1. Generate two variables, x and y, linearly linked to one another.(do not forget to add some noise)

2. Represent the scatter-plot of the two variables with plot.3. Add to the plot the underlying linear model with lines or

plot.

Page 54: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Colors, dashes, symbols and width

4 important parameters :

I pch : to choose the type of point (circle, triangle, etc.),I lty : (line type) to choose the line type,I col : (color) to choose the color,I lwd : (line width) to set the width.

Page 55: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

legend

Argument Meaning

x, y Legend position. . .legend Legend text.‘bty Type of box = "o" (with) or "n" (without).

Page 56: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Ex. 9

Add a legend to this graph

plot(1:10, type="b", col="steelblue", lwd=2)

.

1. Add a legend at the following coordinates: (1, 7).2. Add a legend without a box around it, in the upper left corner

of the graph.3. Add the legend wherever you want it with locator(1).

Page 57: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Combining plots. . .

. . . is easy with layout!

1. Create the layout, a matrix indicating the positions andorders of the plots.

2. plot the graphs to populate the layout.

Ex:

x <- rnorm(100) # DataM <- rbind(1, 2:3) # 3 graphs in the layoutlayout(M) # Create the layout and put theplot(x) # 1st ...hist(x) # 2nd ...boxplot(x) # and 3rd graphs

Page 58: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

Here is the layout we used:

1

2 3

Page 59: Introduction to R 21/11/2016 · Thefewcommandsyoumust know Command Whatitdoes read.table Readatabulatedfile. write.table Writeamatrixordataframe. plot Commandforgraphicalrepresentation.

The resulting plot

0 20 40 60 80 100

−2

−1

01

2

Index

x

Histogram of x

x

Fre

quen

cy

−3 −2 −1 0 1 2

05

1015

20

−2

−1

01

2