Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction...
Transcript of Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction...
![Page 1: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/1.jpg)
Reinhard Furrer, UZH
I-Math, 12. 2. 2014NZZ.ch
Introduction to R
![Page 2: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/2.jpg)
Contents
2
I Basics
I Data handling and storing
I Plotting
I Linear models
I Simple programming tricks
![Page 3: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/3.jpg)
3
Part 1
Basics
![Page 4: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/4.jpg)
4
I What is R?
I The R-environment
I Getting started
I R rules
![Page 5: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/5.jpg)
What is R?
5
I R is a language and environment for statistical computing and
graphics.
I R provides a wide variety of statistical and graphical techniques,
and is highly extensible.
I R produces well-designed publication-quality plots with a careful
choice of default values.
I R is available as Free Software under the terms of the Free Soft-
ware Foundation’s GNU General Public License in source code
form.
![Page 6: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/6.jpg)
What is R?
6
Crude classification:
I Symbolic software:
– Mathematica
– Maple
– Magma
– . . .
I Numeric software:
– MATLAB, Octave
– NCL, IDL
– . . .
– R
![Page 7: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/7.jpg)
The R-environment: micro
7
I R is an integrated suite of software facilities
I Emphasis on statistical analysis and graphical display
I Perform an entire analysis from raw data to reports
I Essentially command line interpreted, links to precompiled code
are possible
![Page 8: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/8.jpg)
The R-environment: macro
8
Due to licence:
I freely available: cran.r-project.org
I huge community
I many packages (>5100): cran.r-project.org/web/packages/
I abundant documentation in form of:
FAQs (cran.r-project.org/doc/FAQ/R-FAQ.html), manuals (cran.r-
project.org/manuals.html or cran.r-project.org/other-docs.html),
wiki’s, books, . . . see www.r-project.org
I several mailing lists: www.r-project.org/mail.html
![Page 9: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/9.jpg)
The R-environment: macro
9
Slides are mainly based on the following sources:
I An Introduction to R: (IR)
cran.r-project.org/doc/manuals/R-intro.pdf
I The R Primer : (RP)
www.stat.washington.edu/cggreen/rprimer/
I The R Inferno: (RI)
www.burns-stat.com/pages/Tutor/R inferno.pdf
and some 10 years of personal use . . .
![Page 10: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/10.jpg)
Getting started: install R
10
Done through “The Comprehensive R Archive Network” (CRAN):
cran.r-project.org
Easy to follow instructions in Chapter 1 of RP:
www.stat.washington.edu/cggreen/rprimer/
![Page 11: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/11.jpg)
Getting started: run R (Linux)
11
Launch R in your console:<194>furrer@furrer-laptop:~/teaching/intro2R> R
R version 2.15.0 (2012-03-30)Copyright (C) 2012 The R Foundation for Statistical ComputingISBN 3-900051-07-0Platform: i686-pc-linux-gnu (32-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.You are welcome to redistribute it under certain conditions.Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.Type 'contributors()' for more information and'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or'help.start()' for an HTML browser interface to help.Type 'q()' to quit R.
>
![Page 12: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/12.jpg)
Getting started: run R
12
RStudio
Runs under Windows, Linux, OS X (free; AGPLv3) rstudio.org
![Page 13: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/13.jpg)
Getting started: run R
13
Tinn-R (Tinn stands for the recursive acronym ’Tinn is not Notepad’)
Runs under Windows (free; GPL) sciviews.org/Tinn-R
![Page 14: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/14.jpg)
Getting started: run R
14
EMACS environment for R (and other statistics software)
Runs under Windows, Linux, OS X (GPL)
![Page 15: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/15.jpg)
Getting started
15
> pi
[1] 3.141593
> cos( pi)
[1] -1
> 2 + 2.3
[1] 4.3
> sqrt( -1) # Oops
[1] NaN
> myvar <- exp( -2.3) # Assigning
> print( myvar)
[1] 0.1002588
> print( myvar, digits=16)
[1] 0.1002588437228037
RStudio
![Page 16: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/16.jpg)
Hands-on tasks 1
16
1. Open RStudio and familarize with it.
2. What is the 15th digit of π?
3. Interpret the result of sin( pi).
![Page 17: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/17.jpg)
Getting started
17
> nrcyclones <- c(6, 5, 4, 6, 6, 3, 12, 7, 4, 2, 6, 7, 4)
> # "c" is a function... creating a vector out of its elements
> summary( nrcyclones)
Min. 1st Qu. Median Mean 3rd Qu. Max.
2.000 4.000 6.000 5.538 6.000 12.000
> hist( nrcyclones)
Histogram of nrcyclones
nrcyclones
Fre
quen
cy
2 4 6 8 10 12
01
23
45
![Page 18: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/18.jpg)
Getting started
18
> plot( nrcyclones, type="b")
●
●
●
● ●
●
●
●
●
●
●
●
●
2 4 6 8 10 12
24
68
1012
Index
nrcy
clon
es
> cor( nrcyclones[-1], nrcyclones[-13])
[1] -0.1113836
![Page 19: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/19.jpg)
Getting started
19
> par( mfrow=c(1,2))
> acf( nrcyclones)
> pacf( nrcyclones)
0 2 4 6 8 10
−0.
50.
00.
51.
0
Lag
AC
F
Series nrcyclones
2 4 6 8 10
−0.
4−
0.2
0.0
0.2
0.4
Lag
Par
tial A
CF
Series nrcyclones
![Page 20: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/20.jpg)
Getting started
20
> help( acf)acf package:stats R Documentation
Auto- and Cross- Covariance and -Correlation Function Estimation
Description:
The function 'acf' computes (and by default plots) estimates ofthe autocovariance or autocorrelation function. Function 'pacf'is the function used for the partial autocorrelations. Function'ccf' computes the cross-correlation or cross-covariance of twounivariate series.
Usage:
acf(x, lag.max = NULL,type = c("correlation", "covariance", "partial"),plot = TRUE, na.action = na.fail, demean = TRUE, ...)
pacf(x, lag.max, plot, na.action, ...)
## Default S3 method:pacf(x, lag.max = NULL, plot = TRUE, na.action = na.fail,
...)
![Page 21: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/21.jpg)
Getting started: getting help
21
Various possibilities:
> ?mean # Shortcut for help( mean)
> ?"%*%" # The quotes are required!
> help.start() # Interactive html-based help!
Further illustrative help is accessed via:
> example("image") # example code in the help of "image"
> demo("image") # run the demo "image"
> demo() # lists all available demos
We hardly use the following command:
> q()
![Page 22: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/22.jpg)
R rules
22
I R is case-sensitive.
I Variable names, function names, etc., should contain only
alphanumeric characters (A-Z, a-z, 0-9), the “.” (and “ ”).
Cannot be a reserved word or start with a digit or ” ”.
I Commands are separated by semicolons (“;”) or by a newline.
Commands are grouped with curly braces ({ }).
I # is the comment sign. Remainder of the line is ignored.
I If a command is not complete at the end of a line, R will give
a continuation prompt, “+ ”, on subsequent lines until the com-
mand is complete.
I As long as matched, single quotes (’) and double quotes (") are
equivalent.
![Page 23: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/23.jpg)
R rules: reserved words
23
The reserved words in R’s parser are:
if, else, repeat, while, for, in, next, break, function
TRUE, FALSE, NULL, Inf, NaN, NA and NA-specific types.
... and ...-derivatives, which are used to refer to arguments
passed down from an enclosing function.
There are (unprotected) short cuts T and F, for TRUE and FALSE:
> T
[1] TRUE
> T <- F # How not to do it!!
> T
[1] FALSE
![Page 24: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/24.jpg)
R rules: functions and operators
24
Most R statements are composed of functions and operators:
> y <- sqrt(2 + 2)
consists of the + operator followed by the √ -function and then the
assign operator.
Functions are of the form function( list of arguments )
Operators are of the form lhs operator rhs
![Page 25: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/25.jpg)
Hands-on tasks 2
25
1. What are operators and what are functions in the following calls:
2 + 1
sin( pi)
2 + cos( 0)
2. What does the function median calculate?
3. Notice the difference between ?mean, ?"mean" and ?in, ?"in".
4. Create a variable named my1var containing log( 3).
5. Which of the following are valid variable names:
yo, beHappy!, I am 2, myvar;val, getvar1, getvar$char.
6.? Many operators can be used as functions: "operator"(lhs, rhs).
Compare: 2 + 2 and "+"(2,2)
![Page 26: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/26.jpg)
R rules: syntax
26
R has the following operators (highest to lowest)::: ::: access variables in a name space$ @ component / slot extraction[ [[ indexing^ exponentiation (right to left)- + unary minus and plus: sequence operator%any% special operators* / multiply, divide+ - (binary) add, subtract< > <= >= == != ordering and comparison! negation& && and| || or~ as in formulae-> ->> rightwards assignment= assignment (right to left)<- <<- assignment (right to left)? help (unary and binary)
![Page 27: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/27.jpg)
Hands-on tasks 3
27
1. Compare:
1:-3
1:(-3)
-1:3
-(1:3)
2. Compare:
2^1/2
2^(1/2)
3.? Be aware of floating point arithmetic:
pi==3.14159265358979
pi==3.141592653589793
pi==3.141592653589793116
![Page 28: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/28.jpg)
28
Part 2
Data handling and storage
![Page 29: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/29.jpg)
29
I Objects
I Indexing
I Functions
I Reading from files
![Page 30: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/30.jpg)
Objects
30
R uses the following “core” objects:
I vectors
I matrices
I arrays
I factors
I lists
I data frames
I functions
![Page 31: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/31.jpg)
Objects: vectors
31
Intrinsic attributes: mode and length
> v <- 1:4
> v
[1] 1 2 3 4
mode is of logical, numeric, complex, character (or raw).
> length( v)
[1] 4
> mode( v)
[1] "numeric"
> mode( 1i) # to give another example
[1] "complex"
The mode numeric has storage mode integer or double.
![Page 32: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/32.jpg)
Hands-on tasks 4
32
1. All elements of a vector are of the same mode.
What is the mode of c("char", pi), c(2,1i)?
2. Interpret the result of sqrt(-1) and sqrt(-1+0i)
3. is.integer and as.integer query and coerce to integer format.
What is the output of length (two ways to verify)?
4.? Compare the results of identical(1,1.0) and
identical( as.integer(1),1.0)
5.? What is the result and storage mode of 3L, 3L*1, 3L*1L, 3L/1L,
3L/3L?
![Page 33: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/33.jpg)
Objects: vectors: generation
33
Concatenation operator:
> v <- c( 1, 2, 3, 4)
Generate sequences (several additional possibilities exist):
> seq( 4) # identical to 1:4
[1] 1 2 3 4
> seq( 1, 12, by=2)
[1] 1 3 5 7 9 11
> seq( 1, by=2, length.out=12)
[1] 1 3 5 7 9 11 13 15 17 19 21 23
> rep( 1:4, 2) # identical to rep.int( 1:4, 2)
[1] 1 2 3 4 1 2 3 4
> rep( 1:4, each=2)
[1] 1 1 2 2 3 3 4 4
> rep( 1:4, 2:5) # identical to rep( 1:4, times=2:5)
[1] 1 1 2 2 2 3 3 3 3 4 4 4 4 4
![Page 34: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/34.jpg)
Hands-on tasks 5
34
1. Interpret the output of the following calls:
seq( from=1, to=13, by=2)
seq( from=1, to=13, length.out=3)
seq( from=1, by=2, length.out=3)
seq( from=1, to=12, by=2, length.out=3)
2. What calls generate the sequence: 1, 4, 4, 7, 7, 7, 10, 10, 10,
10, 13, 13, 13, 13, 13?
3. Create a sequence containing TRUE and FALSE according to the
parity of the last sequence.
4. Why is it not advisable to use the command: c <- c(1, 2, 3, 4)?
![Page 35: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/35.jpg)
Objects: matrices
35
A vector with (minimal) attribute dim
> m <- matrix( 1:16, 4, 4)
> m
[,1] [,2] [,3] [,4]
[1,] 1 5 9 13
[2,] 2 6 10 14
[3,] 3 7 11 15
[4,] 4 8 12 16
> length( m)
[1] 16
> attributes( m)
$dim
[1] 4 4
![Page 36: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/36.jpg)
Objects: matrices
36
A matrix can contain additional attributes
> rownames( m) <- paste( "r", 1:4, sep="")
> attributes( m)
$dim
[1] 4 4
$dimnames
$dimnames[[1]]
[1] "r1" "r2" "r3" "r4"
$dimnames[[2]]
NULL
The function attr( object, name) can be used to specify an attribute:
> attr( m, "dim") <- c(2, 8) # What is the result?
![Page 37: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/37.jpg)
Objects: matrices: generation
37
> m1 <- matrix( 1:8, nrow=4, ncol=4, byrow=TRUE) # recycling
> m2 <- diag( 1:4)
> m3 <- cbind( 1:3, 2:4, 1)
> m3
[,1] [,2] [,3]
[1,] 1 2 1
[2,] 2 3 1
[3,] 3 4 1
> t( m3) # transpose
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 2 3 4
[3,] 1 1 1
![Page 38: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/38.jpg)
Hands-on tasks 6
38
1. What is the effect of dim( m) <- c( 2, 8)? Try other values.
2. What is the result of
matrix( 1:7, nrow=4, ncol=4)
diag( m1)
rbind( 1:3, 2:4, 1)
cbind( rbind( 1:2, 3:4), 0) ?
3. Construct a block diagonal matrix with 2 blocks of sizes 2×2.
![Page 39: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/39.jpg)
Objects: arrays
39
Arrays are higher-dimensional “matrices”
> a <- array( 1:24, c( 3, 4, 2))
> a
, , 1
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
, , 2
[,1] [,2] [,3] [,4]
[1,] 13 16 19 22
[2,] 14 17 20 23
[3,] 15 18 21 24
![Page 40: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/40.jpg)
Hands-on tasks 7
40
1. What is the length of a?
2. What are its attributes?
3. aperm is the generalization of t.
Trace the elements of aperm(a,c(2,1,3)) and aperm(a,c(3,2,1)).
![Page 41: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/41.jpg)
Objects: factors
41
Strange concept, neither numeric nor character.
> as.factor( 1:3)
[1] 1 2 3
Levels: 1 2 3
> as.factor( 1:3) + 1
[1] NA NA NA
Used in the context of categorical data.
![Page 42: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/42.jpg)
Objects: lists
42
A vector whose elements can be of ‘any’ type.
> l <- list(1:2, as.factor(1:2), paste(1:2))
> l
[[1]]
[1] 1 2
[[2]]
[1] 1 2
Levels: 1 2
[[3]]
[1] "1" "2"
> length(l)
[1] 3
![Page 43: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/43.jpg)
Objects: data frames
43
Matrix-like structures, in which the columns can be of different types.
> d <- data.frame( m)
> d
X1 X2 X3 X4 X5 X6 X7 X8
1 1 3 5 7 9 11 13 15
2 2 4 6 8 10 12 14 16
> attributes( d)
$names
[1] "X1" "X2" "X3" "X4" "X5" "X6" "X7" "X8"
$row.names
[1] 1 2
$class
[1] "data.frame"
![Page 44: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/44.jpg)
Objects: data frames
44
While rownames and colnames are for matrices, names and row.names are
for data frames.
> names( d)
[1] "X1" "X2" "X3" "X4" "X5" "X6" "X7" "X8"
> row.names( d)
[1] "1" "2"
Luckily, the former work as well:
> colnames( d)
[1] "X1" "X2" "X3" "X4" "X5" "X6" "X7" "X8"
> rownames( d)
[1] "1" "2"
In general, work with dimnames.
![Page 45: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/45.jpg)
Hands-on tasks 8
45
1. Can factors be ordered?
2. What is the difference between l[1] and l[[1]] ?
(use is.list(..) to probe the result).
3. Internally, a data.frame is a list with class data.frame .
Check d[[3]] .
4. What is the length of d? Is the result intuitive?
![Page 46: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/46.jpg)
Objects: functions
46
R is built upon itself. Many of the functions are “visible”:> sdfunction (x, na.rm = FALSE)sqrt(var(if (is.vector(x)) x else as.double(x), na.rm = na.rm))<bytecode: 0x25d9408><environment: namespace:stats>
More later . . .
![Page 47: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/47.jpg)
Objects: coercion and testing
47
An object obj has usually with three associated functions:
obj() , as.obj() , and is.obj() .
> is.matrix( a)
[1] FALSE
> as.matrix( v) # here equivalent to "matrix(v)"
[,1]
[1,] 1
[2,] 2
[3,] 3
[4,] 4
![Page 48: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/48.jpg)
Hands-on tasks 9
48
1. Notice the difference between matrix( a, nrow=3)
and as.matrix( a, nrow=3)
2. What is the result of c( 0, NULL, 3),
is.array( m), is.matrix( m)
is.array( a), is.matrix( a)
3. Note all coercions work. What is the result of
as.integer( pi)
as.integer( 2i)
as.numeric( "a")
![Page 49: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/49.jpg)
Objects: summary
49
Source: RI
![Page 50: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/50.jpg)
Indexing
50
Basically, extraction is done via the [ operator:
> v
[1] 1 2 3 4
> v[1]
[1] 1
> v[-c(2:3)] # or v[-c(2,3)] or v[-(2:3)]
[1] 1 4
Similarly, replacement is done via the [<- operator:
> v[ 1] <- 1.1
> v[-c(2:3)] <- c(2.2, 3.3)
> v
[1] 2.2 2.0 3.0 3.3
![Page 51: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/51.jpg)
Indexing: vectors
51
Extraction is done via the [ operator:
> v
[1] 2.2 2.0 3.0 3.3
> v[ c(1, 4)]
[1] 2.2 3.3
> v[-c(1, 4)]
[1] 2 3
> v[c(TRUE, FALSE, TRUE, FALSE)]
[1] 2.2 3.0
> v[c(TRUE, FALSE, TRUE)] # note the recycling!
[1] 2.2 3.0 3.3
Extraction for (very) long vectors:
> tail( v, 2)
[1] 3.0 3.3
> head( v, -1)
[1] 2.2 2.0 3.0
![Page 52: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/52.jpg)
Indexing: matrices
52
> m <- matrix( 1:16, 4, 4)
> m[2, 3]
[1] 10
> m[1,]
[1] 1 5 9 13
> m[,1]
[1] 1 2 3 4
> m[ c(1,8,12)] # ordered columwise
[1] 1 8 12
> m[ c(1,2,4), c(4,2,1)] # note the ordering
[,1] [,2] [,3]
[1,] 13 5 1
[2,] 14 6 2
[3,] 16 8 4
> m[cbind( c(1,2,4), c(4,2,1))] # What is the result when using rbind?
[1] 13 6 4
![Page 53: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/53.jpg)
Indexing: matrices
53
If the matrix has appropriate dimnames attributes:
> rownames( m) <- paste( "r", 1:4, sep="")
> m
[,1] [,2] [,3] [,4]
r1 1 5 9 13
r2 2 6 10 14
r3 3 7 11 15
r4 4 8 12 16
> m["r1",]
[1] 1 5 9 13
> m[,1, drop=FALSE]
[,1]
r1 1
r2 2
r3 3
r4 4
![Page 54: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/54.jpg)
Indexing: matrices
54
Extract or replace the diagonal values:
> n <- min( dim( m))
> diag( m)
[1] 1 6 11 16
> diag( m) <- -(1:n)
How to extract the values above the diagonal?
> m[ (1:(n-1))*(n+1)]
[1] 5 10 15
> m
[,1] [,2] [,3] [,4]
r1 -1 5 9 13
r2 2 -2 10 14
r3 3 7 -3 15
r4 4 8 12 -4
![Page 55: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/55.jpg)
Hands-on tasks 10
55
1. Suppose that m only has rownames, interpret the result of m[,"c1"].
2. Use diag to extract the values above the diagonal.
3. Set the values of m below the diagonal to -1.
4. Compare m[cbind( c(1,2,4), c(4,2,1))] and the result when using
rbind instead?
![Page 56: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/56.jpg)
Indexing: lists
56
Extraction is done via the [, [[, $ operator:
> l[[1]]
[1] 1 2
> l[1]
[[1]]
[1] 1 2
> ll <- list( a=2, b=3, cde=10)
> ll$a
[1] 2
> ll$c # note the partial matching
[1] 10
![Page 57: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/57.jpg)
Indexing: data frames
57
Column extraction is also possible with $ operator:
> d$X1 # a data frame is primarily a list!
[1] 1 2
> d[,1]
[1] 1 2
> d[,"X1"]
[1] 1 2
Similarly:
> d[1,]
X1 X2 X3 X4 X5 X6 X7 X8
1 1 3 5 7 9 11 13 15
> d["1",]
X1 X2 X3 X4 X5 X6 X7 X8
1 1 3 5 7 9 11 13 15
![Page 58: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/58.jpg)
Indexing: other details
58
I Matrices are stored column-wise.
I Arrays are stored along the indices.
I Objects can have length zero, e.g. v[0].
I Indexing starts at one, but indexing can have all negative values.
![Page 59: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/59.jpg)
Hands-on tasks 11
59
1. What happens if ll <- list( a=2, b=3, cd=10, ce=12) is indexed
with ll$c?
2. What elements are extracted with m[1:6], a[1:4*2]?
3. Let exist <- 1:14. What elements are extracted with exist[-c(1:3)],
exist[c(1:3)]? What is the result of exist[-1:3]
4.? Examine the code
nonexist[2] <- 1
nonexist <- numeric(0)
length(nonexist)
nonexist[0]
nonexist[1]
nonexist[2] <- 1
nonexist
![Page 60: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/60.jpg)
Functions
60
Example:
> x <- mean( x, trim=.1)
General structure:
> res <- fcn( defarg1, defarg2,..., optarg1, optarg2, ...)
I res may be NULL
I Required arguments need to be in order.
I Optional arguments are name matched.
![Page 61: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/61.jpg)
Functions: “Math” group
61
Math(x, ...): abs, sign, sqrt, floor, ceiling, trunc
round, signif, exp, log, expm1, log1p
cos, sin, tan, acos, asin, atan
cosh, sinh, tanh, acosh, asinh, atanh
lgamma, gamma, digamma, trigamma
cumsum, cumprod, cummax, cummin
Ops(e1, e2): "+", "-", "*", "/", "^", "%%", "%/%"
"&", "|", "!"
"==", "!=", "<", "<=", ">=", ">"
Summary(..., na.rm=FALSE): all, any, sum, prod, min, max, range
Complex(z): Arg, Conj, Im, Mod, Re
![Page 62: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/62.jpg)
Hands-on tasks 12
62
1. What is the result of min( c( 1, 3, NA)) ?
Is there a difference to min( 1, 3, NA) ?
How to get the result of 1?
2. What is the result of 17 %% 7 and 17 %/% 7 ? Why?
3.? It is possible to define functions without a function name:
(function(x,y) { z <- x**2 + y**2; x+y+z } )(0:7, 1)
![Page 63: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/63.jpg)
Functions: matrices
63
For matrices, special operators are defined:
> m1 <- m2 <- matrix(1, 2, 2)
> m1[2, 2] <- 2
> m1 %*% m2
[,1] [,2]
[1,] 2 2
[2,] 3 3
> solve( m1)
[,1] [,2]
[1,] 2 -1
[2,] -1 1
> det( m1)
[1] 1
![Page 64: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/64.jpg)
Functions: matrices: factorization
64
> svd( m1) # X = U D V'$d[1] 2.618034 0.381966
$u[,1] [,2]
[1,] -0.5257311 -0.8506508[2,] -0.8506508 0.5257311
$v[,1] [,2]
[1,] -0.5257311 -0.8506508[2,] -0.8506508 0.5257311> chol( m1) # X = R' R
[,1] [,2][1,] 1 1[2,] 0 1> eigen( m1) # X = G D G' ## We see eigen and chol again!$values[1] 2.618034 0.381966
$vectors[,1] [,2]
[1,] 0.5257311 -0.8506508[2,] 0.8506508 0.5257311
![Page 65: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/65.jpg)
Functions: matrices: factorization
65
> qr( m1)$qr
[,1] [,2][1,] -1.4142136 -2.1213203[2,] 0.7071068 0.7071068
$rank[1] 2
$qraux[1] 1.7071068 0.7071068
$pivot[1] 1 2
attr(,"class")[1] "qr"
There are several additional functions associated: qr.qy, qr.tqr, . . .
![Page 66: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/66.jpg)
Hands-on tasks 13
66
Let M <- m1 %*% t( m1)
1. What is the eigendecomposition of M ?
2. What are the singular values of the same matrix?
3. Propose several approaches to construct an inverse of
M + diag( 2)
4. How can you calculate the trace of an arbitrary matrix A ?
![Page 67: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/67.jpg)
Functions: probability distributions
67
General construct of prefix and root.
I prefix: d density, p CDF, q quantile, r random numbers
I root: beta, binom, pois, norm, t, and many more
For example:
> runif( 5)
[1] 0.2282756 0.1472576 0.8364201 0.8430635 0.0640814
> dnorm( 0)
[1] 0.3989423
> qt( 0.975, df=1)
[1] 12.7062
Parameters are “quite” standard, consult the help.
![Page 68: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/68.jpg)
Functions: apply
68
Applying a function to margins of an array or matrix.
> d
X1 X2 X3 X4 X5 X6 X7 X8
1 1 3 5 7 9 11 13 15
2 2 4 6 8 10 12 14 16
> apply( d, 2, mean)
X1 X2 X3 X4 X5 X6 X7 X8
1.5 3.5 5.5 7.5 9.5 11.5 13.5 15.5
> apply( d, 1, range)
[,1] [,2]
[1,] 1 2
[2,] 15 16
> apply( d, 1, function(x, tr) { x[2] - mean(x, trim=tr)}, tr=.4)
[1] -5 -5
![Page 69: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/69.jpg)
Hands-on tasks 14
69
1. Draw a normal sample of size 100 and draw a histogram of the
sample.
What is the mean and standard deviation of the sample?
2. Repeat the previous exercise 1000 times and calculate the mean
of the means and the standard deviations.
3. How do the results compare to the ones from your peers?
Is there a way to “homogenize” the procedure?
![Page 70: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/70.jpg)
Reading from files: data files
70
I Several possibilities of reading ASCII files:
> read.table(file, header = FALSE, sep = "")
> read.csv(file, header = TRUE, sep = ",", quote="\"")
> scan(file, ...)
I scan is a powerful (complex) alternative.
I Byte length encoding is read with read.fwd.
I Common open source storage formats are supported:
netCDF, GRIB, HDF, . . .
(specific packages need to be loaded).
I Directly reading Excel files is not possible (non-free software).
![Page 71: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/71.jpg)
Reading from files: R code/objects
71
I R “source code” is read and evaluated with source("filename.R")
I R data files are read with load("file.RData")
I To save R objects use
> save.image()
> save(..., file="file.RData") # symbols or character strings
Note the save.image question when quitting R.
I data() lists all the available datasets in the search path (directly
available).
data( package=.packages( all.available=TRUE)) lists all the avail-
able datasets.
I data( name, package="packagename") loads name from the package
packagename.
![Page 72: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/72.jpg)
Hands-on tasks 15
72
On www.math.uzh.ch/furrer/software/workshop/ the three datasets
data1.dat, data2.dat and data3.dat are deposited (use entire link).
1. Download the datasets and look at the content thereof.
What are the differences?
2. Load these three datasets into R, by properly keeping column and
row names of the original data.
Try to specify directly the URL instead of the filename, what
do you notice?
3. Save one of the datasets in R-native format.
4.? Are there ways to reduce the file size?
![Page 73: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/73.jpg)
73
Part 3
Plotting
![Page 74: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/74.jpg)
74
I Plotting in R
I High-level plotting (HLP) functions
I Low-level plotting (LLP) functions
I Interactive graphics functions
I Graphical parameters
![Page 75: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/75.jpg)
Plotting in R
75
R distinguishes different plotting type functions:
I High-level plotting (HLP) functions create a new plot on the
graphics device, possibly with axes, labels, titles and so on.
I Low-level plotting (LLP) functions add more information to an
existing plot, such as extra points, lines and labels.
I Interactive graphics functions allow you interactively add infor-
mation to, or extract information from, an existing plot, using a
pointing device such as a mouse.
R maintains a list of graphical parameters which can be manipulated
to customize your plots.
![Page 76: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/76.jpg)
Plotting in R: workflow
76
General workflow:
1. Choosing a device (screen, PDF file, . . . )
2. Setting graphical parameters
3. Calling a high-level plotting function
4. Calling low-level plotting functions
5. More calls to high-level and low-level functions
6. Closing the device
Simplest example (i.e., point 3 only):
> plot(0)
![Page 77: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/77.jpg)
Plotting in R: workflow: example
77
> x <- rnorm( 100) # 100 random numbers
> pdf( "figure1.pdf") # Output to a PDF file
> par( mfrow=c(1, 2)) # Two panels for this plot
> hist( x) # high-level call
> abline( v=mean( x)) # low-level call
> qqnorm( x) # second high-level call
> dev.off() # close the device
produces: Histogram of x
x
Fre
quen
cy
−2 0 1 2
05
1015
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●●
●●
−2 0 1 2
−2
−1
01
2
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
![Page 78: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/78.jpg)
Plotting in R: workflow
78
I If no device is open, the default one will be used (usually screen).
I When producing files, dev.off() is required.
I Each new high-level plot overwrites the current area, unless dif-
ferently specified (usually, add=TRUE).
I Several devices can be open, only one is active. Use dev.cur()
and dev.set(), to inquire and set the active device.
![Page 79: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/79.jpg)
HLP functions
79
Common high-level plotting functions:
plot(x, y) most basic plotting command, flexiblehist(x) histogram (specify breaks for discrete data)boxplot(x) boxplot of one or several variablesqqnorm(y) quantile-quantile plot (empirical vs normal)qqplot(x, y) quantile-quantile plot (empirical vs arbitrary)pairs(x) scatterplots for multidimensional datacurve(expr) plots a functionimage(x, y, z) z = f(x, y) is provided in a matrixcontour(x, y, z) z = f(x, y) is provided in a matrixpersp(x, y, z) basic 3D plotting with shading
![Page 80: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/80.jpg)
Hands-on tasks 16
80
1. Draw a random sample of size 15 from a normal distribution.
Plot a histogram and superimpose the true density.
2. Repeat the experiment 100 times and superimpose a histogram
of the means.
![Page 81: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/81.jpg)
HLP functions: 3D plotting
81
Consider X1, . . . , Xniid∼ N (µ, σ2).
Investigate the likelihood function L(µ, σ) =n∏i=1
fX(xi;µ, σ).
For numerical stability, we work with the log-likelihood.
> mu <- 2
> sigma <- 2
> n <- 20
> x <- rnorm(n,mu,sigma)
> loglikelihood <- function(pars, x) {
+ return( sum( dnorm( x, pars[1], pars[2], log=T) ) )
+ }
![Page 82: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/82.jpg)
HLP functions: 3D plotting
82
Evaluate the log-likelihood over a grid
> ns <- 50
> m <- seq( 1, to=4, length=ns)
> s <- seq( 1, to=5, length=ns)
> grid <- expand.grid( m, s)
> ll <- apply( grid, 1, loglikelihood, x=x) # What is ll?
> llmat <- matrix( ll, ns) # What is dim(llmat)? Why?
> image( m, s, llmat)
1.0 1.5 2.0 2.5 3.0 3.5 4.0
12
34
5
m
s
![Page 83: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/83.jpg)
HLP functions: 3D plotting
83
> ncol <- 64
> mx <- unlist( grid[ which.max( ll),])
> image( m, s, llmat, col=topo.colors(ncol),
+ xlab=expression(mu), ylab=expression(sigma))
> abline( v=mx[1], h=mx[2])
1.0 1.5 2.0 2.5 3.0 3.5 4.0
12
34
5
µ
σ
![Page 84: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/84.jpg)
HLP functions: 3D plotting
84
> image( m, s, llmat, col=topo.colors(64),
+ xlab=expression(mu), ylab=expression(sigma))
> abline( v=mx[1], h=mx[2])
> box()
> contour( m, s, llmat, add=T)
1.0 1.5 2.0 2.5 3.0 3.5 4.0
12
34
5
µ
σ
−70 −65 −60 −60 −55 −55 −50
−50
−45
−40
![Page 85: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/85.jpg)
HLP functions: 3D plotting
85
> persp( m, s, llmat)
m
s
llmat
![Page 86: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/86.jpg)
HLP functions: 3D plotting
86
> persp( m, s, llmat, phi=45, theta=30)
m
s
llmat
![Page 87: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/87.jpg)
HLP functions: 3D plotting
87
> persp( m, s, llmat, phi=45, theta=30, axes=FALSE, box=FALSE)
![Page 88: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/88.jpg)
HLP functions: 3D plotting
88
> zfacet <- llmat[-1,-1]+llmat[-1,-ns]+llmat[-ns,-1]+llmat[-ns,-ns]
> facetcol <- cut( zfacet, ncol)
> brcol <- colorRampPalette( c("white","yellow", "red") )
> persp( m, s, llmat, phi=45, theta=30, axes=FALSE, box=FALSE,
+ col=brcol( ncol)[facetcol]) -> out
![Page 89: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/89.jpg)
HLP functions: 3D plotting
89
> persp( m, s, llmat, phi=45, theta=30, axes=FALSE, box=FALSE,
+ col=brcol(ncol)[facetcol], border=NA)
> points( trans3d(mx[1], mx[2], max( llmat), out), cex=4, col=4,
+ pch=4)
![Page 90: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/90.jpg)
HLP functions: 3D plotting
90
Grid search delivers maximum:
> c( value=max(ll), mx)
value Var1 Var2
-39.924118 2.408163 1.816327
Numerical optimum is at:
> par <- optim( mx, function( theta) -loglikelihood( theta, x),
+ method="L-BFGS-B", lower=c( -Inf, 0))
> c( value=-par[["value"]], par[["par"]])
value Var1 Var2
-39.913951 2.381047 1.780258
> par[4] # _ALWAYS_ check!
$convergence
[1] 0
For a maximization, set control$fnscale to a negative value.
![Page 91: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/91.jpg)
Hands-on tasks 17
91
Draw a random sample of size two from a normal density.
1. Plot the log-likelihood as a function of x1 and x2.
2. Plot the log-likelihood as a function of µ and σ.
![Page 92: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/92.jpg)
LLP functions
92
Common low-level plotting functions:
points, lines similar as plot
title main/sub above/below the panelabline v, h, or intercept/slopetext like points with text insteadmtext quite flexiblelegend flexible through many parametersaxis add additional axis, (see xaxt, yaxt)box around the panelarrows, segments . . .polygon . . .rect . . .
![Page 93: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/93.jpg)
Interactive graphics function
93
I locator(n=512):
gets n coordinates of the graphics cursor when left mouse button
is pressed.
I identify(x, y, n=length(x)):
after a left mouse button click, reads the position and searches
the closest point among x,y. Returns the index of the points.
I Both functions quit when pressing any other button.
I For more interaction, use package rgl.
![Page 94: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/94.jpg)
Graphical parameters
94
The function par queries and sets plotting parameters (similar to
option for “system” parameters).
> par("bty") # Frame is a rectangle
[1] "o"
> par(bty="n") # no frame/box is drawn
> par("bty")
[1] "n"
Many options are available, see for example:
> par()
?par is my most frequent help call.
![Page 95: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/95.jpg)
Graphical parameters
95
Further parameters:
adj text ajustment (.5 is default, centering)bg fg background and foreground (default) colorcex, cex. magnification of text and symbols relative to the defaultcol, col. color specification (numbers 0:7, words, rgb hex string)las rotation style of axis labelslty line type (1=solid, 2=dashed, 3=dotted, ...)lwd line widthmfrow,mfcol array of subplots filled by row/columnew if TRUE the next HLP will not clean the framepch specifying the symbol used for pointspty if s use square plotting areaxaxs, yaxs i for precise axis boundsxaxt, yaxt n to suppress axis drawingxlog, ylog if TRUE use logarithmic scale
where “ ” : axis, lab, main, sub
![Page 96: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/96.jpg)
Graphical parameters
96
mai and omi (in inches or mar and oma in ’lines’):
As well as mgp, (defaults to c(3,1,0)). . .
![Page 97: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/97.jpg)
Graphical parameters: example
97
> sample <- rt(100, df=2)
> boxplot( sample)
●
●
●
●
●●
−10
−5
05
1015
20
![Page 98: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/98.jpg)
Graphical parameters: example
98
> par(bty="l", col=5, col.main=2, cex=2)
> boxplot( sample, main="Boxplot")−
100
1020
Boxplot
![Page 99: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/99.jpg)
Graphical parameters: example
99
> par(bty="l", col.main=2, col.axis=4, cex=2, mai=c(.1,.7,.5,.1),
+ mgp=c(3,.8,0), adj=1, las=1, pch="-")
> boxplot( sample, main="Boxplot", col=5)
−
−
−
−
−−−10
−5
0
5
10
15
20Boxplot
![Page 100: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/100.jpg)
Graphical parameters: example
100
> par(bty="l", col.main=2, col.axis=4, cex=2, mai=c(.1,.7,.5,.1),
+ mgp=c(3,.8,0), adj=1, las=1, pch="-")
> boxplot( sample, col=5)
> title("Boxplot", adj=.5)
−
−
−
−
−−−10
−5
0
5
10
15
20Boxplot
![Page 101: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/101.jpg)
Hands-on tasks 18
101
Create the following plot. The data is available at:
www.math.uzh.ch/furrer/software/workshop/wheat.csv50
6070
8090
110
Durum
US
pro
duct
ion
(mio
bus
hel)
56
78
910
Pric
e (U
SD
per
bus
hel)
2008/09 2009/10 2010/11 2011/12
![Page 102: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/102.jpg)
Hands-on tasks 19
102
Create the following plot.
−3 −2 −1 0 1 2 3
−6
−4
−2
02
46
x
f(x)
ex
ln(x)
−3 −2 −1 0 1 2 3
0.0
0.5
1.0
1.5
2.0
2.5
3.0
x
f(x)
cosh(x)arcosh(x)
![Page 103: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/103.jpg)
103
Part 4
Linear models
![Page 104: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/104.jpg)
104
I A regression example
I Objects of class formula
I lm object
I Another regression example
I Other uses of formula objects
![Page 105: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/105.jpg)
A regression example
105
Suppose we have a response y for a set of predictors x1, . . . , xp.
Assume a linear model
yi = β1xi1 + · · ·+ βpxip + εi εiiid∼ N (0, σ2), i = 1, . . . , n
in matrix notation y = Xβ + ε.
Given response and predictors “solve” the regression problem:
I What are the estimates β̂?
I Which predictors are significant?
I Is the model adequate?
I . . .
![Page 106: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/106.jpg)
A regression example
106
Artificial data, so we know the “truth”:
> n <- 10
> x <- runif( n, -1, 2)
> beta <- c( 1, 1)
> sigma <- .5
> y <- beta[1] + beta[2]*x + rnorm( n, sd=sigma)
> plot( x, y)
●
●
●
●
●
●
●
●
●●
−0.5 0.0 0.5 1.0 1.5
0.5
1.5
2.5
3.5
x
y
![Page 107: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/107.jpg)
A regression example
107
A linear model is fitted with> lm1 <- lm( y~x)> summary( lm1)Call:lm(formula = y ~ x)
Residuals:Min 1Q Median 3Q Max
-1.1663 -0.3133 0.1224 0.3003 0.6425
Coefficients:Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.9993 0.2257 4.426 0.002208 **x 1.0673 0.2031 5.255 0.000769 ***---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.577 on 8 degrees of freedomMultiple R-squared: 0.7754, Adjusted R-squared: 0.7473F-statistic: 27.62 on 1 and 8 DF, p-value: 0.000769
![Page 108: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/108.jpg)
A regression example
108
> coef( lm1)
(Intercept) x
0.9992531 1.0673167
> fitted( lm1)
1 2 3 4 5 6 7
0.7820819 1.1234585 1.7661843 2.8399724 0.5777119 2.8085353 2.9567395
8 9 10
2.0477779 1.9463282 0.1297729
> resid( lm1)
1 2 3 4 5
-0.39579007 0.23662769 0.32153818 0.17254164 -0.12536026
6 7 8 9 10
0.64252432 0.07220797 -0.37600485 -1.16633597 0.61805134
![Page 109: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/109.jpg)
A regression example
109
> par( mfrow=c(2, 2))
> plot( lm1)
0.5 1.0 1.5 2.0 2.5 3.0
−1.
00.
0
Fitted values
Res
idua
ls
●
●●
●
●
●
●
●
●
●
Residuals vs Fitted
9
610
●
●●
●
●
●
●
●
●
●
−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5
−2
−1
01
Theoretical Quantiles
Sta
ndar
dize
d re
sidu
als Normal Q−Q
9
106
0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
Fitted values
Sta
ndar
dize
d re
sidu
als
●
●●
●●
●
●
●
●
●
Scale−Location9
10 6
0.0 0.1 0.2 0.3
−2
01
Leverage
Sta
ndar
dize
d re
sidu
als
●
●●●
●
●
●
●
●
●
Cook's distance 1
0.5
0.5
Residuals vs Leverage10
9
6
![Page 110: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/110.jpg)
A regression example
110
> pre <- predict( lm1, newdata=data.frame(x=0))
> pre
1
0.9992531
> plot( x, y)
> points( 0, pre, col=2, cex=2)
●
●
●
●
●
●
●
●
●●
−0.5 0.0 0.5 1.0 1.5
0.5
1.5
2.5
3.5
x
y
●
![Page 111: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/111.jpg)
A regression example
111
> new <- data.frame( x = seq(-2, 3, by=0.25))
> pred.w.plim <- predict( lm1, new, interval="prediction")
> pred.w.clim <- predict( lm1, new, interval="confidence")
> plot( x, y)
> points( 0, pre, col=2, cex=2)
> matlines( new$x, cbind(pred.w.clim, pred.w.plim[,-1]), lty=1)
●
●
●
●
●
●
●
●
●●
−0.5 0.0 0.5 1.0 1.5
0.5
1.5
2.5
3.5
x
y
●
![Page 112: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/112.jpg)
Objects of class formula
112
General structure: LHS ~ RHS
I ~ is used to define a model formula
I LHS is usually a single vector, the response
I RHS is of the form
op1 term1 op2 term2 ...
where opi is either + or - and termi: formula expression consisting
of factors, vectors or matrices connected by formula operators.
I Examples of formula operators are in RI p52.
I I(object) treated as is, inhibit the interpretation of operators as
model operators.
I offset(object) term in a linear model with known coefficient (=1)
![Page 113: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/113.jpg)
lm object
113
Generic for lm object:plot
summary
residuals resid
coef
predict
add1
drop1
step
deviance
formula
anova
vcov
kappa
effects
There exist some more . . .
![Page 114: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/114.jpg)
Another regression example
114
> pairs( swiss, panel = panel.smooth, main = "swiss data",
+ col = 3 + (swiss$Catholic > 50), gap=0)
Fertility
0 40 80
●●
●
●
●●
●
●
●●●
●●●
●
●●
●●
●●●
●●
●●●
●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●●
●
●
● ●●
●● ●
●
●●
● ●
● ●●
●●
●●●
● ●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
0 20 40
●●
●
●
●●
●
●
● ●●
●● ●
●
●●
●●
●●●
●●
● ●●
● ●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
● ●
●
●
●●●
●●●
●
●●
●●
●●●
●●
●●●
● ●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
15 20 25
4060
80●●
●
●
● ●
●
●
● ●●
●● ●
●
●●
●●
●● ●
●●
● ●●
● ●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
040
80
●
●●●
●●
● ●
●●
●●●●
● ●
●
●●
●
●●
●●
●
●●●
●
●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
Agriculture
●
●● ●
●●
●●
●●
● ●●
●●●
●
●●
●
●●
●●
●
●● ●
●
●
● ●●
●
●
●●
●
●
●● ●
●
●
●
●
●
●
●●●
●●
●●
●●
● ●●
●●●
●
●●
●
●●
●●
●
●● ●
●
●
●●●
●
●
●●
●
●
●● ●
●
●
●
●
●
●
●●●
●●
●●
●●
●●●●● ●
●
●●
●
●●
●●
●
●●●
●
●
●●●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●●●●
●
●●
●●
●●●
●● ●
●
●●
●
●●
●●
●
●●●
●
●
● ●●
●
●
●●
●
●
●● ●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●●
●
●
●●
●●
●
●
●●
●
●●
●●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●
●●
●●
●
●
●●
●
●●
●●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
● Examination●
●●
●
●
●
●●●
●●
●
●
●●
● ●
●
●
●●
●
●●
●●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●●
●●
●
●
●●
●●
●
●
●●
●
●●
●●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
515
30
●
●●
●
●
●
●●
●
●●
●
●
●●
●●
●
●
●●
●
●●
●●
●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
020
40
●●
●●
●
● ● ●●
●
●
●●●
●●
●
●
●
●●
●
●
●●
●●
●
●
●
●●
●●
●
●
●
●●● ●
●
●●
●
●●
●●
●●
●
● ●●●
●
●
●●
●
●●
●
●
●
●●
●
●
●●
●●
●
●
●
●●
●●
●
●
●
●●● ●
●
●●
●
●●
●●
● ●
●
● ●●●
●
●
●●
●
●●
●
●
●
● ●
●
●
●●
●●
●
●
●
●●
●●
●
●
●
● ● ●●
●
● ●
●
● ● Education
●●
●●
●
●●●●
●
●
●●●
●●
●
●
●
●●
●
●
●●
●●
●
●
●
●●●●●
●
●
●●●●
●
●●
●
●●
●●
●●
●
●●●●
●
●
●●
●
●●
●
●
●
●●
●
●
●●
●●
●
●
●
●●
●●
●
●
●
● ●●●
●
●●
●
●●
●
●●
●
●
● ● ●●●
●
●●●●
●
●●● ●●●●● ●●●
●●
●
●● ●● ●● ●●
●● ●●
●●
●●
●
●
●●
●
●
● ●●●
●●
●●● ●
●
●●
● ●●●●● ●●●
●●
●
●●●●● ● ●●
●● ●
●
●●
●●
●
●
●●
●
●
● ●●●●
●
●● ● ●
●
●●
●● ●●●
●●●●●
●
●
● ●● ●●●● ●
●●●
●
● ●
●●
●
●
●●
●
●
●●●●
●●
●● ●●
●
●●
●●●●●
●● ●●●
●
●
●●●●● ●● ●
●●●
●
●●
●●●
Catholic0
4080
●
●●
●
●
●●●●●●
●● ●●
●
●●
● ●● ●●
● ● ●●●
●
●
● ●●●●●● ●
●●●
●
●●
●●
●
40 60 80
1520
25
●●
●●●
●
●●
●
●●
●
●
●
●
●●●
●
●
●
●
●●
●
●
●●
●●
●
●●
●●
●●
●
●●●
●
●●● ●
●
● ●
●●●
●
●●
●
● ●
●
●
●
●
●●●
●
●
●
●
●●
●
●
●●
●●
●
●●
●●
●●
●
●●●
●
●●● ●
●
5 15 25 35
●●
● ● ●
●
●●
●
●●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●●
●●
●
●●
●●
●●
●
● ●●
●
● ●●●
●
●●
●● ●
●
●●
●
●●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●●
●●
●
●●
●●
●●
●
●●●
●
●●●●
●
0 40 80
● ●
●●●
●
●●
●
●●
●
●
●
●
●● ●
●
●
●
●
●●
●
●
●●
●●
●
●●●●
●●
●
●●●
●
●●●●
● Infant.Mortality
swiss data
![Page 115: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/115.jpg)
Another regression example
115
> summary( lmswiss <- lm(Fertility ~ . , data = swiss))Call:lm(formula = Fertility ~ ., data = swiss)
Residuals:Min 1Q Median 3Q Max
-15.2743 -5.2617 0.5032 4.1198 15.3213
Coefficients:Estimate Std. Error t value Pr(>|t|)
(Intercept) 66.91518 10.70604 6.250 1.91e-07 ***Agriculture -0.17211 0.07030 -2.448 0.01873 *Examination -0.25801 0.25388 -1.016 0.31546Education -0.87094 0.18303 -4.758 2.43e-05 ***Catholic 0.10412 0.03526 2.953 0.00519 **Infant.Mortality 1.07705 0.38172 2.822 0.00734 **---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 7.165 on 41 degrees of freedomMultiple R-squared: 0.7067, Adjusted R-squared: 0.671F-statistic: 19.76 on 5 and 41 DF, p-value: 5.594e-10
![Page 116: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/116.jpg)
Another regression example
116
> drop1( lmswiss, test="F")
Single term deletions
Model:
Fertility ~ Agriculture + Examination + Education + Catholic +
Infant.Mortality
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 2105.0 190.69
Agriculture 1 307.72 2412.8 195.10 5.9934 0.018727 *
Examination 1 53.03 2158.1 189.86 1.0328 0.315462
Education 1 1162.56 3267.6 209.36 22.6432 2.431e-05 ***
Catholic 1 447.71 2552.8 197.75 8.7200 0.005190 **
Infant.Mortality 1 408.75 2513.8 197.03 7.9612 0.007336 **
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
![Page 117: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/117.jpg)
Another regression example
117
> add1( lm( Fertility ~ 1, data=swiss), ~ Agriculture +
+ Examination + Education + Catholic + Infant.Mortality)
Single term additions
Model:
Fertility ~ 1
Df Sum of Sq RSS AIC
<none> 7178.0 238.34
Agriculture 1 894.8 6283.1 234.09
Examination 1 2994.4 4183.6 214.97
Education 1 3162.7 4015.2 213.04
Catholic 1 1543.3 5634.7 228.97
Infant.Mortality 1 1245.5 5932.4 231.39
![Page 118: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/118.jpg)
Other uses of formula objects
118
I Functions like plot or boxplot can be fed with a formula object.
I Generalized linear models, extensions of linear models:
glm( formula, family = gaussian, data, weights, subset, ...)
![Page 119: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/119.jpg)
119
Part 5
Programming tricks
![Page 120: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/120.jpg)
120
I Search path
I Scripting
I Functions
I Writing packages
I Customize the environment
I Writing documents
![Page 121: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/121.jpg)
Search path
121
R objects of a session are stored in environments.
The global environment is called the workspace.
> ls()
[1] "a" "beta" "brcol" "d"
[5] "facetcol" "grid" "l" "ll"
[9] "llmat" "lm1" "loglikelihood" "m"
[13] "m1" "m2" "m3" "mu"
[17] "mx" "myvar" "n" "ncol"
[21] "nrcyclones" "ns" "par" "s"
[25] "sample" "sigma" "v" "x"
[29] "y" "zfacet"
> rm( m1, m2, m3, facet, loglikelihood, nrcyclones, facetcol, grid,
+ llmat, zfacet, ncol, mx, brcol, ll, myvar, lm1, sample)
> ls()
[1] "a" "beta" "d" "l" "m" "mu" "n" "ns"
[9] "par" "s" "sigma" "v" "x" "y"
![Page 122: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/122.jpg)
Search path
122
To list all environments or databases:
> search()
[1] ".GlobalEnv" "package:stats" "package:graphics"
[4] "package:grDevices" "package:utils" "package:datasets"
[7] "package:methods" "Autoloads" "package:base"
Variables are searched for in the databases until an appropriate match
is found.
![Page 123: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/123.jpg)
Search path: data frames
123
attach allows you to put the “columns” of the argument in your
“search path”, i.e., they are directly accessible.
> X1
Error in try(X1) : object 'X1' not found
> attach( d) # reverse is done with a detach(d)
> X1
[1] 1 2
> search()
[1] ".GlobalEnv" "d" "package:stats"
[4] "package:graphics" "package:grDevices" "package:utils"
[7] "package:datasets" "package:methods" "Autoloads"
[10] "package:base"
> detach( d)
> search()[1:3]
[1] ".GlobalEnv" "package:stats" "package:graphics"
![Page 124: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/124.jpg)
Hands-on tasks 20
124
1. What is the command rm( list=ls()) doing.
2. Attach d, change an entry in X1, then attach d again.
What do you notice?
![Page 125: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/125.jpg)
Scripting
125
I Save R commands in a file.
File is executed with source( filename ),
where filename is a character string.
I Scripting is faster than line by line evaluation.
I Better programming practice compared to history re-evaluation!
I Make use of #.
I Add plenty of spaces or newlines to structure the code.
![Page 126: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/126.jpg)
Scripting: flow control
126
I if-statements:
> if(condition) expr
> if(condition) cons.expr else alt.expr
I Control:
> stop('message')> warning('message') # evaluation is continued
I Loops:
> for(var in seq) expr
> while(condition) expr
> repeat expr # needs a break
Most loops can be avoided by “vectorizing” the commands.
![Page 127: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/127.jpg)
Scripting: flow control: vectorizing
127
Instead of:
> rns <- matrix(0, 90, 100)
> sol <- numeric( 90)
> for ( i in 1:90) {
+ rns[i,] <- rnorm(100)
+ sol[i] <- mean( rns[i,])
+ }
> rns
Use:
> rns <- array( rnorm( 90*100), c(90,100))
> sol <- apply( rns, 1, mean)
![Page 128: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/128.jpg)
Hands-on tasks 21
128
1. Convince yourself that if ( cond ) expr and if(cond)expr
are equivalent (note the spaces).
2. Create a script executing a few commands and evaluate the script.
E.g. drawing 1000 random numbers from a gamma distribution,
plotting the histogram and indicating the mean and median with
vertical lines.
3. Implement a statement causing an error in the last call, what do
you notice?
![Page 129: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/129.jpg)
Functions
129
I A function is defined by an assignment of the form
> functionname <- function(arg_1, arg_2, ...) expression
expression is usually a series of R expressions (evaluations) grouped
by { and }.
I The last (evaluated) expression is returned.
I Recommended to use a return() or invisible().
![Page 130: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/130.jpg)
Functions
130
Example:
two functions that transform Cartesian (x, y) to polar coordinates
(θ, ρ):
> cart2polar <- function(x) {
+ return( cbind( atan2(x[,2], x[,1]), sqrt( x[,1]^2 + x[,2]^2)))
+ }
> polar2cart <- function(x) {
+ return( cbind( x[,2]*cos(x[,1]), x[,2]*sin(x[,1])) )
+ }
> n <- 1500
> po <- cbind( runif(n, 0, 2*pi), runif( n, 0, 1))
![Page 131: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/131.jpg)
Functions
131
> par( pty="s")
> plot( polar2cart( po))
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●●
●
●
●
●●
●●●
●
●
●
●
●● ●
●
●●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●●
● ●
●●
●
●●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
● ●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●●
● ●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●●
●
●
● ●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●●
●
● ●
●
●
●●
●
●●
●
●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●●
●
●
●● ●
●
●
●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
● ●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
−1.0 −0.5 0.0 0.5 1.0
−1.
0−
0.5
0.0
0.5
1.0
polar2cart(po)[,1]
pola
r2ca
rt(p
o)[,2
]
![Page 132: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/132.jpg)
Functions
132
Maybe, some checking might be useful:
> cart2polar <- function(x) {
+ if ((length(dim(x))!=2) || (dim(x)[2]!=2))
+ stop("Need a nx2 matrix/array")
+ return( cbind(atan2(x[,2],x[,1]), sqrt( x[,1]^2+x[,2]^2)))
+ }
> cart2polar(rep(1,2))
Error in cart2polar(rep(1, 2)) : Need a nx2 matrix/array
> cart2polar(cbind(1,2))
[,1] [,2]
[1,] 1.107149 2.236068
![Page 133: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/133.jpg)
Hands-on tasks 22
133
1. extend the function cart2polar such that an optional argument
allows scaling of the coordinates.
2. extend the function polar2cart such that degrees as input are
possible.
![Page 134: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/134.jpg)
Packages
134
I All R functions and datasets are stored in packages.
I Only when a package is loaded are its contents available.
This is done both for efficiency and to aid package developers,
who are protected from name clashes with other code.
I Packages come along with help files for each function and dataset!
I A few packages are standard and loaded by default:
stats, graphics, grDevices, utils, datasets, methods, base.
I There are > 3800 packages publicly available on CRAN.
Daily increasing . . .
![Page 135: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/135.jpg)
Packages
135
I To see which packages are installed at your site, issue
> library()
I To see which packages are currently loaded, use
> search()
I To load a package, use
> library( abind)
I To remove a package, use
> detach( package:abind)
I A basic description of the package is often given by
> help( "package.name")
RStudio
![Page 136: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/136.jpg)
Packages: namespaces
136
Packages have a NAMESPACE
:: accessing public (exported) objects
::: accessing private (non-exported) objects
Works for not-loaded packages as well!
> exists( "diag.spam")
[1] FALSE
> spam::diag.spam( 1)
[,1]
[1,] 1
Class 'spam'> spam::.spam.addsparsefull
Error : '.spam.addsparsefull' is not an exported object from 'namespace:spam'> # The following would work:
> # spam:::.spam.addsparsefull
![Page 137: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/137.jpg)
Packages: writing packages
137
I Disseminate R code (globally or locally)
I Thorough code and documentation checking
Documentation:
cran.r-project.org/doc/manuals/R-exts.html
cran.r-project.org/doc/contrib/Leisch-CreatingPackages.pdf
![Page 138: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/138.jpg)
Customize the environment
138
Within RStudio, set preferences (→ Tools → Options)
![Page 139: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/139.jpg)
Customize the environment
139
Global and local initialization files (Section 10.8 in RI).
I global: file taken from the R PROFILE environment variable
I local: .Rprofile in any directory
Launching R executes (“sources”)
1. site profile
2. user profile (local or home)
3. .RData
4. .First()
![Page 140: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/140.jpg)
Customize the environment
140
Example:
> .First <- function() {
+ library( spam)
+ source( "/home/furrer/R/usefulfcn.R")
+ options( width=120)
+ }
Similarly, before closing R, .Last() is executed:
> .Last <- function() {
+ cat( "Thanks for using R - good night or enjoy your coffee\n")
+ }
![Page 141: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/141.jpg)
Customize the environment: ESS
141
ESS: EMACS speaks statistics
EMACS environment for R (and other statistics software)
![Page 142: Introduction to R · 2016. 2. 29. · Reinhard Furrer, UZH I-Math, 12. 2. 2014 NZZ.ch Introduction to R. Contents 2 I Basics I Data handling and storing I Plotting I Linear models](https://reader033.fdocuments.us/reader033/viewer/2022061005/60b313fa29fe4851fd15180f/html5/thumbnails/142.jpg)
Writing documents
142
Using Sweave() mingle/merges LATEX with R code and R code output
within one document.
Structure of a LATEX file with embedded R code:
<<tag, eval=TRUE, echo=TRUE, fig=TRUE>>=
plot( x, y, xlab=’Diameter’, ylab=’Height’)
@
Prints, evaluates the code and includes the figure.
Documentation:
stat.ethz.ch/R-manual/R-devel/library/utils/doc/Sweave.pdf
This presentation has been prepared with Sweave and the LATEX pack-
age pfuef.