Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics...

38
Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of Auckland June 30, 2011 List of Figures 1 Boxplot with 5-number-summary .................. 3 2 Boxplot showing right-skewed distribution ............. 4 3 Boxplot showing left-skewed distribution .............. 5 4 Univariate Normal QQ plot ..................... 6 5 Histogram showing normal distribution ............... 7 6 Histogram and boxplot showing normal distribution ....... 8 7 Histogram showing left skewed distribution ............ 9 8 Histogram and boxplot showing left skewed distribution ..... 10 9 Histogram showing uniform distribution .............. 11 10 Histogram and boxplot showing uniform distribution ....... 12 11 Piechart Example ........................... 13 12 Barplot Example ........................... 14 13 Scatterplot Example ......................... 15 14 Scatterplot pointing out the negative x values ........... 16 15 Scatterplot with outlier ....................... 17 16 Scatterplot with mistake? ...................... 18 17 Scatterplot with NOBEL PRIZE!! ................. 19 18 Scatterplot without the outlier and the negative weights ..... 20 19 Scatterplot with the fitted line ................... 21 20 Two way frequency table of counts on a barplot .......... 22 21 Two way frequency table of percentage on a barplot ....... 23 22 Boxplots with homogeneous variances ............... 24 23 Boxplots with heterogeneous variances ............... 25 24 Q-Q Plot ............................... 26 25 Three way frequency on a barplot .................. 28 26 Polynomial relationship between two continuous variables .... 29 27 Trellis graph with three continuous variables ............ 30 1

Transcript of Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics...

Page 1: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

Statistics Workshop

R graphics handout

Statistical Consulting ServiceThe Department of StatisticsThe University of Auckland

June 30, 2011

List of Figures

1 Boxplot with 5-number-summary . . . . . . . . . . . . . . . . . . 32 Boxplot showing right-skewed distribution . . . . . . . . . . . . . 43 Boxplot showing left-skewed distribution . . . . . . . . . . . . . . 54 Univariate Normal QQ plot . . . . . . . . . . . . . . . . . . . . . 65 Histogram showing normal distribution . . . . . . . . . . . . . . . 76 Histogram and boxplot showing normal distribution . . . . . . . 87 Histogram showing left skewed distribution . . . . . . . . . . . . 98 Histogram and boxplot showing left skewed distribution . . . . . 109 Histogram showing uniform distribution . . . . . . . . . . . . . . 1110 Histogram and boxplot showing uniform distribution . . . . . . . 1211 Piechart Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 1312 Barplot Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 1413 Scatterplot Example . . . . . . . . . . . . . . . . . . . . . . . . . 1514 Scatterplot pointing out the negative x values . . . . . . . . . . . 1615 Scatterplot with outlier . . . . . . . . . . . . . . . . . . . . . . . 1716 Scatterplot with mistake? . . . . . . . . . . . . . . . . . . . . . . 1817 Scatterplot with NOBEL PRIZE!! . . . . . . . . . . . . . . . . . 1918 Scatterplot without the outlier and the negative weights . . . . . 2019 Scatterplot with the fitted line . . . . . . . . . . . . . . . . . . . 2120 Two way frequency table of counts on a barplot . . . . . . . . . . 2221 Two way frequency table of percentage on a barplot . . . . . . . 2322 Boxplots with homogeneous variances . . . . . . . . . . . . . . . 2423 Boxplots with heterogeneous variances . . . . . . . . . . . . . . . 2524 Q-Q Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2625 Three way frequency on a barplot . . . . . . . . . . . . . . . . . . 2826 Polynomial relationship between two continuous variables . . . . 2927 Trellis graph with three continuous variables . . . . . . . . . . . . 30

1

Page 2: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

28 Trellis graph with two qualitative variables and one quantitativevariable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

29 Iris scatterplot of Petal Length against Petal Width . . . . . . . 3230 Iris scatterplot with different species . . . . . . . . . . . . . . . . 3331 Iris trellis scatterplot graphs . . . . . . . . . . . . . . . . . . . . . 3432 Iris pairs plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3533 Iris trellis graphs with more than three variables . . . . . . . . . 3634 Iris trellis graphs with measurements from four different days . . 3735 PCA on iris data with the first two components . . . . . . . . . . 38

2

Page 3: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

plot.new()

plot.window(xlim = c(1, 2), ylim = c(0, 55))

boxplot(1:45, add = T, at = 1.3, border = "red")

lines(x = c(0, 1.2), y = rep(1, 2), lty = 3, col = "grey")

lines(x = c(0, 1.1), y = rep(12, 2), lty = 3, col = "grey")

lines(x = c(0, 1.1), y = rep(23, 2), lty = 3, col = "grey")

lines(x = c(0, 1.1), y = rep(34, 2), lty = 3, col = "grey")

lines(x = c(0, 1.2), y = rep(45, 2), lty = 3, col = "grey")

text(1.9, 1, "Minimum", cex = 1.5)

text(1.9, 12, "1st Quantile", cex = 1.5)

text(1.9, 23, "Median", cex = 1.5)

text(1.9, 34, "3rd Quantile", cex = 1.5)

text(1.9, 45, "Maximum", cex = 1.5)

text(1.9, 50, "Outlier?", cex = 1.5)

arrows(rep(1.51, 6), c(1, 12, 23, 34, 45, 50), rep(1.7, 6), c(1,

12, 23, 34, 45, 50), length = 0.1, code = 3)

points(1.3, 50, col = "red")

010

2030

4050

Minimum

1st Quantile

Median

3rd Quantile

Maximum

Outlier?●

Figure 1: Boxplot with 5-number-summary

3

Page 4: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

boxplot(exp(seq(1, 5, length.out = 500)))

●●●●●●●●●●●●●●●●●●●●●

050

100

150

Figure 2: Boxplot showing right-skewed distribution

4

Page 5: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

boxplot(log(seq(1, 20, length.out = 500)))

●●●●●●●●●

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Figure 3: Boxplot showing left-skewed distribution

5

Page 6: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

x <- exp(seq(1, 3, length.out = 100)) + rnorm(100)

qqnorm(x, main = "")

qqline(x, col = "red", lwd = 2)

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●●●●

●●

●●●

●●

−2 −1 0 1 2

510

1520

Theoretical Quantiles

Sam

ple

Qua

ntile

s

Figure 4: Univariate Normal QQ plot

6

Page 7: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

hist1 <- rnorm(10000)

hist(hist1, breaks = 40, main = "Histogram showing Normal

Distribution", xlab = "", ylim = c(0, 1100))

box()

Histogram showing Normal Distribution

Fre

quen

cy

−4 −2 0 2 4

020

040

060

080

010

00

Figure 5: Histogram showing normal distribution

7

Page 8: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

hist(hist1, breaks = 40, main = "Histogram and boxplot showing Normal

Distribution", xlab = "", ylim = c(0, 1100))

boxplot(hist1, horizontal = T, at = 1000, add = T, boxwex = 250)

Histogram and boxplot showing Normal Distribution

Fre

quen

cy

−4 −2 0 2 4

020

040

060

080

010

00 ● ●●●● ●●● ●● ●● ●● ●●● ●● ●● ●●●● ●● ●●● ●●● ●● ●●● ● ●●●●●● ●●● ● ●●● ● ●● ● ●● ● ● ●● ●●

−4 −2 0 2 4

Figure 6: Histogram and boxplot showing normal distribution

8

Page 9: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

hist2 <- log(rnorm(10000)) + 10

hist(hist2, breaks = 60, main = "Histogram showing left skewed

distribution",

xlab = "", ylim = c(0, 600))

box()

Histogram showing left skewed distribution

Fre

quen

cy

2 4 6 8 10

010

020

030

040

050

060

0

Figure 7: Histogram showing left skewed distribution

9

Page 10: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

hist(hist2, breaks = 60, main = "Histogram and boxplot showing left

skewed istribution",

xlab = "", ylim = c(0, 600))

boxplot(hist2, horizontal = T, at = 550, add = T, boxwex = 120)

Histogram and boxplot showing left skewed istribution

Fre

quen

cy

2 4 6 8 10

010

020

030

040

050

060

0

●●●● ●●● ● ●● ● ●●● ● ●● ● ●● ● ●● ●● ●● ●●● ●●● ●● ●●● ●●● ●● ●●● ● ●●● ● ●● ● ●●●● ● ● ●●● ●● ● ●● ● ●●● ●●● ● ●●●● ●● ●●● ●● ● ●●●● ●●●● ●●● ● ●●● ● ●● ●●● ●● ●● ●● ●● ●●●● ● ●● ●● ●● ●● ●●●● ●●●● ● ●●● ●●● ●● ●●●● ●● ● ●●●●● ●● ●●●● ●●● ●● ●●● ●● ●●●

2 4 6 8 10

Figure 8: Histogram and boxplot showing left skewed distribution

10

Page 11: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

hist3 <- runif(1000)

hist(hist3, breaks = 30, main = "Histogram showing uniform

distribution",

xlab = "", ylim = c(0, 80))

box()

Histogram showing uniform distribution

Fre

quen

cy

0.0 0.2 0.4 0.6 0.8 1.0

020

4060

80

Figure 9: Histogram showing uniform distribution

11

Page 12: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

hist(hist3, breaks = 30, main = "Histogram and boxplot showing uniform

distribution",

xlab = "", ylim = c(0, 80))

boxplot(hist3, horizontal = T, at = 75, add = T, boxwex = 10)

Histogram and boxplot showing uniform distribution

Fre

quen

cy

0.0 0.2 0.4 0.6 0.8 1.0

020

4060

80

0.0 0.2 0.4 0.6 0.8 1.0

Figure 10: Histogram and boxplot showing uniform distribution

12

Page 13: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

count <- c(250,34,101,22)

order.count <- order(count, decreasing = T)

phylum <- c("Molluscs", "Annelids", "Arthropods", "Echinoderms")

pie(count[order.count], labels = phylum[order.count], col = 2:5)

Molluscs

Arthropods

Annelids

Echinoderms

Figure 11: Piechart Example

13

Page 14: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

barplot(count[order.count]/sum(count) * 100, names.arg =

phylum[order.count], ylab = "Percentage", xlab = "Phylum")

Molluscs Arthropods Annelids Echinoderms

Phylum

Per

cent

age

010

2030

4050

60

Figure 12: Barplot Example

14

Page 15: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

set.seed(07012011)

x <- c(rnorm(99, 0, 10) + 1:99, 2)

y <- c(rnorm(99, 0, 10) + 150:248, 260)

plot(x, y, xlab = "Weight (g)", ylab = "Food Consumption (mg/day)")

● ●

●●

●● ●

●●

●●

● ●

●●

●●

● ●

● ●

●● ●

● ●

−20 0 20 40 60 80 100

140

160

180

200

220

240

260

Weight (g)

Foo

d C

onsu

mpt

ion

(mg/

day)

Figure 13: Scatterplot Example

15

Page 16: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

plot(x, y, xlab = "Weight (g)", ylab = "Food Consumption (mg/day)")

points(x[which(x<0)], y[which(x<0)], pch = 20, col = "black", bg = "black")

abline(v = 0, lty = 2)

● ●

●●

●● ●

●●

●●

● ●

●●

●●

● ●

● ●

●● ●

● ●

−20 0 20 40 60 80 100

140

160

180

200

220

240

260

Weight (g)

Foo

d C

onsu

mpt

ion

(mg/

day)

Figure 14: Scatterplot pointing out the negative x values

16

Page 17: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

plot(x, y, xlab = "Weight (g)", ylab = "Food Consumption (mg/day)")

points(2, 260, pch = 20, col = "black", bg = "black")

● ●

●●

●● ●

●●

●●

● ●

●●

●●

● ●

● ●

●● ●

● ●

−20 0 20 40 60 80 100

140

160

180

200

220

240

260

Weight (g)

Foo

d C

onsu

mpt

ion

(mg/

day)

Figure 15: Scatterplot with outlier

17

Page 18: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

plot(x, y, xlab = "Weight (g)", ylab = "Food Consumption (mg/day)")

points(2, 260, pch = 20, col = "black", bg = "black")

text(20, 261, labels = "Mistake??", cex = 1.5)

● ●

●●

●● ●

●●

●●

● ●

●●

●●

● ●

● ●

●● ●

● ●

−20 0 20 40 60 80 100

140

160

180

200

220

240

260

Weight (g)

Foo

d C

onsu

mpt

ion

(mg/

day)

● Mistake?

Figure 16: Scatterplot with mistake?

18

Page 19: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

plot(x, y, xlab = "Weight (g)", ylab = "Food Consumption (mg/day)")

points(2, 260, pch = 20, col = "black", bg = "black")

text(28, 261, labels = "NOBEL PRIZE!!", cex = 1.5)

● ●

●●

●● ●

●●

●●

● ●

●●

●●

● ●

● ●

●● ●

● ●

−20 0 20 40 60 80 100

140

160

180

200

220

240

260

Weight (g)

Foo

d C

onsu

mpt

ion

(mg/

day)

● NOBEL PRIZE!!

Figure 17: Scatterplot with NOBEL PRIZE!!

19

Page 20: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

plot(x, y, xlab = "Weight (g)", ylab = "Food Consumption (mg/day)", type = "n")

points(x[-c(100, which(x<0))],y[-c(100,which(x<0))])

−20 0 20 40 60 80 100

140

160

180

200

220

240

260

Weight (g)

Foo

d C

onsu

mpt

ion

(mg/

day)

● ●

●●

●● ●

●●

●●

● ●

●●

●●

● ●

● ●

●● ●

● ●

Figure 18: Scatterplot without the outlier and the negative weights

20

Page 21: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

plot(x, y, xlab = "Weight (g)", ylab = "Food Consumption (mg/day)", type = "n")

points(x[-c(100, which(x<0))],y[-c(100,which(x<0))])

abline(coef(lm(y[-c(100, which(x<0))]~x[-c(100, which(x<0))])),

col = "red", lwd = 2)

text(10, 260, labels =

paste("y = ", round(coef(lm(y[-c(100, which(x<0))]~x[-c(100,

which(x<0))]))[2], 2), "x + ",

round(coef(lm(y[-c(100, which(x<0))]~x[-c(100, which(x<0))]))[1], 2),

sep = ""), cex = 1.5)

−20 0 20 40 60 80 100

140

160

180

200

220

240

260

Weight (g)

Foo

d C

onsu

mpt

ion

(mg/

day)

● ●

●●

●● ●

●●

●●

● ●

●●

●●

● ●

● ●

●● ●

● ●

y = 0.91x + 153.66

Figure 19: Scatterplot with the fitted line

21

Page 22: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

count <- matrix(c(130,29,82,16,120,5,19,6), nrow = 2, ncol = 4, byrow = T)

colnames(count) <- c("Molluscs", "Annelids", "Arthropods", "Echinoderms")

rownames(count) <- c("Hahei","Leigh")

order.count = order(colSums(count), decreasing = T)

barplot(count[, order.count], beside = T, legend.text = T,

ylab = "Frequency (Absolute Abundance)")

Molluscs Arthropods Annelids Echinoderms

HaheiLeigh

Fre

quen

cy (

Abs

olut

e A

bund

ance

)

020

4060

8010

012

0

Figure 20: Two way frequency table of counts on a barplot

22

Page 23: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

proportion <- round(count / rowSums(count), 3)

barplot(proportion[,order.count], beside = T, legend.text = T,

ylab = "Percentage (Relative Abundance)")

Molluscs Arthropods Annelids Echinoderms

HaheiLeigh

Per

cent

age

(Rel

ativ

e A

bund

ance

)

0.0

0.2

0.4

0.6

0.8

Figure 21: Two way frequency table of percentage on a barplot

23

Page 24: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

group1 <- rgamma(1000, 2, 5)

group2 <- rgamma(1000, 3, 6)

groupname <- rep(c("Group One","Group Two"),c(1000,1000))

boxplot(c(group1,group2)~groupname)

●●

●●

●●●

●●

●●●

●●

Group One Group Two

0.0

0.5

1.0

1.5

2.0

Figure 22: Boxplots with homogeneous variances

24

Page 25: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

group1 <- rgamma(1000, 2, 5)

group2 <- seq(1, 10, length.out = 1000)

groupname <- rep(c("Group One","Group Two"),c(1000,1000))

boxplot(c(group1,group2)~groupname)

●●

●●

●●●

●●●

●●

●●●●●●●●●●●●

Group One Group Two

02

46

810

Figure 23: Boxplots with heterogeneous variances

25

Page 26: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

x1 <- rnorm(100, 3, 1)

y1 <- rnorm(100, 3.5, 1)

qqplot(x1, y1, asp = 1, xlim = c(0, 7), ylim = c(0, 7), xlab = "Group One",

ylab = "Group Two")

abline(0, 1, col = "red", lwd = 2)

● ●●

●●

●●●●●●

● ●●●●●●●●●●●●●●

●●●●●●●●●●●●

●●●●●●

●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●

●●●●●●

●●●

●●

●●

● ●

0 2 4 6

01

23

45

67

Group One

Gro

up T

wo

Figure 24: Q-Q Plot

26

Page 27: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

count <- matrix(c(39,9,25,6,91,20,57,10,32,1,6,2,88,4,13,4), nrow = 4, ncol = 4)

order.count <- order(rowSums(count), decreasing = T)

par(las = 3, cex.axis = 0.8 )

plot.new()

plot.window(xlim = c(0,20), ylim = c(0, max(count) + 2))

rect(seq(0, 7.5, by= 2.5),0,seq(0, 7.5, by=2.5)+1,count[order.count,1],

col = "grey")

rect(seq(0, 7.5, by= 2.5)+1,0,seq(0, 7.5, by=2.5)+2,count[order.count,2],

col = "green")

rect(seq(0, 7.5, by= 2.5)+11,0,seq(0, 7.5, by=2.5)+12,count[order.count,3],

col = "grey")

rect(seq(0, 7.5, by= 2.5)+12,0,seq(0, 7.5, by=2.5)+13,count[order.count,4],

col = "green")

text(c(5, 15), max(count)+1, c("Hahei", "Leigh"), cex = 2)

axis(1, at = c(seq(1,8.5, 2.5), seq(12, 19.5, by = 2.5)),

labels = rep(c("Molluscs", "Arthropods", "Annelids","Echinoderms"),2), cex.lab = 0.5)

axis(2)

abline(v = 10, lty = 2)

box()

legend(15, 80, fill = c("grey", "green"), legend = c("Winter", "Summer"), cex = 1.2)

title(ylab = "Frequency (Absolute Abundance)")

27

Page 28: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

Hahei Leigh

Mol

lusc

s

Art

hrop

ods

Ann

elid

s

Ech

inod

erm

s

Mol

lusc

s

Art

hrop

ods

Ann

elid

s

Ech

inod

erm

s

020

4060

80

WinterSummer

Fre

quen

cy (

Abs

olut

e A

bund

ance

)

Figure 25: Three way frequency on a barplot

28

Page 29: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

x1 <- seq(-2, 4, length.out = 1200)

x2 <- -2*x1^2 + 5*x1 + 2 + rnorm(1200, 0, 2)

x2 = x2 - min(x2)

plot(x1,x2, xlab = "-log (Concentration)", ylab = "Outcome")

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●●

●●

●●

●●

●●

●●●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●●

●●

●●●

●●

●●

●●●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●●

●●

●●●

●●

●●

●●●●

●●

●●●

●●

●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●●

●●●●●●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●●●●

●●

●●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●●●

●●

●●

●●●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

−2 −1 0 1 2 3 4

05

1015

2025

30

−log (Concentration)

Out

com

e

Figure 26: Polynomial relationship between two continuous variables

29

Page 30: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

• May need to install lattice package in R

• In R, click Packages ⇒ InstallPackage(s) ⇒ NewZealand ⇒ lattice

library(lattice)

Oxygen <- equal.count(runif(1200), number = 12, overlap = 0)

xyplot(x2~x1|Oxygen, xlab = "-log (Concentration)", ylab = "Outcome")

−log (Concentration)

Out

com

e

0

5

10

15

20

25

30

−2 −1 0 1 2 3 4

●●●●●

●●●

●●●●●

●●●

●●

●●●●●●

●●

●●

●●●

●●●●●●●●●●●●●

●●●

●●

●●

●●

●●●●●

●●●

●●●●●

●●●●●

●●●

●●

●●●●

Oxygen

●●●

●●●●

●●●●

●●●

●●●

●●

●●●

●●●

●●●●●●

●●●●●●●●●

●●

●●●

●●

●●●

●●

●●●●●●●●●

●●●●●

●●●●●

●●

●●●●

●●●●

●●●

●●

Oxygen

−2 −1 0 1 2 3 4

●●

●●

●●

●●●●●

●●●

●●

●●

●●●●●●●

●●●

●●

●●●●●●●

●●●●●

●●●●●

●●

●●

●●●●●●●●

●●●●

●●

●●●●●●●

●●●●

●●

●●●●●●●

Oxygen

●●●●●●●●●●●

●●

●●

●●

●●

●●●

●●●

●●●

●●

●●

●●

●●●●

●●●

●●●●●

●●●●

●●

●●●●

●●●

●●●●●

●●

●●●●

●●●

●●●●●●●●

●●

Oxygen

●●●

●●

●●

●●

●●

●●●●●●●●●●●

●●

●●●●●

●●●●

●●

●●●●●

●●●●●

●●

●●●

●●

●●●●

●●●●●

●●●

●●●●●●

Oxygen

●●●●

●●●●●●●●

●●

●●●

●●●●●

●●●

●●

●●●●●

●●

●●●

●●

●●

●●

●●

●●●

●●

●●●●●

●●●

●●

●●●●●●●●

●●●

●●●

Oxygen

●●●●●●●●

●●

●●

●●●●

●●●

●●●

●●●●

●●

●●●●●

●●●

●●●●●

●●

●●●●

●●●

●●●

●●

●●●●●●●

●●

●●●

●●●●●●●●

●●●

●●●

Oxygen

0

5

10

15

20

25

30

●●●

●●●●

●●●

●●

●●

●●●●●●●

●●●

●●●●●

●●●

●●●

●●

●●●●

●●

●●●

●●●

●●

●●●●●●

●●●●

●●●●

●●●

●●

●●●●

●●●

●●

Oxygen0

5

10

15

20

25

30

●●

●●●●●●●●

●●●

●●●

●●●●

●●

●●

●●

●●

●●

●●

●●●

●●●●●●

●●●●

●●

●●●●●

●●

●●●●

●●●●●●

●●

●●

●●●

●●

●●

●●●●●

Oxygen

−2 −1 0 1 2 3 4

●●●●●

●●

●●●

●●●●

●●●●

●●●

●●●

●●●●●●

●●●●●●●●

●●●●

●●●●

●●

●●

●●

●●●●●●

●●

●●●●

●●

●●●●●●●

●●

●●●●

●●●●●

Oxygen

●●●

●●●

●●

●●

●●●●●

●●

●●●

●●●

●●●●●●

●●●●●●●●

●●

●●

●●

●●●●●●●

●●●●

●●

●●●●

●●

●●●

●●●●●●●

●●●●

●●

●●

●●

●●

Oxygen

−2 −1 0 1 2 3 4

●●●●

●●●

●●●●●●●

●●

●●

●●

●●

●●

●●●●

●●●

●●●●●

●●●●●●●●

●●●

●●●●

●●

●●●●●●

●●

●●●

●●

●●

●●

●●●●●●

●●

Oxygen

Figure 27: Trellis graph with three continuous variables

30

Page 31: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

IQ <- c(rnorm(500, 10, 2), rnorm(500, 12, 2), rnorm(500, 12, 2), rnorm(500, 10, 2))

Group1 <- rep(c("Male", "Female"), rep(1000, 2))

Group2 <- rep(rep(c("Biology", "Statistics"), rep(500, 2)), 2)

bwplot(IQ~Group1|Group2, ylab = "IQ (in 10s)")IQ

(in

10s

)

5

10

15

Female Male

●●

Biology

Female Male

●●●

●●

●●

Statistics

Figure 28: Trellis graph with two qualitative variables and one quantitativevariable

31

Page 32: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

with(iris, plot(Petal.Length, Petal.Width))

●●● ●●

●●

●●

●●

●●

● ●●

●●

●● ●●

●●●●

● ●

●●

●●●●

● ●

● ●

●●●

●●

●●

● ●

●●●

●●

1 2 3 4 5 6 7

0.5

1.0

1.5

2.0

2.5

Petal.Length

Pet

al.W

idth

Figure 29: Iris scatterplot of Petal Length against Petal Width

32

Page 33: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

with(iris, plot(Petal.Length, Petal.Width, type = "n"))

with(iris, for (i in 1:length(unique(Species))){

points(Petal.Length[Species == unique(Species)[i]],

Petal.Width[Species == unique(Species)[i]],

pch = 20+i, col = i, bg = i)

}

)

1 2 3 4 5 6 7

0.5

1.0

1.5

2.0

2.5

Petal.Length

Pet

al.W

idth

●●● ●●

●●

●●

●●

●●

● ●●

●●

●● ●●

●●●●

● ●

●●

●●●●

● setosaversicolorvirginica

Figure 30: Iris scatterplot with different species

33

Page 34: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

key.variety <- list(columns = 1, space = "right", text = list(levels(iris$Species)),

points = list(pch = 21:23, fill = 1:3, col = 1:3))

with(iris, xyplot(Petal.Width~Petal.Length|Species,layout = c(1,3),

groups = Species, pch = 21:23,

fill =1:3, col = 1:3, key = key.variety))

Petal.Length

Pet

al.W

idth

0.0

0.5

1.0

1.5

2.0

2.5

1 2 3 4 5 6 7

●●● ●●

●●

●●●●●

●●●

●●● ●●

●●

●● ●●

●●●●●●

● ●●●●

●●

●●●●

setosa0.0

0.5

1.0

1.5

2.0

2.5versicolor

0.0

0.5

1.0

1.5

2.0

2.5virginica

setosaversicolorvirginica

Figure 31: Iris trellis scatterplot graphs

34

Page 35: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

pairs(iris)

Sepal.Length

2.0 3.0 4.0

●●

●●

●●

● ●●

●●

●●

● ●●●

●●

●●

●●

●●

● ●

● ●●

●●

●●

●● ●

●●

●●

●●

●●

●● ●

●●

●●●●

●●●

●●

● ●●

●●

●●

● ●

●●

●●

● ●●

●●

●●●

●●

●●●

●●●

●●

●●

●●●●

●●

●●●●

●●

●●

●●●●

●●

●●●

●●

●●

●●

●●●●

●●

●●

●● ●

●●

●●

●●

●●●

●●

●●

●●●

●● ●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●●

●●●

0.5 1.5 2.5

●●●●

●●

● ●●

●●

●●● ●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●● ●

●●

●●

●●

●●

●● ●

●●

●●●

●●●

●●

●●●

●●

●●

● ●

●●

●●

●●●

●●

●●●

●●

●●

● ●●

●●

●●

4.5

5.5

6.5

7.5

●●●●

●●

●●●●

●●●

●●●●●●

●●

●●●

●●

●●

●●

●●●●

●●

●●

●●●

●●

●●●●●●●●●

●●●●●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●●

●●●

●●●●

2.0

3.0

4.0

●●

● ●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●● ●

●●

● ●

●●

● ●●

●●

●●

●●

●●●

●●

●●●

●●

●Sepal.Width

●●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●

●●

●●●●

●●

●●

●●●●

●●●

●●

●●●

● ●

●●

●●●

●●● ●

●●

● ●●

●●

●●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●●

● ●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

● ●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●

●●

●●●●

●●

●●

●●●●

●●●●●●

●●●

●●

●●

●●●

●●●●

●●

●●●

●●

●●●●●

●●●

●●●●●

●●

●●●● ●

●● ●● ● ●●

●● ●

●●●

●●

●●

●●

●●●●●● ●● ●●

● ●●●●

●●●●●

●●

● ●●

●●

●●●

●●

●●

●●●

●●

●●●

●●●

● ●●

●●●

●●

●●● ●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●● ●

●●●● ● ●●

●● ●

●●●

●●

●●

●●

● ●●●●● ● ●●●● ●●●

●●● ●●

●●

● ●●

●●

●●●

●●

●●

●●●

●●

●●●

●●●

● ●●

●●●

●●

● ●●●

●●

●●

●●

● ●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

Petal.Length

●●●●●

●●●●●●●●

●●●●●

●●

●●

●●● ●●●●● ●●●●●●

●●●

●●●●

●●●●●

●●●

●●●

●●

●●

●●●

●●

●●●

●●●

●●●

●●●

●●

●●●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

12

34

56

7

●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●

●●●●●

●●●

●●●

●●

●●

●●●●

●●●●●●

●●●●

●●●●●●●●

●●●●

●●●

●●

●●●

●●●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●●●●

0.5

1.5

2.5

●●●● ●●

●●●

●●●

●●●

●●● ●●

●●

●●●

●●●●●

●●●● ●

●● ●

●●●

●●

●●● ●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●●

●●

●●● ●

●●

●●

●●

●●●

●●

●●

●●

●●

● ●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●● ●● ●●

●●●

●●●

●●●

●●● ●●

●●

●●●

●●●●●

●●●● ●

●● ●

●●●

●●

●●● ●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●●●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

●●

●●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●●●●●

●●●●●●●●

●●●●●●

●●

●●●●●●●●

●●●●●●●●●●●

●●

●●●●●

●● ●

●●●

●●

●●

●●

●●

●●● ●

●●

●●●●

●●●●

●●●●●

●●

●●●●

●●

●●

●●

●●●

●●●

●●

●●

● ●●

●●●

●●

●●

●●

●●

●●

●●

●●

Petal.Width

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●

●●●●●●●●●●●

●●●●●●●

●●●●●●

●●

●●●●

●●

●●●●

●●

●●●●

●●●●●●●●●●●

●●●●●●

●●

●●

●●●

●●●●

●●

●●

●●●

●●●

●●●

●●

●●

●●

●●

●●●

●●

4.5 5.5 6.5 7.5

●●●● ● ●● ●● ● ●●●● ●●●● ●● ●●● ●●●●●●●● ●● ●●● ●●● ●●●● ●●● ●● ●●

●● ●● ●● ●● ●●● ●●●● ●●● ●● ●●●● ●●●●●●●● ●●● ● ●●●●● ●●● ●●● ●● ●

●● ●●● ●● ●● ●●● ●●● ●● ●●● ●● ●● ● ●●● ● ●● ●●●● ●●●● ●●●● ●●●●●●●

●● ●● ● ●●●● ● ●●●● ● ●●● ●●● ●●●●● ●●●●● ● ●●●● ●●● ●●● ● ● ●● ●● ●●

●●●● ●● ●● ●●● ●● ●● ●●●● ● ●●● ●●●● ●●●●● ●● ● ●●● ●●● ●●● ● ●●●● ●

●● ●●●●● ●● ●●● ●● ● ●● ●●● ●●●● ●●● ●● ●● ●●●● ● ●●●●●●● ●●●● ● ●●

1 2 3 4 5 6 7

●●●●●●●●●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●

●● ●● ●●●● ●●● ●● ●● ●●● ●● ●● ●●●● ●●●●●●● ●●●●●●● ●●●● ●●●●● ●

●● ●●● ●● ●●●●●●●●●● ●●● ●● ●● ●●●● ●●●●●● ● ●●●● ●●●● ●●●●●●●

●●●●● ●●●●●●●●●● ●●●●●● ●● ●●● ●●●●● ●●●●●●●●●●●● ●●●●●●●

●●●● ●● ●● ●●● ●● ●●●●● ●● ●● ●●●●● ●●●●● ● ●●●●●●●● ●●● ●●●●● ●

●● ●● ●●●●● ●●● ●● ●●● ●●● ●●●● ●●●● ●● ●● ●●● ●●●● ● ●●● ● ●●●● ●●

1.0 2.0 3.0

1.0

2.0

3.0

Species

Figure 32: Iris pairs plot

35

Page 36: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

xyplot(Sepal.Length + Sepal.Width ~ Petal.Length + Petal.Width | Species,

data = iris, scales = "free", layout = c(2, 2),

auto.key = list(x = .6, y = .7, corner = c(0, 0)))

Petal.Length + Petal.Width

Sep

al.L

engt

h +

Sep

al.W

idth

34

5

0.5 1.0 1.5

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

setosa2

34

56

7

1 2 3 4 5

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●● ●

●●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●●

● ●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●●

● ●

●●

●●●

versicolor

23

45

67

8

2 3 4 5 6 7

●●

● ●

●●

● ●

● ●

●●

●●

●●●

●●

●●

●●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●●

●●

●●

● ●●

●●

●● ● ●

●●

● ●●

● ●

●●

●●

●●●

●● ● ●●

●●

●● ●●

●●

●●●

●●

●●

●●

●●●

●● ● ●●

● ●

virginica

Sepal.Length * Petal.LengthSepal.Length * Petal.WidthSepal.Width * Petal.LengthSepal.Width * Petal.Width

Figure 33: Iris trellis graphs with more than three variables

36

Page 37: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

key.day <- list(columns = 3, space = "top", text = list(levels(iris$Species)),

points = list(pch = 21:23, fill = 1:3, col = 1:3))

day <- rep(rep(c("Day 1", "Day 2", "Day 3", "Day 4"), c(12, 12, 12, 14)), 3)

with(iris, xyplot(Petal.Width~Petal.Length|day, groups = Species,

pch = 21:23, fill = 1:3, col = 1:3, key = key.day))

Petal.Length

Pet

al.W

idth

0.0

0.5

1.0

1.5

2.0

2.5

1 2 3 4 5 6 7

●●●●●

●●●●●●●

Day 1

●●●

●●● ●●

Day 2

●●

●●●●

●●●●

Day 3

1 2 3 4 5 6 7

0.0

0.5

1.0

1.5

2.0

2.5

●●

●●●●●

●●

●●●●

Day 4

setosa versicolor virginica●

Figure 34: Iris trellis graphs with measurements from four different days

37

Page 38: Statistics Workshop R graphics handoutkxio001/handouts.pdf · Statistics Workshop R graphics handout Statistical Consulting Service The Department of Statistics The University of

pca <- princomp(iris[, 1:4])

plot.pch <- with(iris, ifelse(Species == levels(Species)[1], 21,

ifelse(Species == levels(Species)[2], 22, 23)))

plot.col <- with(iris, ifelse(Species == levels(Species)[1], 1,

ifelse(Species == levels(Species)[2], 2, 3)))

plot(pca$scores[, 1:2], type = "n", asp = 1,

xlab = paste("Component One (",

round(pca$sdev[1]^2/sum(pca$sdev^2), 3)*100, "%)",sep = ""),

ylab = paste("Component Two (",

round(pca$sdev[2]^2/sum(pca$sdev^2), 3)*100, "%)",sep = ""),

main = paste(round(sum(pca$sdev[1:2]^2)/sum(pca$sdev^2), 3)*100,

"% of the total variance is explained",

"\n", "in this two dimensional graph", sep = ""))

points(pca$scores[, 1:2], pch = plot.pch, col = plot.col, bg = plot.col)

legend(2, 3, pch = 21:23, col = 1:3, legend = levels(iris$Species), pt.bg = 1:3)

−3 −2 −1 0 1 2 3 4

−3

−2

−1

01

23

97.8% of the total variance is explainedin this two dimensional graph

Component One (92.5%)

Com

pone

nt T

wo

(5.3

%)

●●

●●

●●●

● ●●

●●

●●

●●

● setosaversicolorvirginica

Figure 35: PCA on iris data with the first two components

38