Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R...

197
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Visualization with R Data Visualization with R Dhafer Malouche essai.academia.edu/DhaferMalouche Center of Political Studies, Institute of Social Research University of Michigan Ecole Sup´ erieure de la Statistique et de l’Analyse de l’Information, University of Carthage March 29th, 2017, 12:00-1:30 PM 5670 and 5769 Haven Hall Department of Political Science, University of Michigan D. Malouche | LSA, UoM, 29/3/17 1/ 115

Transcript of Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R...

Page 1: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Data Visualization with R

Dhafer Maloucheessai.academia.edu/DhaferMalouche

Center of Political Studies,Institute of Social Research

University of Michigan

Ecole Superieure de la Statistiqueet de l’Analyse de l’Information,

University of Carthage

March 29th, 2017, 12:00-1:30 PM 5670 and 5769 Haven HallDepartment of Political Science, University of Michigan

D. Malouche | LSA, UoM, 29/3/171 /

115

Page 2: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

D. Malouche | LSA, UoM, 29/3/172 /

115

Page 3: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Source: Knoema websiteR package: Knoema on Github

D. Malouche | LSA, UoM, 29/3/173 /

115

Page 4: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Outline

1 R packagesggplot2sjPlottabplot

2 Visualizing multivariate:Categorical DataQuantitative Data

3 Visualizing Data with target variable and results of statisticalmodels.

D. Malouche | LSA, UoM, 29/3/174 /

115

Page 5: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

R packages

ggplot2, programminggraphssjPlot, for Social Scientistsfsmb, Radar Chartstabplot, Large data

D. Malouche | LSA, UoM, 29/3/175 /

115

Page 6: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

D. Malouche | LSA, UoM, 29/3/176 /

115

Page 7: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

HadleyWickham, 2005

D. Malouche | LSA, UoM, 29/3/177 /

115

Page 8: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> dat <- data.frame(+ time = factor(c("Lunch","Dinner"), levels=c("Lunch","Dinner")),+ total_bill = c(14.89, 17.23)+ )> dat

time total_bill1 Lunch 14.892 Dinner 17.23

> library(ggplot2)> ggplot(data=dat, aes(x=time, y=total_bill, fill=time)) ++ geom_bar(colour="black", fill="#DD8888", width=.8, stat="identity") ++ guides(fill=FALSE) ++ xlab("Time of day") + ylab("Total bill") ++ ggtitle("Average bill for 2 people")

0

5

10

15

Lunch Dinner

Time of day

Tota

l bill

Average bill for 2 people

D. Malouche | LSA, UoM, 29/3/178 /

115

Page 9: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> dat <- data.frame(+ time = factor(c("Lunch","Dinner"), levels=c("Lunch","Dinner")),+ total_bill = c(14.89, 17.23)+ )> dat

time total_bill1 Lunch 14.892 Dinner 17.23

> library(ggplot2)> ggplot(data=dat, aes(x=time, y=total_bill, fill=time)) ++ geom_bar(colour="black", fill="#DD8888", width=.8, stat="identity") ++ guides(fill=FALSE) ++ xlab("Time of day") + ylab("Total bill") ++ ggtitle("Average bill for 2 people")

0

5

10

15

Lunch Dinner

Time of day

Tota

l bill

Average bill for 2 people

D. Malouche | LSA, UoM, 29/3/178 /

115

Page 10: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> dat <- data.frame(+ time = factor(c("Lunch","Dinner"), levels=c("Lunch","Dinner")),+ total_bill = c(14.89, 17.23)+ )> dat

time total_bill1 Lunch 14.892 Dinner 17.23

> library(ggplot2)> ggplot(data=dat, aes(x=time, y=total_bill, fill=time)) ++ geom_bar(colour="black", fill="#DD8888", width=.8, stat="identity") ++ guides(fill=FALSE) ++ xlab("Time of day") + ylab("Total bill") ++ ggtitle("Average bill for 2 people")

0

5

10

15

Lunch Dinner

Time of day

Tota

l bill

Average bill for 2 people

D. Malouche | LSA, UoM, 29/3/178 /

115

Page 11: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> library(reshape2)> data(tips)> head(tips)

total_bill tip sex smoker day time size1 16.99 1.01 Female No Sun Dinner 22 10.34 1.66 Male No Sun Dinner 33 21.01 3.50 Male No Sun Dinner 34 23.68 3.31 Male No Sun Dinner 25 24.59 3.61 Female No Sun Dinner 46 25.29 4.71 Male No Sun Dinner 4> levels(tips$day)[1] "Fri" "Sat" "Sun" "Thur"> tips$day=factor(tips$day,levels=levels(tips$day)[c(4,1,2,3)])

D. Malouche | LSA, UoM, 29/3/179 /

115

Page 12: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> library(ggplot2)> ggplot(data=tips, aes(x=day)) ++ geom_bar(stat="count")

0

25

50

75

Thur Fri Sat Sun

day

coun

t

D. Malouche | LSA, UoM, 29/3/1710 /

115

Page 13: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> library(ggplot2)> ggplot(data=tips, aes(x=day)) ++ geom_bar(stat="count")

0

25

50

75

Thur Fri Sat Sun

day

coun

t

D. Malouche | LSA, UoM, 29/3/1710 /

115

Page 14: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> library(plyr)> # Calculate the mean of tip for each day> mtips <- ddply(tips, "day", summarise, mtip = mean(tip))> mtips$day=factor(mtips$day,levels=levels(mtips$day)[c(4,1,2,3)])> mtips

day mtip1 Thur 2.7714522 Fri 2.7347373 Sat 2.9931034 Sun 3.255132

> ggplot(data=mtips, aes(x=day,y=mtip)) ++ geom_bar(stat="identity",fill="red",alpha=.6)+theme_bw()+xlab("Day")++ ylab("Average of tips")

0

1

2

3

Sun Thur Fri Sat

Day

Ave

rage

of t

ips

D. Malouche | LSA, UoM, 29/3/1711 /

115

Page 15: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> library(plyr)> # Calculate the mean of tip for each day> mtips <- ddply(tips, "day", summarise, mtip = mean(tip))> mtips$day=factor(mtips$day,levels=levels(mtips$day)[c(4,1,2,3)])> mtips

day mtip1 Thur 2.7714522 Fri 2.7347373 Sat 2.9931034 Sun 3.255132

> ggplot(data=mtips, aes(x=day,y=mtip)) ++ geom_bar(stat="identity",fill="red",alpha=.6)+theme_bw()+xlab("Day")++ ylab("Average of tips")

0

1

2

3

Sun Thur Fri Sat

Day

Ave

rage

of t

ips

D. Malouche | LSA, UoM, 29/3/1711 /

115

Page 16: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> library(plyr)> # Calculate the mean of tip for each day> mtips <- ddply(tips, "day", summarise, mtip = mean(tip))> mtips$day=factor(mtips$day,levels=levels(mtips$day)[c(4,1,2,3)])> mtips

day mtip1 Thur 2.7714522 Fri 2.7347373 Sat 2.9931034 Sun 3.255132

> ggplot(data=mtips, aes(x=day,y=mtip)) ++ geom_bar(stat="identity",fill="red",alpha=.6)+theme_bw()+xlab("Day")++ ylab("Average of tips")

0

1

2

3

Sun Thur Fri Sat

Day

Ave

rage

of t

ips

D. Malouche | LSA, UoM, 29/3/1711 /

115

Page 17: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> library(plyr)> # Calculate the mean of tip for each day> mtips <- ddply(tips, "day", summarise, mtip = mean(tip),stip=sd(tip))> mtips$day=factor(mtips$day,levels=levels(mtips$day)[c(4,1,2,3)])> mtips$lower=mtips$mtip-2*mtips$stip> mtips$upper=mtips$mtip+2*mtips$stip> mtips$day=factor(mtips$day,levels=levels(mtips$day)[c(4,1,2,3)])> mtips

day mtip stip lower upper1 Thur 2.771452 1.240223 0.2910052 5.2518982 Fri 2.734737 1.019577 0.6955827 4.7738913 Sat 2.993103 1.631014 -0.2689252 6.2551324 Sun 3.255132 1.234880 0.7853710 5.724892

D. Malouche | LSA, UoM, 29/3/1712 /

115

Page 18: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> ggplot(mtips,aes(x=day,y=mtip,group=day))++ geom_errorbar(aes(ymin=lower,ymax=upper,width=.2))++ geom_point(size=3)+theme_bw()+xlab("Day")+ylab("Average of tips")

0

2

4

6

Sat Sun Thur Fri

Day

Ave

rage

of t

ips

D. Malouche | LSA, UoM, 29/3/1713 /

115

Page 19: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> ggplot(mtips,aes(x=day,y=mtip,group=day))++ geom_errorbar(aes(ymin=lower,ymax=upper,width=.2))++ geom_point(size=3)+theme_bw()+xlab("Day")+ylab("Average of tips")

0

2

4

6

Sat Sun Thur Fri

Day

Ave

rage

of t

ips

D. Malouche | LSA, UoM, 29/3/1713 /

115

Page 20: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> library(plyr)> # Calculate the mean of tip for each day> mtips <- ddply(tips, c("day","sex","smoker"), summarise, mtip = mean+ (tip),stip=sd(tip))> mtips$day=factor(mtips$day,levels=levels(mtips$day)[c(4,1,2,3)])> mtips$lower=mtips$mtip-2*mtips$stip> mtips$upper=mtips$mtip+2*mtips$stip> mtips$day=factor(mtips$day,levels=levels(mtips$day)[c(4,1,2,3)])> mtips

day sex smoker mtip stip lower upper1 Thur Female No 2.459600 1.0783687 0.30286265 4.6163372 Thur Female Yes 2.990000 1.2040487 0.58190255 5.3980973 Thur Male No 2.941500 1.4856233 -0.02974659 5.9127474 Thur Male Yes 3.058000 1.1115735 0.83485308 5.2811475 Fri Female No 3.125000 0.1767767 2.77144661 3.4785536 Fri Female Yes 2.682857 1.0580125 0.56683212 4.7988827 Fri Male No 2.500000 1.4142136 -0.32842712 5.3284278 Fri Male Yes 2.741250 1.1668081 0.40763386 5.0748669 Sat Female No 2.724615 0.9619045 0.80080640 4.64842410 Sat Female Yes 2.868667 1.4613783 -0.05409002 5.79142311 Sat Male No 3.256562 1.8397486 -0.42293469 6.93606012 Sat Male Yes 2.879259 1.7443379 -0.60941660 6.36793513 Sun Female No 3.329286 1.2823564 0.76457293 5.89399814 Sun Female Yes 3.500000 0.4082483 2.68350342 4.31649715 Sun Male No 3.115349 1.2164005 0.68254779 5.54815016 Sun Male Yes 3.521333 1.4174316 0.68647010 6.356197

D. Malouche | LSA, UoM, 29/3/1714 /

115

Page 21: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> pd <- position_dodge(0.4)> ggplot(mtips,aes(x=sex,y=mtip,col=smoker,group=smoker))++ geom_errorbar(aes(ymin=lower,ymax=upper),position=pd,width=.2)++ geom_point(size=3,position=pd)+theme_bw()+xlab("Gender")++ ylab("Average of tips")+facet_wrap(˜day)

Thur Fri

Sat Sun

Female Male Female Male

0

2

4

6

0

2

4

6

Gender

Ave

rage

of t

ips

smoker

No

Yes

D. Malouche | LSA, UoM, 29/3/1715 /

115

Page 22: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> pd <- position_dodge(0.4)> ggplot(mtips,aes(x=sex,y=mtip,col=smoker,group=smoker))++ geom_errorbar(aes(ymin=lower,ymax=upper),position=pd,width=.2)++ geom_point(size=3,position=pd)+theme_bw()+xlab("Gender")++ ylab("Average of tips")+facet_wrap(˜day)Thur Fri

Sat Sun

Female Male Female Male

0

2

4

6

0

2

4

6

Gender

Ave

rage

of t

ips

smoker

No

Yes

D. Malouche | LSA, UoM, 29/3/1715 /

115

Page 23: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> ggplot(tips,aes(x=sex,y=tip,col=smoker,fill=smoker))++ geom_boxplot(position=pd,width=.2,alpha=.5)+theme_bw()+xlab("Gender")++ ylab("Tips")+facet_wrap(˜day)

Sat Sun

Thur Fri

Female Male Female Male

2.5

5.0

7.5

10.0

2.5

5.0

7.5

10.0

Gender

Tip

s

smoker

No

Yes

D. Malouche | LSA, UoM, 29/3/1716 /

115

Page 24: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> ggplot(tips,aes(x=sex,y=tip,col=smoker,fill=smoker))++ geom_boxplot(position=pd,width=.2,alpha=.5)+theme_bw()+xlab("Gender")++ ylab("Tips")+facet_wrap(˜day)

Sat Sun

Thur Fri

Female Male Female Male

2.5

5.0

7.5

10.0

2.5

5.0

7.5

10.0

Gender

Tip

s

smoker

No

Yes

D. Malouche | LSA, UoM, 29/3/1716 /

115

Page 25: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> ggplot(tips,aes(x=sex,y=tip,col=smoker,fill=smoker))++ geom_violin(position=pd,width=.2,alpha=.5)+theme_bw()+xlab("Gender")++ ylab("Tips")++ facet_wrap(˜day)

Sat Sun

Thur Fri

Female Male Female Male

2.5

5.0

7.5

10.0

2.5

5.0

7.5

10.0

Gender

Tip

s

smoker

No

Yes

D. Malouche | LSA, UoM, 29/3/1717 /

115

Page 26: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> ggplot(tips,aes(x=sex,y=tip,col=smoker,fill=smoker))++ geom_violin(position=pd,width=.2,alpha=.5)+theme_bw()+xlab("Gender")++ ylab("Tips")++ facet_wrap(˜day)

Sat Sun

Thur Fri

Female Male Female Male

2.5

5.0

7.5

10.0

2.5

5.0

7.5

10.0

Gender

Tip

s

smoker

No

Yes

D. Malouche | LSA, UoM, 29/3/1717 /

115

Page 27: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> ggplot(tips,aes(x=day,y=tip,col=time,fill=time))++ geom_boxplot(alpha=.4)+theme_bw()+xlab("Tips")+ylab("")++ facet_grid(sex˜smoker)+ggtitle("Tips in term of Smoker x Gender")

No Yes

Fem

aleM

ale

Thur Fri Sat Sun Thur Fri Sat Sun

2.5

5.0

7.5

10.0

2.5

5.0

7.5

10.0

Tips

time

Dinner

Lunch

Tips in term of Smoker x Gender

D. Malouche | LSA, UoM, 29/3/1718 /

115

Page 28: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> ggplot(tips,aes(x=day,y=tip,col=time,fill=time))++ geom_boxplot(alpha=.4)+theme_bw()+xlab("Tips")+ylab("")++ facet_grid(sex˜smoker)+ggtitle("Tips in term of Smoker x Gender")

No Yes

Fem

aleM

ale

Thur Fri Sat Sun Thur Fri Sat Sun

2.5

5.0

7.5

10.0

2.5

5.0

7.5

10.0

Tips

time

Dinner

Lunch

Tips in term of Smoker x Gender

D. Malouche | LSA, UoM, 29/3/1718 /

115

Page 29: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> ggplot(tips,aes(x=total_bill,y=tip,col=time,fill=time))++ geom_smooth(alpha=.4)+theme_bw()+xlab("Tips")+ylab("")++ facet_grid(sex˜smoker)+ggtitle("Tips in term of Smoker x Gender")

No Yes

Fem

aleM

ale

10 20 30 40 50 10 20 30 40 50

0

4

8

12

0

4

8

12

Tips

time

Dinner

Lunch

Tips in term of Smoker x Gender

D. Malouche | LSA, UoM, 29/3/1719 /

115

Page 30: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> ggplot(tips,aes(x=total_bill,y=tip,col=time,fill=time))++ geom_smooth(alpha=.4)+theme_bw()+xlab("Tips")+ylab("")++ facet_grid(sex˜smoker)+ggtitle("Tips in term of Smoker x Gender")

No Yes

Fem

aleM

ale

10 20 30 40 50 10 20 30 40 50

0

4

8

12

0

4

8

12

Tips

time

Dinner

Lunch

Tips in term of Smoker x Gender

D. Malouche | LSA, UoM, 29/3/1719 /

115

Page 31: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> ggplot(tips,aes(x=total_bill,y=tip,col=time,fill=time))+geom_point()++ geom_smooth(method='lm',alpha=.4)+theme_bw()+xlab("Tips")+ylab("")++ facet_grid(sex˜smoker)+ggtitle("Tips in term of Smoker x Gender")

No Yes

Fem

aleM

ale

10 20 30 40 50 10 20 30 40 50

2.5

5.0

7.5

10.0

2.5

5.0

7.5

10.0

Tips

time

Dinner

Lunch

Tips in term of Smoker x Gender

D. Malouche | LSA, UoM, 29/3/1720 /

115

Page 32: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> ggplot(tips,aes(x=total_bill,y=tip,col=time,fill=time))+geom_point()++ geom_smooth(method='lm',alpha=.4)+theme_bw()+xlab("Tips")+ylab("")++ facet_grid(sex˜smoker)+ggtitle("Tips in term of Smoker x Gender")

No Yes

Fem

aleM

ale

10 20 30 40 50 10 20 30 40 50

2.5

5.0

7.5

10.0

2.5

5.0

7.5

10.0

Tips

time

Dinner

Lunch

Tips in term of Smoker x Gender

D. Malouche | LSA, UoM, 29/3/1720 /

115

Page 33: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> ggplot(tips,aes(x=total_bill,y=tip,col=time,fill=time,size=size))+geom_point()++ geom_smooth(method='lm',alpha=.4)+theme_bw()+xlab("Tips")+ylab("")++ facet_grid(sex˜smoker)+ggtitle("Tips in term of Smoker x Gender")

No Yes

Fem

aleM

ale

10 20 30 40 50 10 20 30 40 50

2.5

5.0

7.5

10.0

2.5

5.0

7.5

10.0

Tips

time

Dinner

Lunch

size

1

2

3

4

5

6

Tips in term of Smoker x Gender

D. Malouche | LSA, UoM, 29/3/1721 /

115

Page 34: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggplot2

> ggplot(tips,aes(x=total_bill,y=tip,col=time,fill=time,size=size))+geom_point()++ geom_smooth(method='lm',alpha=.4)+theme_bw()+xlab("Tips")+ylab("")++ facet_grid(sex˜smoker)+ggtitle("Tips in term of Smoker x Gender")

No Yes

Fem

aleM

ale

10 20 30 40 50 10 20 30 40 50

2.5

5.0

7.5

10.0

2.5

5.0

7.5

10.0

Tips

time

Dinner

Lunch

size

1

2

3

4

5

6

Tips in term of Smoker x Gender

D. Malouche | LSA, UoM, 29/3/1721 /

115

Page 35: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

GUI for ggplot2

JGR, Deducer...Rcmdr,RmcdrPlugin.KMggplot2

D. Malouche | LSA, UoM, 29/3/1722 /

115

Page 36: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

GUI for ggplot2

JGR, Deducer...

Rcmdr,RmcdrPlugin.KMggplot2

D. Malouche | LSA, UoM, 29/3/1722 /

115

Page 37: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

GUI for ggplot2

JGR, Deducer...

Rcmdr,RmcdrPlugin.KMggplot2

D. Malouche | LSA, UoM, 29/3/1722 /

115

Page 38: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot

D. Malouche | LSA, UoM, 29/3/1723 /

115

Page 39: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot

Author: Daniel Lüdecke [email protected]: http://www.strengejacke.de/sjPlot/It’s a Data Visualization package for Statistics in Social ScienceIt contains functions to import data from different formats: SPSS,STATA, SAS. . . etc.Labeling and handling factor variables in the data.

D. Malouche | LSA, UoM, 29/3/1724 /

115

Page 40: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot, Bar charts

> ## Load the package and define your theme (there are a lot...).> library(sjPlot)> library(sjmisc)> library(ggplot2)> sjp.setTheme(geom.outline.color = "antiquewhite4",+ geom.outline.size = 1,+ geom.label.size = 2,+ geom.label.color = "black",+ title.color = "red",+ title.size = 1.5,+ axis.textcolor = "blue",+ base = theme_bw())> ## Load data and represent the bar chart of one the variables.> data(efc)> attr(efc$e42dep,"labels")

independent slightly dependent moderately dependent1 2 3

severely dependent4

> sjp.frq(efc$e42dep,coord.flip = T,geom.size = .4)

66 (7.3%)

225 (25.0%)

306 (34.0%)

304 (33.7%)

independent

slightly dependent

moderately dependent

severely dependent

0 100 200 300

elde

r's d

epen

denc

y

D. Malouche | LSA, UoM, 29/3/1725 /

115

Page 41: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot, Bar charts

> ## Load the package and define your theme (there are a lot...).> library(sjPlot)> library(sjmisc)> library(ggplot2)> sjp.setTheme(geom.outline.color = "antiquewhite4",+ geom.outline.size = 1,+ geom.label.size = 2,+ geom.label.color = "black",+ title.color = "red",+ title.size = 1.5,+ axis.textcolor = "blue",+ base = theme_bw())> ## Load data and represent the bar chart of one the variables.> data(efc)> attr(efc$e42dep,"labels")

independent slightly dependent moderately dependent1 2 3

severely dependent4

> sjp.frq(efc$e42dep,coord.flip = T,geom.size = .4)

66 (7.3%)

225 (25.0%)

306 (34.0%)

304 (33.7%)

independent

slightly dependent

moderately dependent

severely dependent

0 100 200 300

elde

r's d

epen

denc

y

D. Malouche | LSA, UoM, 29/3/1725 /

115

Page 42: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot, Bar charts

> sjp.frq(efc$e42dep,show.prc = T,show.n = F)

7.3%

25.0%

34.0% 33.7%

0

100

200

300

independent slightly dependent moderately dependent severely dependentelder's dependency

D. Malouche | LSA, UoM, 29/3/1726 /

115

Page 43: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot, Bar charts

> sjp.frq(efc$e42dep,show.prc = T,show.n = F)

7.3%

25.0%

34.0% 33.7%

0

100

200

300

independent slightly dependent moderately dependent severely dependentelder's dependency

D. Malouche | LSA, UoM, 29/3/1726 /

115

Page 44: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot, Contingency tables

> xtabs(˜efc$e16sex+efc$e42dep)efc$e42dep

efc$e16sex 1 2 3 41 23 70 109 932 43 154 197 211

> sjp.xtab(x = efc$e42dep, grp = efc$e16sex)

7.3%(n=66)

7.1%(n=43)

7.8%(n=23)

24.9%(n=224)

25.4%(n=154)

23.7%(n=70)

34.0%(n=306)32.6%

(n=197)

37.0%(n=109)

33.8%(n=304)

34.9%(n=211)

31.5%(n=93)

0%

20%

40%

independent slightlydependent

moderatelydependent

severelydependent

elder's dependency

elder's gender

male

female

Total

D. Malouche | LSA, UoM, 29/3/1727 /

115

Page 45: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot, Contingency tables

> xtabs(˜efc$e16sex+efc$e42dep)efc$e42dep

efc$e16sex 1 2 3 41 23 70 109 932 43 154 197 211

> sjp.xtab(x = efc$e42dep, grp = efc$e16sex)

7.3%(n=66)

7.1%(n=43)

7.8%(n=23)

24.9%(n=224)

25.4%(n=154)

23.7%(n=70)

34.0%(n=306)32.6%

(n=197)

37.0%(n=109)

33.8%(n=304)

34.9%(n=211)

31.5%(n=93)

0%

20%

40%

independent slightlydependent

moderatelydependent

severelydependent

elder's dependency

elder's gender

male

female

Total

D. Malouche | LSA, UoM, 29/3/1727 /

115

Page 46: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot, Contingency tables

> xtabs(˜efc$e16sex+efc$e42dep)efc$e42dep

efc$e16sex 1 2 3 41 23 70 109 932 43 154 197 211

> sjp.xtab(x = efc$e42dep, grp = efc$e16sex)

7.3%(n=66)

7.1%(n=43)

7.8%(n=23)

24.9%(n=224)

25.4%(n=154)

23.7%(n=70)

34.0%(n=306)32.6%

(n=197)

37.0%(n=109)

33.8%(n=304)

34.9%(n=211)

31.5%(n=93)

0%

20%

40%

independent slightlydependent

moderatelydependent

severelydependent

elder's dependency

elder's gender

male

female

Total

D. Malouche | LSA, UoM, 29/3/1727 /

115

Page 47: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot, Contingency tables, Other options

> sjp.xtab(x = efc$e42dep, grp = efc$e16sex,bar.pos = "stack",+ margin = "row",show.n = F,show.total = F,+ summary.pos = "l")

34.9%31.2%

35.6%30.6%

65.2%68.8%

64.4%69.4%

0%

20%

40%

60%

80%

100%

independent slightlydependent

moderatelydependent

severelydependent

elder's dependency

elder's gender

male

female

D. Malouche | LSA, UoM, 29/3/1728 /

115

Page 48: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot, Contingency tables, Other options

> sjp.xtab(x = efc$e42dep, grp = efc$e16sex,bar.pos = "stack",+ margin = "row",show.n = F,show.total = F,+ summary.pos = "l")

34.9%31.2%

35.6%30.6%

65.2%68.8%

64.4%69.4%

0%

20%

40%

60%

80%

100%

independent slightlydependent

moderatelydependent

severelydependent

elder's dependency

elder's gender

male

female

D. Malouche | LSA, UoM, 29/3/1728 /

115

Page 49: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot, Contingency tables, Other options

We replace bars with lines

> sjp.xtab(x = efc$e42dep, grp = efc$e16sex,+ show.n = F,show.total = F,+ type="line")

7.1%7.8%

25.4%

23.7%

32.6%

37.0%

34.9%

31.5%

0%

20%

40%

independent slightlydependent

moderatelydependent

severelydependent

elder's dependency

group

male

female

D. Malouche | LSA, UoM, 29/3/1729 /

115

Page 50: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot, Contingency tables, Other options

We replace bars with lines

> sjp.xtab(x = efc$e42dep, grp = efc$e16sex,+ show.n = F,show.total = F,+ type="line")

7.1%7.8%

25.4%

23.7%

32.6%

37.0%

34.9%

31.5%

0%

20%

40%

independent slightlydependent

moderatelydependent

severelydependent

elder's dependency

group

male

female

D. Malouche | LSA, UoM, 29/3/1729 /

115

Page 51: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Stacked bar plot

Plot multiple variables with same categories.

> # recveive first item of COPE-index scale> start <- which(colnames(efc) == "c82cop1")> # recveive first item of COPE-index scale> end <- which(colnames(efc) == "c90cop9")> sjp.stackfrq(efc[, start:end], expand.grid = TRUE,+ geom.size = .4,sort.frq = "last.desc")

0.3%10.8% 65.6% 23.3%

20.6% 60.6% 14.4% 4.3%

57.2% 27.9% 9.1% 5.8%

45.5% 38.5% 9.5% 6.5%

69.4% 23.4% 5.5%1.7%

79.2% 14.6% 4.3%1.9%

37.3% 41.6% 12.6% 8.6%

34.7% 26.3% 26.7% 12.2%

8.6% 23.6% 33.8% 34.0%

does caregiving causedifficulties in your

relationship with your family?(n=902)

does caregiving causefinancial difficulties?

(n=900)

do you find caregiving toodemanding? (n=902)

does caregiving causedifficulties in your

relationship with yourfriends? (n=902)

does caregiving have negativeeffect on your physical

health? (n=898)

do you feel trapped in yourrole as caregiver? (n=900)

do you feel supported byfriends/neighbours? (n=901)

do you feel you cope well ascaregiver? (n=901)

do you feel caregivingworthwhile? (n=888)

0% 20% 40% 60% 80% 100%

never

sometimes

often

always

D. Malouche | LSA, UoM, 29/3/1730 /

115

Page 52: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Stacked bar plot

Plot multiple variables with same categories.

> # recveive first item of COPE-index scale> start <- which(colnames(efc) == "c82cop1")> # recveive first item of COPE-index scale> end <- which(colnames(efc) == "c90cop9")> sjp.stackfrq(efc[, start:end], expand.grid = TRUE,+ geom.size = .4,sort.frq = "last.desc")

0.3%10.8% 65.6% 23.3%

20.6% 60.6% 14.4% 4.3%

57.2% 27.9% 9.1% 5.8%

45.5% 38.5% 9.5% 6.5%

69.4% 23.4% 5.5%1.7%

79.2% 14.6% 4.3%1.9%

37.3% 41.6% 12.6% 8.6%

34.7% 26.3% 26.7% 12.2%

8.6% 23.6% 33.8% 34.0%

does caregiving causedifficulties in your

relationship with your family?(n=902)

does caregiving causefinancial difficulties?

(n=900)

do you find caregiving toodemanding? (n=902)

does caregiving causedifficulties in your

relationship with yourfriends? (n=902)

does caregiving have negativeeffect on your physical

health? (n=898)

do you feel trapped in yourrole as caregiver? (n=900)

do you feel supported byfriends/neighbours? (n=901)

do you feel you cope well ascaregiver? (n=901)

do you feel caregivingworthwhile? (n=888)

0% 20% 40% 60% 80% 100%

never

sometimes

often

always

D. Malouche | LSA, UoM, 29/3/1730 /

115

Page 53: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot, Likert-scales plots

Create a dummy data set withfive items (columns)500 observations.Each items has 4 category values, two so-called “positive” values(agree and strongly agree) versus two negative values (disagree andstrongly disagree).

D. Malouche | LSA, UoM, 29/3/1731 /

115

Page 54: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot, Likert-scales plots

> ## Data> mydf <- data.frame(+ question1 = as.factor(sample(1:4, 500, replace = TRUE,+ prob = c(0.25, 0.33, 0.14, 0.28))),+ question2 = as.factor(sample(1:4, 500, replace = TRUE,+ prob = c(0.5, 0.25, 0.15, 0.1))),+ question3 = as.factor(sample(1:4, 500, replace = TRUE,+ prob = c(0.25, 0.1, 0.39, 0.26))),+ question4 = as.factor(sample(1:4, 500, replace = TRUE,+ prob = c(0.17, 0.27, 0.38, 0.16))),+ question5 = as.factor(sample(1:4, 500, replace = TRUE,+ prob = c(0.37, 0.26, 0.16, 0.21)))+ )

D. Malouche | LSA, UoM, 29/3/1732 /

115

Page 55: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot, Likert-scales plots

> ## Create labels> labels <- c("Strongly agree", "Agree", "Disagree",+ "Strongly disagree")

> ## Create item labels> items <- c("Question 1", "Question 2", "Question 3",+ "Question 4", "Question 5")

D. Malouche | LSA, UoM, 29/3/1733 /

115

Page 56: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot, Likert-scales plots

> ## Create labels> labels <- c("Strongly agree", "Agree", "Disagree",+ "Strongly disagree")

> ## Create item labels> items <- c("Question 1", "Question 2", "Question 3",+ "Question 4", "Question 5")

D. Malouche | LSA, UoM, 29/3/1733 /

115

Page 57: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot, Likert-scales plots

> sjp.likert(mydf, axis.labels = items,+ legend.labels = labels,+ geom.size = 0.4)

26.4 36.0

27.6 16.0

10.6 29.0

27.4 49.4

31.8 22.6

14.623.0

37.818.6

34.625.8

14.09.2

16.029.6

Question 5 (n=500)

Question 4 (n=500)

Question 3 (n=500)

Question 2 (n=500)

Question 1 (n=500)

100% 80% 60% 40% 20% 0% 20% 40% 60% 80% 100%

Strongly agree

Agree

Disagree

Strongly disagree

D. Malouche | LSA, UoM, 29/3/1734 /

115

Page 58: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

sjPlot, Likert-scales plots

> sjp.likert(mydf, axis.labels = items,+ legend.labels = labels,+ geom.size = 0.4)

26.4 36.0

27.6 16.0

10.6 29.0

27.4 49.4

31.8 22.6

14.623.0

37.818.6

34.625.8

14.09.2

16.029.6

Question 5 (n=500)

Question 4 (n=500)

Question 3 (n=500)

Question 2 (n=500)

Question 1 (n=500)

100% 80% 60% 40% 20% 0% 20% 40% 60% 80% 100%

Strongly agree

Agree

Disagree

Strongly disagree

D. Malouche | LSA, UoM, 29/3/1734 /

115

Page 59: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Radar Charts

D. Malouche | LSA, UoM, 29/3/1735 /

115

Page 60: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Radar Charts

Radar charts arecalled Spider or Web or Polar charts.a way of comparing multiple quantitative variables.are also useful for seeing which variables are scoring high or lowwithin a dataset.

We can use fmsb package to draw radar charts.

D. Malouche | LSA, UoM, 29/3/1736 /

115

Page 61: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Radar Charts

> library(fmsb)>> # Create data: note in High school for several students> set.seed(99)> data=as.data.frame(matrix( sample( 0:20 , 15 , replace=F) , ncol=5))> colnames(data)=c("math" , "english" , "biology" , "music" , "R-coding" )> rownames(data)=paste("mister" , letters[1:3] , sep="-")> # We add 2 lines to the dataframe: the max and min of each> # topic to show on the plot!> data=rbind(rep(20,5) , rep(0,5) , data)> data

math english biology music R-coding1 20 20 20 20 202 0 0 0 0 0mister-a 12 17 10 19 1mister-b 2 9 4 6 16mister-c 13 15 18 5 20

D. Malouche | LSA, UoM, 29/3/1737 /

115

Page 62: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Radar Charts

> colors_border=c( rgb(0.2,0.5,0.5,0.9),+ rgb(0.8,0.2,0.5,0.9) ,+ rgb(0.7,0.5,0.1,0.9) )> colors_in=c( rgb(0.2,0.5,0.5,0.4),+ rgb(0.8,0.2,0.5,0.4) ,+ rgb(0.7,0.5,0.1,0.4) )> radarchart( data , axistype=1 ,+ #custom polygon+ pcol=colors_border , pfcol=colors_in , plwd=4 , plty=1,+ #custom the grid+ cglcol="grey", cglty=1, axislabcol="grey",+ caxislabels=seq(0,20,5), cglwd=0.8,+ #custom labels+ vlcex=0.8+ )> legend(x=0.7, y=1,+ legend = rownames(data[-c(1,2),]),+ bty = "n", pch=20 ,+ col=colors_in , text.col = "grey", cex=1.2, pt.cex=3)

0

5

10

15

20

math

english

biology music

R−coding

mister−amister−bmister−c

D. Malouche | LSA, UoM, 29/3/1738 /

115

Page 63: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Radar Charts

> colors_border=c( rgb(0.2,0.5,0.5,0.9),+ rgb(0.8,0.2,0.5,0.9) ,+ rgb(0.7,0.5,0.1,0.9) )> colors_in=c( rgb(0.2,0.5,0.5,0.4),+ rgb(0.8,0.2,0.5,0.4) ,+ rgb(0.7,0.5,0.1,0.4) )> radarchart( data , axistype=1 ,+ #custom polygon+ pcol=colors_border , pfcol=colors_in , plwd=4 , plty=1,+ #custom the grid+ cglcol="grey", cglty=1, axislabcol="grey",+ caxislabels=seq(0,20,5), cglwd=0.8,+ #custom labels+ vlcex=0.8+ )> legend(x=0.7, y=1,+ legend = rownames(data[-c(1,2),]),+ bty = "n", pch=20 ,+ col=colors_in , text.col = "grey", cex=1.2, pt.cex=3)

0

5

10

15

20

math

english

biology music

R−coding

mister−amister−bmister−c

D. Malouche | LSA, UoM, 29/3/1738 /

115

Page 64: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

tabplot, Large Data

D. Malouche | LSA, UoM, 29/3/1739 /

115

Page 65: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

tabplot, Large data visualization

1 Explore and analyse large datasets.

2 Discover strange data patterns.

3 Check the occurrence and selectivity of missing values.

D. Malouche | LSA, UoM, 29/3/1740 /

115

Page 66: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Data

> require(ggplot2)Loading required package: ggplot2> data(diamonds)> head(diamonds)# A tibble: 6 » 10

carat cut color clarity depth table price x y z<dbl> <ord> <ord> <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl>

1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.432 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.313 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.314 0.29 Premium I VS2 62.4 58 334 4.20 4.23 2.635 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.756 0.24 Very Good J VVS2 62.8 57 336 3.94 3.96 2.48

D. Malouche | LSA, UoM, 29/3/1741 /

115

Page 67: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Data

> summary(diamonds)carat cut color clarity

Min. :0.2000 Fair : 1610 D: 6775 SI1 :130651st Qu.:0.4000 Good : 4906 E: 9797 VS2 :12258Median :0.7000 Very Good:12082 F: 9542 SI2 : 9194Mean :0.7979 Premium :13791 G:11292 VS1 : 81713rd Qu.:1.0400 Ideal :21551 H: 8304 VVS2 : 5066Max. :5.0100 I: 5422 VVS1 : 3655

J: 2808 (Other): 2531depth table price x

Min. :43.00 Min. :43.00 Min. : 326 Min. : 0.0001st Qu.:61.00 1st Qu.:56.00 1st Qu.: 950 1st Qu.: 4.710Median :61.80 Median :57.00 Median : 2401 Median : 5.700Mean :61.75 Mean :57.46 Mean : 3933 Mean : 5.7313rd Qu.:62.50 3rd Qu.:59.00 3rd Qu.: 5324 3rd Qu.: 6.540Max. :79.00 Max. :95.00 Max. :18823 Max. :10.740

y zMin. : 0.000 Min. : 0.0001st Qu.: 4.720 1st Qu.: 2.910Median : 5.710 Median : 3.530Mean : 5.735 Mean : 3.5393rd Qu.: 6.540 3rd Qu.: 4.040Max. :58.900 Max. :31.800

D. Malouche | LSA, UoM, 29/3/1742 /

115

Page 68: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Exploring Data

> require(tabplot)> tableplot(diamonds)

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

row bins: 100

objects:53,940

539 (per bin)

carat

0.0 1.0 2.0

cut

FairGoodVery GoodPremiumIdeal

missing

color

DEFGHIJ

missing

clarity

I1SI2SI1VS2VS1VVS2VVS1IF

missing

depth

058 60 62 64

table

0 54 56 58 60

price

0.0 0.5 1.0 1.5

x 1e+04

x

0 2 4 6 8

y

0 2 6 10

z

0 2 4

D. Malouche | LSA, UoM, 29/3/1743 /

115

Page 69: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Exploring Data

> require(tabplot)> tableplot(diamonds)

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

row bins: 100

objects:53,940

539 (per bin)

carat

0.0 1.0 2.0

cut

FairGoodVery GoodPremiumIdeal

missing

color

DEFGHIJ

missing

clarity

I1SI2SI1VS2VS1VVS2VVS1IF

missing

depth

058 60 62 64

table

0 54 56 58 60

price

0.0 0.5 1.0 1.5

x 1e+04

x

0 2 4 6 8

y

0 2 6 10

z

0 2 4

D. Malouche | LSA, UoM, 29/3/1743 /

115

Page 70: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Exploring Data, how it works?

> tableplot(diamonds, nBins=2,select =c(carat,color),decreasing = T)

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

row bins: 2

objects:53,94026,970 (per bin)

carat

0.0 0.5 1.0 1.5

color

DEFGHIJ

missing

D. Malouche | LSA, UoM, 29/3/1744 /

115

Page 71: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Exploring Data, how it works?

> tableplot(diamonds, nBins=2,select =c(carat,color),decreasing = T)

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

row bins: 2

objects:53,94026,970 (per bin)

carat

0.0 0.5 1.0 1.5

color

DEFGHIJ

missing

D. Malouche | LSA, UoM, 29/3/1744 /

115

Page 72: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Zooming data,

> tableplot(diamonds, nBins=5, select = c(carat, price, cut, color, clarity),+ sortCol = price, from = 0, to = 5)

0.0%

0.5%

1.0%

1.5%

2.0%

2.5%

3.0%

3.5%

4.0%

4.5%

5.0%

row bins: 5

objects:2,697 539 (per bin)

carat

0.0 0.5 1.0 1.5 2.0

price

0.0 0.5 1.0 1.5

x 1e+04

cut

FairGoodVery GoodPremiumIdeal

missing

color

DEFGHIJ

missing

clarity

I1SI2SI1VS2VS1VVS2VVS1IF

missing

D. Malouche | LSA, UoM, 29/3/1745 /

115

Page 73: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Zooming data,

> tableplot(diamonds, nBins=5, select = c(carat, price, cut, color, clarity),+ sortCol = price, from = 0, to = 5)

0.0%

0.5%

1.0%

1.5%

2.0%

2.5%

3.0%

3.5%

4.0%

4.5%

5.0%

row bins: 5

objects:2,697 539 (per bin)

carat

0.0 0.5 1.0 1.5 2.0

price

0.0 0.5 1.0 1.5

x 1e+04

cut

FairGoodVery GoodPremiumIdeal

missing

color

DEFGHIJ

missing

clarity

I1SI2SI1VS2VS1VVS2VVS1IF

missing

D. Malouche | LSA, UoM, 29/3/1745 /

115

Page 74: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Filtering data

> tableplot(diamonds, subset = price < 5000 & cut == "Premium")

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

row bins: 100

objects:9,070

91 (per bin)

carat

0.0 0.5 1.0

cut

FairGoodVery GoodPremiumIdeal

missing

color

DEFGHIJ

missing

clarity

I1SI2SI1VS2VS1VVS2VVS1IF

missing

depth

0 59 60 61 62

table

0 56 58 60

price

0 1 2 3 4

x 1e+03

x

0 2 4 6

y

0 2 4 6

z

0 1 2 3 4

D. Malouche | LSA, UoM, 29/3/1746 /

115

Page 75: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Filtering data

> tableplot(diamonds, subset = price < 5000 & cut == "Premium")

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

row bins: 100

objects:9,070

91 (per bin)

carat

0.0 0.5 1.0

cut

FairGoodVery GoodPremiumIdeal

missing

color

DEFGHIJ

missing

clarity

I1SI2SI1VS2VS1VVS2VVS1IF

missing

depth

0 59 60 61 62

table

0 56 58 60

price

0 1 2 3 4

x 1e+03

x

0 2 4 6

y

0 2 4 6

z

0 1 2 3 4

D. Malouche | LSA, UoM, 29/3/1746 /

115

Page 76: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Change colors

> tableplot(diamonds, pals = list(cut="Set1(6)", color="Set5", clarity=rainbow(8)))

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

row bins: 100

objects:53,940

539 (per bin)

carat

0.0 1.0 2.0

cut

FairGoodVery GoodPremiumIdeal

missing

color

DEFGHIJ

missing

clarity

I1SI2SI1VS2VS1VVS2VVS1IF

missing

depth

0 60 62 64

table

0 54565860

price

0.0 0.5 1.0 1.5

x 1e+04

x

0 2 4 6 8

y

0 2 6 10

z

0 2 4

D. Malouche | LSA, UoM, 29/3/1747 /

115

Page 77: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Change colors

> tableplot(diamonds, pals = list(cut="Set1(6)", color="Set5", clarity=rainbow(8)))

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

row bins: 100

objects:53,940

539 (per bin)

carat

0.0 1.0 2.0

cut

FairGoodVery GoodPremiumIdeal

missing

color

DEFGHIJ

missing

clarity

I1SI2SI1VS2VS1VVS2VVS1IF

missing

depth

0 60 62 64

table

0 54 5658 60

price

0.0 0.5 1.0 1.5

x 1e+04

x

0 2 4 6 8

y

0 2 6 10

z

0 2 4

D. Malouche | LSA, UoM, 29/3/1747 /

115

Page 78: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Visualizing multivariate:Categorical DataQuantitative Data

D. Malouche | LSA, UoM, 29/3/1748 /

115

Page 79: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Categorical Data, Mosaic plots

D. Malouche | LSA, UoM, 29/3/1749 /

115

Page 80: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Mosaic Plots with ggmosaic, Titanic Data

> data(Titanic)> titanic <- as.data.frame(Titanic)> titanic$Survived <- factor(titanic$Survived, levels=c("Yes", "No"))> head(titanic)

Class Sex Age Survived Freq1 1st Male Child No 02 2nd Male Child No 03 3rd Male Child No 354 Crew Male Child No 05 1st Female Child No 06 2nd Female Child No 0

D. Malouche | LSA, UoM, 29/3/1750 /

115

Page 81: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Mosaic of table Class x Survived

> library(ggplot2)> library(ggmosaic)Loading required package: productplots

Attaching package: 'ggmosaic'The following objects are masked from 'package:productplots':

ddecker, hspine, mosaic, prodcalc, spine, vspine> ggplot(data=titanic) ++ geom_mosaic(aes(weight=Freq, x=product(Class), fill=Survived))

0.00

0.25

0.50

0.75

1.00

1st 2nd 3rd Crew

product(Class)

Survived

Yes

No

D. Malouche | LSA, UoM, 29/3/1751 /

115

Page 82: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Mosaic of table Class x Survived

> library(ggplot2)> library(ggmosaic)Loading required package: productplots

Attaching package: 'ggmosaic'The following objects are masked from 'package:productplots':

ddecker, hspine, mosaic, prodcalc, spine, vspine> ggplot(data=titanic) ++ geom_mosaic(aes(weight=Freq, x=product(Class), fill=Survived))

0.00

0.25

0.50

0.75

1.00

1st 2nd 3rd Crew

product(Class)

Survived

Yes

No

D. Malouche | LSA, UoM, 29/3/1751 /

115

Page 83: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Mosaic of table Class x Survived, how it works?

> margin.table(Titanic,margin = c(1,4))Survived

Class No Yes1st 122 2032nd 167 1183rd 528 178Crew 673 212

> prop.table(margin.table(Titanic,margin = c(1,4)),1)Survived

Class No Yes1st 0.3753846 0.62461542nd 0.5859649 0.41403513rd 0.7478754 0.2521246Crew 0.7604520 0.2395480

D. Malouche | LSA, UoM, 29/3/1752 /

115

Page 84: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Customizing the Mosaic plot

> library(scales)> ggplot(data=titanic) ++ geom_mosaic(aes(weight=Freq, x=product(Class), fill=Survived))++ scale_y_continuous(labels=percent) ++ labs(x = "Class",+ y = "Percentage") ++ theme(panel.background = NULL, axis.text.x = element_text(angle=40, vjust=1))

0%

25%

50%

75%

100%

1st 2nd3rd

Crew

Class

Per

cent

age Survived

Yes

No

D. Malouche | LSA, UoM, 29/3/1753 /

115

Page 85: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Customizing the Mosaic plot

> library(scales)> ggplot(data=titanic) ++ geom_mosaic(aes(weight=Freq, x=product(Class), fill=Survived))++ scale_y_continuous(labels=percent) ++ labs(x = "Class",+ y = "Percentage") ++ theme(panel.background = NULL, axis.text.x = element_text(angle=40, vjust=1))

0%

25%

50%

75%

100%

1st 2nd3rd

Crew

Class

Per

cent

age Survived

Yes

No

D. Malouche | LSA, UoM, 29/3/1753 /

115

Page 86: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Mosaic plot with 3 variables and more

> ggplot(data=titanic) ++ geom_mosaic(aes(weight=Freq, x=product(Class, Age), fill=Survived))++ scale_y_continuous(labels=percent) ++ labs(x = "Class",+ y = "Percentage") ++ theme(panel.background = NULL, axis.text.x = element_text(angle=40, vjust=1))

0%

25%

50%

75%

100%

Yes:C

hild

No:Child

Yes:A

dult

No:Adult

Class

Per

cent

age Survived

Yes

No

D. Malouche | LSA, UoM, 29/3/1754 /

115

Page 87: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Mosaic plot with 3 variables and more

> ggplot(data=titanic) ++ geom_mosaic(aes(weight=Freq, x=product(Class, Age), fill=Survived))++ scale_y_continuous(labels=percent) ++ labs(x = "Class",+ y = "Percentage") ++ theme(panel.background = NULL, axis.text.x = element_text(angle=40, vjust=1))

0%

25%

50%

75%

100%

Yes:C

hild

No:Child

Yes:A

dult

No:Adult

Class

Per

cent

age Survived

Yes

No

D. Malouche | LSA, UoM, 29/3/1754 /

115

Page 88: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Mosaic plot with 3 variables and more

> ggplot(data=titanic) ++ geom_mosaic(aes(weight=Freq, x=product(Age), fill=Survived))++ scale_y_continuous(labels=percent) ++ labs(x = "Class",+ y = "Percentage") ++ theme(panel.background = NULL, axis.text.x = element_text(angle=40, vjust=1))++ facet_wrap(˜Class)

3rd Crew

1st 2nd

ChildAdult

ChildAdult

0%

25%

50%

75%

100%

0%

25%

50%

75%

100%

Class

Per

cent

age Survived

Yes

No

D. Malouche | LSA, UoM, 29/3/1755 /

115

Page 89: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Mosaic plot with 3 variables and more

> ggplot(data=titanic) ++ geom_mosaic(aes(weight=Freq, x=product(Age), fill=Survived))++ scale_y_continuous(labels=percent) ++ labs(x = "Class",+ y = "Percentage") ++ theme(panel.background = NULL, axis.text.x = element_text(angle=40, vjust=1))++ facet_wrap(˜Class)

3rd Crew

1st 2nd

ChildAdult

ChildAdult

0%

25%

50%

75%

100%

0%

25%

50%

75%

100%

Class

Per

cent

age Survived

Yes

No

D. Malouche | LSA, UoM, 29/3/1755 /

115

Page 90: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Mosaic plot with 3 variables and more

> margin.table(Titanic,margin = c(1,3,4)), , Survived = No

AgeClass Child Adult

1st 0 1222nd 0 1673rd 52 476Crew 0 673

, , Survived = Yes

AgeClass Child Adult

1st 6 1972nd 24 943rd 27 151Crew 0 212

D. Malouche | LSA, UoM, 29/3/1756 /

115

Page 91: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Mosaic plot with 3 variables and more

> ggplot(data=titanic) ++ geom_mosaic(aes(weight=Freq, x=product(Class),fill=Survived))++ scale_y_continuous(labels=percent) ++ labs(x = "Class",+ y = "Percentage") ++ theme(panel.background = NULL, axis.text.x = element_text(angle=40, vjust=1))++ facet_wrap(˜Age)

Child Adult

1st 2nd3rd

Crew 1st 2nd3rd

Crew

0%

25%

50%

75%

100%

Class

Per

cent

age Survived

Yes

No

D. Malouche | LSA, UoM, 29/3/1757 /

115

Page 92: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Mosaic plot with 3 variables and more

> ggplot(data=titanic) ++ geom_mosaic(aes(weight=Freq, x=product(Class),fill=Survived))++ scale_y_continuous(labels=percent) ++ labs(x = "Class",+ y = "Percentage") ++ theme(panel.background = NULL, axis.text.x = element_text(angle=40, vjust=1))++ facet_wrap(˜Age)

Child Adult

1st 2nd3rd

Crew 1st 2nd3rd

Crew

0%

25%

50%

75%

100%

Class

Per

cent

age Survived

Yes

No

D. Malouche | LSA, UoM, 29/3/1757 /

115

Page 93: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Adding frequencies to the mosaic plot

> p<-ggplot(data=titanic) ++ geom_mosaic(aes(weight=Freq, x=product(Class),fill=Survived))> x=ggplot_build(p)> z=prop.table(margin.table(Titanic,margin = c(1,4)),1)> z=z[,levels(titanic$Survived)]> z1=paste(round(100*as.vector(t(z)),1),"%",sep="")> df=data.frame(xtext=(x$data[[1]]$xmin+x$data[[1]]$xmax)/2,+ ytext=(x$data[[1]]$ymin+x$data[[1]]$ymax)/2,+ value=z1)> df

xtext ytext value1 0.07161517 0.3093300 62.5%2 0.07161517 0.8140973 37.5%3 0.21603135 0.2050437 41.4%4 0.21603135 0.7098110 58.6%5 0.44440254 0.1248604 25.2%6 0.44440254 0.6296277 74.8%7 0.80498637 0.1186320 24%8 0.80498637 0.6233993 76%

D. Malouche | LSA, UoM, 29/3/1758 /

115

Page 94: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Adding frequencies to the mosaic plot

> p<-ggplot(data=titanic) ++ geom_mosaic(aes(weight=Freq, x=product(Class),fill=Survived))++ scale_y_continuous(labels=percent) ++ labs(x = "Class",+ y = "Percentage") ++ theme(panel.background = NULL, axis.text.x = element_text(angle=40, vjust=1))> p<-p+geom_text(data=df,aes(x=xtext,y=ytext,label=value))> p

62.5%

37.5%

41.4%

58.6%

25.2%

74.8%

24%

76%

0%

25%

50%

75%

100%

1st 2nd3rd

Crew

Class

Per

cent

age Survived

Yes

No

D. Malouche | LSA, UoM, 29/3/1759 /

115

Page 95: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Adding frequencies to the mosaic plot

> p<-ggplot(data=titanic) ++ geom_mosaic(aes(weight=Freq, x=product(Class),fill=Survived))++ scale_y_continuous(labels=percent) ++ labs(x = "Class",+ y = "Percentage") ++ theme(panel.background = NULL, axis.text.x = element_text(angle=40, vjust=1))> p<-p+geom_text(data=df,aes(x=xtext,y=ytext,label=value))> p

62.5%

37.5%

41.4%

58.6%

25.2%

74.8%

24%

76%

0%

25%

50%

75%

100%

1st 2nd3rd

Crew

Class

Per

cent

age Survived

Yes

No

D. Malouche | LSA, UoM, 29/3/1759 /

115

Page 96: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Quantitative Data, Correlationmatrix

D. Malouche | LSA, UoM, 29/3/1760 /

115

Page 97: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

corrplot package

Display of a correlation matrix, confidence interval.

Contains some algorithms to do matrix reordering.

Good at details, including choosing color, text labels, color labels,layout, etc.

D. Malouche | LSA, UoM, 29/3/1761 /

115

Page 98: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Visualization Methods, circles

> library(corrplot)> data(mtcars)> head(mtcars)

mpg cyl disp hp drat wt qsec vs am gear carbMazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1> M <- cor(mtcars)> corrplot(M, method = "circle")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp drat

wt

qsec

vs am gear

carb

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1762 /

115

Page 99: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Visualization Methods, circles

> library(corrplot)> data(mtcars)> head(mtcars)

mpg cyl disp hp drat wt qsec vs am gear carbMazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1> M <- cor(mtcars)> corrplot(M, method = "circle")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp drat

wt

qsec

vs am gear

carb

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1762 /

115

Page 100: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Visualization Methods, squares

> corrplot(M, method = "square")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp drat

wt

qsec

vs am gear

carb

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1763 /

115

Page 101: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Visualization Methods, squares

> corrplot(M, method = "square")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp drat

wt

qsec

vs am gear

carb

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1763 /

115

Page 102: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Visualization Methods, ellipses

> corrplot(M, method = "ellipse")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp drat

wt

qsec

vs am gear

carb

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1764 /

115

Page 103: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Visualization Methods, ellipses

> corrplot(M, method = "ellipse")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp drat

wt

qsec

vs am gear

carb

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1764 /

115

Page 104: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Visualization Methods, numbers

> corrplot(M, method = "number")

1

−0.85

−0.85

−0.78

0.68

−0.87

0.42

0.66

0.6

0.48

−0.55

−0.85

1

0.9

0.83

−0.7

0.78

−0.59

−0.81

−0.52

−0.49

0.53

−0.85

0.9

1

0.79

−0.71

0.89

−0.43

−0.71

−0.59

−0.56

0.39

−0.78

0.83

0.79

1

−0.45

0.66

−0.71

−0.72

−0.24

−0.13

0.75

0.68

−0.7

−0.71

−0.45

1

−0.71

0.09

0.44

0.71

0.7

−0.09

−0.87

0.78

0.89

0.66

−0.71

1

−0.17

−0.55

−0.69

−0.58

0.43

0.42

−0.59

−0.43

−0.71

0.09

−0.17

1

0.74

−0.23

−0.21

−0.66

0.66

−0.81

−0.71

−0.72

0.44

−0.55

0.74

1

0.17

0.21

−0.57

0.6

−0.52

−0.59

−0.24

0.71

−0.69

−0.23

0.17

1

0.79

0.06

0.48

−0.49

−0.56

−0.13

0.7

−0.58

−0.21

0.21

0.79

1

0.27

−0.55

0.53

0.39

0.75

−0.09

0.43

−0.66

−0.57

0.06

0.27

1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp drat

wt

qsec

vs am gear

carb

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1765 /

115

Page 105: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Visualization Methods, numbers

> corrplot(M, method = "number")

1

−0.85

−0.85

−0.78

0.68

−0.87

0.42

0.66

0.6

0.48

−0.55

−0.85

1

0.9

0.83

−0.7

0.78

−0.59

−0.81

−0.52

−0.49

0.53

−0.85

0.9

1

0.79

−0.71

0.89

−0.43

−0.71

−0.59

−0.56

0.39

−0.78

0.83

0.79

1

−0.45

0.66

−0.71

−0.72

−0.24

−0.13

0.75

0.68

−0.7

−0.71

−0.45

1

−0.71

0.09

0.44

0.71

0.7

−0.09

−0.87

0.78

0.89

0.66

−0.71

1

−0.17

−0.55

−0.69

−0.58

0.43

0.42

−0.59

−0.43

−0.71

0.09

−0.17

1

0.74

−0.23

−0.21

−0.66

0.66

−0.81

−0.71

−0.72

0.44

−0.55

0.74

1

0.17

0.21

−0.57

0.6

−0.52

−0.59

−0.24

0.71

−0.69

−0.23

0.17

1

0.79

0.06

0.48

−0.49

−0.56

−0.13

0.7

−0.58

−0.21

0.21

0.79

1

0.27

−0.55

0.53

0.39

0.75

−0.09

0.43

−0.66

−0.57

0.06

0.27

1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp drat

wt

qsec

vs am gear

carb

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1765 /

115

Page 106: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Visualization Methods, pies

> corrplot(M, method = "pie")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp drat

wt

qsec

vs am gear

carb

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1766 /

115

Page 107: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Visualization Methods, pies

> corrplot(M, method = "pie")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp drat

wt

qsec

vs am gear

carb

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1766 /

115

Page 108: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Visualization Methods, mixed

> corrplot.mixed(M, lower = "ellipse", upper = "number")

1 −0.85

1

−0.85

0.9

1

−0.78

0.83

0.79

1

0.68

−0.7

−0.71

−0.45

1

−0.87

0.78

0.89

0.66

−0.71

1

0.42

−0.59

−0.43

−0.71

0.09

−0.17

1

0.66

−0.81

−0.71

−0.72

0.44

−0.55

0.74

1

0.6

−0.52

−0.59

−0.24

0.71

−0.69

−0.23

0.17

1

0.48

−0.49

−0.56

−0.13

0.7

−0.58

−0.21

0.21

0.79

1

−0.55

0.53

0.39

0.75

−0.09

0.43

−0.66

−0.57

0.06

0.27

1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1767 /

115

Page 109: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Visualization Methods, mixed

> corrplot.mixed(M, lower = "ellipse", upper = "number")

1 −0.85

1

−0.85

0.9

1

−0.78

0.83

0.79

1

0.68

−0.7

−0.71

−0.45

1

−0.87

0.78

0.89

0.66

−0.71

1

0.42

−0.59

−0.43

−0.71

0.09

−0.17

1

0.66

−0.81

−0.71

−0.72

0.44

−0.55

0.74

1

0.6

−0.52

−0.59

−0.24

0.71

−0.69

−0.23

0.17

1

0.48

−0.49

−0.56

−0.13

0.7

−0.58

−0.21

0.21

0.79

1

−0.55

0.53

0.39

0.75

−0.09

0.43

−0.66

−0.57

0.06

0.27

1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1767 /

115

Page 110: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Reorder A Correlation Matrix

AOE based on the angle of eigen vector of the correlation matrix.

FPC for the first principal component order.

hclust for hierarchical clustering order, and hclust.method forthe agglomeration method to be used.

hclust.method should be one of ward, single, complete,average, mcquitty, median or centroid.

alphabet for alphabetical order.

D. Malouche | LSA, UoM, 29/3/1768 /

115

Page 111: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Reorder A Correlation Matrix, AOC

> corrplot(M, order = "AOE")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

gear

am drat

mpg

vs qsec

wt

disp

cyl

hp carb

gear

am

drat

mpg

vs

qsec

wt

disp

cyl

hp

carb

D. Malouche | LSA, UoM, 29/3/1769 /

115

Page 112: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Reorder A Correlation Matrix, AOC

> corrplot(M, order = "AOE")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

gear

am drat

mpg

vs qsec

wt

disp

cyl

hp carb

gear

am

drat

mpg

vs

qsec

wt

disp

cyl

hp

carb

D. Malouche | LSA, UoM, 29/3/1769 /

115

Page 113: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Reorder A Correlation Matrix, hclust

> corrplot(M, order = "hclust")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

carb

wt

hp cyl

disp

qsec

vs mpg

drat

am gear

carb

wt

hp

cyl

disp

qsec

vs

mpg

drat

am

gear

D. Malouche | LSA, UoM, 29/3/1770 /

115

Page 114: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Reorder A Correlation Matrix, hclust

> corrplot(M, order = "hclust")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

carb

wt

hp cyl

disp

qsec

vs mpg

drat

am gear

carb

wt

hp

cyl

disp

qsec

vs

mpg

drat

am

gear

D. Malouche | LSA, UoM, 29/3/1770 /

115

Page 115: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Reorder A Correlation Matrix, FPC

> corrplot(M, order = "FPC")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

cyl

disp

wt

hp carb

qsec

gear

am drat

vs mpg

cyl

disp

wt

hp

carb

qsec

gear

am

drat

vs

mpg

D. Malouche | LSA, UoM, 29/3/1771 /

115

Page 116: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Reorder A Correlation Matrix, FPC

> corrplot(M, order = "FPC")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

cyl

disp

wt

hp carb

qsec

gear

am drat

vs mpg

cyl

disp

wt

hp

carb

qsec

gear

am

drat

vs

mpg

D. Malouche | LSA, UoM, 29/3/1771 /

115

Page 117: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Reorder a Correlation Matrix, alphabet

> corrplot(M, order = "alphabet")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

am carb

cyl

disp

drat

gear

hp mpg

qsec

vs wt

am

carb

cyl

disp

drat

gear

hp

mpg

qsec

vs

wt

D. Malouche | LSA, UoM, 29/3/1772 /

115

Page 118: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Reorder a Correlation Matrix, alphabet

> corrplot(M, order = "alphabet")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

am carb

cyl

disp

drat

gear

hp mpg

qsec

vs wt

am

carb

cyl

disp

drat

gear

hp

mpg

qsec

vs

wt

D. Malouche | LSA, UoM, 29/3/1772 /

115

Page 119: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Customizing the correlation matrix, adding rectangles

> corrplot(M, order = "hclust",addrect = 3)

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

carb

wt

hp cyl

disp

qsec

vs mpg

drat

am gear

carb

wt

hp

cyl

disp

qsec

vs

mpg

drat

am

gear

D. Malouche | LSA, UoM, 29/3/1773 /

115

Page 120: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Customizing the correlation matrix, adding rectangles

> corrplot(M, order = "hclust",addrect = 3)

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

carb

wt

hp cyl

disp

qsec

vs mpg

drat

am gear

carb

wt

hp

cyl

disp

qsec

vs

mpg

drat

am

gear

D. Malouche | LSA, UoM, 29/3/1773 /

115

Page 121: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Customizing the correlation matrix, changing colors

> mycol <- colorRampPalette(c("red", "white", "blue"))> corrplot(M, order = "hclust",addrect = 2,col=mycol(50))

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

carb

wt

hp cyl

disp

qsec

vs mpg

drat

am gear

carb

wt

hp

cyl

disp

qsec

vs

mpg

drat

am

gear

D. Malouche | LSA, UoM, 29/3/1774 /

115

Page 122: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Customizing the correlation matrix, changing colors

> mycol <- colorRampPalette(c("red", "white", "blue"))> corrplot(M, order = "hclust",addrect = 2,col=mycol(50))

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

carb

wt

hp cyl

disp

qsec

vs mpg

drat

am gear

carb

wt

hp

cyl

disp

qsec

vs

mpg

drat

am

gear

D. Malouche | LSA, UoM, 29/3/1774 /

115

Page 123: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Customizing the correlation matrix, changingbackground

> wb <- c("white", "black")> corrplot(M, order = "hclust", addrect = 2, col = wb, bg = "gold2")

−1

0

1

carb

wt

hp cyl

disp

qsec

vs mpg

drat

am gear

carb

wt

hp

cyl

disp

qsec

vs

mpg

drat

am

gear

D. Malouche | LSA, UoM, 29/3/1775 /

115

Page 124: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Customizing the correlation matrix, changingbackground

> wb <- c("white", "black")> corrplot(M, order = "hclust", addrect = 2, col = wb, bg = "gold2")

−1

0

1

carb

wt

hp cyl

disp

qsec

vs mpg

drat

am gear

carb

wt

hp

cyl

disp

qsec

vs

mpg

drat

am

gear

D. Malouche | LSA, UoM, 29/3/1775 /

115

Page 125: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Correlation Independence test

> cor.mtest <- function(mat, conf.level = 0.95) {+ mat <- as.matrix(mat)+ n <- ncol(mat)+ p.mat <- lowCI.mat <- uppCI.mat <- matrix(NA, n, n)+ diag(p.mat) <- 0+ diag(lowCI.mat) <- diag(uppCI.mat) <- 1+ for (i in 1:(n - 1)) {+ for (j in (i + 1):n) {+ tmp <- cor.test(mat[, i], mat[, j], conf.level = conf.level)+ p.mat[i, j] <- p.mat[j, i] <- tmp$p.value+ lowCI.mat[i, j] <- lowCI.mat[j, i] <- tmp$conf.int[1]+ uppCI.mat[i, j] <- uppCI.mat[j, i] <- tmp$conf.int[2]+ }+ }+ return(list(p.mat, lowCI.mat, uppCI.mat))+ }

D. Malouche | LSA, UoM, 29/3/1776 /

115

Page 126: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Correlation Independence test

> res1 <- cor.mtest(mtcars, 0.95)> res1[[1]][1:4,1:4]

[,1] [,2] [,3] [,4][1,] 0.000000e+00 6.112687e-10 9.380327e-10 1.787835e-07[2,] 6.112687e-10 0.000000e+00 1.802838e-12 3.477861e-09[3,] 9.380327e-10 1.802838e-12 0.000000e+00 7.142679e-08[4,] 1.787835e-07 3.477861e-09 7.142679e-08 0.000000e+00> corrplot(M, p.mat = res1[[1]], sig.level = 0.1)

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp drat

wt

qsec

vs am gear

carb

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1777 /

115

Page 127: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Correlation Independence test

> res1 <- cor.mtest(mtcars, 0.95)> res1[[1]][1:4,1:4]

[,1] [,2] [,3] [,4][1,] 0.000000e+00 6.112687e-10 9.380327e-10 1.787835e-07[2,] 6.112687e-10 0.000000e+00 1.802838e-12 3.477861e-09[3,] 9.380327e-10 1.802838e-12 0.000000e+00 7.142679e-08[4,] 1.787835e-07 3.477861e-09 7.142679e-08 0.000000e+00> corrplot(M, p.mat = res1[[1]], sig.level = 0.1)

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp drat

wt

qsec

vs am gear

carb

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1777 /

115

Page 128: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Correlation Independence test

> corrplot(M, p.mat = res1[[1]], sig.level = 0.01,insig = "p-value")

0.02 0.01

0.03

0.18

0.49

0.62

0.01

0.62

0.34

0.01

0.02

0.01

0.62

0.34

0.21

0.24

0.01

0.36

0.26

0.18

0.21

0.36

0.75

0.49

0.24

0.26

0.13

0.03

0.62

0.01

0.75

0.13

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp drat

wt

qsec

vs am gear

carb

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1778 /

115

Page 129: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Correlation Independence test

> corrplot(M, p.mat = res1[[1]], sig.level = 0.01,insig = "p-value")

0.02 0.01

0.03

0.18

0.49

0.62

0.01

0.62

0.34

0.01

0.02

0.01

0.62

0.34

0.21

0.24

0.01

0.36

0.26

0.18

0.21

0.36

0.75

0.49

0.24

0.26

0.13

0.03

0.62

0.01

0.75

0.13

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp drat

wt

qsec

vs am gear

carb

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1778 /

115

Page 130: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Correlation Independence test

> corrplot(M, p.mat = res1[[1]], sig.level = 0.01,insig = "blank")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp drat

wt

qsec

vs am gear

carb

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1779 /

115

Page 131: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Correlation Independence test

> corrplot(M, p.mat = res1[[1]], sig.level = 0.01,insig = "blank")

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

mpg

cyl

disp

hp drat

wt

qsec

vs am gear

carb

mpg

cyl

disp

hp

drat

wt

qsec

vs

am

gear

carb

D. Malouche | LSA, UoM, 29/3/1779 /

115

Page 132: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Visualizing data with targetvariable and results of

Statistical Models.

D. Malouche | LSA, UoM, 29/3/1780 /

115

Page 133: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Regression models, sjPlot

D. Malouche | LSA, UoM, 29/3/1781 /

115

Page 134: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ANOVA

> library(sjmisc)> library(sjPlot)#refugeeswelcome> data(efc)> attr(efc$e42dep,"labels")

independent slightly dependent moderately dependent1 2 3

severely dependent4

D. Malouche | LSA, UoM, 29/3/1782 /

115

Page 135: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ANOVA

> summary(lm(efc$c12hour˜as.factor(efc$e42dep)))

Call:lm(formula = efc$c12hour ˜ as.factor(efc$e42dep))

Residuals:Min 1Q Median 3Q Max

-71.901 -24.520 -7.538 9.099 150.462

Coefficients:Estimate Std. Error t value Pr(>|t|)

(Intercept) 9.909 5.445 1.820 0.0691 .as.factor(efc$e42dep)2 7.629 6.193 1.232 0.2183as.factor(efc$e42dep)3 24.611 6.004 4.099 4.52e-05 ***as.factor(efc$e42dep)4 65.992 6.007 10.985 < 2e-16 ***---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 44.24 on 897 degrees of freedom(7 observations deleted due to missingness)

Multiple R-squared: 0.2448, Adjusted R-squared: 0.2422F-statistic: 96.91 on 3 and 897 DF, p-value: < 2.2e-16>

D. Malouche | LSA, UoM, 29/3/1783 /

115

Page 136: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ANOVA

> x=sjp.aov1(efc$c12hour, efc$e42dep)> names(x)[1] "plot" "data"> x$data

term estimate conf.low conf.high p.value p.string xpos1 (Intercept) 9.909091 -0.7779303 20.59611 6.913035e-02 9.91 12 var.grp2 7.628687 -4.5251081 19.78248 2.183125e-01 7.63 23 var.grp3 24.610517 12.8272036 36.39383 4.523872e-05 24.61 *** 34 var.grp4 65.992225 54.2020366 77.78241 1.994596e-26 65.99 *** 4

geom.color1 #3366a02 #3366a03 #3366a04 #3366a0> x$plot ## to plot the ANOVA

9.91

7.63

24.61 ***

65.99 ***

independent (Intercept)

slightly dependent

moderately dependent

severely dependent

0 20 40 60 80

elder's dependency by average number of hours ofcare per week

D. Malouche | LSA, UoM, 29/3/1784 /

115

Page 137: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ANOVA

> x=sjp.aov1(efc$c12hour, efc$e42dep)> names(x)[1] "plot" "data"> x$data

term estimate conf.low conf.high p.value p.string xpos1 (Intercept) 9.909091 -0.7779303 20.59611 6.913035e-02 9.91 12 var.grp2 7.628687 -4.5251081 19.78248 2.183125e-01 7.63 23 var.grp3 24.610517 12.8272036 36.39383 4.523872e-05 24.61 *** 34 var.grp4 65.992225 54.2020366 77.78241 1.994596e-26 65.99 *** 4

geom.color1 #3366a02 #3366a03 #3366a04 #3366a0> x$plot ## to plot the ANOVA

9.91

7.63

24.61 ***

65.99 ***

independent (Intercept)

slightly dependent

moderately dependent

severely dependent

0 20 40 60 80

elder's dependency by average number of hours ofcare per week

D. Malouche | LSA, UoM, 29/3/1784 /

115

Page 138: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Pearson’s Chi2-tests

> # create data frame with 5 dichotomous (dummy) variables> mydf <- data.frame(as.factor(sample(1:2, 100, replace=TRUE)),+ as.factor(sample(1:2, 100, replace=TRUE)),+ as.factor(sample(1:2, 100, replace=TRUE)),+ as.factor(sample(1:2, 100, replace=TRUE)),+ as.factor(sample(1:2, 100, replace=TRUE)))> colnames(mydf)=c("x1","x2","x3","x4","x5")> # create variable labels> items <- list(c("Item 1", "Item 2", "Item 3", "Item 4", "Item 5"))>> # plot Chi2-contingency-table> x=sjp.chi2(mydf, axis.labels = items)> x$mydf[1:2,]

Row Column Chi.Square df p.value1 x1 x1 95.9370 1 0.00002 x2 x1 0.0054 1 0.9417> chisq.test(xtabs(˜mydf$x1+mydf$x2))

Pearson's Chi-squared test with Yates' continuity correction

data: xtabs(˜mydf$x1 + mydf$x2)X-squared = 0.0053545, df = 1, p-value = 0.9417> x$plot

0.000 0.942 0.343 0.839 0.871

0.942 0.000 0.816 0.841 0.747

0.343 0.816 0.000 0.161 0.398

0.839 0.841 0.161 0.000 0.108

0.871 0.747 0.398 0.108 0.000

Item 1

Item 2

Item 3

Item 4

Item 5

Item 1 Item 2 Item 3 Item 4 Item 5

Pearson's Chi2−Test of Independence

D. Malouche | LSA, UoM, 29/3/1785 /

115

Page 139: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Pearson’s Chi2-tests

> # create data frame with 5 dichotomous (dummy) variables> mydf <- data.frame(as.factor(sample(1:2, 100, replace=TRUE)),+ as.factor(sample(1:2, 100, replace=TRUE)),+ as.factor(sample(1:2, 100, replace=TRUE)),+ as.factor(sample(1:2, 100, replace=TRUE)),+ as.factor(sample(1:2, 100, replace=TRUE)))> colnames(mydf)=c("x1","x2","x3","x4","x5")> # create variable labels> items <- list(c("Item 1", "Item 2", "Item 3", "Item 4", "Item 5"))>> # plot Chi2-contingency-table> x=sjp.chi2(mydf, axis.labels = items)> x$mydf[1:2,]

Row Column Chi.Square df p.value1 x1 x1 95.9370 1 0.00002 x2 x1 0.0054 1 0.9417> chisq.test(xtabs(˜mydf$x1+mydf$x2))

Pearson's Chi-squared test with Yates' continuity correction

data: xtabs(˜mydf$x1 + mydf$x2)X-squared = 0.0053545, df = 1, p-value = 0.9417> x$plot

0.000 0.942 0.343 0.839 0.871

0.942 0.000 0.816 0.841 0.747

0.343 0.816 0.000 0.161 0.398

0.839 0.841 0.161 0.000 0.108

0.871 0.747 0.398 0.108 0.000

Item 1

Item 2

Item 3

Item 4

Item 5

Item 1 Item 2 Item 3 Item 4 Item 5

Pearson's Chi2−Test of Independence

D. Malouche | LSA, UoM, 29/3/1785 /

115

Page 140: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients

> # fit linear model> fit <- lm(Ozone ˜ Wind + Temp + Solar.R, data=airquality)> summary(fit)

Call:lm(formula = Ozone ˜ Wind + Temp + Solar.R, data = airquality)

Residuals:Min 1Q Median 3Q Max

-40.485 -14.219 -3.551 10.097 95.619

Coefficients:Estimate Std. Error t value Pr(>|t|)

(Intercept) -64.34208 23.05472 -2.791 0.00623 **Wind -3.33359 0.65441 -5.094 1.52e-06 ***Temp 1.65209 0.25353 6.516 2.42e-09 ***Solar.R 0.05982 0.02319 2.580 0.01124 *---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 21.18 on 107 degrees of freedom(42 observations deleted due to missingness)

Multiple R-squared: 0.6059, Adjusted R-squared: 0.5948F-statistic: 54.83 on 3 and 107 DF, p-value: < 2.2e-16

D. Malouche | LSA, UoM, 29/3/1786 /

115

Page 141: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients

> x=sjp.lm(fit, grid.breaks = 2)> x$data# A tibble: 3 » 8

xpos term estimate conf.low conf.high p.string p.value* <fctr> <chr> <dbl> <dbl> <dbl> <chr> <dbl>1 1 Wind -3.33359131 -4.63087706 -2.0363055 -3.33 *** 1.515934e-062 2 Solar.R 0.05982059 0.01385613 0.1057851 0.06 * 1.123664e-023 3 Temp 1.65209291 1.14949967 2.1546862 1.65 *** 2.423506e-09# ... with 1 more variables: group <lgl>> x$plot

−3.33 ***

0.06 *

1.65 ***

Wind

Solar.R

Temp

−4.7 −2.7 −0.7 1.3

Estimates

Ozone

D. Malouche | LSA, UoM, 29/3/1787 /

115

Page 142: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients

> x=sjp.lm(fit, grid.breaks = 2)> x$data# A tibble: 3 » 8

xpos term estimate conf.low conf.high p.string p.value* <fctr> <chr> <dbl> <dbl> <dbl> <chr> <dbl>1 1 Wind -3.33359131 -4.63087706 -2.0363055 -3.33 *** 1.515934e-062 2 Solar.R 0.05982059 0.01385613 0.1057851 0.06 * 1.123664e-023 3 Temp 1.65209291 1.14949967 2.1546862 1.65 *** 2.423506e-09# ... with 1 more variables: group <lgl>> x$plot

−3.33 ***

0.06 *

1.65 ***

Wind

Solar.R

Temp

−4.7 −2.7 −0.7 1.3

Estimates

Ozone

D. Malouche | LSA, UoM, 29/3/1787 /

115

Page 143: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients, slopes for each predictor

> x=sjp.lm(fit, grid.breaks = 2,type = "slope")> x$df.list[[1]][1:3,]

x y1 7.4 412 8.0 363 12.6 12> airquality[1:3,]

Ozone Solar.R Wind Temp Month Day1 41 190 7.4 67 5 12 36 118 8.0 72 5 23 12 149 12.6 74 5 3> x$plot.list[[1]]> x$plot.list[[2]]> x$plot.list[[3]]> x$plot.list[[1]]> x$plot.list[[2]]> x$plot.list[[3]]

0

50

100

150

5 10 15 20

Wind

Ozo

ne

Ozone

0

50

100

150

60 70 80 90

Temp

Ozo

ne

Ozone

0

50

100

150

0 100 200 300

Solar.R

Ozo

ne

Ozone

D. Malouche | LSA, UoM, 29/3/1788 /

115

Page 144: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients, slopes for each predictor

> x=sjp.lm(fit, grid.breaks = 2,type = "slope")> x$df.list[[1]][1:3,]

x y1 7.4 412 8.0 363 12.6 12> airquality[1:3,]

Ozone Solar.R Wind Temp Month Day1 41 190 7.4 67 5 12 36 118 8.0 72 5 23 12 149 12.6 74 5 3> x$plot.list[[1]]> x$plot.list[[2]]> x$plot.list[[3]]> x$plot.list[[1]]> x$plot.list[[2]]> x$plot.list[[3]]

0

50

100

150

5 10 15 20

Wind

Ozo

ne

Ozone

0

50

100

150

60 70 80 90

Temp

Ozo

ne

Ozone

0

50

100

150

0 100 200 300

Solar.R

Ozo

ne

Ozone

D. Malouche | LSA, UoM, 29/3/1788 /

115

Page 145: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients, slopes for each predictor

> x=sjp.lm(fit, grid.breaks = 2,type = "slope")> x$df.list[[1]][1:3,]

x y1 7.4 412 8.0 363 12.6 12> airquality[1:3,]

Ozone Solar.R Wind Temp Month Day1 41 190 7.4 67 5 12 36 118 8.0 72 5 23 12 149 12.6 74 5 3> x$plot.list[[1]]> x$plot.list[[2]]> x$plot.list[[3]]> x$plot.list[[1]]> x$plot.list[[2]]> x$plot.list[[3]]

0

50

100

150

5 10 15 20

Wind

Ozo

ne

Ozone

0

50

100

150

60 70 80 90

Temp

Ozo

ne

Ozone

0

50

100

150

0 100 200 300

Solar.R

Ozo

ne

Ozone

D. Malouche | LSA, UoM, 29/3/1788 /

115

Page 146: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients, slopes for each predictor

> x=sjp.lm(fit, grid.breaks = 2,type = "slope")> x$df.list[[1]][1:3,]

x y1 7.4 412 8.0 363 12.6 12> airquality[1:3,]

Ozone Solar.R Wind Temp Month Day1 41 190 7.4 67 5 12 36 118 8.0 72 5 23 12 149 12.6 74 5 3> x$plot.list[[1]]> x$plot.list[[2]]> x$plot.list[[3]]> x$plot.list[[1]]> x$plot.list[[2]]> x$plot.list[[3]]

0

50

100

150

5 10 15 20

Wind

Ozo

ne

Ozone

0

50

100

150

60 70 80 90

Temp

Ozo

ne

Ozone

0

50

100

150

0 100 200 300

Solar.R

Ozo

ne

Ozone

D. Malouche | LSA, UoM, 29/3/1788 /

115

Page 147: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients, residuals for eachpredictor

> x=sjp.lm(fit, grid.breaks = 2,type = "resid")

0

50

100

5 10 15 20

Wind

resi

dual

s

Ozone

0

50

100

60 70 80 90

Temp

resi

dual

s

Ozone

0

50

100

0 100 200 300

Solar.R

resi

dual

s

Ozone

D. Malouche | LSA, UoM, 29/3/1789 /

115

Page 148: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients, residuals for eachpredictor

> x=sjp.lm(fit, grid.breaks = 2,type = "resid")

0

50

100

5 10 15 20

Wind

resi

dual

s

Ozone

0

50

100

60 70 80 90

Temp

resi

dual

s

Ozone

0

50

100

0 100 200 300

Solar.R

resi

dual

s

Ozone

D. Malouche | LSA, UoM, 29/3/1789 /

115

Page 149: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients, residuals for eachpredictor

> x=sjp.lm(fit, grid.breaks = 2,type = "resid")

0

50

100

5 10 15 20

Wind

resi

dual

s

Ozone

0

50

100

60 70 80 90

Temp

resi

dual

s

Ozone

0

50

100

0 100 200 300

Solar.R

resi

dual

s

Ozone

D. Malouche | LSA, UoM, 29/3/1789 /

115

Page 150: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients, residuals for eachpredictor

> x=sjp.lm(fit, grid.breaks = 2,type = "resid")

0

50

100

5 10 15 20

Wind

resi

dual

s

Ozone

0

50

100

60 70 80 90

Temp

resi

dual

s

Ozone

0

50

100

0 100 200 300

Solar.R

resi

dual

s

Ozone

D. Malouche | LSA, UoM, 29/3/1789 /

115

Page 151: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients, checking modelassumptions

> x=sjp.lm(fit, type = "ma")Removed 3 cases during 1 step(s).Rˆ2 / adj. Rˆ2 of original model: 0.605895 / 0.594845Rˆ2 / adj. Rˆ2 of updated model: 0.663962 / 0.654268AIC of original model: 998.717103AIC of updated model: 926.512020

good

tolerable

0.0

2.5

5.0

7.5

10.0

Solar.R

Wind

Tem

p

Variance Inflation Factors (multicollinearity)

−2.5

0.0

2.5

5.0

0 50 100

Theoretical quantiles (predicted values)

Stu

dent

ized

Res

idua

ls

Dots should be plotted along the line

Non−normality of residuals and outliers

0.00

0.05

0.10

0 50 100

Residuals

Den

sity

Distribution should look like normal curve

Non−normality of residuals

0

50

100

0 50 100

Fitted values

Res

idua

ls

Amount and distance of points scattered above/below line is equal or randomly spread

Homoscedasticity (constant variance of residuals)

−3.33 ***

0.06 *

1.65 ***

Wind

Solar.R

Temp

−5 −4 −3 −2 −1 0 1 2

Estimates

Ozone

D. Malouche | LSA, UoM, 29/3/1790 /

115

Page 152: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients, checking modelassumptions

> x=sjp.lm(fit, type = "ma")Removed 3 cases during 1 step(s).Rˆ2 / adj. Rˆ2 of original model: 0.605895 / 0.594845Rˆ2 / adj. Rˆ2 of updated model: 0.663962 / 0.654268AIC of original model: 998.717103AIC of updated model: 926.512020

good

tolerable

0.0

2.5

5.0

7.5

10.0

Solar.R

Wind

Tem

p

Variance Inflation Factors (multicollinearity)

−2.5

0.0

2.5

5.0

0 50 100

Theoretical quantiles (predicted values)

Stu

dent

ized

Res

idua

ls

Dots should be plotted along the line

Non−normality of residuals and outliers

0.00

0.05

0.10

0 50 100

Residuals

Den

sity

Distribution should look like normal curve

Non−normality of residuals

0

50

100

0 50 100

Fitted values

Res

idua

ls

Amount and distance of points scattered above/below line is equal or randomly spread

Homoscedasticity (constant variance of residuals)

−3.33 ***

0.06 *

1.65 ***

Wind

Solar.R

Temp

−5 −4 −3 −2 −1 0 1 2

Estimates

Ozone

D. Malouche | LSA, UoM, 29/3/1790 /

115

Page 153: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients, checking modelassumptions

> x=sjp.lm(fit, type = "ma")Removed 3 cases during 1 step(s).Rˆ2 / adj. Rˆ2 of original model: 0.605895 / 0.594845Rˆ2 / adj. Rˆ2 of updated model: 0.663962 / 0.654268AIC of original model: 998.717103AIC of updated model: 926.512020

good

tolerable

0.0

2.5

5.0

7.5

10.0

Solar.R

Wind

Tem

p

Variance Inflation Factors (multicollinearity)

−2.5

0.0

2.5

5.0

0 50 100

Theoretical quantiles (predicted values)

Stu

dent

ized

Res

idua

ls

Dots should be plotted along the line

Non−normality of residuals and outliers

0.00

0.05

0.10

0 50 100

Residuals

Den

sity

Distribution should look like normal curve

Non−normality of residuals

0

50

100

0 50 100

Fitted values

Res

idua

ls

Amount and distance of points scattered above/below line is equal or randomly spread

Homoscedasticity (constant variance of residuals)

−3.33 ***

0.06 *

1.65 ***

Wind

Solar.R

Temp

−5 −4 −3 −2 −1 0 1 2

Estimates

Ozone

D. Malouche | LSA, UoM, 29/3/1790 /

115

Page 154: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients, checking modelassumptions

> x=sjp.lm(fit, type = "ma")Removed 3 cases during 1 step(s).Rˆ2 / adj. Rˆ2 of original model: 0.605895 / 0.594845Rˆ2 / adj. Rˆ2 of updated model: 0.663962 / 0.654268AIC of original model: 998.717103AIC of updated model: 926.512020

good

tolerable

0.0

2.5

5.0

7.5

10.0

Solar.R

Wind

Tem

p

Variance Inflation Factors (multicollinearity)

−2.5

0.0

2.5

5.0

0 50 100

Theoretical quantiles (predicted values)

Stu

dent

ized

Res

idua

ls

Dots should be plotted along the line

Non−normality of residuals and outliers

0.00

0.05

0.10

0 50 100

Residuals

Den

sity

Distribution should look like normal curve

Non−normality of residuals

0

50

100

0 50 100

Fitted values

Res

idua

ls

Amount and distance of points scattered above/below line is equal or randomly spread

Homoscedasticity (constant variance of residuals)

−3.33 ***

0.06 *

1.65 ***

Wind

Solar.R

Temp

−5 −4 −3 −2 −1 0 1 2

Estimates

Ozone

D. Malouche | LSA, UoM, 29/3/1790 /

115

Page 155: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients, checking modelassumptions

> x=sjp.lm(fit, type = "ma")Removed 3 cases during 1 step(s).Rˆ2 / adj. Rˆ2 of original model: 0.605895 / 0.594845Rˆ2 / adj. Rˆ2 of updated model: 0.663962 / 0.654268AIC of original model: 998.717103AIC of updated model: 926.512020

good

tolerable

0.0

2.5

5.0

7.5

10.0

Solar.R

Wind

Tem

p

Variance Inflation Factors (multicollinearity)

−2.5

0.0

2.5

5.0

0 50 100

Theoretical quantiles (predicted values)

Stu

dent

ized

Res

idua

ls

Dots should be plotted along the line

Non−normality of residuals and outliers

0.00

0.05

0.10

0 50 100

Residuals

Den

sity

Distribution should look like normal curve

Non−normality of residuals

0

50

100

0 50 100

Fitted values

Res

idua

ls

Amount and distance of points scattered above/below line is equal or randomly spread

Homoscedasticity (constant variance of residuals)

−3.33 ***

0.06 *

1.65 ***

Wind

Solar.R

Temp

−5 −4 −3 −2 −1 0 1 2

Estimates

Ozone

D. Malouche | LSA, UoM, 29/3/1790 /

115

Page 156: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients, checking modelassumptions

> x=sjp.lm(fit, type = "ma")Removed 3 cases during 1 step(s).Rˆ2 / adj. Rˆ2 of original model: 0.605895 / 0.594845Rˆ2 / adj. Rˆ2 of updated model: 0.663962 / 0.654268AIC of original model: 998.717103AIC of updated model: 926.512020

good

tolerable

0.0

2.5

5.0

7.5

10.0

Solar.R

Wind

Tem

p

Variance Inflation Factors (multicollinearity)

−2.5

0.0

2.5

5.0

0 50 100

Theoretical quantiles (predicted values)

Stu

dent

ized

Res

idua

ls

Dots should be plotted along the line

Non−normality of residuals and outliers

0.00

0.05

0.10

0 50 100

Residuals

Den

sity

Distribution should look like normal curve

Non−normality of residuals

0

50

100

0 50 100

Fitted values

Res

idua

ls

Amount and distance of points scattered above/below line is equal or randomly spread

Homoscedasticity (constant variance of residuals)

−3.33 ***

0.06 *

1.65 ***

Wind

Solar.R

Temp

−5 −4 −3 −2 −1 0 1 2

Estimates

Ozone

D. Malouche | LSA, UoM, 29/3/1790 /

115

Page 157: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients, Variance Inflation factor

> x=sjp.lm(fit, type = "vif")> x$vifval

Wind Temp Solar.R1.329070 1.431367 1.095253> x$plot

good

tolerable

0.0

2.5

5.0

7.5

10.0

Solar.R

Wind

Tem

p

Variance Inflation Factors (multicollinearity)

D. Malouche | LSA, UoM, 29/3/1791 /

115

Page 158: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Linear models, β coefficients, Variance Inflation factor

> x=sjp.lm(fit, type = "vif")> x$vifval

Wind Temp Solar.R1.329070 1.431367 1.095253> x$plot good

tolerable

0.0

2.5

5.0

7.5

10.0

Solar.R

Wind

Tem

p

Variance Inflation Factors (multicollinearity)

D. Malouche | LSA, UoM, 29/3/1791 /

115

Page 159: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

PCA, CA and MCA

D. Malouche | LSA, UoM, 29/3/1792 /

115

Page 160: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

PCA

> library(FactoMineR)> library(factoextra)Loading required package: ggplot2> data(decathlon)> head(decathlon)

100m Long.jump Shot.put High.jump 400m 110m.hurdle DiscusSEBRLE 11.04 7.58 14.83 2.07 49.81 14.69 43.75CLAY 10.76 7.40 14.26 1.86 49.37 14.05 50.72KARPOV 11.02 7.30 14.77 2.04 48.37 14.09 48.95BERNARD 11.02 7.23 14.25 1.92 48.93 14.99 40.87YURKOV 11.34 7.09 15.19 2.10 50.42 15.31 46.26WARNERS 11.11 7.60 14.31 1.98 48.68 14.23 41.10

Pole.vault Javeline 1500m Rank Points CompetitionSEBRLE 5.02 63.19 291.7 1 8217 DecastarCLAY 4.92 60.15 301.5 2 8122 DecastarKARPOV 4.92 50.31 300.2 3 8099 DecastarBERNARD 5.32 62.77 280.1 4 8067 DecastarYURKOV 4.72 63.44 276.4 5 8036 DecastarWARNERS 4.92 51.77 278.1 6 8030 Decastar> pc1=PCA(decathlon,ncp=3,scale.unit = T,quanti.sup=11:12,quali.sup=13,graph = F)

D. Malouche | LSA, UoM, 29/3/1793 /

115

Page 161: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

PCA, scree plot

> fviz_screeplot(pc1)

0

10

20

30

1 2 3 4 5 6 7 8 9 10Dimensions

Per

cent

age

of e

xpla

ined

var

ianc

es

Scree plot

D. Malouche | LSA, UoM, 29/3/1794 /

115

Page 162: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

PCA, Representing individuals

> fviz_pca_ind(pc1,axes=c(1,2),repel = T,habillage = "Competition",+ addEllipses=TRUE, ellipse.level=0.95)

SEBRLECLAY

KARPOV

BERNARD

YURKOV

WARNERS

ZSIVOCZKY

McMULLEN

MARTINEAUHERNU

BARRAS

NOOL

BOURGUIGNON

Sebrle

Clay

Karpov

Macey

Warners

Zsivoczky

Hernu

Nool

Bernard

Schwarzl

Pogorelov

Schoenbeck

Barras

Smith

Averyanov

Ojaniemi

Smirnov

Qi

Drews

Parkhomenko

Terek

Gomez

Turi

Lorenzo

Karlivans

Korkizoglou

Uldal

Casarsa

−4

−2

0

2

4

−5.0 −2.5 0.0 2.5 5.0Dim1 (32.7%)

Dim

2 (1

7.4%

)

Competition

a

a

Decastar

OlympicG

Individuals − PCA

D. Malouche | LSA, UoM, 29/3/1795 /

115

Page 163: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

PCA, Representing individuals

> fviz_pca_ind(pc1,axes=c(1,2),repel = T,habillage = "Competition",+ addEllipses=TRUE, ellipse.level=0.95)

SEBRLECLAY

KARPOV

BERNARD

YURKOV

WARNERS

ZSIVOCZKY

McMULLEN

MARTINEAUHERNU

BARRAS

NOOL

BOURGUIGNON

Sebrle

Clay

Karpov

Macey

Warners

Zsivoczky

Hernu

Nool

Bernard

Schwarzl

Pogorelov

Schoenbeck

Barras

Smith

Averyanov

Ojaniemi

Smirnov

Qi

Drews

Parkhomenko

Terek

Gomez

Turi

Lorenzo

Karlivans

Korkizoglou

Uldal

Casarsa

−4

−2

0

2

4

−5.0 −2.5 0.0 2.5 5.0Dim1 (32.7%)

Dim

2 (1

7.4%

)

Competition

a

a

Decastar

OlympicG

Individuals − PCA

D. Malouche | LSA, UoM, 29/3/1795 /

115

Page 164: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

PCA, Circle of correlations

> fviz_pca_var(pc1,axes=c(1,2),repel = T,col.circle = "red")

100m

Long.jump

Shot.put

High.jump

400m

110m.hurdle

Discus

Pole.vault

Javeline1500m

Rank Points

−1.0

−0.5

0.0

0.5

1.0

−1.0 −0.5 0.0 0.5 1.0Dim1 (32.7%)

Dim

2 (1

7.4%

)

Variables − PCA

D. Malouche | LSA, UoM, 29/3/1796 /

115

Page 165: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

PCA, Circle of correlations

> fviz_pca_var(pc1,axes=c(1,2),repel = T,col.circle = "red")100m

Long.jump

Shot.put

High.jump

400m

110m.hurdle

Discus

Pole.vault

Javeline1500m

Rank Points

−1.0

−0.5

0.0

0.5

1.0

−1.0 −0.5 0.0 0.5 1.0Dim1 (32.7%)

Dim

2 (1

7.4%

)

Variables − PCA

D. Malouche | LSA, UoM, 29/3/1796 /

115

Page 166: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

PCA, Biplot

> fviz_pca_biplot(pc1,axes=c(1,2),repel = T)

SEBRLECLAY

KARPOV

BERNARD

YURKOV

WARNERS

ZSIVOCZKY

McMULLEN

MARTINEAU HERNU

BARRAS

NOOL

BOURGUIGNON

SebrleClay

Karpov

Macey

Warners

Zsivoczky

Hernu

Nool

Bernard

Schwarzl

Pogorelov

Schoenbeck

Barras

Smith

Averyanov

Ojaniemi

Smirnov

Qi

Drews

Parkhomenko

Terek

Gomez

Turi

Lorenzo

Karlivans

Korkizoglou

Uldal

Casarsa

100m

Long.jump

Shot.put

High.jump

400m

110m.hurdle

Discus

Pole.vault

Javeline

1500m

Rank

Points

−2

0

2

4

−2.5 0.0 2.5 5.0Dim1 (32.7%)

Dim

2 (1

7.4%

)

PCA − Biplot

D. Malouche | LSA, UoM, 29/3/1797 /

115

Page 167: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

PCA, Biplot

> fviz_pca_biplot(pc1,axes=c(1,2),repel = T) SEBRLECLAY

KARPOV

BERNARD

YURKOV

WARNERS

ZSIVOCZKY

McMULLEN

MARTINEAU HERNU

BARRAS

NOOL

BOURGUIGNON

SebrleClay

Karpov

Macey

Warners

Zsivoczky

Hernu

Nool

Bernard

Schwarzl

Pogorelov

Schoenbeck

Barras

Smith

Averyanov

Ojaniemi

Smirnov

Qi

Drews

Parkhomenko

Terek

Gomez

Turi

Lorenzo

Karlivans

Korkizoglou

Uldal

Casarsa

100m

Long.jump

Shot.put

High.jump

400m

110m.hurdle

Discus

Pole.vault

Javeline

1500m

Rank

Points

−2

0

2

4

−2.5 0.0 2.5 5.0Dim1 (32.7%)

Dim

2 (1

7.4%

)

PCA − Biplot

D. Malouche | LSA, UoM, 29/3/1797 /

115

Page 168: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

CA, Correspondence Analysis> library(vcd)Loading required package: grid> data("Suicide")> head(Suicide)

Freq sex method age age.group method21 4 male poison 10 10-20 poison2 0 male cookgas 10 10-20 gas3 0 male toxicgas 10 10-20 gas4 247 male hang 10 10-20 hang5 1 male drown 10 10-20 drown6 17 male gun 10 10-20 gun> suicide.tab1=xtabs(Freq˜sex+method2,data=Suicide)> suicide.tab1

method2sex poison gas hang drown gun knife jump other

male 8917 2089 14740 946 2945 628 1340 2214female 8648 318 5637 1703 173 309 1505 1070

> suicide.tab2=xtabs(Freq˜age.group+method2,data=Suicide)> suicide.tab2

method2age.group poison gas hang drown gun knife jump other

10-20 2081 375 1736 97 537 58 320 56425-35 4495 996 3326 352 916 180 642 103840-50 4689 716 5417 601 927 263 571 83955-65 3814 246 5595 886 506 257 661 59070-90 2486 74 4303 713 232 179 651 253

> suicide.tab=rbind(suicide.tab2,suicide.tab1)

D. Malouche | LSA, UoM, 29/3/1798 /

115

Page 169: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

CA> suicide.ca=CA(suicide.tab,row.sup = 6:7,graph = F)> summary(suicide.ca)

Call:CA(X = suicide.tab, row.sup = 6:7, graph = F)

The chi square of independence between the two variables is equal to 3422.466(p-value = 0 ).

EigenvaluesDim.1 Dim.2 Dim.3 Dim.4

Variance 0.060 0.002 0.001 0.000% of var. 93.901 3.248 2.298 0.554Cumulative % of var. 93.901 97.149 99.446 100.000

RowsIner*1000 Dim.1 ctr cos2 Dim.2 ctr cos2 Dim.3

10-20 | 10.361 | 0.292 15.339 0.895 | 0.003 0.053 0.000 | -0.10025-35 | 20.579 | 0.297 32.748 0.962 | 0.046 22.935 0.023 | 0.03740-50 | 1.563 | 0.038 0.614 0.237 | -0.063 50.755 0.679 | 0.01655-65 | 10.683 | -0.210 17.271 0.977 | -0.014 2.064 0.004 | -0.00170-90 | 21.168 | -0.351 34.028 0.971 | 0.055 24.193 0.024 | -0.009

ctr cos210-20 73.762 0.105 |25-35 20.723 0.015 |40-50 4.607 0.044 |55-65 0.008 0.000 |70-90 0.900 0.001 |

ColumnsIner*1000 Dim.1 ctr cos2 Dim.2 ctr cos2 Dim.3

poison | 3.561 | 0.103 5.798 0.984 | 0.003 0.101 0.001 | 0.012gas | 16.819 | 0.599 26.878 0.966 | 0.023 1.151 0.001 | 0.107hang | 15.187 | -0.197 24.643 0.981 | -0.025 11.560 0.016 | -0.008drown | 9.633 | -0.431 15.344 0.963 | 0.047 5.251 0.011 | 0.058gun | 8.069 | 0.360 12.572 0.941 | -0.062 10.814 0.028 | -0.063knife | 0.623 | -0.159 0.735 0.712 | -0.042 1.507 0.051 | 0.091jump | 2.032 | -0.088 0.690 0.205 | 0.164 68.814 0.708 | -0.050other | 8.430 | 0.361 13.340 0.956 | 0.016 0.801 0.002 | -0.059

ctr cos2poison 3.011 0.013 |gas 34.719 0.031 |hang 1.786 0.002 |drown 11.486 0.018 |gun 15.697 0.029 |knife 9.881 0.234 |jump 9.056 0.066 |other 14.365 0.025 |

Supplementary rowsDim.1 cos2 Dim.2 cos2 Dim.3 cos2

male | 0.060 0.066 | -0.135 0.336 | -0.051 0.048 |female | -0.105 0.066 | 0.235 0.336 | 0.089 0.048 |

D. Malouche | LSA, UoM, 29/3/1799 /

115

Page 170: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

CA

> fviz_ca_biplot(suicide.ca)

10−20

25−35

40−50

55−65

70−90

male

female

poison

gas

hang

drown

gun

knife

jump

other

−0.1

0.0

0.1

0.2

−0.25 0.00 0.25 0.50Dim1 (93.9%)

Dim

2 (3

.2%

)

CA − Biplot

D. Malouche | LSA, UoM, 29/3/17100 /

115

Page 171: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

CA

> fviz_ca_biplot(suicide.ca)

10−20

25−35

40−50

55−65

70−90

male

female

poison

gas

hang

drown

gun

knife

jump

other

−0.1

0.0

0.1

0.2

−0.25 0.00 0.25 0.50Dim1 (93.9%)

Dim

2 (3

.2%

)

CA − Biplot

D. Malouche | LSA, UoM, 29/3/17100 /

115

Page 172: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

ggfortify

D. Malouche | LSA, UoM, 29/3/17101 /

115

Page 173: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Time series

> library(ggfortify)Loading required package: ggplot2> head(AirPassengers)[1] 112 118 132 129 121 135> class(AirPassengers)[1] "ts"> autoplot(AirPassengers)

200

400

600

1950 1955 1960

D. Malouche | LSA, UoM, 29/3/17102 /

115

Page 174: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Time series

> library(ggfortify)Loading required package: ggplot2> head(AirPassengers)[1] 112 118 132 129 121 135> class(AirPassengers)[1] "ts"> autoplot(AirPassengers)

200

400

600

1950 1955 1960

D. Malouche | LSA, UoM, 29/3/17102 /

115

Page 175: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Time series, Customizing

> p <- autoplot(AirPassengers)> p + ggtitle('AirPassengers') + xlab('Year') + ylab('Passengers')

200

400

600

1950 1955 1960

Year

Pas

seng

ers

AirPassengers

D. Malouche | LSA, UoM, 29/3/17103 /

115

Page 176: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Time series, Customizing

> p <- autoplot(AirPassengers)> p + ggtitle('AirPassengers') + xlab('Year') + ylab('Passengers')

200

400

600

1950 1955 1960

Year

Pas

seng

ers

AirPassengers

D. Malouche | LSA, UoM, 29/3/17103 /

115

Page 177: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Clustering

> set.seed(1)> head(iris)

Sepal.Length Sepal.Width Petal.Length Petal.Width Species1 5.1 3.5 1.4 0.2 setosa2 4.9 3.0 1.4 0.2 setosa3 4.7 3.2 1.3 0.2 setosa4 4.6 3.1 1.5 0.2 setosa5 5.0 3.6 1.4 0.2 setosa6 5.4 3.9 1.7 0.4 setosa> p <- autoplot(kmeans(iris[-5], 3), data = iris)> p

−0.2

−0.1

0.0

0.1

0.2

−0.10 −0.05 0.00 0.05 0.10 0.15

PC1

PC

2

cluster

1

2

3

D. Malouche | LSA, UoM, 29/3/17104 /

115

Page 178: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Clustering

> set.seed(1)> head(iris)

Sepal.Length Sepal.Width Petal.Length Petal.Width Species1 5.1 3.5 1.4 0.2 setosa2 4.9 3.0 1.4 0.2 setosa3 4.7 3.2 1.3 0.2 setosa4 4.6 3.1 1.5 0.2 setosa5 5.0 3.6 1.4 0.2 setosa6 5.4 3.9 1.7 0.4 setosa> p <- autoplot(kmeans(iris[-5], 3), data = iris)> p

−0.2

−0.1

0.0

0.1

0.2

−0.10 −0.05 0.00 0.05 0.10 0.15

PC1

PC

2

cluster

1

2

3

D. Malouche | LSA, UoM, 29/3/17104 /

115

Page 179: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

PCA

> df <- iris[c(1, 2, 3, 4)]> autoplot(prcomp(df))

−0.2

−0.1

0.0

0.1

0.2

−0.10 −0.05 0.00 0.05 0.10 0.15

PC1

PC

2

D. Malouche | LSA, UoM, 29/3/17105 /

115

Page 180: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

PCA

> df <- iris[c(1, 2, 3, 4)]> autoplot(prcomp(df))

−0.2

−0.1

0.0

0.1

0.2

−0.10 −0.05 0.00 0.05 0.10 0.15

PC1

PC

2

D. Malouche | LSA, UoM, 29/3/17105 /

115

Page 181: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

PCA, by showing groups! Ellipses

> autoplot(prcomp(df), data = iris, colour = 'Species',+ shape = FALSE, label.size = 3, frame=T, frame.type = 'norm')

1

23

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

2122

2324

25

26

27

2829

3031

32

33

34

35

36

37

38

39

4041

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

6768

6970

7172

73

74

75

76

77 78

79

80

8182

83

8485

86

87

88

89

90 91

92

93

94

95

9697

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116117

118

119

120

121

122

123

124

125

126

127128

129

130

131

132

133134

135

136

137

138

139

140

141

142

143

144145

146

147

148149

150

−0.2

−0.1

0.0

0.1

0.2

−0.1 0.0 0.1

PC1

PC

2

Species

a

a

a

setosa

versicolor

virginica

D. Malouche | LSA, UoM, 29/3/17106 /

115

Page 182: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

PCA, by showing groups! Ellipses

> autoplot(prcomp(df), data = iris, colour = 'Species',+ shape = FALSE, label.size = 3, frame=T, frame.type = 'norm')

1

23

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

2122

2324

25

26

27

2829

3031

32

33

34

35

36

37

38

39

4041

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

6768

6970

7172

73

74

75

76

77 78

79

80

8182

83

8485

86

87

88

89

90 91

92

93

94

95

9697

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116117

118

119

120

121

122

123

124

125

126

127128

129

130

131

132

133134

135

136

137

138

139

140

141

142

143

144145

146

147

148149

150

−0.2

−0.1

0.0

0.1

0.2

−0.1 0.0 0.1

PC1

PC

2

Species

a

a

a

setosa

versicolor

virginica

D. Malouche | LSA, UoM, 29/3/17106 /

115

Page 183: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

PCA, by showing groups! Convexes

> autoplot(prcomp(df), data = iris, colour = 'Species',+ shape = FALSE, label.size = 3, frame=T)

1

23

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

2122

2324

25

26

27

2829

3031

32

33

34

35

36

37

38

39

4041

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

6768

6970

7172

73

74

75

76

77 78

79

80

8182

83

8485

86

87

88

89

90 91

92

93

94

95

9697

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116117

118

119

120

121

122

123

124

125

126

127128

129

130

131

132

133134

135

136

137

138

139

140

141

142

143

144145

146

147

148149

150

−0.2

−0.1

0.0

0.1

0.2

−0.10 −0.05 0.00 0.05 0.10 0.15

PC1

PC

2

Species

a

a

a

setosa

versicolor

virginica

D. Malouche | LSA, UoM, 29/3/17107 /

115

Page 184: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

PCA, by showing groups! Convexes

> autoplot(prcomp(df), data = iris, colour = 'Species',+ shape = FALSE, label.size = 3, frame=T)

1

23

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

2122

2324

25

26

27

2829

3031

32

33

34

35

36

37

38

39

4041

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

6768

6970

7172

73

74

75

76

77 78

79

80

8182

83

8485

86

87

88

89

90 91

92

93

94

95

9697

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116117

118

119

120

121

122

123

124

125

126

127128

129

130

131

132

133134

135

136

137

138

139

140

141

142

143

144145

146

147

148149

150

−0.2

−0.1

0.0

0.1

0.2

−0.10 −0.05 0.00 0.05 0.10 0.15

PC1

PC

2

Species

a

a

a

setosa

versicolor

virginica

D. Malouche | LSA, UoM, 29/3/17107 /

115

Page 185: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Biplot for a PCA

> autoplot(prcomp(df), data = iris, colour = 'Species',+ loadings = TRUE, loadings.colour = 'blue',+ loadings.label = TRUE, loadings.label.size = 3)

Sepal.LengthSepal.Width

Petal.Length

Petal.Width

−0.2

−0.1

0.0

0.1

0.2

−0.10 −0.05 0.00 0.05 0.10 0.15

PC1

PC

2

Species

setosa

versicolor

virginica

D. Malouche | LSA, UoM, 29/3/17108 /

115

Page 186: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Biplot for a PCA

> autoplot(prcomp(df), data = iris, colour = 'Species',+ loadings = TRUE, loadings.colour = 'blue',+ loadings.label = TRUE, loadings.label.size = 3)

Sepal.LengthSepal.Width

Petal.Length

Petal.Width

−0.2

−0.1

0.0

0.1

0.2

−0.10 −0.05 0.00 0.05 0.10 0.15

PC1

PC

2

Species

setosa

versicolor

virginica

D. Malouche | LSA, UoM, 29/3/17108 /

115

Page 187: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Regression diagnostic

> m <- lm(Petal.Width ˜ Petal.Length, data = iris)> autoplot(m, which = 1:6, colour = 'dodgerblue3',+ smooth.colour = 'black', smooth.linetype = 'dashed',+ ad.colour = 'blue',+ label.size = 3, label.n = 5, label.colour = 'blue',+ ncol = 3)

115

135

142146145

−0.6

−0.3

0.0

0.3

0.6

0.0 0.5 1.0 1.5 2.0 2.5

Fitted values

Res

idua

ls

Residuals vs Fitted

115

135

142146145

−3

−2

−1

0

1

2

3

−2 −1 0 1 2

Theoretical Quantiles

Sta

ndar

dize

d re

sidu

als

Normal Q−Q

115135142

146145

0.0

0.5

1.0

1.5

0.0 0.5 1.0 1.5 2.0 2.5

Fitted values

Sta

ndar

dize

d re

sidu

als

Scale−Location

123135108115

145

0.00

0.02

0.04

0 50 100 150

Obs. Number

Coo

k's

dist

ance

Cook's distance

123

135

108

115

145

−3

−2

−1

0

1

2

3

0.00 0.01 0.02

Leverage

Sta

ndar

dize

d R

esid

uals

Residuals vs Leverage

123135 108115

145

0.00

0.02

0.04

0.00 0.01 0.02

Leverage

Coo

k's

dist

ance

Cook's dist vs Leverage

D. Malouche | LSA, UoM, 29/3/17109 /

115

Page 188: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Regression diagnostic

> m <- lm(Petal.Width ˜ Petal.Length, data = iris)> autoplot(m, which = 1:6, colour = 'dodgerblue3',+ smooth.colour = 'black', smooth.linetype = 'dashed',+ ad.colour = 'blue',+ label.size = 3, label.n = 5, label.colour = 'blue',+ ncol = 3)

115

135

142146145

−0.6

−0.3

0.0

0.3

0.6

0.0 0.5 1.0 1.5 2.0 2.5

Fitted values

Res

idua

lsResiduals vs Fitted

115

135

142146145

−3

−2

−1

0

1

2

3

−2 −1 0 1 2

Theoretical Quantiles

Sta

ndar

dize

d re

sidu

als

Normal Q−Q

115135142

146145

0.0

0.5

1.0

1.5

0.0 0.5 1.0 1.5 2.0 2.5

Fitted values

Sta

ndar

dize

d re

sidu

als

Scale−Location

123135108115

145

0.00

0.02

0.04

0 50 100 150

Obs. Number

Coo

k's

dist

ance

Cook's distance

123

135

108

115

145

−3

−2

−1

0

1

2

3

0.00 0.01 0.02

Leverage

Sta

ndar

dize

d R

esid

uals

Residuals vs Leverage

123135 108115

145

0.00

0.02

0.04

0.00 0.01 0.02

LeverageC

ook'

s di

stan

ce

Cook's dist vs Leverage

D. Malouche | LSA, UoM, 29/3/17109 /

115

Page 189: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Local Fisher Discriminant Analysis

> library(lfda)> model <- lfda(x = iris[-5], y = iris[, 5], r = 3, metric="plain")> autoplot(model, data = iris, frame = TRUE, frame.colour = 'Species')

−1.5

−1.0

−0.5

0.0

0.5

1.0

−2 0 2

PC1

PC

2

Species

setosa

versicolor

virginica

D. Malouche | LSA, UoM, 29/3/17110 /

115

Page 190: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Local Fisher Discriminant Analysis

> library(lfda)> model <- lfda(x = iris[-5], y = iris[, 5], r = 3, metric="plain")> autoplot(model, data = iris, frame = TRUE, frame.colour = 'Species')

−1.5

−1.0

−0.5

0.0

0.5

1.0

−2 0 2

PC1

PC

2

Species

setosa

versicolor

virginica

D. Malouche | LSA, UoM, 29/3/17110 /

115

Page 191: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

GGally package, showing the whole data!

> library(GGally)> data(tips, package = "reshape")> pm <- ggpairs(tips,bins=10)> pm

Corr:

0.676

Corr:

0.598

Corr:

0.489

total_bill tip sex smoker day time sizetotal_bill

tipsex

smoker

daytim

esize

10 20 30 40 50 2.5 5.0 7.5 10.00 2550751000 2550751000 25 50 75 0 25 50 75 02040020400204002040 0 2550751000 255075100 2 4 6

0.00

0.02

0.04

2.5

5.0

7.5

10.0

05

1015

05

1015

05

1015

05

1015

0.02.55.07.510.0

0.02.55.07.510.0

0.02.55.07.510.0

0.02.55.07.510.0

05

1015

05

1015

2

4

6

D. Malouche | LSA, UoM, 29/3/17111 /

115

Page 192: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

GGally package, showing the whole data!

> library(GGally)> data(tips, package = "reshape")> pm <- ggpairs(tips,bins=10)> pm

Corr:

0.676

Corr:

0.598

Corr:

0.489

total_bill tip sex smoker day time sizetotal_bill

tipsex

smoker

daytim

esize

10 20 30 40 50 2.5 5.0 7.5 10.00 2550751000 2550751000 25 50 75 0 25 50 75 02040020400204002040 0 2550751000 255075100 2 4 6

0.00

0.02

0.04

2.5

5.0

7.5

10.0

05

1015

05

1015

05

1015

05

1015

0.02.55.07.510.0

0.02.55.07.510.0

0.02.55.07.510.0

0.02.55.07.510.0

05

1015

05

1015

2

4

6

D. Malouche | LSA, UoM, 29/3/17111 /

115

Page 193: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

GGally package, selecting some variables

> library(ggplot2)> pm <- ggpairs(tips, bins=5, mapping = aes(color = sex), columns = c("total_bill", "time", "tip"))> pm

Cor : 0.676

Female: 0.683

Male: 0.67

total_bill time tiptotal_bill

time

tip

10 20 30 40 50 0 10 20 30 0 10 20 30 2.5 5.0 7.5 10.0

0.00

0.02

0.04

0.06

0

5

10

15

0

5

10

15

2.5

5.0

7.5

10.0

D. Malouche | LSA, UoM, 29/3/17112 /

115

Page 194: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

GGally package, selecting some variables

> library(ggplot2)> pm <- ggpairs(tips, bins=5, mapping = aes(color = sex), columns = c("total_bill", "time", "tip"))> pm

Cor : 0.676

Female: 0.683

Male: 0.67

total_bill time tiptotal_bill

time

tip

10 20 30 40 50 0 10 20 30 0 10 20 30 2.5 5.0 7.5 10.0

0.00

0.02

0.04

0.06

0

5

10

15

0

5

10

15

2.5

5.0

7.5

10.0

D. Malouche | LSA, UoM, 29/3/17112 /

115

Page 195: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

Resources

D. Malouche | LSA, UoM, 29/3/17113 /

115

Page 196: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

The R Graph Gallery

http://www.r-graph-gallery.com/all-graphs/

D. Malouche | LSA, UoM, 29/3/17114 /

115

Page 197: Data Visualization with R - storage.googleapis.com · Data Visualization with R Outline 1 R packages ggplot2 sjPlot tabplot 2 Visualizing multivariate: Categorical Data Quantitative

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Data Visualization with R

R for data sciences

http://r4ds.had.co.nz/

D. Malouche | LSA, UoM, 29/3/17115 /

115