[系列活動] Data exploration with modern R
-
date post
21-Apr-2017 -
Category
Data & Analytics
-
view
2.853 -
download
1
Transcript of [系列活動] Data exploration with modern R
![Page 1: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/1.jpg)
Exploring data with modern R
Winston Chang RStudio
2016–12–21
![Page 2: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/2.jpg)
https://hea-www.harvard.edu/~fine/Observatory/women.html
![Page 3: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/3.jpg)
Modern R
![Page 4: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/4.jpg)
A brief history of R• In the beginning, there was S. Developed at
Bell Labs in the 1970’s.• S was owned and licensed by AT&T• In 1990’s, two professors from New Zealand
created a free, open source reimplementation of S, called R
• Many of the unusual features of R exist because they came from S
• R itself is somewhat different from S and has a very flexible syntax
![Page 5: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/5.jpg)
install.packages("tidyverse") # Automatically installs ggplot2, dplyr, tidyr, # and others.
library(tidyverse) tidyverse_update() # Update all tidyverse pacakges to the latest # version.
The tidyverse
![Page 6: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/6.jpg)
Getting started
![Page 7: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/7.jpg)
faithful
head(faithful)
str(faithful)
View(faithful) # In RStudio
Looking at data with R
![Page 8: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/8.jpg)
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
50
60
70
80
90
2 3 4 5eruptions
waiting
library(ggplot2)
ggplot(data=faithful, mapping=aes(x=eruptions, y=waiting)) + geom_point()
# More concisely: ggplot(faithful, aes(eruptions, waiting)) + geom_point()
![Page 9: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/9.jpg)
0
5
10
15
20
25
2 3 4 5eruptions
count
0
10
20
30
40
2 3 4 5eruptions
count
ggplot(faithful, aes(x=eruptions)) + geom_histogram()
ggplot(faithful, aes(x=eruptions)) + geom_histogram(binwidth=.25)
![Page 10: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/10.jpg)
Your turnInspect the diamonds data set.With diamonds, make a histogram of the carat variable. Experiment with different bin sizes. What patterns do you see?
Inspect the mpg data set. With mpg, make a scatter plot showing the relationship between displ and hwy.
![Page 11: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/11.jpg)
0
5000
10000
15000
0 1 2 3 4 5carat
count
ggplot(diamonds, aes(x=carat)) + geom_histogram()
![Page 12: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/12.jpg)
ggplot(diamonds, aes(x=carat)) + geom_histogram(binwidth=0.3)
0
5000
10000
15000
0 1 2 3 4 5carat
count
![Page 13: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/13.jpg)
0
5000
10000
0 1 2 3 4 5carat
count
ggplot(diamonds, aes(x=carat)) + geom_histogram(binwidth=0.25)
![Page 14: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/14.jpg)
ggplot(diamonds, aes(x=carat)) + geom_histogram(binwidth=0.01)
0
1000
2000
0 1 2 3 4 5carat
count
![Page 15: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/15.jpg)
ggplot(mpg, aes(x=displ, y=hwy)) + geom_point()
20
30
40
2 3 4 5 6 7displ
hwy
![Page 16: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/16.jpg)
ggplot(mpg, aes(x=displ, y=hwy)) + geom_point() + geom_smooth(method=lm)
10
20
30
40
2 3 4 5 6 7displ
hwy
![Page 17: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/17.jpg)
head(mpg)
str(mpg)
View(mpg)
![Page 18: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/18.jpg)
ggplot(mpg, aes(x=displ, y=hwy, color=drv)) + geom_point()
20
30
40
2 3 4 5 6 7displ
hwy
drv4
f
r
![Page 19: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/19.jpg)
20
30
40
2 3 4 5 6 7displ
hwy
class2seater
compact
midsize
minivan
pickup
subcompact
suv
ggplot(mpg, aes(x=displ, y=hwy, color=class)) + geom_point()
![Page 20: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/20.jpg)
Your turn
What happens if you use shape instead of color?
Run ?geom_smooth to see the documentation. Then remove the confidence region from the model line. What happens if you add a model line and map a variable to color?
![Page 21: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/21.jpg)
Faceting
![Page 22: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/22.jpg)
ggplot(mpg, aes(x=displ, y=hwy)) + geom_point() + facet_wrap(~class)
suv
minivan pickup subcompact
2seater compact midsize
2 3 4 5 6 7
2 3 4 5 6 7 2 3 4 5 6 7
20
30
40
20
30
40
20
30
40
displ
hwy
![Page 23: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/23.jpg)
●●●●
●●
●● ●
●
●
●●●
●
●●●●
●
●●
●●
●
●●●
●●
●●
●●
●●●
●
●●●●●●●●
●●
●●
●●
●●●●
●
●●●●
●●●
●
●
●●
●
●
●
●●
●
●
●
●
●●●● ●●
●●
●●●
●●●●●●●
●
●●●●●●
●
●●●●
●●●●●●●●●●●
●●●●
●●●
●●●●
●●
●●
●●●
● ●
●●
●
● ●●●
●●
●
●●●
●●●
●●
●●
●●●●● ●
●
●
●
●
●●
●
●
●●
●
●
●●
●●●
●
●●
●
●
●●
●
●●
●
●●●
●
●●
●●
● ●●
●●
●●●
●●
●●●●
●
●
●
●●
●●
●●
●●●●
●●
●
●
●●
●
4 5 6 8
2 3 4 5 6 7 2 3 4 5 6 7 2 3 4 5 6 7 2 3 4 5 6 7
20
30
40
displ
hwy
ggplot(mpg, aes(x=displ, y=hwy)) + geom_point() + facet_grid(. ~ cyl)
●●
●●
●● ●●●
●●
●
●●
●●●●●
●●
●
●●
● ●
●
●●
●
●●
●
●●●
●
●●
●●
●●●● ●
●●●●●●
●●
●
●●
●
●
●●
●●
●●
●●● ●
●●●
●●
●●
●●●
●
●● ●●●●●●
●●●●
●
●●
●●●●
●●
●●
●●●●
●●●●
●
●
●
●● ●
●●●●
●
●●●●
●●●
●
●●●●
●
●●
●●
●●●
●
●●●
●●●
●●
●●
●●●
● ●● ●
●●
●
●●
●●
●●●
●●●●
●●●
●
●●●●
●
●
●●
●
●
●
●
●● ●●
●●
●
●
●
●●●●●
●●
●● ●
●
●
●
● ●
●
●
●●
●
● ●●
●●●●
●●●●
●
●●●
4f
r
2 3 4 5 6 7
20
30
40
20
30
40
20
30
40
displ
hwy
ggplot(mpg, aes(x=displ, y=hwy)) + geom_point() + facet_grid(drv ~ .)
![Page 24: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/24.jpg)
●●
●●
●●
●●●
●
●●●●●●●●
●●●●●
●●●●
●
●
●
●●●
●
●●●●
●
●●
●●
●
●●●
●●
●●
●●
●●
●●●●
●
●●●●
●
●
●●
●
●
●
●●
●
●
●
●
●●●● ●●
●●
●●●●●●
●●●●●●●●●●●
●
●●
●●
●●
●●●
●
●●
●●
●●●●
●
●●
●●●●
●
●●●●
●●●
●●●
●●●
● ●● ●
●●
●●●
●●●
●●●●● ●
●●●●
●
●
●●
●●●
●
●●
●
●
●●
●
●●
●
●●●
●
●●
●●
●●
●●●
●●●
●
●●
●●
●●
●
●●
●●●
●
●
●
●
●
●●
●
●
●●
●
● ●●
●●●●
●
●●●
4 5 6 8
4f
r
2 3 4 5 6 7 2 3 4 5 6 7 2 3 4 5 6 7 2 3 4 5 6 7
20
30
40
20
30
40
20
30
40
displ
hwy
ggplot(mpg, aes(x=displ, y=hwy)) + geom_point() + facet_grid(drv ~ cyl)
cyl
drv
![Page 25: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/25.jpg)
Your turn
Try faceting with a histogram
![Page 26: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/26.jpg)
ggplot2 concepts
![Page 27: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/27.jpg)
Geoms
Points
Lines
Bars
Error bars
Box plot
![Page 28: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/28.jpg)
Aesthetics
Y position
X position
Color
Size
![Page 29: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/29.jpg)
Aesthetics
![Page 30: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/30.jpg)
Mapping data values to aesthetics
●
●
●
●
2
4
6
8
2 3 4 5 6 7var1
var2
●
●
●
●
2
4
6
8
2 3 4 5 6 7var1
var2
012345
var3
var1 var2 var3
2 2 53 4 05 8 47 5 1
ggplot(dat, aes(x=var1, y=var2)) + geom_point()
ggplot(dat, aes(x=var1, y=var2, color=var3)) + geom_point()
![Page 31: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/31.jpg)
●
●
●
●
2
4
6
8
2 3 4 5 6 7var1
var2
Setting data values to aestheticsvar1 var2 var3
2 2 53 4 05 8 47 5 1
ggplot(dat, aes(x=var1, y=var2)) + geom_point(color="red")
ggplot(dat, aes(x=var1, y=var2)) + geom_point(color="red", size=6)
●
●
●
●
2
4
6
8
2 3 4 5 6 7var1
var2
![Page 32: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/32.jpg)
Different geoms
●
●
●
●
2
4
6
8
2 3 4 5 6 7var1
var2
ggplot(dat, aes(x=var1, y=var2)) + geom_point()
2
4
6
8
2 3 4 5 6 7var1
var2
ggplot(dat, aes(x=var1, y=var2)) + geom_line()
0
2
4
6
8
2 4 6var1
var2
ggplot(dat, aes(x=var1, y=var2)) + geom_bar(stat="identity")
![Page 33: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/33.jpg)
Using multiple geoms
●
●
●
●
2
4
6
8
2 3 4 5 6 7var1
var2
ggplot(dat, aes(x=var1, y=var2)) + geom_point() + geom_line()
# Equivalent to ggplot(dat) + geom_point(aes(x=var1, y=var2)) + geom_line(aes(x=var1, y=var2))
ggplot() + geom_point(aes(x=var1, y=var2), data=dat) + geom_line(aes(x=var1, y=var2), data=dat)
Default data
Default mapping
Overridedefaults in each
geom
![Page 34: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/34.jpg)
Discrete Continuous
Color Rainbow of colorsGradient from light
blue to dark blue
Size Discrete size stepsLinear mapping
between radius and value
Shape Different shape for each Shouldn’t work
![Page 35: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/35.jpg)
●
●
●
●
●
●
0
2
4
6
A Bvar1
var3
var2●
●
●
G0G1G2
●
●
●
●
●
●
0
2
4
6
A Bvar1
var3ggplot(dat2, aes(x=var1, y=var3)) + geom_point()
Mapping discrete variables
ggplot(dat2, aes(x=var1, y=var3, color=var2)) + geom_point()
var1 var2 var3A G1 5B G0 0A G2 4B G1 1A G0 6B G2 3
![Page 36: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/36.jpg)
Data wrangling with modern R
![Page 37: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/37.jpg)
Tidyverse=
Tidy + universe
Source: https://www.flickr.com/photos/rubbermaid/7203340384 Source: http://hubblesite.org/newscenter/archive/releases/2014/27/image/a/
![Page 38: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/38.jpg)
faithful
as.tbl(faithful)
Tibbles
![Page 39: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/39.jpg)
Tidy data
A B C D A B C D
Each variable is in a column
Each observation is in a row
![Page 40: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/40.jpg)
Example of non-tidy data
subject sex cond1 cond2 cond3
1 M 7.9 12.3 10.7
2 F 6.3 10.6 11.1
3 F 9.5 13.1 13.8
4 M 11.5 13.4 12.9
Each row has 3 observations
Not Tidy
![Page 41: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/41.jpg)
Converting to tidy data
subject sex cond1 cond2 cond3
1 M 7.9 12.3 10.7
2 F 6.3 10.6 11.1
3 F 9.5 13.1 13.8
4 M 11.5 13.4 12.9
subject sex condition value
1 M cond1 7.9
1 M cond2 12.3
1 M cond3 10.7
2 F cond1 6.3
2 F cond2 10.6
2 F cond3 11.1
3 F cond1 9.5
3 F cond2 13.1
3 F cond3 13.8
4 M cond1 11.5
4 M cond2 13.4
4 M cond3 12.9
Not Tidy
Tidy
![Page 42: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/42.jpg)
• filter: Keep rows
• select: Keep columns
• mutate: Add new columns
• arrange: Sort rows
• summarise: Reduce variables
![Page 43: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/43.jpg)
# Traditional R mpg[mpg$hwy > 30, ]
# dplyr filter(mpg, hwy > 30)
Filter: get a subset of rows
![Page 44: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/44.jpg)
# AND filter(mpg, hwy > 30, class == "compact") filter(mpg, hwy > 30 & class == "compact")
# OR filter(mpg, hwy > 30 | class == "compact")
Filter: get a subset of rows
![Page 45: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/45.jpg)
%>%
![Page 46: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/46.jpg)
filter(mpg, hwy > 30)
mpg %>% filter(hwy > 30)
select(filter(mpg, hwy > 30), model, hwy, class)
mpg %>% filter(hwy > 30) %>% select(model, hwy, class)
mpg %>% filter(hwy > 30) %>% select(model, hwy, class) %>% View()
Piping with %>%
![Page 47: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/47.jpg)
# Traditional R mpg[, c("model", "displ", "cyl", "drv", "class", "hwy")]
# dplyr select(mpg, model, displ, cyl, drv, class, hwy)
select(mpg, -manufacturer, -fl)
Select: get a subset of columns
![Page 48: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/48.jpg)
# Traditional R mpg$avg <- (mpg$cty + mpg$hwy)/2
# dplyr mpg %>% mutate(avg = (cty+hwy)/2)
mpg %>% mutate( avg = (cty+hwy)/2, ratio = hwy/cty )
Mutate: add new columns
![Page 49: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/49.jpg)
# Traditional R mpg[order(mpg$hwy), ]
# dplyr arrange(mpg, hwy)
Arrange: sort rows
![Page 50: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/50.jpg)
# Traditional R mean(mpg$hwy) sd(mpg$hwy)
# dplyr summarise(mpg, hwy_m = mean(hwy))
summarise(mpg, hwy_m = mean(hwy), hwy_sd = sd(hwy), cty_m = mean(cty), cty_sd = sd(cty) )
Summarise: reduce variables
summarise ≠ summarize
![Page 51: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/51.jpg)
Group operations
![Page 52: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/52.jpg)
Why is this important?
![Page 53: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/53.jpg)
Summarisesubject sex condition value
1 M cond1 7.9
1 M cond2 12.3
1 M cond3 10.7
2 F cond1 6.3
2 F cond2 10.6
2 F cond3 11.1
3 F cond1 9.5
3 F cond2 13.1
3 F cond3 13.8
4 M cond1 11.5
4 M cond2 13.4
4 M cond3 12.9
value
11.1
data %>% summarise(value = mean(value))
![Page 54: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/54.jpg)
Group-wise summarisesubject sex condition value
1 M cond1 7.9
1 M cond2 12.3
1 M cond3 10.7
2 F cond1 6.3
2 F cond2 10.6
2 F cond3 11.1
3 F cond1 9.5
3 F cond2 13.1
3 F cond3 13.8
4 M cond1 11.5
4 M cond2 13.4
4 M cond3 12.9
subject value
1 10.3
2 9.3
3 12.1
4 12.6
data %>% group_by(subject) %>% summarise(value = mean(value))
![Page 55: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/55.jpg)
Group-wise summarisesubject sex condition value
1 M cond1 7.9
1 M cond2 12.3
1 M cond3 10.7
2 F cond1 6.3
2 F cond2 10.6
2 F cond3 11.1
3 F cond1 9.5
3 F cond2 13.1
3 F cond3 13.8
4 M cond1 11.5
4 M cond2 13.4
4 M cond3 12.9
sex condition value
F cond1 11.9
F cond2 12.5
F cond3 7.9
M cond1 12.9
M cond2 11.8
M cond3 9.7
data %>% group_by(sex, condition) %>% summarise(value = mean(value))
![Page 56: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/56.jpg)
Mutatesubject sex condition value
1 M cond1 7.9
1 M cond2 12.3
1 M cond3 10.7
2 F cond1 6.3
2 F cond2 10.6
2 F cond3 11.1
3 F cond1 9.5
3 F cond2 13.1
3 F cond3 13.8
4 M cond1 11.5
4 M cond2 13.4
4 M cond3 12.9
data %>% mutate(norm = value - mean(value))
subject sex condition value norm
1 M cond1 7.9 -3.2
1 M cond2 12.3 1.2
1 M cond3 10.7 -0.4
2 F cond1 6.3 -4.8
2 F cond2 10.6 -0.5
2 F cond3 11.1 0
3 F cond1 9.5 -1.6
3 F cond2 13.1 2
3 F cond3 13.8 2.7
4 M cond1 11.5 0.4
4 M cond2 13.4 2.3
4 M cond3 12.9 1.8
![Page 57: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/57.jpg)
Group-wise mutatesubject sex condition value
1 M cond1 7.9
1 M cond2 12.3
1 M cond3 10.7
2 F cond1 6.3
2 F cond2 10.6
2 F cond3 11.1
3 F cond1 9.5
3 F cond2 13.1
3 F cond3 13.8
4 M cond1 11.5
4 M cond2 13.4
4 M cond3 12.9
data %>% group_by(subject) %>% mutate(norm = value - mean(value))
subject sex condition value norm
1 M cond1 7.9 -2.4
1 M cond2 12.3 2
1 M cond3 10.7 0.4
2 F cond1 6.3 -3
2 F cond2 10.6 1.3
2 F cond3 11.1 1.8
3 F cond1 9.5 -2.6
3 F cond2 13.1 1
3 F cond3 13.8 1.7
4 M cond1 11.5 -1.1
4 M cond2 13.4 0.8
4 M cond3 12.9 0.3
![Page 58: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/58.jpg)
Tidying data with tidyr
![Page 59: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/59.jpg)
Converting to tidy data
subject sex cond1 cond2 cond3
1 M 7.9 12.3 10.7
2 F 6.3 10.6 11.1
3 F 9.5 13.1 13.8
4 M 11.5 13.4 12.9
subject sex condition value
1 M cond1 7.9
1 M cond2 12.3
1 M cond3 10.7
2 F cond1 6.3
2 F cond2 10.6
2 F cond3 11.1
3 F cond1 9.5
3 F cond2 13.1
3 F cond3 13.8
4 M cond1 11.5
4 M cond2 13.4
4 M cond3 12.9
Not Tidy
Tidy
![Page 60: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/60.jpg)
Converting to tidy data
subject sex cond1 cond2 cond3
1 M 7.9 12.3 10.7
2 F 6.3 10.6 11.1
3 F 9.5 13.1 13.8
4 M 11.5 13.4 12.9
gather(data, condition, value, cond1:cond3)
subject sex condition value
1 M cond1 7.9
1 M cond2 12.3
1 M cond3 10.7
2 F cond1 6.3
2 F cond2 10.6
2 F cond3 11.1
3 F cond1 9.5
3 F cond2 13.1
3 F cond3 13.8
4 M cond1 11.5
4 M cond2 13.4
4 M cond3 12.9
data
![Page 61: [系列活動] Data exploration with modern R](https://reader030.fdocuments.us/reader030/viewer/2022020301/58f9a903760da3da068b69fc/html5/thumbnails/61.jpg)
Thank you!