Count Data. HT Cleopatra VII & Marcus Antony C c Aa.
-
Upload
dorcas-hubbard -
Category
Documents
-
view
222 -
download
0
Transcript of Count Data. HT Cleopatra VII & Marcus Antony C c Aa.
![Page 1: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/1.jpg)
Count Data
![Page 2: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/2.jpg)
H T
),(~ HH pnBinX
),;(~),( THTH ppnMNomXX
),,,;(~),,,( 621621 pppnMNomXXX
![Page 3: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/3.jpg)
Cleopatra VII & Marcus Antony
),,,;(~),,,( cacACaCAcacACaCA ppppnMNomXXXX
C
c
A a
![Page 4: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/4.jpg)
EVEN
ODD
1st 122nd 123rd 12
EVEN
ODD
![Page 5: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/5.jpg)
Gregor Mendel, 1822-1884
![Page 6: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/6.jpg)
![Page 7: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/7.jpg)
RY Ry rY ry Total
Obs. 950 250 350 50 1600
Expect( )
900 300 300 100 1600
)1:3:3:9():::(:0 ryrYRyRY ppppH
)1:3:3:9():::(:1 ryrYRyRY ppppH
0H
Which statement is right or ?1H0H
![Page 8: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/8.jpg)
RY Ry rY ry Total
Obs. 950 250 350 50 1600
Expect 900 300 300 100 1600
O-E 50 -50 50 -50 0
2500 2500 2500 2500 100002)( EO
X
![Page 9: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/9.jpg)
),,,;(~),,,( 43214321 ppppnMNomXXXX
4,3,2,1,),(~ inpPoissonX iiii
iiii NPoisson largefor,),(~)(
)1(~,)1,0(~ 2
2
i
ii
i
ii XN
X
![Page 10: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/10.jpg)
),,,;(~),,,( 43214321 ppppnMNomXXXX
)1(~ 2
2
i
iiX
nXXXX 4321
)14(~ 24
1
2
i i
iiX
![Page 11: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/11.jpg)
1 2 3 4 Total
Obs. 950 250 350 50 1600
Expect 900 300 300 100 1600
O-E 50 -50 50 -50 0
25/9 25/3 25/3 25 25*15/9EEO /)( 2
X
)3(~ 2
4
1
24
1
2
i i
ii
i i
ii
E
EOX
![Page 12: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/12.jpg)
1 3 5 8 15 24 ∞
0.975 0.001
0.216
0.831 2.180
6.262
12.401
0.95 0.004
0.352
1.145 2.733
7.261
13.848
0.05 3.841
7.815
11.071
15.507
24.996
36.415
0.025 5.024
9.348
12.833
17.535
27.488
39.364
∞
∞
∞
∞
2,n
n
0 2 4 6 8 10
0.0
0.1
0.
2
- 2 0 2 4 6 8 10 12
0.0
0
.2
0.4
0.6
0
.8
1.
0
®®
®®
Â2®(n)Â2®(n) Â2
®(n)Â2®(n)
2,n
2,n
![Page 13: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/13.jpg)
RY Ry rY ry Total
Obs. 950 250 350 50 1600
Expect( )
900 300 300 100 1600
)1:3:3:9(),,,(:0 ryrYRyRY ppppH
)1:3:3:9(),,,(:1 ryrYRyRY ppppH
0H
815.744.449/16*25 2
3,05.0
4
1
2
i i
ii
E
EO
> x <- c(950,250,350,50)> p <- c(9,3,3,1)/16> chisq.test(x, p=p) Chi-squared test for given probabilitiesdata: x X-squared = 44.4444, df = 3, p-value = 1.214e-09
![Page 14: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/14.jpg)
ry
rY
Ry
RY
p
p
p
p
p*
1
3
3
9
16
1Mp
10 HH
0Mp
*p
0H
MM ppHvsppH *1
*0 :.:
303)dim()dim( 010 HHHdf
![Page 15: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/15.jpg)
Y y Total
R 950 250 1200
r 350 50 400
Total 1300 300 1600
scscscsc pppHvspppH ),(1),(0 :.:
Y y
R
r
1
cs
cs),( RYp ),( Ryp
),( ryp),( rYp
Yp yp
Rp
rp
16/12
16/4
16/13 16/3
![Page 16: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/16.jpg)
scscscsc pppHvspppH ),(1),(0 :.:
Y y
R
r
1
cs),( RYp ),( Ryp
),( ryp),( rYp
Yp yp
Rp
rp
16/12
16/4
16/13 16/3
Y y
R
r
1
cs
16
13
16
12 16/12
16/4
16/13 16/3
16
3
16
12
16
3
16
4
16
13
16
4
Chi-square test for Independence test
![Page 17: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/17.jpg)
RY Ry rY ry Total
Obs. 950 250 350 50 1600
Expect( )
1600975 225 325 75
Y y
R
r
1
cs
16
13
16
12 16/12
16/4
16/13 16/3
16
3
16
12
16
3
16
4
16
13
16
4
Y y
R 1200
r 400
1300 300 1600
cs
16
13
16
121600
16
3
16
121600
16
3
16
41600
16
13
16
41600
0H
![Page 18: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/18.jpg)
RY Ry rY ry Total
Obs. 950 250 350 50 1600
Expect( )
1600
0.64 2.77 1.92 8.33 13.67
975 225 325 750H
scscscsc pppHvspppH ),(1),(0 :.:
EEO /)( 2
84.367.13 2
1,05.0
4
1
2
i i
ii
E
EO
> mx<- matrix(c(950,250,350,50),2,)> chisq.test(mx,correct=F) Pearson's Chi-squared testdata: mx X-squared = 13.6752, df = 1, p-value = 0.0002173
> mx [,1] [,2][1,] 950 350[2,] 250 50
![Page 19: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/19.jpg)
Y y
R
r
1
cs),( RYp ),( Ryp
),( ryp),( rYp
Yp yp
Rp
rp
1 rR pp
1 yY pp
1)12()12( df
123)dim()dim( 010 HHHdf
yr
Yr
yR
YR
pp
pp
pp
pp
p*
ry
rY
Ry
RY
p
p
p
p
p*
2)dim( 0 H
3)dim( 10 HH
![Page 20: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/20.jpg)
y1 … ym Tot
r1
…
rk
Tot 1
cs
)1()1()dim()dim( 010 kmHHHdf
)1()1()dim( 0 kmH
1)dim( 10 mkHH
![Page 21: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/21.jpg)
Total
Obs. 8 12 7 14 9 10 60Expec ( )
10 10 10 10 10 10 60
0.4 0.4 0.9 1.6 0.1 0 3.4
)1:1:1:1:1:1():::::(: 6543210 ppppppH
)1:1:1:1:1:1():::::(: 6543211 ppppppH
EEO /)( 20H
25,05.0
6
1
22 07.114.3
i i
ii
E
EO
> x <- c(8,12,7,14,9,10)> p <- rep(1,6)/6> chisq.test(x,p=p) Chi-squared test for given probabilitiesdata: x X-squared = 3.4, df = 5, p-value = 0.6386
![Page 22: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/22.jpg)
H T Total
Obs. 60 40 100Expec( )
50 50 100
2 2 4EEO /)( 20H
)1:1():(:0 TH ppH
)1:1():(:1 TH ppH
2/1:0 HpH
2/1:1 HpH
)1:1():(:1 TH ppH
21,05.0
2
1
22 84.34
i i
ii
E
EO
> chisq.test(c(60,40),p=c(1,1)/2) Chi-squared test for given probabilitiesdata: c(60, 40) X-squared = 4, df = 1, p-value = 0.0455
![Page 23: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/23.jpg)
|| ?
:
:
560
440
640
360
> head2 <- c( 560, 640)> toss2 <- c( 1000, 1000)> prop.test(head2, toss2)2-sample test for equality of proportions ….data: head2 out of toss2 X-squared = 13.0021, df = 1, p-value = 0.0003111alternative hypothesis: two.sided 95 percent confidence interval: -0.12379728 -0.03620272 sample estimates:prop 1 prop 2 0.56 0.64
Caesar Tolemy
Head 560 640
Tail 440 360
> chisq.test(mx,cor=F) Pearson's Chi-squared testdata: mx X-squared = 13.3333, df = 1, p-value = 0.0002607> chisq.test(mx) Pearson's Chi-squared test with Yates‘ continuity correctiondata: mx X-squared = 13.0021, df = 1, p-value = 0.0003111
> mx <- matrix(c(560,440,640,360),2,)> mx [,1] [,2][1,] 560 640[2,] 440 360
Chi-square test for Homogeneity of distributions
![Page 24: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/24.jpg)
> > # H0 : all four coins have the same proportion showing head side> # H1 : at least one coin have different proportion to the others> > head4 <- c( 83, 90, 129, 70 )> toss4 <- c( 86, 93, 136, 82 )> prop.test(head4, toss4)
4-sample test for equality of proportions without continuity correction
data: head4 out of toss4 X-squared = 12.6004, df = 3, p-value = 0.005585alternative hypothesis: two.sided sample estimates: prop 1 prop 2 prop 3 prop 4 0.9651163 0.9677419 0.9485294 0.8536585
Coin 1 Coin 2 Coin 3 Coin 4
Head 83 90 129 70 Alive
Tail 3 3 7 12 Dead
Total 86 93 136 82 Total
Hospital 1
Hospital 2
Hospital 3 Hospital 4
> mx <- matrix(c(83,3,90,3,129,7,70,12),2,)> chisq.test(mx) Pearson's Chi-squared testdata: mx X-squared = 12.6004, df = 3, p-value = 0.005585
![Page 25: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/25.jpg)
D W WD
CC 37 190 94
CR 23 59 23
RC 10 141 28
RR 15 58 26
Australia rare plants data
Common (C ) & Rare (R ) in ( South Australia, Victoria) and (Tasmania )
The number of plants:
in Dry (D ), Wet (W ) and Wet or Dry (WD ) regions.
Question (null hypothesis):
Is the distribution of plants for (D,W,WD) are equal for all CC, CR, RC and RR?
![Page 26: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/26.jpg)
Australia rare plants data
> rareplants<-matrix(c(37,23,10,15,190,59,141,58,94,23,28,16),4,)> dimnames(rareplants)<-list(c("CC","CR","RC","RR"),c("D","W","WD"))> rareplants> (sout<- chisq.test(rareplants) )
Pearson's Chi-squared test
data: rareplants X-squared = 34.9863, df = 6, p-value = 4.336e-06
> round( sout$expected ,1 ) D W WDCC 39.3 207.2 74.5CR 12.9 67.8 24.4RC 21.9 115.6 41.5RR 10.9 57.5 20.6> round( sout$resid ,3 ) D W WDCC -0.369 -1.196 2.263CR 2.828 -1.067 -0.275RC -2.547 2.368 -2.099RR 1.242 0.072 -1.023
![Page 27: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/27.jpg)
The lady tasting tea
http://www.youtube.com/watch?v=lgs7d5saFFc
http://en.wikipedia.org/wiki/Fisher's_exact_test
![Page 28: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/28.jpg)
Fisher’s exact test for 2X2 tables with small n (n<25)
> chisq.test(matrix(c(7,2,1,5),2,)) Pearson's Chi-squared test with Yates' continuity correctionX-squared = 3.2254, df = 1, p-value = 0.0725Warning message: 카이 자승 근사는 부정확할지도 모릅니다> fisher.test(matrix(c(7,2,1,5),2,)) Fisher's Exact Test for Count Data p-value = 0.04056alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 0.8646648 934.0087368 sample estimates: odds ratio 13.59412 > fisher.test(matrix(c(7,2,1,5),2,),alter="greater") Fisher's Exact Test for Count Datap-value = 0.03497alternative hypothesis: true odds ratio is greater than 1 95 percent confidence interval: 1.179718 Inf sample estimates: odds ratio 13.59412
Guess\Making Milk 1st Tea 1st Sum
Milk 1st 7 1 8
Tea 1st 2 5 7
sum 9 6 15
![Page 29: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/29.jpg)
There are 7 possible tables for given marginal counts.
G\M M 1st
T 1st
Sum
M 1st 8 0 8
T 1st 1 6 7
sum 9 6 15
G\M M 1st
T 1st
Sum
M 1st 7 1 8
T 1st 2 5 7
sum 9 6 15
G\M M 1st
T 1st
Sum
M 1st 6 2 8
T 1st 3 4 7
sum 9 6 15
G\M M 1st
T 1st
Sum
M 1st 5 3 8
T 1st 4 3 7
sum 9 6 15G\M M
1st
T 1st
Sum
M 1st 4 4 8
T 1st 5 2 7
sum 9 6 15
G\M M 1st
T 1st
Sum
M 1st 3 5 8
T 1st 6 1 7
sum 9 6 15
G\M M 1st
T 1st
Sum
M 1st 2 6 8
T 1st 7 0 7
sum 9 6 15
What is the probability that each table will show at the experiment ?
![Page 30: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/30.jpg)
G\M M 1st
T 1st
Sum
M 1st a b a+b
T 1st c d c+d
sum a+c b+d n
G\M M 1st
T 1st
Sum
M 1st r q v
T 1st 1-r 1-q 1-v
sum 1 1 1
r)q(1
q)r(1
q1q
r1r
1 means no discernible ability.
Odds ratio :
qrv dcban
1:.1: 10 HvsH
ba
n
b
db
a
ca
dcban
dbcadcba
ca
n
c
dc
a
ba
p!!!!!
)!()!()!()!(
1:.1: 10 HvsH
cb
ad with some
correction
![Page 31: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/31.jpg)
G\M M 1st
T 1st
Sum
M 1st 8 0 8
T 1st 1 6 7
sum 9 6 15G\M M
1st
T 1st
Sum
M 1st 7 1 8
T 1st 2 5 7
sum 9 6 15
G\M M 1st
T 1st
Sum
M 1st 6 2 8
T 1st 3 4 7
sum 9 6 15
G\M M 1st
T 1st
Sum
M 1st 5 3 8
T 1st 4 3 7
sum 9 6 15
G\M M 1st
T 1st
Sum
M 1st 4 4 8
T 1st 5 2 7
sum 9 6 15
G\M M 1st
T 1st
Sum
M 1st 3 5 8
T 1st 6 1 7
sum 9 6 15
G\M M 1st
T 1st
Sum
M 1st 2 6 8
T 1st 7 0 7
sum 9 6 15
0.00140
0.03356 0.19580
1When
0.00560
0.39161
0.29370 0.07832
0.00140 + 0.03356 + 0.00560 = 0.04056 (See, p-value of the fisher exact test; two-sided test)
0.00140 + 0.03356 = 0.03497 (one-sided test)
![Page 32: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/32.jpg)
G\M M 1st
T 1st
Sum
M 1st 9 0 9
T 1st 0 6 6
sum 9 6 15
G\M M 1st
T 1st
Sum
M 1st 4 4 8
T 1st 5 2 7
sum 9 6 15100% correct answers Some are misclassified
Fisher exact test considers only the cases with the same fixed margins.
The probabilities of tables with different margins are completely ignored.
This is referred to data-respecting (?) inference, from time to time.
![Page 33: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/33.jpg)
Use Fisher’s exact test only for small n ( less than 25).
> Pearson's Chi-squared testX-squared = 10.8036, df = 1, p-value = 0.001013> chisq.test(matrix(c(14,4,2,10),2,)) Pearson's Chi-squared test with Yates' continuity correctionX-squared = 8.4877, df = 1, p-value = 0.003576> fisher.test(matrix(c(14,4,2,10),2,)) Fisher's Exact Test for Count Datap-value = 0.002185alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 2.123319 202.143800 sample estimates: odds ratio 15.40804
Guess\Making Milk 1st Tea 1st Sum
Milk 1st 14 2 16
Tea 1st 4 10 14
sum 18 12 30
No big difference when n is large !
![Page 34: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/34.jpg)
Yates’ continuity correction
8036.10
))()()((
)( 222
dbcadcba
bcadn
E
EO
i i
ii
G\M M 1st
T 1st
Sum
M 1st a b a+b
T 1st c d c+d
sum a+c b+d ndcban
4877.8))()()((
)2/|(|21
||corrected
2
2
2
dbcadcba
nbcadn
E
EO
i i
ii
![Page 35: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/35.jpg)
Odds ratio : q1q
r1r
q1
qlog
r1
rlog)log(
0.0 0.2 0.4 0.6 0.8 1.0
-4-2
02
4
y
logi
t(y)
-6 -4 -2 0 2 4 6
0.0
0.2
0.4
0.6
0.8
1.0
x
log
istic
(x)
![Page 36: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/36.jpg)
),(~ 2ii NYtindependen
Regressionii X 0
Generalized Linear Model (GLM)
iij ),(~ 2ijij NY
ijii X ),(~ 2ijij NY
ANOVA
Linear Model (LM)
tindependen
tindependen
![Page 37: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/37.jpg)
Linear Model (LM) - Regression, - ANOVA
)(~ ijij PoissonY
),(~ ijij pnBinY
tindependen
tindependen
ijiij X )log(
ijiij
ij Xp
p
1log
Generalized Linear Model (GLM)
Poisson Regression
Binomial Regression ( Logistic Regression )
ijiij X ),(~ 2ijij NYtindependen
![Page 38: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/38.jpg)
Guess\Making Milk 1st Tea 1st Sum
Milk 1st 7 1 8
Tea 1st 2 5 7
sum 9 6 15
),9(~ 11 pBinY
):;:(2,1,),,(~ GuessjMakingijipnBinY iii
),6(~ 22 pBinY
11 9 YV 22 6 YV 1,7 21 YY are observed!
Logistic regression
iii
i Xp
p
1
log
![Page 39: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/39.jpg)
> tm<-data.frame(gm=c(7,1),gt=c(2,5), making=c("M","T"))> summary( glm(cbind(gm,gt)~making,family=binomial, data=tm) )Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 1.2528 0.8018 1.562 0.118 makingT -2.8622 1.3575 -2.108 0.035 *
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 5.7863e+00 on 1 degrees of freedomResidual deviance: 8.8818e-16 on 0 degrees of freedomAIC: 8.1909
Number of Fisher Scoring iterations: 4
Logistic regression with the lady tasting tea data
![Page 40: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/40.jpg)
A B C D E F
10 11 0 3 3 11
7 17 1 5 5 9
20 21 7 12 3 15
14 11 2 6 5 22
14 16 3 4 3 15
12 14 1 3 6 16
10 17 2 5 1 13
23 17 1 5 1 10
17 19 3 5 3 26
20 21 0 5 2 26
14 7 1 2 6 24
13 13 4 4 4 13
A B C D E F
05
1015
2025
InsectSprays data
Type of spray
Inse
ct c
ount
![Page 41: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/41.jpg)
> sx<-rep(LETTERS[1:6],e=12)> dx<-c(10,7,20,14,14,12,10,23,17,20,14,13,11,17,21,11,16,14,17,17,19,21,7,13,+ 0,1,7,2,3,1,2,1,3,0,1,4,3,5,12,6,4,3,5,5,5,5,2,4,3,5,3,5,3,6,1,1,3,2,6,+ 4,11,9,15,22,15,16,13,10,26,26,24,13)> ax<- 30-dx> insect<-data.frame(dead=dx,alive=ax,spray=sx)> gout<-glm(cbind(dead,alive)~spray,family=binomial, data=insect)> summary( gout )
Call: glm(formula = cbind(dead, alive) ~ spray, family = binomial, data = insect)
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.06669 0.10547 -0.632 0.5272 sprayB 0.11114 0.14913 0.745 0.4561 sprayC -2.52856 0.23259 -10.871 <2e-16 ***sprayD -1.56288 0.17719 -8.821 <2e-16 ***sprayE -1.95769 0.19513 -10.033 <2e-16 ***sprayF 0.28983 0.14958 1.938 0.0527 .
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 614.07 on 71 degrees of freedomResidual deviance: 171.24 on 66 degrees of freedomAIC: 416.16
Number of Fisher Scoring iterations: 4
ii
i
p
p
1
log
![Page 42: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/42.jpg)
> gres<-rbind(unique(fitted(gout)),unique(predict(gout)))> dimnames(gres)[[2]]<-LETTERS[1:6]
> gres A B C D E F[1,] 0.48333333 0.51111111 0.06944445 0.1638889 0.1166667 0.5555556[2,] -0.06669137 0.04445176 -2.59525468 -1.6295728 -2.0243818 0.2231436
> anova(gout)Analysis of Deviance Table
Model: binomial, link: logit
Response: cbind(dead, alive)
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. DevNULL 71 614.07spray 5 442.83 66 171.24
ii
i
p
p
1
log
ip
i
![Page 43: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/43.jpg)
Correlation and causality
The more STBK stores, the higher will APT price increase ?
The more Starbucks, the higher APT price !
APT prices in Seoul
![Page 44: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/44.jpg)
STBK
APT price
강남구 45 1030
강동구 2 530
중구 24 520
중랑구 0 330
STBK: the number of Starbucks stores
APT price: Average APT price by a 1 m2
)(~ ii PoissonY
ii X )log(
iY iX
![Page 45: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/45.jpg)
y<-c(45, 2,1,4,4,6,4,2,1,0,2,3,10,8,21,3,5,5,3,12,7,1,20,24,0)x<-c(3373,1907,1115,1413,1286,1861,1218,1018,1250,1135,1240,1528, 1675,1220,2854,1644,1247,2427,2034,1723,2594,1138,1634,1729,1101)
xm<- x/(3.3) # 평단가
( res<- glm(y~xm, family=poisson) )
anova(res)summary(res)
plot(xm,y,ylab="Starbucks",xlab="APT price/m2")
points(xm,fitted(res),col="red",pch=16) # exp(predict(res))=fitted(res)
300 400 500 600 700 800 900 1000
01
02
03
04
0
APT price/m2
Sta
rbu
cks
i
![Page 46: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/46.jpg)
> summary(res)
Call:glm(formula = y ~ xm, family = poisson)
Deviance Residuals: Min 1Q Median 3Q Max -2.6923 -1.7239 -0.6041 0.5783 5.3036
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.0072064 0.2128074 -0.034 0.973 xm 0.0035630 0.0003009 11.841 <2e-16 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 235.19 on 24 degrees of freedomResidual deviance: 111.52 on 23 degrees of freedomAIC: 195.4
Number of Fisher Scoring iterations: 5
![Page 47: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/47.jpg)
> anova(res)Analysis of Deviance Table
Model: poisson, link: log
Response: y
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. DevNULL 24 235.19xm 1 123.67 23 111.52
![Page 48: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/48.jpg)
A 0.75 0.05 0.05 0.05 0.05 0.05
B 0.1 0.5 0.1 0.1 0.1 0.1
C 0.05 0.05 0.05 0.05 0.05 0.75
distribution & likelihood
![Page 49: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/49.jpg)
,),(~ pnBinX
xnx ppx
nxf
)1()(
0xX
What is ?
is observed.
p
)1,0(p
![Page 50: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/50.jpg)
01
56
0.0
p x
1.0
1.0p
7.0p
distribution & likelihood
0 1 2 3 4 5 6
0.00
0.10
0.20
0.30
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.1
0.2
0.3
0.4
p
lik
eli
ho
od
0.133 0.587
0.15
42 )1(2
6)2( ppf
![Page 51: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/51.jpg)
2)( jjj
YSSE
nj ,,2,1 ),(~ 2 jj NYtindependen
?
n
jjjy
nn
jjj eyf 1
22 2/)(2/2
1
)2()|(likelihood
)2log(/)()|(log2hood)log(likeli2- 2
1
22
1
nyyfn
jjj
n
jjj
22 2/)(
2
1)(
yeyf
![Page 52: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/52.jpg)
)log(2 likelihoodDeviance
nj ,,2,1 tindependen
)(~ jj PoissonY
,2,1,0,!
)( yy
eyfy
n
jj
yj
n
jjj yeyf jj
11
)!/()|(likelihood
n
jjjjj
n
jjj yyyf
11
))!log(log(2)|(log2hood)log(likeli2-
![Page 53: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/53.jpg)
)(~ jj PoissonY
jjj X )log(
n
jjjjj
n
jjj yyyf
11
))!log(log(2)|(log2hood)log(likeli2-
n
jjjjj xyx
1)log()(2
link function(for Poisson family)
:,2)log(2 kklikelhoodAIC the number of parameters
tindependen
linear modeling for the link function
![Page 54: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/54.jpg)
)log(2 likelihoodDeviance
nj ,,2,1 tindependen
),(~ jj pnBinY
n
j
ynj
yj
j
n
jjj
jj ppy
npyf
11
)1()|(likelihood
n
j jj
jij y
n
p
pypn
1log
1log)1log(2hood)log(likeli2-
nyppy
npyfyf yny ,...,1,0,)1()|()(
iji
i Xp
p
01
loglink function (for binomial family)
linear modeling for the link function
![Page 55: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/55.jpg)
Independence test in GLM for Australia rare plants data
> rareplants<-matrix(c(37,23,10,15,190,59,141,58,94,23,28,16),4,)
> dimnames(rareplants)<-list(c("CC","CR","RC","RR"),c("D","W","WD"))> (sout<- chisq.test(rareplants) )
Pearson's Chi-squared testdata: rareplants X-squared = 34.9863, df = 6, p-value = 4.336e-06
> wdx<-rep(c("D","W","WD"),e=4)> crx<-rep(c("CC","CR","RC","RR"),3)> rplants<-data.frame(wd=wdx,cr=crx,r=c(rareplants))> anova( glm(r~wd*cr,family=poisson,data=rplants) )
Analysis of Deviance TableModel: poisson, link: log, Response: r
Terms added sequentially (first to last) Df Deviance Resid. Df Resid. DevNULL 11 522.11wd 2 305.28 9 216.83cr 3 181.88 6 34.95wd:cr 6 34.95 0 -9.77e-15
D W WD
CC 37 190 94
CR 23 59 23
RC 10 141 28
RR 15 58 26
)(~)log(2 2 dflikelhood > 1-pchisq(34.95,6) [1] 4.406699e-06
![Page 56: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/56.jpg)
> # H0 : all four coins have the same proportion showing head side> # H1 : at least one coin have different proportion to the others> > head4 <- c( 83, 90, 129, 70 )> toss4 <- c( 86, 93, 136, 82 )> prop.test(head4, toss4) 4-sample test for equality of proportions without continuity correction X-squared = 12.6004, df = 3, p-value = 0.005585alternative hypothesis: two.sided > coins<-factor(LETTERS[1:4])> anova(glm(cbind(head4,toss4-head4)~coins,family=binomial))Analysis of Deviance TableTerms added sequentially (first to last) Df Deviance Resid. Df Resid. DevNULL 3 10.667coins 3 10.667 0 1.132e-14
Coin 1 Coin 2 Coin 3 Coin 4
Head 83 90 129 70 Alive
Tail 3 3 7 12 Dead
Total 86 93 136 82 Total
Hosp’l 1 Hosp’l 2 Hosp’l 3 Hosp’l 4
)(~)log(2 2 dflikelhood > 1-pchisq(10.667,3) [1] 0.01366980
Homogeneity test in GLM for coin tossing example
![Page 57: Count Data. HT Cleopatra VII & Marcus Antony C c Aa.](https://reader036.fdocuments.us/reader036/viewer/2022062309/5697bf9e1a28abf838c94878/html5/thumbnails/57.jpg)
Thank you !!