Categorical Data Analysis - Seoul National...

74
Categorical Data Analysis updated 2018-10-20 Ho Kim

Transcript of Categorical Data Analysis - Seoul National...

Page 1: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Categorical Data Analysis

updated 2018-10-20

Ho Kim

Page 2: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

2 2 table

1 2

1

2

Chi-square statistics

Mantel-Haenszel Chi-square

chi-square

11n 12n

21n 22n

1+n

2+n

+1n +2n N

2

11 11

11

( )n mQ

v

22 2

1 1

( )ij ij

Pi j

ij

n mQ

m

Page 3: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

data respire;

input treat $ outcome $ count ;

cards;

test f 40

test u 20

placebo f 16

placebo u 48;

proc freq;

weight count;

tables treat*outcome/chisq;

run;

Page 4: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

SAS 시스템

FREQ 프로시저

treat * outcome 교차표

treat outcome

빈도|백분율|

행 백분율|칼럼 백분율|f |u | 총합-----------+--------+--------+placebo | 16 | 48 | 64

| 12.90 | 38.71 | 51.61| 25.00 | 75.00 || 28.57 | 70.59 |

-----------+--------+--------+test | 40 | 20 | 60

| 32.26 | 16.13 | 48.39| 66.67 | 33.33 || 71.43 | 29.41 |

-----------+--------+--------+총합 56 68 124

45.16 54.84 100.00

Page 5: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

treat * outcome 테이블에 대한 통계량

통계량 자유도 값 확률값----------------------------------------------------------

카이제곱 1 21.7087 <.0001

우도비 카이제곱 1 22.3768 <.0001

연속성 수정 카이제곱 1 20.0589 <.0001

Mantel-Haenszel 카이제곱 1 21.5336 <.0001

파이 계수 -0.4184

분할 계수 0.3860

크래머의 V -0.4184

Fisher의 정확 검정----------------------------

(1,1) 셀 빈도(F) 16하단측 p값 Pr <= F 2.838E-06

상단측 p값 Pr >= F 1.0000

테이블 확률 (P) 2.397E-06

양측 p값 Pr <= P 4.754E-06

표본 크기 = 124

Page 6: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

data severe;

input treat $ outcome $ count ;

cards;

Test f 10

Test u 2

Control f 2

Control u 4

;

proc freq order=data;

tables treat*outcome / chisq nocol;

weight count;

run;

Page 7: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

SAS 시스템

FREQ 프로시저

treat * outcome 교차표

treat outcome

빈도|

백분율|

행 백분율|f |u | 총합-----------+--------+--------+

Test | 10 | 2 | 12

| 55.56 | 11.11 | 66.67

| 83.33 | 16.67 |

-----------+--------+--------+

Control | 2 | 4 | 6

| 11.11 | 22.22 | 33.33

| 33.33 | 66.67 |

-----------+--------+--------+

총합 12 6 18

66.67 33.33 100.00

Page 8: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

treat * outcome 테이블에 대한 통계량

통계량 자유도 값 확률값---------------------------------------------------------

-카이제곱 1 4.5000 0.0339우도비 카이제곱 1 4.4629 0.0346연속성 수정 카이제곱 1 2.5313 0.1116Mantel-Haenszel 카이제곱 1 4.2500 0.0393파이 계수 0.5000분할 계수 0.4472크래머의 V 0.5000

경고: 셀들의 75%가 5보다 작은 기대도수를 가지고 있습니다.카이제곱 검정은 올바르지 않을 수 있습니다.

Fisher의 정확 검정----------------------------(1,1) 셀 빈도(F) 10하단측 p값 Pr <= F 0.9961상단측 p값 Pr >= F 0.0573

테이블 확률 (P) 0.0533양측 p값 Pr <= P 0.1070

표본 크기 = 18

Page 9: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Exact Test

Table Cell

(1,1) (1,2) (2,1) (2,2) probabilities

12 0 0 6 .0001

11 1 1 5 .0039

10 2 2 4 .0533

9 3 3 3 .2370

8 4 4 2 .4000

7 5 5 1 .2560

6 6 6 0 .0498

Page 10: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Table Probabilities One-tailed p-value

Two-tailed p-value

0.0533 0.0039 0.0001 0.0573p

0.0533 0.0039 0.0001 0.0498 0.1071p

Page 11: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Difference in Proportions

1 2 1 2

1 1 2 2

1 2

21 2

{ }

(1 ) (1 )

1 1

1 1 1{ ( )}

2

d

d

E p p

p p p pv

n n

d z vn n

Page 12: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Odds Ratio and Relative Risk

1 1 11 22

2 2 12 21

1 1

2 2

1 1 2 2

11 12 21 22

2

/(1 )

/(1 )

/(1 ) log{ } log

/(1 )

log{ /(1 )} log{ /(1 )}

1 1 1 1

exp( )

f

f

p p n nOR

p p n n

p pf OR

p p

p p p p

vn n n n

f z v

Page 13: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

1

2

21 22

11 12

11 21 12 22

1

1

if n and n are small relative to n and n

rare outcome assumption

pRR

p

n nRR OR

n n

Yes No totalProportion

Yes

Group1

Group2

total

11n 12n

21n 22n

1+n

2+n

+1n +2n N

1 11 1p n n

2 21 2p n n

Page 14: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Odds Ratio and Relative Risk Risk is the preferable measure because it is a probability.

Why odds?

– OR is often a good approximation to the RR.

– Sometimes OR is either all we can estimate (case-control studies) or

– It is the most convenient to calculate (logistic regression analysis)

Page 15: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Sun protection during childhood and cutaneous melanoma

Sun protection? Cases Controls total

Yes 99 132 231

No 303 290 593

total 402 422 824

RR is meaningful when the samples are randomly selected. In case-control study, we have a sample stratified by case-control status.

Proportion of melanoma=402/824=0.49

Woodward (Epidemiology-study design and data analysis)

Page 16: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Woodward (Epidemiology-study design and data analysis)

(a) Population value (b)Expected values in the sample

Diseased Not Diseased

Total cases controls Total

Exposed A B A+B f1A f2B f1A+f2B

Not Exposed

C D C+D f1C f2D f1C+f2D

total A+C B+D N f1(A+C) f2(B+D) n

/( )

/( )

A A BRR

C C D

Risk factor for the exposed

Risk factor for the unexposed

1 1 2/( ) /( )f A f A f B A A B

1 1 2/( ) /( )f C f C f D C C D

Page 17: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

(a) Population value (b)Expected values in the sample

Diseased Not Diseased

Total cases controls Total

Exposed A B A+B f1A f2B f1A+f2B

Not Exposed

C D C+D f1C f2D f1C+f2D

total A+C B+D N f1(A+C) f2(B+D) n

/( )

/( )

A A BRR

C C D

RR (b)

1 1 2

1 1 2

( ) ( )

( ) ( )

f A f C f D A C D

f C f A f B C A B

RR (a)

Page 18: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

(a) Population value (b)Expected values in the sample

Diseased Not Diseased

Total cases controls Total

Exposed A B A+B f1A f2B f1A+f2B

Not Exposed

C D C+D f1C f2D f1C+f2D

total A+C B+D N f1(A+C) f2(B+D) n

1 2

2 1

( )( )

( )( )

f A f D ADOR

f B f C BC

Odds for the exposed

Odds for the unexposed

1 2/ /f A f B A B

1 2/ /f C f D C D

But

We can use a case-control study to estimate the OR, but not risk, RR, nor odds.

Page 19: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

http://sphweb.bumc.bu.edu/otlt/MPH-Modules/EP/EP713_Association/

Measures of associations (Boston University)

Page 20: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

data stress;

input stress $ outcome $ count ;

cards;

low f 48

low u 12

high f 96

high u 94

;

proc freq order=data;

tables stress*outcome / chisq measures nocol

nopercent;

weight count;

run ;

Page 21: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

FREQ 프로시저

stress * outcome 교차표

stress outcome빈도|

행 백분율|f |u | 총합-----------+--------+--------+low | 48 | 12 | 60

| 80.00 | 20.00 |-----------+--------+--------+high | 96 | 94 | 190

| 50.53 | 49.47 |-----------+--------+--------+총합 144 106 250

stress * outcome 테이블에 대한 통계량

통계량 자유도 값 확률값----------------------------------------------------------카이제곱 1 16.2198 <.0001

우도비 카이제곱 1 17.3520 <.0001

연속성 수정 카이제곱 1 15.0354 0.0001

Mantel-Haenszel 카이제곱 1 16.1549 <.0001

파이 계수 0.2547

분할 계수 0.2468

크래머의 V 0.2547

Page 22: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Fisher의 정확 검정----------------------------(1,1) 셀 빈도(F) 48하단측 p값 Pr <= F 1.0000상단측 p값 Pr >= F 3.247E-05

테이블 확률 (P) 2.472E-05양측 p값 Pr <= P 4.546E-05

통계량 값 점근표준오차----------------------------------------------------------

감마 0.5932 0.1147

Kendall의 타우-b 0.2547 0.0551

Stuart 타우-c 0.2150 0.0489

Somers D C|R 0.2947 0.0631

Somers D R|C 0.2201 0.0499

Pearson 상관계수 0.2547 0.0551

Spearman 상관계수 0.2547 0.0551

람다 비대칭 C|R 0.0000 0.0000

람다 비대칭 R|C 0.0000 0.0000

람다 대칭 0.0000 0.0000

불확실 계수 C|R 0.0509 0.0231

불확실 계수 R|C 0.0630 0.0282

불확실 계수 대칭 0.0563 0.0253

Page 23: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

상대위험도의 추정값(행1/행2)

연구 유형 값 95% 신뢰한계

-----------------------------------------------------------------

사례대조연구 (오즈비) 3.9167 1.9575 7.8366

코호트 (칼럼1 리스크) 1.5833 1.3104 1.9131

코호트 (칼럼2 리스크) 0.4043 0.2389 0.6841

표본 크기 = 250

Page 24: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

data respire;

input treat $ outcome $ count ;

cards;

test yes 29

test no 16

placebo yes 14

placebo no 31

;

proc freq order=data;

tables treat*outcome / measures chisq nocol

nopercent;

weight count;

run ;

Page 25: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

FREQ 프로시저

treat * outcome 교차표

treat outcome

빈도|행 백분율|yes |no | 총합

-----------+--------+--------+test | 29 | 16 | 45

| 64.44 | 35.56 |-----------+--------+--------+placebo | 14 | 31 | 45

| 31.11 | 68.89 |-----------+--------+--------+총합 43 47 90

상대위험도의 추정값(행1/행2)

연구 유형 값 95% 신뢰한계-----------------------------------------------------------------사례대조연구 (오즈비) 4.0134 1.6680 9.6564코호트 (칼럼1 리스크) 2.0714 1.2742 3.3675코호트 (칼럼2 리스크) 0.5161 0.3325 0.8011

표본 크기 = 90

Page 26: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

McNemar Test : Matched pairs

The question is whether

and are the same.

Response 2Response 1

Yes No total

Yes

No

total

11n 12n

21n 22n

1+n

2+n

+1n +2n n

11

np

n

12

np

n

2

12 21

12 21

M

n nQ

n n

Page 27: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

data approval;

input hus_resp $ wif_resp $ count ;

cards;

yes yes 20

yes no 5

no yes 10

no no 10

;

proc freq order=data;

weight count;

tables hus_resp*wif_resp / agree;

run;

Page 28: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

FREQ 프로시저

hus_resp * wif_resp 교차표

hus_resp wif_resp

빈도|백분율|

행 백분율|칼럼 백분율|yes |no | 총합-----------+--------+--------+yes | 20 | 5 | 25

| 44.44 | 11.11 | 55.56| 80.00 | 20.00 || 66.67 | 33.33 |

-----------+--------+--------+no | 10 | 10 | 20

| 22.22 | 22.22 | 44.44| 50.00 | 50.00 || 33.33 | 66.67 |

-----------+--------+--------+총합 30 15 45

66.67 33.33 100.00

hus_resp*wif_resp테이블에 대한 통계량

McNemar 검정--------------------

통계량 (S) 1.6667

자유도 1

Pr > S 0.1967

단순 카파 계수-------------------------

카파 0.3077

ASE 0.1402

95% 신뢰하한 0.0329

95% 신뢰상한 0.5825

표본 크기 = 45

CI does not include 0 : reject Ho : No Agreement

Page 29: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Sets of 2*2 tablesdata ca;

input gender $ ECG $ disease $ count;

datalines;

female <0.1 yes 4

female <0.1 no 11

female >=0.1 yes 8

female >=0.1 no 10

male <0.1 yes 9

male <0.1 no 9

male >=0.1 yes 21

male >=0.1 no 6

;

options ls=75 nonumber nodate;

proc freq;

weight count ;

tables gender*disease /nocol nopct chisq ;

tables gender*ECG*disease /nocol nopct cmh chisq measures;

run;

Page 30: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

The FREQ Procedure

Table of gender by disease

gender disease

Frequency|

Row Pct |no |yes | Total

---------+--------+--------+

female | 21 | 12 | 33

| 63.64 | 36.36 |

---------+--------+--------+

male | 15 | 30 | 45

| 33.33 | 66.67 |

---------+--------+--------+

Total 36 42 78

Statistics for Table of gender by disease

Statistic DF Value Prob

------------------------------------------------------

Chi-Square 1 7.0346 0.0080

Likelihood Ratio Chi-Square 1 7.1209 0.0076

Continuity Adj. Chi-Square 1 5.8681 0.0154

Mantel-Haenszel Chi-Square 1 6.9444 0.0084

Phi Coefficient 0.3003

Contingency Coefficient 0.2876

Cramer's V 0.3003

Page 31: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Table 1 of ECG by disease

Controlling for gender=female

ECG disease

Frequency|

Row Pct |no |yes | Total

---------+--------+--------+

<0.1 | 11 | 4 | 15

| 73.33 | 26.67 |

---------+--------+--------+

>=0.1 | 10 | 8 | 18

| 55.56 | 44.44 |

---------+--------+--------+

Total 21 12 33

Statistics for Table 1 of ECG by disease

Controlling for gender=female

Statistic DF Value Prob

------------------------------------------------------

Chi-Square 1 1.1175 0.2905

Likelihood Ratio Chi-Square 1 1.1337 0.2870

Continuity Adj. Chi-Square 1 0.4813 0.4879

Mantel-Haenszel Chi-Square 1 1.0836 0.2979

Phi Coefficient 0.1840

Contingency Coefficient 0.1810

Cramer's V 0.1840

Estimates of the Relative Risk (Row1/Row2)

Type of Study Value 95% Confidence Limits-----------------------------------------------------------------Case-Control (Odds Ratio) 2.2000 0.5036 9.6107Cohort (Col1 Risk) 1.3200 0.7897 2.2063Cohort (Col2 Risk) 0.6000 0.2240 1.6073

Sample Size = 33

Page 32: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Table 2 of ECG by disease

Controlling for gender=male

ECG disease

Frequency|

Row Pct |no |yes | Total

---------+--------+--------+

<0.1 | 9 | 9 | 18

| 50.00 | 50.00 |

---------+--------+--------+

>=0.1 | 6 | 21 | 27

| 22.22 | 77.78 |

---------+--------+--------+

Total 15 30 45

Statistics for Table 2 of ECG by disease

Controlling for gender=male

Statistic DF Value Prob

------------------------------------------------------

Chi-Square 1 3.7500 0.0528

Likelihood Ratio Chi-Square 1 3.7288 0.0535

Continuity Adj. Chi-Square 1 2.6042 0.1066

Mantel-Haenszel Chi-Square 1 3.6667 0.0555

Phi Coefficient 0.2887

Contingency Coefficient 0.2774

Cramer's V 0.2887

Estimates of the Relative Risk (Row1/Row2)

Type of Study Value 95% Confidence Limits-----------------------------------------------------------------Case-Control (Odds Ratio) 3.5000 0.9587 12.7775Cohort (Col1 Risk) 2.2500 0.9680 5.2298Cohort (Col2 Risk) 0.6429 0.3883 1.0642

Sample Size = 45

Page 33: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Summary Statistics for ECG by disease

Controlling for gender

Cochran-Mantel-Haenszel Statistics (Based on Table Scores)

Statistic Alternative Hypothesis DF Value Prob

---------------------------------------------------------------

1 Nonzero Correlation 1 4.5026 0.0338

2 Row Mean Scores Differ 1 4.5026 0.0338

3 General Association 1 4.5026 0.0338

Estimates of the Common Relative Risk (Row1/Row2)

Type of Study Method Value 95% Confidence Limits

-------------------------------------------------------------------------

Case-Control Mantel-Haenszel 2.8467 1.0765 7.5279

(Odds Ratio) Logit 2.8593 1.0807 7.5650

Cohort Mantel-Haenszel 1.6414 1.0410 2.5879

(Col1 Risk) Logit 1.5249 0.9833 2.3647

Cohort Mantel-Haenszel 0.6299 0.3980 0.9969

(Col2 Risk) Logit 0.6337 0.4046 0.9926

Breslow-Day Test forHomogeneity of the Odds Ratios------------------------------Chi-Square 0.2155DF 1Pr > ChiSq 0.6425

Total Sample Size = 78

Page 34: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

data ca;

input gender $ ECG $ disease $ count;

datalines;

female <0.1 yes 8

female <0.1 no 11

female >=0.1 yes 8

female >=0.1 no 20

male <0.1 yes 9

male <0.1 no 9

male >=0.1 yes 30

male >=0.1 no 6

;

options ls=75 nonumber nodate;

proc freq;

weight count ;

tables gender*ECG*disease /nocol nopct cmh measures;

run;

Page 35: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Table 1 of ECG by disease

Controlling for gender=female

ECG disease

Frequency|

Row Pct |no |yes | Total

---------+--------+--------+

<0.1 | 11 | 8 | 19

| 57.89 | 42.11 |

---------+--------+--------+

>=0.1 | 20 | 8 | 28

| 71.43 | 28.57 |

---------+--------+--------+

Total 31 16 47

Estimates of the Relative Risk (Row1/Row2)

Type of Study Value 95% Confidence Limits

-----------------------------------------------------------------

Case-Control (Odds Ratio) 0.5500 0.1615 1.8731

Cohort (Col1 Risk) 0.8105 0.5171 1.2703

Cohort (Col2 Risk) 1.4737 0.6701 3.2407

Sample Size = 47

Page 36: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Table 2 of ECG by disease

Controlling for gender=male

ECG disease

Frequency|

Row Pct |no |yes | Total

---------+--------+--------+

<0.1 | 9 | 9 | 18

| 50.00 | 50.00 |

---------+--------+--------+

>=0.1 | 6 | 30 | 36

| 16.67 | 83.33 |

---------+--------+--------+

Total 15 39 54

Estimates of the Relative Risk (Row1/Row2)

Type of Study Value 95% Confidence Limits

-----------------------------------------------------------------

Case-Control (Odds Ratio) 5.0000 1.3992 17.8677

Cohort (Col1 Risk) 3.0000 1.2641 7.1198

Cohort (Col2 Risk) 0.6000 0.3696 0.9740

Sample Size = 54

Page 37: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Summary Statistics for ECG by disease

Controlling for gender

Cochran-Mantel-Haenszel Statistics (Based on Table Scores)

Statistic Alternative Hypothesis DF Value Prob

---------------------------------------------------------------

1 Nonzero Correlation 1 1.2063 0.2721

2 Row Mean Scores Differ 1 1.2063 0.2721

3 General Association 1 1.2063 0.2721

Estimates of the Common Relative Risk (Row1/Row2)

Type of Study Method Value 95% Confidence Limits

-------------------------------------------------------------------------

Case-Control Mantel-Haenszel 1.5604 0.6767 3.5982

(Odds Ratio) Logit 1.5893 0.6572 3.8433

Cohort Mantel-Haenszel 1.2447 0.8393 1.8460

(Col1 Risk) Logit 1.0708 0.7187 1.5954

Cohort Mantel-Haenszel 0.8135 0.5412 1.2227

(Col2 Risk) Logit 0.7677 0.5081 1.1600

Breslow-Day Test for

Homogeneity of the Odds Ratios

------------------------------

Chi-Square 6.2128

DF 1

Pr > ChiSq 0.0127

Page 38: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

If table we will see Kappa and weighted Kappa

Where

is prob. agreement

is prob. agreement if independent

if perfect agreement, then

Weighted Kappa considers off_diagonal terms

Kappa > 0.8 : excellent agreement

> 0.4 : moderate agreement

s r

0

1

e

e

x

0

0

1

ii

e i i

1x

Page 39: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

table r=3인 경우2 r

31

11

1

31

1 01

1

321

1 0 11

1

1

1

2

1

1 1

( )( 1)

( 1)

( 1)

j j

j

j

j aj

j

j a jj

a

a

s

a

a nf

n

nE f H a

n

nn nV f H a n

n n n

n n v

n n

fQ

n n n n v

Page 40: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

data arth;

input gender $ treat $ response $ count @@;

cards;

female test none 6 female test some 5

female test marked 16 female placebo none 19

female placebo some 7 female placebo marked 6

male test none 7 male test some 2

male test marked 5 male placebo none 10

male placebo some 0 male placebo marked 1

;

proc freq data=arth order=data;

weight count;

tables treat*response / chisq nocol nopercent;

run;

Page 41: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

FREQ 프로시저

treat * response 교차표

treat response

빈도|행 백분율|none |some |marked | 총합

-----------+--------+--------+--------+test | 13 | 7 | 21 | 41

| 31.71 | 17.07 | 51.22 |-----------+--------+--------+--------+placebo | 29 | 7 | 7 | 43

| 67.44 | 16.28 | 16.28 |-----------+--------+--------+--------+총합 42 14 28 84

treat * response 테이블에 대한 통계량

통계량 자유도 값 확률값----------------------------------------------------------

카이제곱 2 13.0550 0.0015

우도비 카이제곱 2 13.5298 0.0012

Mantel-Haenszel 카이제곱 1 12.8590 0.0003

파이 계수 0.3942

분할 계수 0.3668

크래머의 V 0.3942

표본 크기 = 84

Page 42: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

data arth;

input gender $ treat $ response $ count @@;

cards;

female test none 6 female test some 5

female test marked 16 female placebo none 19

female placebo some 7 female placebo marked 6

male test none 7 male test some 2

male test marked 5 male placebo none 10

male placebo some 0 male placebo marked 1

;

proc freq data=arth order=data;

weight count;

tables gender*treat*response / cmh nocol nopercent;

run;

Page 43: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

FREQ 프로시저

1번째 treat * response 교차표제어 변수 : gender=female

treat response

빈도|

행 백분율|none |some |marked | 총합-----------+--------+--------+--------+

test | 6 | 5 | 16 | 27

| 22.22 | 18.52 | 59.26 |

-----------+--------+--------+--------+

placebo | 19 | 7 | 6 | 32

| 59.38 | 21.88 | 18.75 |

-----------+--------+--------+--------+

총합 25 12 22 59

2번째 treat * response 교차표제어 변수 : gender=male

treat response

빈도|

행 백분율|none |some |marked | 총합-----------+--------+--------+--------+

test | 7 | 2 | 5 | 14

| 50.00 | 14.29 | 35.71 |

-----------+--------+--------+--------+

placebo | 10 | 0 | 1 | 11

| 90.91 | 0.00 | 9.09 |

-----------+--------+--------+--------+

총합 17 2 6 25

Page 44: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

FREQ 프로시저

테이블 treat * response에 대한 요약 통계량제어 변수 : gender

Cochran-Mantel-Haenszel 통계(테이블 스코어에 기반한)

통계량 대립가설 자유도 값 확률값----------------------------------------------------------------

1 영이 아닌 상관계수 1 14.6319 0.0001

2 행 평균 스코어 차이 1 14.6319 0.0001

3 일반 연관성 2 14.6323 0.0007

전체 표본 크기 = 84

Page 45: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

proc freq data=arth order=data;

weight count;

tables gender*treat*response/cmh scores=modridit nocol

nopercent;

run;

FREQ 프로시저

테이블 treat * response에 대한 요약 통계량제어 변수 : gender

Cochran-Mantel-Haenszel 통계(수정된 Ridit 스코어)

통계량 대립가설 자유도 값 확률값----------------------------------------------------------------

1 영이 아닌 상관계수 1 14.9918 0.0001

2 행 평균 스코어 차이 1 15.0041 0.0001

3 일반 연관성 2 14.6323 0.0007

전체 표본 크기 = 84

Page 46: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

data colds;

input gender $ residnce $ per_cold count @@;

cards;

female urban 0 45 female urban 1 64

female urban 2 71 female rural 0 80

female rural 1 104 female rural 2 116

male urban 0 84 male urban 1 124

male urban 2 82 male rural 0 106

male rural 1 117 male rural 2 87;

proc freq data=colds order=data;

weight count;

tables gender*residnce*per_cold / all cmh nocol

nopercent;

run;

Page 47: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

FREQ 프로시저

1번째 residnce * per_cold 교차표

제어 변수 : gender=female

residnce per_cold

빈도|

행 백분율| 0| 1| 2| 총합-----------+--------+--------+--------+urban | 45 | 64 | 71 | 180

| 25.00 | 35.56 | 39.44 |-----------+--------+--------+--------+rural | 80 | 104 | 116 | 300

| 26.67 | 34.67 | 38.67 |-----------+--------+--------+--------+총합 125 168 187 480

1번째 residnce * per_cold 테이블에 대한 통계량

제어 변수 : gender=female

통계량 자유도 값 확률값

----------------------------------------------------------

카이제곱 2 0.1629 0.9218

우도비 카이제곱 2 0.1634 0.9215

Mantel-Haenszel 카이제곱 1 0.1059 0.7448

파이 계수 0.0184

분할 계수 0.0184

크래머의 V 0.0184

Page 48: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

통계량 값 점근표준오차

----------------------------------------------------------

감마 -0.0242 0.0775

Kendall의 타우-b -0.0134 0.0430

Stuart 타우-c -0.0149 0.0478

Somers D C|R -0.0159 0.0509

Somers D R|C -0.0113 0.0363

Pearson 상관계수 -0.0149 0.0455

Spearman 상관계수 -0.0142 0.0455

람다 비대칭 C|R 0.0000 0.0000

람다 비대칭 R|C 0.0000 0.0000

람다 대칭 0.0000 0.0000

불확실 계수 C|R 0.0002 0.0008

불확실 계수 R|C 0.0003 0.0013

불확실 계수 대칭 0.0002 0.0010

표본 크기 = 480

Page 49: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

2번째 residnce * per_cold 교차표

제어 변수 : gender=male

residnce per_cold

빈도|

행 백분율| 0| 1| 2| 총합-----------+--------+--------+--------+

urban | 84 | 124 | 82 | 290

| 28.97 | 42.76 | 28.28 |

-----------+--------+--------+--------+

rural | 106 | 117 | 87 | 310

| 34.19 | 37.74 | 28.06 |

-----------+--------+--------+--------+

총합 190 241 169 600

2번째 residnce * per_cold 테이블에 대한 통계량

제어 변수 : gender=male

통계량 자유도 값 확률값

----------------------------------------------------------

카이제곱 2 2.2344 0.3272

우도비 카이제곱 2 2.2376 0.3267

Mantel-Haenszel 카이제곱 1 0.7412 0.3893

파이 계수 0.0610

분할 계수 0.0609

크래머의 V 0.0610

Page 50: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

통계량 값 점근표준오차

----------------------------------------------------------

감마 -0.0591 0.0669

Kendall의 타우-b -0.0339 0.0385

Stuart 타우-c -0.0390 0.0442

Somers D C|R -0.0390 0.0442

Somers D R|C -0.0296 0.0335

Pearson 상관계수 -0.0352 0.0407

Spearman 상관계수 -0.0360 0.0408

람다 비대칭 C|R 0.0000 0.0000

람다 비대칭 R|C 0.0241 0.0529

람다 대칭 0.0108 0.0238

불확실 계수 C|R 0.0017 0.0023

불확실 계수 R|C 0.0027 0.0036

불확실 계수 대칭 0.0021 0.0028

표본 크기 = 600

Page 51: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

테이블 residnce * per_cold에 대한 요약 통계량

제어 변수 : gender

Cochran-Mantel-Haenszel 통계(테이블 스코어에 기반한)

통계량 대립가설 자유도 값 확률값

----------------------------------------------------------------

1 영이 아닌 상관계수 1 0.7379 0.3903

2 행 평균 스코어 차이 1 0.7379 0.3903

3 일반 연관성 2 1.9707 0.3733

전체 표본 크기 = 1080

Page 52: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

3 3 2

1 1 1

3 21

01 1

1 0

2

02

0

2 table =3

( 1)

1

i j ijii i

i i j

jii j c a

i j

c a

acs

s s

c a nnf c f

n n

nnE f H c a

n n

v vV f H

n

f E f HQ n r

Var f H

인경우

Page 53: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

data tobacco;

length risk $11. ;

input f_usage $ risk $ usage $ count @@;

cards;

no minimal no 59 no minimal yes 25

no moderate no 169 no moderate yes 29

no substantial no 196 no substantial yes 9

yes minimal no 11 yes minimal yes 8

yes moderate no 33 yes moderate yes 11

yes substantial no 22 yes substantial yes 2

;

proc freq;

weight count;

tables f_usage*risk*usage /cmh chisq measures;

tables f_usage*risk*usage /cmh scores=modridit;

run;

Page 54: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

FREQ 프로시저

1번째 risk * usage 교차표

제어 변수 : f_usage=no

risk usage

빈도 |백분율 |

행 백분율 |칼럼 백분율 |no |yes | 총합------------+--------+--------+minimal | 59 | 25 | 84

| 12.11 | 5.13 | 17.25| 70.24 | 29.76 || 13.92 | 39.68 |

------------+--------+--------+moderate | 169 | 29 | 198

| 34.70 | 5.95 | 40.66| 85.35 | 14.65 || 39.86 | 46.03 |

------------+--------+--------+substantial | 196 | 9 | 205

| 40.25 | 1.85 | 42.09| 95.61 | 4.39 || 46.23 | 14.29 |

------------+--------+--------+총합 424 63 487

87.06 12.94 100.00

Page 55: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

1번째 risk * usage 테이블에 대한 통계량

제어 변수 : f_usage=no

통계량 자유도 값 확률값----------------------------------------------------------

카이제곱 2 34.9217 <.0001

우도비 카이제곱 2 34.0684 <.0001

Mantel-Haenszel 카이제곱 1 34.2843 <.0001

파이 계수 0.2678

분할 계수 0.2587

크래머의 V 0.2678

통계량 값 점근표준오차----------------------------------------------------------

감마 -0.5948 0.0772

Kendall의 타우-b -0.2477 0.0395

Stuart 타우-c -0.1863 0.0339

Somers D C|R -0.1484 0.0267

Somers D R|C -0.4135 0.0628

Pearson 상관계수 -0.2656 0.0439

Spearman 상관계수 -0.2602 0.0415

람다 비대칭 C|R 0.0000 0.0000

람다 비대칭 R|C 0.0709 0.0211

람다 대칭 0.0580 0.0169

불확실 계수 C|R 0.0908 0.0290

불확실 계수 R|C 0.0339 0.0112

불확실 계수 대칭 0.0493 0.0161

표본 크기 = 487

Page 56: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

2번째 risk * usage 교차표

제어 변수 : f_usage=yes

risk usage

빈도 |백분율 |

행 백분율 |칼럼 백분율 |no |yes | 총합------------+--------+--------+minimal | 11 | 8 | 19

| 12.64 | 9.20 | 21.84| 57.89 | 42.11 || 16.67 | 38.10 |

------------+--------+--------+moderate | 33 | 11 | 44

| 37.93 | 12.64 | 50.57| 75.00 | 25.00 || 50.00 | 52.38 |

------------+--------+--------+substantial | 22 | 2 | 24

| 25.29 | 2.30 | 27.59| 91.67 | 8.33 || 33.33 | 9.52 |

------------+--------+--------+총합 66 21 87

75.86 24.14 100.00

Page 57: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

2번째 risk * usage 테이블에 대한 통계량

제어 변수 : f_usage=yes

통계량 자유도 값 확률값

----------------------------------------------------------

카이제곱 2 6.6413 0.0361

우도비 카이제곱 2 7.0461 0.0295

Mantel-Haenszel 카이제곱 1 6.5644 0.0104

파이 계수 0.2763

분할 계수 0.2663

크래머의 V 0.2763

통계량 값 점근표준오차

----------------------------------------------------------

감마 -0.5309 0.1626

Kendall의 타우-b -0.2622 0.0905

Stuart 타우-c -0.2500 0.0917

Somers D C|R -0.2014 0.0726

Somers D R|C -0.3413 0.1171

Pearson 상관계수 -0.2763 0.0966

Spearman 상관계수 -0.2761 0.0955

람다 비대칭 C|R 0.0000 0.0000

람다 비대칭 R|C 0.0000 0.0000

람다 대칭 0.0000 0.0000

불확실 계수 C|R 0.0733 0.0510

불확실 계수 R|C 0.0392 0.0276

불확실 계수 대칭 0.0511 0.0357

표본 크기 = 87

Page 58: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

테이블 risk * usage에 대한 요약 통계량

제어 변수 : f_usage

Cochran-Mantel-Haenszel 통계(테이블 스코어에 기반한)

통계량 대립가설 자유도 값 확률값

----------------------------------------------------------------

1 영이 아닌 상관계수 1 40.6639 <.0001

2 행 평균 스코어 차이 2 41.0577 <.0001

3 일반 연관성 2 41.0577 <.0001

전체 표본 크기 = 574

Page 59: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

1번째 risk * usage 교차표

제어 변수 : f_usage=no

risk usage

빈도 |

백분율 |

행 백분율 |

칼럼 백분율 |no |yes | 총합

------------+--------+--------+

minimal | 59 | 25 | 84

| 12.11 | 5.13 | 17.25

| 70.24 | 29.76 |

| 13.92 | 39.68 |

------------+--------+--------+

moderate | 169 | 29 | 198

| 34.70 | 5.95 | 40.66

| 85.35 | 14.65 |

| 39.86 | 46.03 |

------------+--------+--------+

substantial | 196 | 9 | 205

| 40.25 | 1.85 | 42.09

| 95.61 | 4.39 |

| 46.23 | 14.29 |

------------+--------+--------+

총합 424 63 487

87.06 12.94 100.00

Page 60: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

2번째 risk * usage 교차표

제어 변수 : f_usage=yes

risk usage

빈도 |

백분율 |

행 백분율 |

칼럼 백분율 |no |yes | 총합

------------+--------+--------+

minimal | 11 | 8 | 19

| 12.64 | 9.20 | 21.84

| 57.89 | 42.11 |

| 16.67 | 38.10 |

------------+--------+--------+

moderate | 33 | 11 | 44

| 37.93 | 12.64 | 50.57

| 75.00 | 25.00 |

| 50.00 | 52.38 |

------------+--------+--------+

substantial | 22 | 2 | 24

| 25.29 | 2.30 | 27.59

| 91.67 | 8.33 |

| 33.33 | 9.52 |

------------+--------+--------+

총합 66 21 87

75.86 24.14 100.00

Page 61: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

테이블 risk * usage에 대한 요약 통계량

제어 변수 : f_usage

Cochran-Mantel-Haenszel 통계(수정된 Ridit 스코어)

통계량 대립가설 자유도 값 확률값

----------------------------------------------------------------

1 영이 아닌 상관계수 1 39.3048 <.0001

2 행 평균 스코어 차이 2 41.0826 <.0001

3 일반 연관성 2 41.0577 <.0001

전체 표본 크기 = 574

Page 62: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

data pain;

input dgnosis $ treatmnt $ response $ count @@;

cards;

I placebo no 26 I placebo yes 6

I dosage1 no 26 I dosage1 yes 7

I dosage2 no 23 I dosage2 yes 9

I dosage3 no 18 I dosage3 yes 14

I dosage4 no 9 I dosage4 yes 23

II placebo no 26 II placebo yes 6

II dosage1 no 12 II dosage1 yes 20

II dosage2 no 13 II dosage2 yes 20

II dosage3 no 1 II dosage3 yes 31

II dosage4 no 1 II dosage4 yes 31;

proc freq order=data;

weight count;

tables treatmnt*response /chisq;

tables dgnosis*treatmnt*response/chisq cmh;

tables dgnosis*treatmnt*response/scores=modridit cmh;

run;

Page 63: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

treatmnt * response 교차표

treatmnt response

빈도|

백분율|행 백분율|

칼럼 백분율|no |yes | 총합-----------+--------+--------+placebo | 52 | 12 | 64

| 16.15 | 3.73 | 19.88| 81.25 | 18.75 || 33.55 | 7.19 |

-----------+--------+--------+dosage1 | 38 | 27 | 65

| 11.80 | 8.39 | 20.19| 58.46 | 41.54 || 24.52 | 16.17 |

-----------+--------+--------+dosage2 | 36 | 29 | 65

| 11.18 | 9.01 | 20.19| 55.38 | 44.62 || 23.23 | 17.37 |

-----------+--------+--------+dosage3 | 19 | 45 | 64

| 5.90 | 13.98 | 19.88| 29.69 | 70.31 || 12.26 | 26.95 |

-----------+--------+--------+dosage4 | 10 | 54 | 64

| 3.11 | 16.77 | 19.88| 15.63 | 84.38 || 6.45 | 32.34 |

-----------+--------+--------+총합 155 167 322

48.14 51.86 100.00

Page 64: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

treatmnt * response 테이블에 대한 통계량

통계량 자유도 값 확률값

----------------------------------------------------------

카이제곱 4 68.0752 <.0001

우도비 카이제곱 4 73.2533 <.0001

Mantel-Haenszel 카이제곱 1 65.4730 <.0001

파이 계수 0.4598

분할 계수 0.4178

크래머의 V 0.4598

표본 크기 = 322

Page 65: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

1번째 treatmnt * response 교차표

제어 변수 : dgnosis=I

treatmnt response

빈도|

백분율|행 백분율|

칼럼 백분율|no |yes | 총합-----------+--------+--------+placebo | 26 | 6 | 32

| 16.15 | 3.73 | 19.88| 81.25 | 18.75 || 25.49 | 10.17 |

-----------+--------+--------+dosage1 | 26 | 7 | 33

| 16.15 | 4.35 | 20.50| 78.79 | 21.21 || 25.49 | 11.86 |

-----------+--------+--------+dosage2 | 23 | 9 | 32

| 14.29 | 5.59 | 19.88| 71.88 | 28.13 || 22.55 | 15.25 |

-----------+--------+--------+dosage3 | 18 | 14 | 32

| 11.18 | 8.70 | 19.88| 56.25 | 43.75 || 17.65 | 23.73 |

-----------+--------+--------+dosage4 | 9 | 23 | 32

| 5.59 | 14.29 | 19.88| 28.13 | 71.88 || 8.82 | 38.98 |

-----------+--------+--------+총합 102 59 161

63.35 36.65 100.00

Page 66: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

1번째 treatmnt * response 테이블에 대한 통계량

제어 변수 : dgnosis=I

통계량 자유도 값 확률값

----------------------------------------------------------

카이제곱 4 26.6025 <.0001

우도비 카이제곱 4 26.6689 <.0001

Mantel-Haenszel 카이제곱 1 22.8188 <.0001

파이 계수 0.4065

분할 계수 0.3766

크래머의 V 0.4065

표본 크기 = 161

Page 67: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

2번째 treatmnt * response 교차표

제어 변수 : dgnosis=II

treatmnt response

빈도|백분율|

행 백분율|칼럼 백분율|no |yes | 총합-----------+--------+--------+placebo | 26 | 6 | 32

| 16.15 | 3.73 | 19.88| 81.25 | 18.75 || 49.06 | 5.56 |

-----------+--------+--------+dosage1 | 12 | 20 | 32

| 7.45 | 12.42 | 19.88| 37.50 | 62.50 || 22.64 | 18.52 |

-----------+--------+--------+dosage2 | 13 | 20 | 33

| 8.07 | 12.42 | 20.50| 39.39 | 60.61 || 24.53 | 18.52 |

-----------+--------+--------+dosage3 | 1 | 31 | 32

| 0.62 | 19.25 | 19.88| 3.13 | 96.88 || 1.89 | 28.70 |

-----------+--------+--------+dosage4 | 1 | 31 | 32

| 0.62 | 19.25 | 19.88| 3.13 | 96.88 || 1.89 | 28.70 |

-----------+--------+--------+총합 53 108 161

32.92 67.08 100.00

Page 68: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

2번째 treatmnt * response 테이블에 대한 통계량

제어 변수 : dgnosis=II

통계량 자유도 값 확률값

----------------------------------------------------------

카이제곱 4 60.5073 <.0001

우도비 카이제곱 4 68.7446 <.0001

Mantel-Haenszel 카이제곱 1 52.3306 <.0001

파이 계수 0.6130

분할 계수 0.5226

크래머의 V 0.6130

표본 크기 = 161

Page 69: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

테이블 treatmnt * response에 대한 요약 통계량

제어 변수 : dgnosis

Cochran-Mantel-Haenszel 통계(테이블 스코어에 기반한)

통계량 대립가설 자유도 값 확률값

----------------------------------------------------------------

1 영이 아닌 상관계수 1 71.7263 <.0001

2 행 평균 스코어 차이 4 74.5307 <.0001

3 일반 연관성 4 74.5307 <.0001

전체 표본 크기 = 322

Page 70: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

1번째 treatmnt * response 교차표

제어 변수 : dgnosis=I

treatmnt response

빈도|백분율|

행 백분율|칼럼 백분율|no |yes | 총합-----------+--------+--------+placebo | 26 | 6 | 32

| 16.15 | 3.73 | 19.88| 81.25 | 18.75 || 25.49 | 10.17 |

-----------+--------+--------+dosage1 | 26 | 7 | 33

| 16.15 | 4.35 | 20.50| 78.79 | 21.21 || 25.49 | 11.86 |

-----------+--------+--------+dosage2 | 23 | 9 | 32

| 14.29 | 5.59 | 19.88| 71.88 | 28.13 || 22.55 | 15.25 |

-----------+--------+--------+dosage3 | 18 | 14 | 32

| 11.18 | 8.70 | 19.88| 56.25 | 43.75 || 17.65 | 23.73 |

-----------+--------+--------+dosage4 | 9 | 23 | 32

| 5.59 | 14.29 | 19.88| 28.13 | 71.88 || 8.82 | 38.98 |

-----------+--------+--------+총합 102 59 161

63.35 36.65 100.00

Page 71: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

2번째 treatmnt * response 교차표

제어 변수 : dgnosis=II

treatmnt response

빈도|백분율|

행 백분율|칼럼 백분율|no |yes | 총합-----------+--------+--------+placebo | 26 | 6 | 32

| 16.15 | 3.73 | 19.88| 81.25 | 18.75 || 49.06 | 5.56 |

-----------+--------+--------+dosage1 | 12 | 20 | 32

| 7.45 | 12.42 | 19.88| 37.50 | 62.50 || 22.64 | 18.52 |

-----------+--------+--------+dosage2 | 13 | 20 | 33

| 8.07 | 12.42 | 20.50| 39.39 | 60.61 || 24.53 | 18.52 |

-----------+--------+--------+dosage3 | 1 | 31 | 32

| 0.62 | 19.25 | 19.88| 3.13 | 96.88 || 1.89 | 28.70 |

-----------+--------+--------+dosage4 | 1 | 31 | 32

| 0.62 | 19.25 | 19.88| 3.13 | 96.88 || 1.89 | 28.70 |

-----------+--------+--------+총합 53 108 161

32.92 67.08 100.00

Page 72: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

테이블 treatmnt * response에 대한 요약 통계량

제어 변수 : dgnosis

Cochran-Mantel-Haenszel 통계(수정된 Ridit 스코어)

통계량 대립가설 자유도 값 확률값

----------------------------------------------------------------

1 영이 아닌 상관계수 1 71.6471 <.0001

2 행 평균 스코어 차이 4 74.5307 <.0001

3 일반 연관성 4 74.5307 <.0001

전체 표본 크기 = 322

Page 73: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Summary of Extended Mantel-Haenszel Statistics

Tables

Dimensionsstatistics DF

Corresponding

PROC FREQ MH label

2 x 2 QMH 1

Nonzero Correlations

Row Mean Score Differ

General Association

2 x r QSMH 1Nonzero Correlations

Row Mean Score Differ

S x 2 QCSMH 1 Nonzero Correlation

Page 74: Categorical Data Analysis - Seoul National Universityhosting03.snu.ac.kr/~hokim/sas/2013/catE.pdf · 2018-10-20 · Woodward (Epidemiology-study design and data analysis) Woodward

Extended Mantel-Haenszel Statistics

Alternative

Hypothesis

SAS output label

DFScale

RequirementNonparametric Equivalents

General Association

General Association

(s-1)x(r-1) none

Mean Score Location Shifts

Row Mean Score Differ

(s-1)Column Score Ordinal

Kruskal-Wallis

Linear Association

Nonzero Correlation

1

Row and column variables Ordinal

Spearman Correlation

s r table