Model Selections and Comparisons

13
Model Selections and Comparisons Categorical Data Analysis, Ch 9.2) Yumi Kubo Alvin Hsieh Model 1 Model 2

description

Model Selections and Comparisons. (Categorical Data Analysis, Ch 9.2). Yumi Kubo Alvin Hsieh. Model 2. Model 1. Survey Data. 1992 by Wright State University School of Medicine and United Health Services in Dayton, Ohio 2276 students in the last year of high school (nonurban area) - PowerPoint PPT Presentation

Transcript of Model Selections and Comparisons

Page 1: Model Selections and Comparisons

Model Selections and Comparisons

(Categorical Data Analysis, Ch 9.2)

Yumi KuboAlvin Hsieh

Model 1

Model 2

Page 2: Model Selections and Comparisons

Survey Data1992 by Wright State University School of Medicine and United Health Services in Dayton, Ohio

• 2276 students in the last year of high school (nonurban area)

• We add more dimensions to 8.2.4

• Variables: Alcohol (A), Cigarette (C), Marijuana (M)

• Added variables: Gender (G), Race (R)

Page 3: Model Selections and Comparisons

Association Graphs (Definitions)

• association graph - set of vertices, each vertex is a variable

• edge - conditional association between 2 variables

• path - sequence of edges leading from one variable to another

Page 4: Model Selections and Comparisons

Association Graphs (Saturated)

M

A

C R

G

Variable

Conditional Association

M

R

G

Path

Page 5: Model Selections and Comparisons

Association Graphs (Reduced)

M

AC R

G

Page 6: Model Selections and Comparisons

Data Set Marijuana Use

========================================================== Race = White Race = Other ============================ ==========================

Female Male Female MaleAlcohol Cigarette yes no yes no yes no yes noyes yes 405 268 453 228 23 23 30 19

no 13 218 28 201 2 19 1 18no yes 1 17 1 17 0 1 1 8

no 1 117 1 133 0 12 0 17

Page 7: Model Selections and Comparisons

SAS ProgramToo large to place here:

Go to survey.sas

Page 8: Model Selections and Comparisons

R Programsurvey<-data.frame(expand.grid(cigarette=c("Yes","No"), alcohol=c("Yes","No"), marijuana=c("Yes","No"), gender=c("female","male"), race=c("white","other") ), count=c(405,13,1,1,268,218,17,117,453,28,1,1,228,201,17, 133,23,2,0,0,23,19,1,12,30,1,1,0,19,18,8,17))library(MASS)fit.GR<-glm(count~ . + gender*race, data=survey, family=poisson) # mutual independence + GRfit.homog.assoc<-glm(count~ .^2, data=survey, family=poisson) # homogeneous associationfit.3fact<-glm(count~ .^3, data=survey, family=poisson) # all three factor termssummary(res<-stepAIC(fit.homog.assoc, scope= list(lower = ~ + cigarette + alcohol + marijuana + gender*race), direction="backward"))fit.AC.AM.CM.AG.AR.GM.GR.MR<-resfit.AC.AM.CM.AG.AR.GM.GR<-update(fit.AC.AM.CM.AG.AR.GM.GR.MR, ~. - marijuana:race)fit.AC.AM.CM.AG.AR.GR<-update(fit.AC.AM.CM.AG.AR.GM.GR, ~. - marijuana:gender)

Original codes (modified below): http://math.cl.uh.edu/~thompsonla/RCode.txt

Page 9: Model Selections and Comparisons

R Program (P-values)

1-pchisq((15.8-15.3),1)

1-pchisq((16.7-15.8),1)

1-pchisq((19.9-16.7),1)

1-pchisq((28.8-19.9),1)

1-pchisq((40.3-28.8),1)

Page 10: Model Selections and Comparisons

Model Selection1. Select an Alpha level (default to use 0.05)

2. Look at the P-values of the model

• Use (in R): 1-pchisq(G2, df)

3. Stop selecting once you reach the Alpha in (1)

4. Model 1: G+R+A+C+M+GR

5. Model 2: G+R+A+C+M+GR+(all pairs)

Page 11: Model Selections and Comparisons

Model Selection (Continued)

6. Model 3: G+R+A+C+M+GR+(all pairs)+(all 3 factors)

7. Model 4g: lowest change in G2, taking out CR

8. Model 5: lowest change in G2, taking out CG

9. Model 6: lowest change in G2, taking out MR

10. Model 7: lowest change in G2, taking out GM

11. Consider: A+C+M+AC+AM+CM

Page 12: Model Selections and Comparisons

Goodness-of-Fit tests(Table 9.2)Model (G-Gender, R-Race, A-Alcohol, C-Cigarette, M-Marijuana) G2 df

1. Mutual independence + GR 1325.1 25

2. Homogeneous association 15.3 16

3. All three-factor terms 5.3 6

4a. (2) - AC 201.2 17

4b. (2) - AC 107.0 17

4c. (2) - AC 513.5 17

4d. (2) - AC 18.7 17

4e. (2) - AC 20.3 17

4f. (2) - AC 16.3 17

4g. (2) - AC 15.8 17

4h. (2) - AC 25.2 17

4i. (2) - AC 18.9 17

5. (AC, AM, CM, AG, AR, GM, GR, MR) 16.7 18

6. (AC, AM, CM, AG, AR, GM, GR) 19.9 19

7. (AC, AM, CM, AG, AR, GR) 28.8 20

Page 13: Model Selections and Comparisons

Thank You!

Any Questions???