HSRP 734: Advanced Statistical Methods June 19, 2008

30
HSRP 734: Advanced Statistical Methods June 19, 2008

description

HSRP 734: Advanced Statistical Methods June 19, 2008. Extensions of Logistic Regression. Outcomes with more than 2 categories Categories have order Unordered Conditional logistic regression Analysis of matched data. Extensions of Logistic Regression. Exact methods for small samples - PowerPoint PPT Presentation

Transcript of HSRP 734: Advanced Statistical Methods June 19, 2008

Page 1: HSRP 734:  Advanced Statistical Methods June 19, 2008

HSRP 734: Advanced Statistical Methods

June 19, 2008

Page 2: HSRP 734:  Advanced Statistical Methods June 19, 2008

Extensions of Logistic Regression

• Outcomes with more than 2 categories– Categories have order

– Unordered

• Conditional logistic regression– Analysis of matched data

Page 3: HSRP 734:  Advanced Statistical Methods June 19, 2008

Extensions of Logistic Regression

• Exact methods for small samples– Fisher’s exact

– Exact logistic regression

• Correlated/Clustered data– GEE method

– Mixed models

Page 4: HSRP 734:  Advanced Statistical Methods June 19, 2008

Extensions of Logistic Regression

• Outcomes with more than 2 categories

(polytomous or polychotomous)

• Cumulative logit model – Proportional odds model for ordinal outcomes (ordered categories)

• Generalized logit model for nominal outcomes or non-proportional odds models (unordered categories)

Page 5: HSRP 734:  Advanced Statistical Methods June 19, 2008

Extensions of Logistic Regression

• Cumulative logit model

– Fits a logistic regression model with g-1 intercepts for a g category outcome and one model coefficient for each predictor

– Models cumulative probability of being in a “lower” category

Page 6: HSRP 734:  Advanced Statistical Methods June 19, 2008

Ordinal Logistic Regression

• Odds ratios take on interpretation “% increase/decrease in the odds of being in a lower/higher category”

• Subject to the “Proportional Odds” assumption

Page 7: HSRP 734:  Advanced Statistical Methods June 19, 2008

Extensions of Logistic Regression

• Generalized logit model

– Fits a logistic regression model with g-1 intercepts and g-1 model coefficients for a g category outcome

– Model captures the multinomial probability of being in a particular category using generalized logits

Page 8: HSRP 734:  Advanced Statistical Methods June 19, 2008

Nominal Logistic Regression

• Odds ratios have regular interpretation, just have to be careful with which comparisons are being made (reference category)

• Does not assume “Proportional Odds”

Page 9: HSRP 734:  Advanced Statistical Methods June 19, 2008

SAS

Page 10: HSRP 734:  Advanced Statistical Methods June 19, 2008

Conditional logistic regression

• Can use for matched data (e.g., case-control studies)

• Provides unbiased estimates of odds ratios and CI’s

Page 11: HSRP 734:  Advanced Statistical Methods June 19, 2008

SAS

Page 12: HSRP 734:  Advanced Statistical Methods June 19, 2008

Extensions to Logistic Regression

• Exact Logistic Regression

• Small Sample Size

• Adequate sample size but rare event (sparse data)

Page 13: HSRP 734:  Advanced Statistical Methods June 19, 2008

Fisher’s exact test

• Exact test for RxC table where Chi-square test assumptions are doubtful

• Why not always use Fisher’s exact test and Exact logistic regression?

Page 14: HSRP 734:  Advanced Statistical Methods June 19, 2008

SAS

Page 15: HSRP 734:  Advanced Statistical Methods June 19, 2008

Extensions of Logistic Regression

• Longitudinal data / repeated measures data / Clustered data with binary outcomes

• Multilevel models (nested data structures)

GEE (Generalized Estimating Equations)GLMM (Generalized Linear Mixed Models)

Page 16: HSRP 734:  Advanced Statistical Methods June 19, 2008

Two methods for handling clustered outcomes

• Mixed models– Likelihood based– Use random effects to model clustered observations– continuous outcome (but now extended for categorical)

• Generalized Estimating Equation (GEE)– Non-likelihood based– Can handle large number of clusters– categorical outcome

Page 17: HSRP 734:  Advanced Statistical Methods June 19, 2008

GEE

• GEE can be used in – Longitudinal studies

• repeated measures of the same individual form a cluster– Community studies

• subjects clustered by neighborhood– Familial studies

• subjects clustered by family– Epidemiological studies

• Different forms of clusters – e.g., pedigree

Page 18: HSRP 734:  Advanced Statistical Methods June 19, 2008

GEE

• In general GEE has 3 sets of parameters to estimate:

– Regression parameter (population-averaged effects)

– Correlation parameter (cluster parameter)

– Scale factor (not uncommon to assume =1)

Page 19: HSRP 734:  Advanced Statistical Methods June 19, 2008

Comparing SLR and GEE

SLR GEE

No dispersion allowed for variance

Var (y)= mu(1-mu)

Dispersion allowed for variance

Var (y)= mu(1-mu)*scale_factor

No need to specify correlation matrix

Need to specify correlation structure

Has odds ratio interpretation of exp(coefficient)

Has odds ratio interpretation of exp(coefficient)

Page 20: HSRP 734:  Advanced Statistical Methods June 19, 2008

GEE• In its simplest form, GEE can be considered an extension of logistic regression for

clustered data

• Clustered data are common

– Time: Longitudinal analysis with repeated measurements on individual (e.g., BL, 1m, 2m, 6m follow-up)

– Individual: Cross-sectional analysis with multiple outcomes (e.g., left eye, right eye)

– Background: Subjects clustered because of common geographical or social background (e.g., clinic)

Page 21: HSRP 734:  Advanced Statistical Methods June 19, 2008

Correlation structure

• Correlation structure– Often called the working correlation structure in

GEE– Specifies how the observations within a cluster

are related– Often assumes correlation structure uniform

throughout clusters

Page 22: HSRP 734:  Advanced Statistical Methods June 19, 2008

• Unstructured – All correlation coefficients free to take any value– E.g.,

1

0.3 1

0.1 0.5 1

0.05 0.2 0.4 1

Page 23: HSRP 734:  Advanced Statistical Methods June 19, 2008

• Exchangeable– Any responses within the same cluster has the

same correlation– Simple (1 parameter to estimate)

1

1

1

1

Page 24: HSRP 734:  Advanced Statistical Methods June 19, 2008

• Autogressive AR(1)• Correlation between responses depends on the

interval of time between responses– Farther apart responses => weaker correlation– Only 1 parameter to estimate!

2

3 2

1

1

1

1

Page 25: HSRP 734:  Advanced Statistical Methods June 19, 2008

Correlation matrix

• Selection of a “working correlation structure” is at the discretion of the researcher!

• How does the correlation structure affects the results?

Page 26: HSRP 734:  Advanced Statistical Methods June 19, 2008

Properties of GEE estimators

• How about estimate of correlation if “working” correlation matrix is not correctly specified?

• Model-based estimate => not consistent

• Empirical (robust) estimate => still consistent

Page 27: HSRP 734:  Advanced Statistical Methods June 19, 2008

Properties of GEE estimators

• Even if correlation structure misspecified, estimate for logistic regression is still consistent

– if correlation misspecified, estimate not as efficient (SE is larger)

– This property contributes to the popularity of GEE

• GEE works well with larger #’s of clusters

Page 28: HSRP 734:  Advanced Statistical Methods June 19, 2008

SAS

Page 29: HSRP 734:  Advanced Statistical Methods June 19, 2008

Review

Page 30: HSRP 734:  Advanced Statistical Methods June 19, 2008