After Work Statistics...Practical Hints 1. before doing elaborates statistical analyses: use...

U N I V E R S I T Ä T S M E D I Z I N B E R L I N

After Work Statistics

Ulrike Grittner

Annette Aigner

Institute of Biometry and

Clinical Epidemiology

ulrike.grittner@charite.de

Institute of Biometry and Clinical EpidemiologyWe are…

• … open and helpful!

• … active in the statistical methodologic research and in

medical research

• …active in teaching in many ways

Our Service Unit Biometry

• Free biometrical consulting for all medical research

projects, registration online

• “Statistik-Ambulanz” (Walk-in service): Consultation

without prior registration every Tuesday from 9am to 12pm

• Training in biometrical topics and statistical software

• Responsibility for project biometry within cooperation

For further information visit us online:

https://biometrie.charite.de/

Contact: Univ.-Prof. Dr. Geraldine Rauch (Head of Institute),

Institut für Biometrie und Klinische Epidemiologie (iBikE)

Standort Mitte (Charité Campus Mitte)

Reinhardstraße 58, 10117 Berlin

Standort Mitte (Charité Campus Klinik)

Rahel-Hirsch-Weg 5, 10117 Berlin

Slot Topic

1 So many tests! The agony of choice.

2 So many questions! Multiple testing.

3 So many patients? Sample size calculation.

4 What is it this odds ratio? Logistic regression.

5 Missing information? Dealing with missing data.

6 The right time? Survival analysis.

7 The variety of influences - Mixed models.

8 Who fits together? Patient matching.

1 So viele Tests! Die Qual der Wahl.

2 So viele Fragestellungen! Multiples Testen.

3 So viele Patienten? Fallzahlplanung.

4 Was ist dieses Odds Ratio? Logistische Regression.

5 Fehlende Information? Umgang mit fehlenden Daten.

6 Der richtige Zeitpunkt? Analyse von Ereigniszeiten.

7 Die Vielfalt der Einflüsse – Gemischte Modelle.

8 Wer passt zusammen? Matching von Patienten.

U N I V E R S I T Ä T S M E D I Z I N B E R L I N

The Diversity of Influences–

Mixed Models

Ulrike Grittner

Annette Aigner

Institute of Biometry and

Clinical Epidemiology

ulrike.grittner@charite.de

Overview

• What do you need mixed models for?

• Basic idea of mixed models

• Intra class correlation coefficient (ICC)

• ICC, Random Intercept & Random Slope Model

Mixed models

= multilevel models

= hierarchical models

= random effects models

= nested models

Problem

Assumption for most statistical models

Independent data Dependent data

Association of body height and body

weight, cross-sectional (Bundesgesundheitssurvey 1998, RKI, n=7124)

diast. Blood pressure before and after

therapy, repeated measures(Holzgreve et al., British Medical Journal 299, 881-

886, 1989, 1 study arm: n=169)

Problem

Often we have a mixture of independent and

dependent data (cluster)

Examples

- Individual measures in different clusters

(grouped data)

- Repeated measures in individuals

Grouped Data

Example: Mathematics achievement of pupils in different schools

Assumption:

• math. Achievement of different pupils of one school is more similar as

math. Achievement of pupils of different schools

• Dependent data within schools, independent data across schools

school 1 school 2 school N…

p. 1 p. 2 p. n2 p. nNp. n1 p. 2p. 1p. 1 p. 2… … …

Repeated measures

Example: longitudinal study, repeated measures within individuals

Assumptions:

• measures of one individual are more similar as measures of

different individuals

• dependent data within individuals, independent data

across individuals

Ind.1 Ind. 2 Ind. N…

t 1 t 2 t n2 t nNt n1 t 2t 1t 1 t 2… … …

Example

• Study of math performance of 9th graders across 160 schools (7185

pupils)

Studie „High School and Beyond“ 1982 (Raudenbush & Bryk 2002)

data(MathAchieve), package lme4 in R

• Research question: Is there an association of the socio-

economic status (SES) of the pupils and their math

performance? (SES: Score measured using

education and income

of the parents)

ID School Min. Sex SES MathAch478 1499 No Female -0.678 5.608 479 1499 No Female -0.158 18.352 480 1499 No Female -0.468 5.949 481 1499 Yes Female -0.148 -1.462 482 1499 Yes Female -0.928 4.087 483 1499 No Female -0.218 10.258 484 1499 No Male 0.662 21.791 485 1499 No Male -0.228 1.365 486 1499 No Female -0.368 1.730 487 1499 Yes Female 0.342 5.093 …

Simple Linear Regression

Data of 8 schools (342 pupils)

𝐘 = 𝜶 + 𝜷 ∙ 𝑿 + 𝒆𝒊Math_Perf = mean_Math + β∙SES_Score + 𝒆𝒊

Problem: Similarity in math performance of pupils in same school is ignored

Key-Message 1:

Simple linear

regression models

are not appropriate

for grouped

/clustered data.

Mixed = Fixed and Random Effects

• Fixed Effects:

- Allow statements on general associations

regression coefficients

- Interpretation as in “normal” regression models

• Random Effects:

- Account for dependency between measures in a cluster

- Account for heterogeneity between clusters (independency)

Variance estimation for each level + residual variance

- Random intercept / random slope

Fixed vs Random Effects

Fixed Effects:

- Effects we are interested in

(research question)

Example: SES

- Not randomly chosen

- Would again be chosen

for another study

- Difference among

measures is useful

information

Random Effects:

- Not directly of interest

Example: Schools

- Randomly chosen

- Different schools would be

chosen for another study

- Differences between

schools are often not of

interest

Two Main Models

Random Intercept Model• Individual intercepts (means) for each cluster

• Association of independent and dependent

variable is constant across clusters (example:

association between SES and math performance)

Random Intercept and Random Slope Model• Individual intercepts (means) for each cluster

• Strength of Association of independent and

dependent variable varies across clusters, individual

slopes for each cluster http://mfviz.com/hierarchical-

models/

Random Intercept Model (fixed slope)

yij =α0+u0j+ β1 · xij + εij

Fixed Effects:

α0 : intercept

β1 : fixed effect of x on y

Random Effects:

εij : residual of observation i in cluster j, εij ∼ N(0, σ²)

u0j : residual of cluster j, u0j∼ N(0, τ00)

Mixed effects:

(α0 + u0j) … intercept of cluster j

• Individual intercepts for each cluster

• Association of x and y is fixed across all clusters

Mathe-Scoreij = (10.1 + u0j) + 2.7 · SESij+εij

Random effects:

u0j ∼ N(0, 8.5), εij ∼ N(0, 33.4)

fit <- lmer(MathAch ~ SES + (1|School), data=dat_8schools)

Key-Message 2:

Random intercept

Models account

differences in mean

outcome between

clusters.

u0j ∼ N(0, 8.5)

εij ∼ N(0, 33.4)

Interpretation

• 1 point higher SES-Score → 2.7 points higher math performance

• Mean math performance across all schools: 10.1 Points.

• Schools differ with regard to mean math performance

(Variance=8.5 points)

estimate (95% CI)

Fixed Effects

Intercept 10.1 (7.8; 12.3)

SES 2.7 (1.8; 3.7)

Random Effects

Variance bw. schools

(τ00)

8.5 (2.3; 26.2)

Residual variance (σ2) 33.4 (28.7; 38.9)

Intra class correlation coefficient (ICC)

ICC = τ00

τ00+σ2

τ00 : variance between clusters

σ2 : variance within clusters= residual variance

ICC = proportion of total variance that is due to differences

between clusters

• ICC = 0 … no variance between clusters, all clusters are equal

• ICC = 1 … all variance is explained by clusters, measures within clusters

are equal

ICC =τ00/(τ00+σ2)

= 8.5 /(8.5+ 33.4) = 0.20

20% of the differences in math performance are due to

differences between schools

u0j ∼ N(0, 8.5), εij ∼ N(0, 33.4)

Key-Message 3:

The ICC is a measure of the

proportion of the total variance

in the outcome, that is due to

differences between clusters.

Random Intercept and Random Slope

yij = α0+u0j+(β1+u1j)· xij + εijFixed Effects:

α0 : Intercept

β1 : fixed effect of x on y

Random Effects:

εij : Residual of observation i of cluster j, εij ∼ N(0, σ²)

u0j : random effect of intercept of cluster j, u0j∼ N(0, τ00)

u1j : random slope of cluster j, u1j∼ N(0, τ10)

Mixed effects:

(α0 + u0j) : intercept of cluster j

(β1 + u1j) : slope of cluster j

individual intercepts AND slopes for each

cluster

Sleepstudy (changes in reaction time after restriction of sleep time, data in r

package lme4)

Design: longitudinal, 18 individuals, 10 days

Day 0: normal sleep duration

From day 1 to end of study: only 3 hours night sleep

Outcome: mean reaction time per day

Research question: How does the restricted night sleep influences the

reaction ability of people?

Sleepstudy (reaction time after sleep restriction)

Random InterceptRandom Intercept

und Random Slope

Random Intercept und Random Slope

lmer(Reaction~Days+(1|Subject)+(0+Days|Subject),data=sleepstudy)

Estimate (95% CI)

Fixed Effects

(Intercept) 251.4 (237.6; 265.2)

Days 10.5 (7.3; 13.6)

Random Effects

Intercept subjects 627.6 (15.3; 37.8)

Slope Days 35.9 (4.0; 8.8)

Residual Variance 653.6 (22.9; 28.8)

Interpretation:

• The reaction time increases on average 10.5 ms per day

• There are differences between the individuals in the mean reaction

time (variance: 627.6 ms)

• There are differences between the individuals in the increase of the

reaction time (slope) over the study time (variance: 35.9)

source: sleepstudy data (R package lme4)

Key-Message 4:

Random Intercept &

Slope Models account

for mean differences

and different slope in the

outcome between

clusters.

…. More Mixed Models (mm)

Other Scaling of dependent variable:

– dichotomous (binary logistic mm)

– ordinal (ordinal mm)

– multinomial (multinomial-logistic mm)

Other models:

Survival (frailty models, joint models)

Models, that account for spatial correlations

GAMMS: Generalized Additive Mixed Models (using smooth functions)

applications:

cross-over designs

meta analyses

multi-centre studies

multi-rater settings

>2 levels

Practical Hints

1. before doing elaborates statistical analyses: use descriptive

statistics + graphs => KNOW YOUR DATA!!!

2. Be economical with random effects (use only few)

3. Try to explain as much as possible variance of the

outcome in the fixed effects

4. Check by 1. and by the interpretation of the models, if the

regression coefficients are reasonable!

References

• Snijders TAB, Bosker RJ (1999) Multilevel analysis. An introduction to

basic and advanced multilevel modeling. Sage Publications, London

• Verbeke G, Molenberghs G (2000) Linear mixed models for longitudinal

data. Springer, New York

• Petrie A, Sabin C (2013) Medical Statistics at a Glance. John Wiley &

Sons. in ecology with R. Springer, New York

After Work Statistics...Practical Hints 1. before doing elaborates statistical analyses: use...

Documents

Transcript of After Work Statistics...Practical Hints 1. before doing elaborates statistical analyses: use...

AP Statistics Exam Hints

Social Media– Hints & Tips - Amway Australia€¦ · Social Media– Hints & Tips. Social Media– Hints & Tips. Social Media– Hints & Tips. facebook Who G - Smak Peak rash McKee

More hints

Planning Hints

Helpful Hints

Design Hints

Hints Bmv1

Hesi Hints

Horticulture Hints

Homework Hints

Oracle SQL Hints - HelloDBA.COMhellodba.com/Download/OracleSQLHints.pdf · Oracle SQL Hints Wei Huang (Fuyuncat@gmail.com) Oracle SQL Hints Wei Huang (Fuyuncat@gmail.com)

DataStage Hints

Scenario Hints and Tips - JustFlight.comcdn.justflight.com/.../RailSimulator_ScenarioHintsTips.pdf · Scenario Hints and Tips This selection of useful hints and tips has been compiled

REPAIR HINTS

ACAD Hints

Horticulture Hints - LSU · PDF file1 Winter 2015 Horticulture Hints Winter 2015 Horticulture Hints Vegetable Gardening _____ Baby, it’s cold outside!

Household Hints

Social Hints

Palpation Hints Making Contact Add Text Here. Palpation Hints.

Exhibition Hints