Multi-Site Time Series Studies: Effect of Air Pollution on Morbidity ...

Multi-Site Time Series Studies: Effect of Air Pollution on Morbidity

and Mortality (in Pittsburgh)

Francesca Dominici Department of Biostatistics

Harvard School of Public Health

May 7 2013

Outline • What is a multi-site time series study?

• Which statistical methods do we use to analyze a multi-site time series study?

• What are the benefits of multi-site versus single-site studies?

• What are the strengths/limitations of multi-site versus meta-analysis?

• What do we know now about the short-term effects of air pollution?

Single-city time series studies in the U.S.

Steubenville, OH

Schwartz, 1992

Philadelphia, PA

Kelsall et al. 1997

Birmingham, AL

Schwartz 1993

Utah Valley Pope

et al. 1992

National Data Bases

Health data

(Medicare, NCHS)

120 GB

Exposure data (EPA) 2GB

Weather data (NOAA) 5GB

Daily time series data linked by county

National Morbidity Mortality Air Pollution Study

1987—2006

What is a multi-site time series study?

NMMAPS 1987—2006

• 108 urban communities (including Pittsburgh)

• Cause-specific mortality data from NCHS

– all-cause (non-accidental), CVD, respiratory, COPD, pneumonia, accidental

• Weather from NWS

– Temperature, dew point, relative humidity

• Air pollution data from the EPA

– PM10, PM2.5, O3, NO2, SO2, CO

• U.S. Census 1990, 2000

Methods

Multi-site time series models of air pollution and mortality

• Stage 1 (within city): Poisson regressions for estimating short-term association between air pollution and mortality, controlling for time-varying confounders

• Stage 2 (between cities): Hierarchical model for pooling information across neighboring cities and obtaining a national average effect

Confounding bias

• The association between air pollution and mortality is potentially confounded by:

– Weather: mortality is higher at low and high temperatures

– Seasonality: e.g. mortality generally peaks in winter because of influenza epidemics

– Long-term trends: e.g. improvements in medical practice, lower mortality over time

• All these phenomena cannot be attributed to air pollution

Date

CV

D p

er

10

0,0

00

pe

r d

ay

2002 2004 2006 2008

10

15

20

25

2 df / year

6 df / year

24 df / year

Date

PM

2.5

Co

nce

ntr

atio

n

2002 2004 2006 2008

0

10

20

30

40

50

602 df / year

6 df / year

24 df / year

Degre

es o

f freedom

per y

ear

●

●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●

●

●

●

●●

●●●●

●●●●●●

●●●●

●●●●●●●●●

●

●

●●

●●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

020

40

60

80

100

−3

−2

−1 0 1 2 3

Estimate + bounds

(% increase per 10 units PM2.5)

To Pool or Not to Pool?

• Individual cities can be selected to show one point or another

• Results from individual cities are much more sensitive to model assumptions

• Results from individual cities are swamped by statistical error

• There is not reason to expect that two neighboring cities with similar sources of particles would have qualitative different relative risks

Pooled estimate from multi site time series studies

• It does provides: – evidence of short-term associations

between particulate matter and mortality, on average across locations

– an estimate of the excess number of deaths associated with shorter-term air pollution exposure

• It does not provide: – an estimate of how premature these deaths

are

– an estimate of the extra deaths associated with a sustained exposure which are unrelated to the time of the air pollution episode

Sensitivity of the national average lag effect of PM10 on mortality to different statistical models to adjust for

confounding (NMMAPS 1987-2000)

Peng Dominici Louis JRSSC 2006

Reported estimate

Different statistical models to adjust for confounding

weak moderate strong

Using information only at the very short time scales

Using information at the short and long (trend and seasonality) time scales

Meta Analysis or Multi site time series study?

Meta-analysis versus multi-site time series study (Bell et al 2005)

This indicates that the lag with the highest effect is more likely to have been reported

What are the benefits of multi-site versus single-site studies?

• The primary advantages of meta-analysis or multi-city study over a single city estimate are:

1. the statistical power gained from aggregating multiple estimates

2. the generation of an overall effect estimate

3. the possibility of exploring heterogeneity of the effect across locations

What are the benefits of multi-site time series studies versus meta-

analysis? • In the meta-analytic approach, the independently

conducted single-city studies generally differ with respect to the specification of the statistical model approaches to addressing confounding by weather and long-term trends, and adjustment for additional pollutants, complicating the interpretation of the overall effect.

• In addition, meta-analyses are subject to publication bias and degree of publication bias is difficult to quantify.

The Evidence

PM10 Mortality NMMAPS

Seasonally varying effect of PM10 at lag 1 by region, 100 U.S. cities, 1987—2000

% in

cr.

in

mort

alit

y w

ith

10

mg

m3

in

cr.

in P

M10

at

lag

1

Industrial Midwest North East

Jan Apr July Oct

North West

-0.5

0.0

0.5

1.0

1.5

Southern California

Jan Apr July Oct

-0.5

0.0

0.5

1.0

1.5

Jan Apr July Oct

South East South West

Jan Apr July Oct

Upper Midwest All Regions

The National Medicare Cohort Study,

1999-2008 (MCAPS)

• Medicare data include:

–Billing claims for everyone over 65 enrolled in Medicare (~48 million people),

•date of service

•disease (ICD 9)

•age, gender, and race

•place of residence (zip code)

• Approximately 204 counties linked to the PM2.5 monitoring network

MCAPS study population: 204 counties with populations larger

than 200,000 (11.5 million people)

PM2.5 and Admissions PM10-2.5 and Admissions

US EPA PM Fact Sheet 2006: To better protect public health EPA issued the Agency most protective suite of national air quality standards for particle pollution ever

Dominici et al JAMA 2006 Peng et al JAMA 2008

• Only seven of the 52 components contributed 1% or more to total mass for yearly or seasonal averages

1. OCM

2. Sulfate

3. Nitrate

4. EC

5. Silicon

6. Sodium Ion

7. Ammonium

Chemical composition data on PM2.5

OC

33%

Si

1%

Na+

1%EC

5% Other

4%

SO4=

30%

NO3-

14%

NH4+

12%

Exposure data: Chemical composition

data on PM2.5

from the STN network

1. Constructed a database of time series data for 52 PM2.5 chemical constituents from over 250 STN monitors for 2000 to 2008

2. Identified a subset of PM2.5 components that substantially contribute and/or co-vary with daily PM2.5 concentrations

3. Constructed a database that links by zip code the chemical composition data to human health data

Bell et al EHP 2007

PM2.5

chemical components and mortality rates: 1999-2008

National average estimates and 95% posterior intervals for the percent increase in hospital admissions for cardiovascular diseases per 1 IQR increase in each of the seven PM2.5 components, 119 U.S.

counties, 2000--2006.

Peng et al submitted

Peng et al 2008, EHP

Concluding Thoughts

• Evidence that:

– PM2.5 effects varies by season and region, as does PM2.5 chemical composition

– Some PM sources and components are more harmful than others

• True harmful characteristics of PM not fully understood

• Policy challenge: which sources are most harmful?

• Many challenges remain for study of health and PM, as well as pollution mixtures in general

Questions?

Stage 1: City-specific model

Poisson regression model

Pollutant series

Estimated relative rate for city c

True relative rate for city c

Stage 2: Pooling information across cities

True national-average relative rate

Within city Across cities

b̂ c

b c

a

b̂ c = (b̂ c -b c )+ (b c -a)

Stage 2: Pooling information across cities

City-specific MLE

Between-city variance; heterogeneity

National average

City-specific true effect

b̂ c = b c +N(0,vc )

b c =a +N(0,t 2 )

t

Fre

qu

en

cy

0 1 2 3 4 5 6 7

01

02

030

40

50

60

t

Fre

que

ncy

0 1 2 3 4 5 6 7

05

10

15

20

25

30

35

t

Fre

qu

en

cy

0 1 2 3 4 5 6 7

05

10

15

20

25

30

What is heterogeneity? • It the variance across cities of the true (not the

estimated) air pollution effects • The problem is that we do not see the true effects!

What the data say about heterogeneity?

Small (there is evidence

that it might be zero)

Medium (still including evidence that it might be zero)

Large (no evidence of homogeneity)

How do we know whether city-specific short-term effects of air pollution are truly different across cities?

050

100

150

200

−40 −20 0 20 40

county

b^

●

●

●

●●

●

●

●

●

●●●●●●

●●

●

●●●

●

●

●●

●

●

●●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●●

●

●●●

●

●

●

●

●●

●

●●●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●●

●●●

●●

●

●

●

●

●

●

●

●

●●

●

●

●

●●●

●

●

●●●

●

●●●●●

●

●●

●

●●

●

●

●

●

●

●

●

●●

●

●

●●

●●

●

●

●

●●●●

●●

●●

●

●

●

●

●●●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●

050

100

150

200

−40 −20 0 20 40

county

b^

●

●

●

●●

●

●

●

●

●●●●●●

●●

●

●●●

●

●

●●

●

●

●●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●●

●

●●●

●

●

●

●

●●

●

●●●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●●

●●●

●●

●

●

●

●

●

●

●

●

●●

●

●

●

●●●

●

●

●●●

●

●●●●●

●

●●

●

●●

●

●

●

●

●

●

●

●●

●

●

●●

●●

●

●

●

●●●●

●●

●●

●

●

●

●

●●●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●

050

100

150

200

−40 −20 0 20 40

county

b^

●

●

●

●●

●

●

●

●

●●●●●●

●●

●

●●●

●

●

●●

●

●

●●

●

●

●

●

●●

●

●

●

●

●

●●

●

●

●

●●

●

●●●

●

●

●

●

●●

●

●●●

●

●

●

●

●

●

●

●

●

●

●●●

●

●

●●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●●

●●●

●●

●

●

●

●

●

●

●

●

●●

●

●

●

●●●

●

●

●●●

●

●●●●●

●

●●

●

●●

●

●

●

●

●

●

●

●●

●

●

●●

●●

●

●

●

●●●●

●●

●●

●

●

●

●

●●●

●

●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●●



Weather

a

0.5 1 1.5 2 2.5

High

Medium

Low

Large heterogeneity Medium heterogeneity Small heterogeneity

Posterior distribution of the pooled effect



Seasonal and long-term trends

Multi-Site Time Series Studies: Effect of Air Pollution on Morbidity ...

Documents

Transcript of Multi-Site Time Series Studies: Effect of Air Pollution on Morbidity ...