Airports,Air Pollution, and Contemporaneous Healthws2162/articles/SchlenkerWalker.pdf ·...
Transcript of Airports,Air Pollution, and Contemporaneous Healthws2162/articles/SchlenkerWalker.pdf ·...
Airports, Air Pollution, and Contemporaneous Health
Wolfram Schlenker♣ and W. Reed Walker♠
July 2015
Abstract
We link daily air pollution exposure to measures of contemporaneous health for communities sur-rounding the 12 largest airports in California. These airports are some of the largest sources ofair pollution in the United States, and they experience large changes in daily air pollution emis-sions depending on the amount of time planes spend idling on the tarmac. Excess airplane idling,measured as residual daily taxi time, is due to network delays originating in the Eastern UnitedStates. This idiosyncratic variation in daily airplane taxi time significantly impacts the health oflocal residents, largely driven by increased levels of carbon monoxide (CO) exposure. We use thisvariation in daily airport congestion to estimate the population dose-response of health outcomesto daily CO exposure, examining hospitalization rates for asthma, respiratory, and heart relatedemergency room admissions. A one standard deviation increase in daily pollution levels leads to anadditional $540 thousand in hospitalization costs for respiratory and heart related admissions forthe 6 million individuals living within 10km (6.2 miles) of the airports in California. These healtheffects occur at levels of CO exposure far below existing EPA mandates, and our results suggestthere may be sizable morbidity benefits from lowering the existing CO standard.
We would like to thank Antonio Bento, Janet Currie, Ryan Kellogg, Mushfiq Mobarak, MatthewNeidell, Marit Rehavi, Jay Shimshack, and Christopher Timmins for comments on an earlier ver-sion of this paper. All remaining errors are ours. Walker acknowledges support from the RobertWood Johnson Foundation.
♣ Columbia University and NBER. Email: [email protected].
♠ University of California at Berkeley and NBER. Email: [email protected].
The effect of pollution on health remains a highly debated topic. The US Clean Air Act (CAA)
requires the Environmental Protection Agency (EPA) to develop and enforce regulations to protect
the general public from exposure to airborne contaminants that are known to be hazardous to
human health. In January 2011, the EPA decided against lowering the existing CAA carbon
monoxide standard due to insufficient evidence that relatively low carbon monoxide levels adversely
affect human health. In order to assess the benefits of lowering the standard, accurate estimates are
needed that link contemporaneous air pollution exposure to observable health outcomes at levels
of pollution currently faced by local populations. However, these estimates are hard to come by
as pollution is rarely randomly assigned across individuals, and individuals who live in areas of
high pollution may be in worse health for reasons unrelated to pollution. Preferences for clean
air may covary with unobservable determinants of health (e.g., exercise) which can lead to various
forms of omitted variable bias in regression analysis. Moreover, heterogeneity across individuals
in either preference for, or health responses to, ambient air pollution implies that individuals may
self-select into locations on the basis of these unobserved differences. In both cases, estimates
of the health effects of ambient air pollution may reflect the response of various subpopulations
and/or spurious correlations pertaining to omitted variables. While recent research attempts to
address the issue of non-random assignment using various econometric tools such as fixed effects or
instrumental variables, these studies often focus on infant health over longer periods of time (Chay
& Greenstone 2003, Currie & Neidell 2005). Much less is known about short-term, daily effects
of ambient air pollution on the health of the more general population, such as the non-elderly,
non-child, adult population.1
We develop a framework for estimating the contemporaneous effect of air pollution on health
using variation in local air pollution driven by airport runway congestion. Airports are one of the
largest sources of air pollution in the United States with Los Angeles International Airport (LAX)
being the largest source of carbon monoxide in the state of California (Environmental Protection
Agency 2005). A large fraction of airport emissions come from airplanes, with the largest aggregate
channel of emissions stemming from airplane idling (Transportation Research Board 2008). We
show that airport runway congestion, as measured by the total time planes spent taxiing between
the gate and the runway, is a significant predictor of local pollution levels. Since local runway
congestion may be correlated with other determinants of pollution such as weather, we exploit the
fact that California airport congestion is driven by network delays that began in large airports
outside of California.2 A recent article in the New York Times (New York Times January 27, 2012)
1There is a larger literature in epidemiology which focuses on daily responses to air pollution (see e.g. Ito, Thurston& Silverman (2007), Linn et al. (1987), Peel et al. (2005), Schildcrout et al. (2006), Schwartz et al. (1996)). The workin our paper complements the existing epidemiological literature by focusing on issues pertaining to measurementerror, avoidance behavior, and self-selection bias in the context of susceptibility to pollution exposure. Each of theseissues is critically important to providing unbiased estimates of the causal relationship between pollution and health.The instrumental variables approach in this paper exploits arguably exogenous pollution shocks that are unlikely to beknown by local residents, allowing us to simultaneously address issues of measurement error and avoidance behavior.Recent work in economics and environmental health, discussed in more detail below, suggests that short run variationin pollution exposure may be significant predictors of mortality and morbidity (Moretti & Neidell 2011, Knittel, Miller& Sanders 2011).
2This relationship is well known within the transportation literature (Welman, William & Hechtman 2010). Opti-
1
provides a useful motivation:
[Airplane] delays ripple across the country. A third of all delays around the nation each
year are caused, in some way, by the New York airports, according to the F.A.A. Or,
as Paul McGraw, an operations expert with Airlines for America, the industry trade
group, put it, “When New York sneezes, the rest of the national airspace catches a
cold.”
Our analysis hence links health outcomes of residents living near California airports to changes in
air pollution driven by runway congestion at airports on the East Coast. The identifying variation
in California pollution is caused by events several thousand miles away (e.g., weather in Atlanta),
which is unlikely to be correlated with determinants of health in California.
The goal of this paper is to identify the ways in which short run, daily variation in air pollution
affects population health. In doing so, this paper makes four primary contributions to the existing
literature in this area. First, while most existing literature focuses on the health impacts of infants
or elderly, we are able to examine the health responses of the entire population. We find that infants
as well as the elderly are most sensitive to ambient air pollution. At the same time, a one-unit
increase in pollution has much larger aggregate effects for adults aged 20-64, given their large share
of the overall population. Studies that focus on infants or the elderly significantly underestimate
overall health effects.
The second contribution of this paper is to estimate the contemporaneous effect of multiple pol-
lutants simultaneously. It has traditionally been difficult to decipher which pollutant is responsible
for adverse health outcomes since short-term fluctuations among ambient air pollutants are highly
correlated. Our solution to this identification problem is to rely on the fact that wind speed and
wind direction transport individual pollutants in different ways. By using interactions between taxi
time, wind speed, and wind angle from airports, we can pin down the direct effect of each pollutant,
while holding the others constant. We use over-identified models to instrument for several pollu-
tants simultaneously, an approach that was simultaneously developed in related work by Knittel,
Miller & Sanders (2011). We find that CO is responsible for the majority of the observed increase
in hospital admissions, although we cannot rule out that this may be driven by other unobserved
pollutants that are correlated with airplane-driven CO emissions. This finding has direct policy
implications. The EPA recently decided to maintain the current CO pollution standard, citing a
lack of evidence that reducing CO below current ambient levels would improve population health
outcomes.
We believe there are two additional features that set this paper apart from existing work in
both economics and epidemiology. Our paper is most closely related to the recent work of Moretti
& Neidell (2011) and Knittel, Miller & Sanders (2011) who also instrument daily pollution in
health regressions with variation in local transportation conditions (i.e. container shipping in Long
mal airplane scheduling incorporates anticipated ripple effect. For example, Pyrgiotisa, Maloneb & Odoni (2013) usequeuing theory to simulate how delays propagate through the system. They quote a study that found a multipliereffect of seven, i.e., each 1-hour delay of a particular airplane leads to a combined 7 hours delay for the airline.
2
Beach, CA and automobile congestion in Central and Southern CA, respectively). Relative to these
papers and the existing literature, we believe this paper is the first to use the network structure of
transportation to generate local variation in congestion that is driven by events that occur several
thousand miles away. This matters because one of the key drivers of transportation congestion is
local weather, and local weather is also likely to affect ambient pollution, violating the identifying
assumptions of the model. By way of example, we show that instrumenting local airport congestion
with network delays that are not correlated with local weather doubles our point estimates, relative
to the baseline case. We also explicitly model the spatial dispersion of air pollution emissions, as
it varies with wind speed, wind direction, and distance from the airport. Pollutant transport is
very locally heterogeneous, and failing to account for this spatial heterogeneity leads to bias when
estimating the population dose-response function.
The fourth contribution of this study is the use of newly available Emergency Discharge Data
to better capture the morbidity impacts of air pollution. Previous research has predominantly
focused on the effects of pollution on mortality or morbidity as measured in Inpatient Discharge
Records. Inpatient Discharge data consist only of observations for patients that stayed overnight
in a hospital, and thus exclude a large fraction of respiratory-related emergency room admissions
that do not require overnight hospital visits. We show that estimates using the more commonly
used Inpatient Discharge data substantially underestimate the morbidity impacts of air pollution,
relative to estimates from the combined Emergency Discharge and Inpatient Discharge datasets.3
In summary, our approach combines newly available data with arguably exogenous daily changes
in air pollution that originate several thousand miles away and are unknown to the local popula-
tion. The instrumental variables setting allows us to simultaneously address issues pertaining to
both avoidance behavior and classical forms of measurement error, each of which lead to significant
downward bias in conventional dose-response estimates. The primary estimation framework exam-
ines how zip code level emergency room admissions covary with these quasi-experimental increases
in air pollution stemming from airports.
We find that a one standard deviation increase in daily pollution explains roughly one third of
average daily admissions for asthma problems. It leads to an additional $540 thousand per day
in hospitalization costs for respiratory and heart related admissions of individuals within 10km
of one of the 12 largest airports in California. This is likely a significant lower bound of the
social costs as the willingness to pay to avoid a sickness might be significantly larger than the
medical reimbursement cost (Grossman 1972). Our baseline IV estimates are an order of magnitude
larger than uninstrumented fixed effects estimates, highlighting the importance of accounting for
measurement error and/or avoidance behavior in conventional estimators. We find no evidence that
airport runway congestion affects diagnoses unrelated to air pollution such as bone fractures, stroke,
or appendicitis. We also present a variety of evidence in favor of a non-linear dose-response function.
As pollution levels increase the marginal effect of a 1 unit increase in pollution increases but at a
3Knittel, Miller & Sanders (2011) focus on infant mortality, while Moretti & Neidell (2011) examine morbidityoutcomes, but only for individuals of Emergency Room visits that eventually get admitted to an overnight stay asthe authors rely on the Inpatient Discharge data.
3
decreasing rate. This is consistent with thresholds at which the health effect of air pollution levels
off (i.e. the dose-response function is not convex over the levels of pollution we observe), and along
the lines of what research in epidemiology has observed (Pope et al. 2009, Pope III et al. 2011).
We present several sensitivity checks of results that do not alter our conclusions. For example,
we focus on morning airport congestion in the East since it is possible that California airport delays
impact airports on the East Coast, which then feedback to California airports. Due to the difference
in time zones, very few flights from California reach East Coast airports before 12pm. Estimates
remain similar to our baseline estimates. A distributed lag model finds no evidence for delayed
impacts or forward displacement, i.e., that individuals on the brink of an asthma or heart attack
may experience an episode that would have otherwise occurred in the next few days anyway. A
Poisson model linking sickness counts to pollution levels gives comparable estimates to our baseline
linear probability model, which does not account for the truncation of daily sickness rates at zero.
Lastly, we find little evidence of treatment effect heterogeneity that would raise concerns pertaining
to forms of self-selection bias and/or the external validity of the underlying dose-response estimates.
The findings in this paper suggest that daily variation in ambient air pollution has economically
significant health effects at levels below current EPA mandates, at least for the population that
comprises our study. We believe this is particularly important due to the fact that in January 2011,
the EPA decided against lowering the existing CAA carbon monoxide standard due to insufficient
evidence that relatively low carbon monoxide levels adversely affect human health. The maximum
hourly CO concentration in our data is 7.5ppm (see Appendix Table A2), which is below the
ambient air quality standard of 35ppm for any 1-hour reading or 9ppm for any 8-hour average, i.e.,
air quality levels were always within the limit.4 Yet, fluctuations in pollution levels significantly
below the standard still have sizable health consequences. While a full-fledged benefit-cost analysis
would have to balance the cost of reducing CO against the benefits, EPA indicated there were no
appreciable benefits from lowering the standard to begin with, which we find not to be the case.
1 Background: Airports, Airplanes, and Air Pollution
Regulators have long been aware of the pollution generated by cars, trucks, and public transit.
There have been countless legislative policies designed to curtail harmful emissions from these
sources (Auffhammer & Kellogg 2011). However, aircraft and airport emissions have only recently
become the subject of regulatory scrutiny, although little has been done to reduce or manage
emissions generated by airports and air travel. While there has been some effort to curtail the
substantial CO2 emissions generated by aircraft,5 there has been relatively little effort to control or
contain some of the more pernicious air pollutants generated by jet engines. This lack of regulatory
scrutiny can be traced back to the way in which pollutants are regulated in the United States
4The same is not true for NO2. The maximum 1-hour reading in our data is 136ppb, which is above the 1-hourstandard of 100ppb.
5The European Union has recently approved greenhouse gas measures, which oblige airlines, regardless of nation-ality, that land or take off from an airport in the European Union to join the emissions trading system starting onJanuary 1, 2012.
4
under the Clean Air Act. Current Federal law preempts all federal, state, and local agencies
except the Federal Aviation Administration from establishing measures to reduce emissions from
aircraft due to potential interstate and international commerce conflicts that might arise from other
decentralized regulations.6
Aircraft jet engines, like many other mobile sources, produce carbon dioxide (CO2), nitrogen
oxides (NOx), carbon monoxide (CO), oxides of sulfur (SOx), unburned or partially combusted
hydrocarbons (also known as volatile organic compounds, or VOCs), particulates, and other trace
compounds (Federal Aviation Administration 2005a). Each of these pollutants is emitted at differ-
ent rates during various phases of operation, such as idling, taxing, takeoff, climbing, and landing.
NOx emissions are higher during high power operations like takeoff when combustor tempera-
tures are high. On the other hand, CO emissions are higher during low power operations like
taxiing when combustor temperatures are low and the engine is less efficient (Federal Aviation
Administration 2005a).7 Even though the aircraft engine is often idling during taxi-out, the per
minute CO and NOx emissions factors are higher than at any other stage of a flight (Environmental
Protection Agency 1992). Combining this with the long duration of taxi-out times during peak
periods of the day, total taxiing over the course of a day can add up to a substantial amount.
Consistent with these facts, Los Angeles International airport is estimated to be the largest point
source of CO emissions in the state of California, the second largest of NOx, the twenty-ninth
largest of SO2, and the 2763 and 2782 largest of PM10 and PM2.5, respectively (Environmental
Protection Agency 2005).
Airports provide a particularly compelling setting through which to estimate the contempo-
raneous relationship between air pollution and health. Not only are airports some of the largest
polluters of ambient air pollution in the United States but they also have extraordinarily rich data
on daily operating activity, detailing for each domestic flight the length of time spent taxiing to
and from the gate before takeoff and after landing. This allows for a precise understanding of the
aggregate amount of daily runway congestion at airports. Moreover, daily runway congestion at
airports exhibits a great degree of residual variation even after controlling for normal scheduling
patterns. Much of the variation in runway congestion is driven by network delays propagating from
major airport hub delays thousands of miles away. Network delays at distant airports serve as an
ideal instrumental variable for local pollution; the effect of a snow storm in Chicago on conges-
tion at LAX should be orthogonal to any other confounding influences of air pollution in the Los
Angeles area. In addition, local residents are likely unaware of increases in taxi time and hence
cannot engage in self-protective behavior. Lastly, every airport has detailed weather data, allowing
researchers to exploit the spatial distribution of airport-generated pollution. We can therefore es-
timate how areas downwind of an airport on a given day are disproportionately affected by runway
6Currently, the Environmental Protection Agency has an agreement with the FAA to voluntarily regulate groundsupport equipment at participating airports known as the Voluntary Airport Low Emission (VALE) program (UnitedStates Environmental Protection Agency 2004).
7As a result, reducing engine power for a given operation like takeoff or climb out generally increases the rate ofCO emissions and reduces the rate of NOx emissions.
5
congestion relative to areas upwind. Understanding this spatial variation in pollutant transport
improves the efficiency of our estimates, while also providing important tests of the validity of our
research design.
2 Data
This project uses the most comprehensive data currently available on airport traffic, air pollution,
weather, and daily measures of health in California. This data is rich in both temporal and
spatial dimension, allowing for fine-grained analysis of how daily airport congestion impacts areas
downwind of an airport on a given day. The various datasets and linkages are described in more
detail below.
2.1 Airport Traffic Data
A useful feature of a study involving airports is the detailed nature of daily flight data. The
Bureau of Transportation Statistics (BTS) Airline On-Time Performance Database contains flight-
level information by all certified U.S. air carriers that account for at least one percent of domestic
passenger revenues. It has a wealth of information on individual flights: flight number, the origin
and departure airport, scheduled departure and arrival times, actual departure and arrival times,
the time the aircraft left the runway and when it touches down. We construct a daily congestion
measure for each of the 12 major airports in California by aggregating the combined taxi time of
all airplanes at an airport. This measure consists of (i) the time airplanes spend between leaving
the gateway and taking off from the runway and (ii) the time between landing and reaching the
gate. An interesting feature of aggregate daily taxi time is the large amount of residual variation
remaining after controlling for daily airport scheduling, weather, and holidays. We relate this
variation to local measures of pollution and health in our econometric analysis. One caveat of the
BTS data is that it only includes information for major domestic airline passenger travel.8 As long
as international flights are not treated differently in the queuing system and are hence colinear to
the taxi time of domestic flights, congestion of national flights should be a good proxy for overall
congestion.
We limit our analysis to the 12 largest airports in California by passenger count. These airports
are in alphabetical order (including airport call sign in brackets): Burbank (BUR), Los Angeles
International (LAX), Long Beach (LGB), Oakland International (OAK), Ontario International
(ONT), Palm Springs (PSP), San Diego International (SAN), San Francisco International (SFO),
San Jose International (SJC), Sacramento International (SMF), Santa Barbara (SBA), and Santa
Ana / Orange County (SNA). The locations of these airports are shown as dots in Figure 1.
Average flight statistics at each of these airports are reported in Table A1 of the appendix. There
8In January 2005, international departures (both cargo and passenger) accounted for 8.5% of total departures,whereas cargo (both international and domestic) accounted for 5.9% of all United States airport departures (Depart-ment of Transportation 2009).
6
is significant variation in daily ground congestion at airports: the standard deviation of daily taxi
time at the largest airport (LAX) is 1852 minutes. Once we account for year, month, weekday and
holiday fixed effects as well as local weather, the remaining variation is still 891 minutes. Most of
the airports are close to urban areas as they serve the travel needs of these populations. Seven
airports in California rank among the top 50 busiest airports in the nation according to passenger
enplanement (Federal Aviation Administration 2005b).
A potential concern when linking daily airport activity to daily ambient air pollution levels is
that runway congestion in California airports may be highest in the late afternoon and evening.
This would lead us to erroneously misclassify some of the daily airport effects to the wrong day.
Appendix Figure A2 plots the distribution of aggregate taxi time within a day. Most ground
activity at airports is skewed towards the beginning of the day. We will address the sensitivity
of our estimates towards these issues of misclassification or across-day spillovers in subsequent
sections.
2.2 Pollution Data
We construct daily measures of air pollution surrounding airports using the monitoring network
maintained by the California Air Resource Board (CARB). This database combines pollution read-
ings for all pollution monitors administered by CARB, including information on the exact location
of the monitor. Data includes both daily and hourly pollution readings. We concentrate on the
set of monitors with hourly emission readings for CO, NO2, and O3 in the years 2005-2007.9 The
locations of all CO and NO2 monitors in relation to airports are shown in Figure 1.
A unique feature of pollution data is the significant number of missing observations in the
database. We therefore use the following algorithm when we aggregate the hourly data to daily
pollution readings: Our measure of the daily maximum pollution reading is simply the maximum
of all hourly pollution readings. The daily mean is the duration-weighted average of all hourly
pollution readings. We define the duration as the number of hours until the next reading.10 We
prefer this approach to simply taking the arithmetic average of all hourly readings on a day since
hourly pollution data exhibit great temporal dependence. A missing hourly observation is better
approximated by the previous non-missing value than the daily average. We also keep track of the
number of observations per day. In a sensitivity check (not reported) we rerun the analysis using
only monitors with at least 20 or 12 readings per day.11
9While data exists for other pollutants in California, we limit our analysis to using CO, NO2 as they are directlyemitted by airplanes and have better coverage than PM10. O3 forms from VOC and NOx. In a sensitivity check wedo not find that O3 pollution levels are impacted by airport congestion. Nevertheless, we present sensitivity analysesthat include O3 and PM10 as controls with little effect on our results. While monitor data exists as far back as 1993,portions of our hospital data, described further in this section, exists only from 2005 onwards.
10Readings occur on the hour of each day ranging from midnight to 11pm. If readings at the beginning of a day(midnight, 1am, etc.) are missing, we adjust the duration of the first reading from midnight to the second reading.For example, if readings occur on 3am, 5am, and 8am, the 3am reading would be assigned a duration of 5 hours andthe 5am reading would be assigned a duration of 3 hours. By the same token, if the last reading of a day is not 11pm,the duration of that last reading is from the time of the reading until midnight.
11If a monitor has not a single reading for a day, we approximate it’s value in a three step procedure: (i) we derive
7
We create daily zip code pollution measures by taking the average monitor reading of all mon-
itors within 15km of a zip code centroid, weighting by the inverse distance between the monitor
and the zip code centroid.12 Summary statistics are given in Panel A of Table A2 in the appendix.
Since we have both the longitude and latitude of all airports and zip code centroids, we are able to
derive (i) the distance between the airport and a zip code, and (ii) the angle at which the zip code
is located relative to the airport. In order to leverage the spatial features of our data, we normalize
the angle between a zip code centroid and an airport to 0 if the zip code is lying to the north of
the airport. Degrees are measured in clockwise fashion, e.g., a zip code that is directly east of an
airport will have an angle of 90 degrees. The angle between an airport and a zip code allows us to
explore the link between airport emissions and pollution downwind of airports using the weather
data described next.
2.3 Weather Data
We use temperature, precipitation, and wind data in our analysis to both control for the direct
effects of weather on health (Deschenes, Greenstone & Guryan 2009) and also to leverage the
quasi-experimental features of wind direction and wind speed in distributing airport pollution from
airports. Our weather data comes from Schlenker & Roberts (2009), which provides minimum and
maximum temperature as well as total precipitation at a daily frequency on a 2.5×2.5 mile grid for
the entire United States.13 To assign daily weather observations to an airport or zip code, we use
the grid cell in which the zip code centroid is located. Summary statistics for the zip-code level
data are given in Panel B of Table A2 in the appendix.
Average wind speed and wind direction come from the National Climatic Data by the National
Oceanic and Atmospheric Administration’s (NOAA) hourly weather stations. Most airports have
weather stations with hourly readings. We construct wind direction, which is normalized to equal
zero if the wind is blowing northward and counted in clockwise fashion. If the angles of the zip code
and the wind direction are identical, the zip code is hence exactly downwind from the airport. An
angle of 180 degrees implies that the zip code is upwind from the airport. The hourly wind speed
and wind direction is aggregated to the daily level by calculating the duration-weighted average
between readings comparable to the pollution data above. The distribution of wind directions is
shown in Figure 2. Airports at the ocean predominantly have winds coming from the direction of
the ocean. For example, Santa Barbara, located on the only portion of the California coast that
the cumulative density function (cdf) at each monitor; (ii) take the inverse-distance weighted average of the cdf fora given day at all monitors with non-missing data; (iii) we fill the missing observation with the same percentile ofthe station’s cdf. For example, if surrounding monitors with non-missing data on average have pollution levels thatcorrespond to the 80th percentile of their respective distributions, we fill the missing value of a station with the 80thpercentile of it’s own distribution of pollution readings. This procedure gives us a balanced panel.
12Inverse distance weighting pollution measures has been used to impute pollution in previous research. See forexample, Currie & Neidell (2005).
13There is one exception: in a set of regression models where we estimate the effect of airport weather on taxi timewe use the closest non-missing daily weather station data from NOAA’s COOP station data set for each airport. Thisis because Schlenker & Roberts (2009) use a spatial interpolation procedure that might result in artificial correlationbetween weather data at airports due to the spatial interpolation technique.
8
runs east-west has winds blowing northward. Note again that we are measuring the direction in
which the wind is blowing, not from which it is coming. In our empirical analysis, we use this daily
variation in wind speed and wind direction to predict how pollution from airports disproportionately
impacts some zip codes more than others on a given day.
2.4 Hospital Discharge and Emergency Room Data
Health effects are measured by overnight hospital admission and emergency room visits to any
hospital in the state of California. We use the California Emergency Department & Ambulatory
Surgery data set for the years 2005-2007.14 The dataset gives the exact admission date, the zip code
of the patient’s residence (as well as the hospital), the age of the patient, as well as the primary
and up to 24 secondary diagnosis codes. An important limitation of the Emergency Department
data is that any person who visits an ER and is subsequently admitted to an overnight stay drops
out of the dataset. This is done to prevent double counting in California’s hospital admissions
records, as overnight hospital stays are logged in California’s Inpatient Discharge data. Therefore
we also obtained Inpatient Discharge data for all individuals who stayed overnight in a hospital
in the years 2005-2007. In our baseline model we focus on the sum of emergency room visits and
overnight stays in a zip code-day to avoid non-random attrition in the ER data. Focusing only on
emergency room admittance would suffer from selection bias as higher pollution levels (and more
severe health outcomes) could result in more overnight stays, yet the emergency room numbers
would actually appear smaller.
We count the daily admissions of all people in a zip code who had a diagnosis code pertaining to
three respiratory illnesses: asthma, acute respiratory, and all respiratory. Note that each category
adds additional sickness counts but includes the previous. For example, asthma attacks are also
counted in all respiratory problems. We also count heart related problems, which Peters et al.
(2001) have shown to be correlated with pollution. Finally, we include three placebos: stroke, bone
fractures, and appendicitis.15 In our baseline model, we count a patient as suffering from a sickness
if either the primary or one of the secondary diagnosis codes lists the illness in question.
We merge the zip code level hospital data with age-specific population counts in each zip code
obtained from both the 2000 and 2010 Censuses. We use the weighted average between the 2000
(weight 0.4) and 2010 (weight 0.6) counts, as the midpoint of our data is 2006. We limit our analysis
to the 164 zip codes whose centroid lies within 10km of an airport and which have at least 10000
inhabitants.16 The total population of these 164 zip codes is around 6 million people, or roughly
one sixth of the overall population of California at the time. Summary statistics for the zip codes
in the study are given in Panel C of Appendix Table A2. We use these age-specific population
14The Emergency Room data was not collected prior to 2005.15The exact ICD-9 codes are: asthma: [493, 494); acute respiratory: [460,479), [493,495), [500,509), [514,515),
[516,520); all respiratory: [460, 520); heart problems: [410, 430); stroke [430, 439); bone fractures [800, 830);appendicitis: [540, 544).
16The latter sample restriction excludes 0.8 percent of the total population that lives in a zip code whose centroidis within 10km of an airport but has less than 10000 inhabitants.
9
counts to construct daily hospitalization rates for each zip code. Table A3 provides sickness rates
per 10 million inhabitants for both the entire population as well as population subgroups of those
over 65 years of age and under 5 years of age.
2.5 External Validity - Populations Close to Airports
Our analysis focuses on areas within 10km of airports. This raises the broader question as to
how our estimated results generalize to populations outside of the 10km airport radius. Table A4
investigates this question by examining zip code characteristics from the 2000 Census. We present
three comparisons: First, we look at zip codes that are in our sample in columns (1a)-(1c) but
divide them into zip codes whose centroids are within [0,5]km and (5,10]km of an airport. Second,
we compare zip codes within 10km of an airport versus neighboring zip codes that are between
10 and 20km of an airport in columns (2a)-(2c). Third, we compare zip codes within 10km of an
airport to all other zip code in California in columns (3a)-(3c).
For the first two sets of comparisons, few comparison tests are significant, roughly at a rate that
should happen due to randomness. In other words, areas [0,5]km from an airport are comparable
to areas (5,10]km or (10,20]km.17 On the other hand, the third set of comparisons shows that areas
within 10km are not comparable to the rest of the state of California, which includes more rural
areas. Zip codes closer to airports are on average more urban, more populated, wealthier, and
have higher housing prices. Therefore, we would caution against interpreting the estimated dose
response relationship as representative for the entire population at large. From the standpoint of
airport externalities, the population close to airports is the population of interest. Moreover, much
of the air pollution regulation in the United States is spatially targeted towards urban areas (i.e.
those areas with higher degrees of ambient air pollution), and in that case, these estimates may be
more appropriate for regulatory analysis than a dose response function averaged over individuals
in both urban and rural locations.
3 Empirical Methodology
We are estimating the link between ground level airport congestion, local pollution levels, and
contemporaneous hospitalization rates for major airports in the state of California. To begin, we
consider the effects of increased levels of airport traffic congestion on local measures of pollution.
3.1 Aggregate Daily Taxi Time and Local Pollution Levels
Ambient air pollution is a function of the distance between a point source and the receptor loca-
tion, as well as many other atmospheric variables including, but not limited to, wind speed, wind
direction, humidity, temperature, and precipitation. To model the effects of increases in aggregate
1747% of Californians live in a zip code within 20km of an airport.
10
airport taxi time on pollution levels, we adopt the following additive linear regression model
Model 1: pzat = α1Tat +WztΦ+ weekdayt +montht + yeart + holidayt︸ ︷︷ ︸ZztΓ
+νza + ezat (1)
where pollution pzat in zip code z that is paired with airport a on day t is specified as a function
of taxi time Tat and a vector of zip-code level controls Zzt that include weather controls Wzt.18
Our baseline regressions include 17 weather controls: A quadratic in minimum and maximum
temperature, precipitation and wind speed (8 terms) as well as 9 terms for wind direction that are
included in equation (3) below.19 To model this relationship formally, we define wind direction by
the cosine of the difference between the wind direction and the direction in which the zip code is
located. The variable will be equal to 1 in the case that the angle in which the wind is blowing
equals the direction in which the zip code is located, and the variable will be equal to zero when they
are at a right angle (the difference is 90 degrees). The vector Wzt includes all possible time-varying
interactions between distance, wind speed and angle (up and downwind) to control for pollution
formation not directly influenced by taxi time. We also control for temporal variation in pollution
by including weekday fixed effects (weekdayt), month fixed effects (montht), and year fixed effects
(yeart) as well holiday fixed effects (holidayt) to limit the influence of airport congestion outliers.20
In a sensitivity check (available upon request), we instead include day fixed effects, i.e., one for
each of the 1095 days, and the results remain robust. Since there may be time-invariant unobserved
determinants of pollution for any given zip code, all regressions include zip code fixed effects, νza.
The parameter of interest is α1, which tells us the effect of a 1000 minute increase in aggregate
daily ground congestion on local ambient air pollution levels. Increased airplane taxiing leads to an
increase in airplane emissions and presumably increases in ambient air pollution. Hence, we would
expect this coefficient to be positive.
We also estimate models similar to equation (1), where we interact taxi time (or instrumented
taxi time) with the distance between an airport and the monitor. The idea would be to allow the
marginal effect of taxi time to differ based on monitors that were closer relative to further from the
airport. This results in the following equation:
Model 2: pzat = α1Tat + α2Tatdza + ZztΓ+ νza + ezat (2)
The additional coefficient is α2. The effect of taxi time on pollution should fade out with distance,
and we would hence expect this coefficient to be negative. The marginal effect of taxi time in model
18In principle a zip-code z could be paired with more that one airport a. In practice, our baseline model uses zipcodes whose centroid is within 10km of an airport. Each zip code is assigned to exactly one airport as none is within10km of two airports.
19Specifically, our weather controls include the terms corresponding to α3, α4 and α6 −α12 in equation (3) withoutthe interaction with taxi time Tat. Results are robust to different functional forms of weather control variables.Additionally, we have estimated models that exclude all weather controls, and the coefficients for our primary pollutantof interest (CO see below) are not significantly affected (although the standard errors increase).
20We include fixed effects for New Year, Memorial Day, July 4th, Labor Day, Thanksgiving, and Christmas, as wellas the three days preceding and following the holiday.
11
2 is α1 + α2dza.
In a third step we also include interactions with wind direction and wind speed. The intuition is
that both wind direction and speed transport airport emissions across space. Thus, holding speed
constant, areas downwind should be relatively more affected by aggregate daily taxi time relative to
areas upwind. To model this relationship formally, we let vat be the wind speed and czat the cosine
of the difference between the wind direction and the direction in which the zip code is located,
which can differ upwind czat > 0 and downwind czat < 0.21 Allowing for all possible time-varying
interactions we get:22
Model 3: pzat = α1Tat + α2Tatdza + α3TatczatI[czat>0] + α4TatczatI[czat<0]
+α5Tatvat + α6TatdzaczatI[czat>0] + α7TatdzaczatI[czat<0]
+α8Tatdzavat + α9TatczatI[czat>0]vat + α10TatczatI[czat<0]vat
+α11TatdzaczatI[czat>0]vat + α12TatdzaczatI[czat<0]vat
+ZztΓ+ νza + ezat (3)
The new coefficients are α3 through α12. The predicted signs of these coefficients are less intuitive.
While higher wind speeds can clear the air they may also carry greater amounts of the pollutant
further distances.23 Moreover, downwind areas should have higher pollution levels relative to those
areas upwind, but aircrafts usually start against the wind. To better interpret the combination of all
of these interactions, we plot the marginal effects of this particular regression model using contour
plots in subsequent sections. These contour plots provide strong visual evidence of the relationship
between daily aggregate airport taxi time, wind speed, wind direction, and local pollution levels.
One potential cause for concern in equations (1)-(3) are any omitted transitory determinants of
local pollution levels that may also covary with ground congestion. If such omitted variables exist,
then least squares estimates of the coefficients on taxi time (e.g. α1) will be biased. This could
occur, for example, if weather adversely affected airport activity while also affecting local pollution
levels. To address this potential source of bias, we need an instrumental variable that is correlated
with changes in ground congestion at an airport but is unrelated to local levels of pollution. A
natural instrument comes from delays at major airport hubs outside California, which propagate
through the air network as connecting flights are delayed, leading to more ground congestion at
airports in California. The basic logic is that instead of smoothing out scheduling over the course
of the day, planes now arrive in more distinct blocks of time, leading to more waiting/taxiing by
those planes taking off as the runway space is shared. Specifically, we instrument taxi time at each
California airport with taxi time at major airports outside of California (Atlanta (ATL), Chicago
21The cosine is 0 if the angle is 90degrees, i.e., the separately estimated effect is different upwind and downwind.22The exact dispersion of pollution depends on additional factors like acceleration and height of emissions. Benson
(1984) presents a formal model of pollution dispersion around roads that includes many variables we do not observe.The standard in the literature has hence been to estimate reduced-form relationship with wind direction and speed(Batterman, Zhang & Kononowech 2010).
23Recall that we are already controlling for overall wind speed in Wzt, but it has so far not been interacted withtaxi time or any other weather measure.
12
O’Hare (ORD), and New York John F. Kennedy (JFK)), in the following first stage regression:24
Tat = αa0 +
3∑
k=1
12∑
a=1
αakTktIa + ZatΘ+ ωat (4)
Appendix Figure A1 shows the location of those airports in relation to the California airports.
We allow the coefficients αak in equation (4) to vary by airport a by interacting taxi time with
an airport indicator Ia. These interactions allow for heterogeneity in the impact of delays from
major airports outside of California Tkt on each of the California airports Tat. This is important
as the impact of delays in Atlanta on California airports is likely to differ across airports. Our
baseline model utilizes 36 instruments (3 airports outside California interacted with each of the 12
airports in California).25 We use two-way cluster robust standard errors for inference, clustering on
both zip code and day. The two-way cluster robust variance-covariance estimator implicitly adjusts
standard errors to properly account for both spatial correlation across zip codes on a given day,
which are all due to the same network delays, as well as within-zip code serial correlation in air
pollution over time.26
The standard conditions for consistent estimation of α1 in the context of our 2SLS estimator are
that αak 6= 0 in equation (4) and E[Tkt · eazt | Zzt, νza] = 0. Subsequent sections will show that the
first condition clearly holds; taxi time at airports on the East Coast leads to large increases in taxi
time at California airports. The second condition requires that the error term in the instrumental
variable regression be uncorrelated with taxi time at major airports outside of California, Tkt. This
condition would be violated if ground congestion in Chicago somehow co-varied with pollution levels
in California through reasons unrelated to California airport congestion due to network delays.
While the second condition is not explicitly testable, our data and research design permit
several indirect tests. First, we show evidence that taxi time in California is predicted by weather
fluctuations at airports inside and outside of California, but the reverse is not true: weather at the
major airports in California has no significant effect on taxi time at Eastern airports. Second, we
show that network delays propagate East to West rather than West to East. Taxi time in Atlanta is
not higher due to increased taxi time in Los Angeles.27 Further sensitivity checks show that using
only taxi time before noon at Eastern Airports or directly instrumenting with observed weather
variables at airports in the Eastern United States has little impact on our baseline estimates. In the
24These airports were chosen because they are among the largest airports in the country, they serve differentregions, and they are subject to different weather systems. The results are robust to different airport specifications.
25Model 2 instruments both Tat and Tatdaz with the taxi time outside California Tkt and Tktdaz, and thus uses 72instruments. Similarly, model 3 instruments all 12 interaction of taxi time Tat at the 12 airports by the taxi time atthe three largest airports outside California Tkt, which results in 12×12×3 = 432 instruments.
26Standard errors clustering on both airport and day tend to be smaller than those using zip code and day. Wechoose the latter when conducting inference, as they tend to be the more conservative of the two. Results with airportand day clustering are available upon request.
27This issue is largely addressed by the difference in time zones between our instrumental variable airports andCalifornia. Airplane traffic in the United States generally starts around 6am in the morning and slows down in theevening. Due to the change in time zones, a flight that leaves at LAX in the morning to go to one of the airportsdoes not reach of the three airports outside California before noon. On the other hand, a flight that leaves at 6amon the East Coast will reach California by 9am.
13
following sections we use the variation in California airport taxi time, and the spatial distribution
of emissions from an airport, as a predictor of local air pollution measures in order to better
understand contemporaneous relationships between elevated levels of air pollution and hospital
admissions.
3.2 Aggregate Daily Taxi Time, Local Pollution, and Health
To estimate the pollution-health association in our data we begin by assuming that the relationship
between health and ambient air pollution can be summarized by the following linear model:
yzat = βpzat + ZztΠ+ ηza + ǫzat (5)
where the dependent variable yzat is our observable measure of health in zip code z when paired
with airport a on day t.28 The remaining notation is consistent with the previous models, Zzt are
the same weather and time controls and ηza is a zip code fixed effect.
We focus primarily on respiratory related hospital admissions as defined by International Sta-
tistical Classification of Diseases and Related Health Problems ICD-9 (Friedman et al. 2001, Seaton
et al. 1995). The dependent variable yzat is the number of admissions to either the emergency room
or an overnight hospital stay where either the primary or one of the secondary diagnosis code fell in
one of the following admission categories: asthma, acute respiratory, all respiratory, or heart related
diagnoses. These daily zip code counts are scaled by zip code population so that the dependent
variable represents hospitalization rates per 10 million zip code residents. We also estimate models
for diagnoses unrelated to pollution: strokes, bone fractures, and appendicitis. These outcomes are
meant to serve as an important test for the internal validity of our research design. Since these
health outcomes are unrelated to pollution exposure, they should not be significantly related to
changes in pollution.
The coefficient of interest in this model is β which provides an estimate of the effect of a one
unit increase in pollution levels on daily hospitalization rates in zip code z and time t. Consistent
estimation of β requires E[pzat ·ǫzat | Zzt, ηza] = 0. The inclusion of a zip code fixed effect implicitly
controls for any time invariant determinants of local health that also covary with average pollution
levels. For example, if relatively disadvantaged households live in more polluted areas and have
poorer health for reasons unrelated to air pollution, then the zip code fixed effect will control for
this time-invariant unobserved heterogeneity. However, least squares estimation of β will be biased
if there are time-varying influences of both health and pollution (e.g., weather), and/or if there is
measurement error in pzat. Since we are proxying for pollution exposure using the average level
of pollution in a zip code on a given day, measurement error might be substantial (i.e. people’s
actual exposure to ambient air pollution might differ significantly from that which is reported by a
monitor).
28Our analysis implicitly assumes that we can summarize health responses and behavior at the zip code level andthat the effect of interest, β, is stable over time and across airports.
14
Instrumental variables provide a convenient solution to the bias from omitted variables as well
as the bias introduced from measurement error in the independent variable.29 We use airport
ground congestion as an instrumental variable for local pollution levels in the following first stage
regression equation:
First Stage (Model 1): pzat = α1Tat + ZztΓ+ νza + ezat (6)
The first stage regression, equation (6), estimates the degree to which instrumented airport taxi
time Tat predicts local pollution levels in areas surrounding airports.30 The second stage equation
uses the predicted values from the first stage to estimate the impact of local pollution variation
on health. We also estimate versions of equation (6) using models that interact Tat with distance,
wind speed, and wind direction as in equations (2) and (3), models 2 and 3, respectively.
Aside from the relationship between pollution and health, we are also explore “reduced form”
relationships between health outcomes and taxi time. These “reduced form” estimates are directly
policy relevant; namely, how does aggregate daily taxi time impact the health of nearby residents?
Understanding the degree to which variation in airport runway congestion directly impacts health
has implications for both managing congestion through either demand pricing mechanisms (e.g., a
congestion tax) or a more efficient runway queuing system.
3.3 Health Outcomes: Alternative Models
We supplement our baseline health regressions with several alternative models, exploring model
specification and model dynamics in more detail. These various regression models are described in
more detail below.
3.3.1 Health Outcomes: Non-linearities and Threshold Effects
There is reason to suspect that the relationship between pollution and health outcomes is non-
linear in the level of pollution. Do highly polluted days matter more for predicting negative health
outcomes than moderately polluted days? We test these hypotheses in two different ways. First, we
examine heterogeneity in the dose-response relationship between seasons of the year as pollution
levels of CO and NOx are higher in the winter months as shown in Figure A3. We interact all
variables in all regressions (first and second stage) with a dummy for summer (April-September),
thereby allowing the effect to be different for two subsets of the year. Marginal changes at higher
29Instrumental variables only solves the bias from measurement error in the independent variable when the mea-surement error is classical, namely mean zero and i.i.d. (Griliches & Hausman 1986).
30We are using predicted aggregate taxi time Tat as an instrumental variable in these regression models. Instandard OLS regression, inference using generated regressors should be corrected for first stage sampling variance(e.g. Murphy & Topel (2002)). When the generated regressor is used as an instrumental variable this is no longer thecase. Wooldridge (2002, p. 117) presents a weak set of assumptions for which the standard errors of 2SLS regressionsusing generated instruments are unbiased. The key assumption turns on strict exogeneity between the error termin the structural model and the covariates used to generate the instrument in the auxiliary regression. See Dahl &Lochner (2012) for a similar approach, using a predicted variable as an instrumental variable in a 2SLS setting. Theseissues are also discussed tangentially in Wooldridge (1997) and Wooldridge (2003).
15
baseline levels of pollution (i.e. winter) should be larger than marginal changes at lower levels
of baseline pollution (i.e. summer) if the dose-response function was in fact non-linear. There
may be other important differences in health outcomes across seasons that could explain these
seasonal disparities. For example, pollen levels might be higher in the winter as most precipitation
occurs in the winter and hence flowering occurs in early spring. A body that is weakened by the
elevated pollen levels might be more (or less) susceptible to pollution shocks. The nonlinearity we
are measuring might be an interaction effect with other substances and not directly related to the
average pollution level.
Second, we use the over-identified model 3 to instrument higher order polynomials of average
daily pollution levels and plot the responding dose-response function. Pollution spreads nonlinearly
in wind direction and wind speed, and our overidentified models allow us to identify higher-order
polynomials.
3.3.2 Health Outcomes: Dynamic Effects and Forward Displacement
By looking at the daily response of health outcomes to contemporaneous pollution shocks, we may
be neglecting important dynamic effects of pollution and health. For example, contemporaneous
exposure to air pollution may have lagged effects on health, leading people to seek care one or
two days after the initial pollution episode. Our contemporaneous regression models might miss
these important lagged impacts. Alternatively, health estimates may be driven by various forms
of forward displacement. Short-term spikes in pollution might lead individuals on the brink of an
asthma or heart attack to experience an episode that would have otherwise occurred in the next
few days anyway. Such behavior would overestimate the dose-response function as an increase in
hospitalization rates is followed by a decrease once pollution levels subside. We explore the dynamic
effects of pollution on health by estimating the following distributed lag model:
yzat =
3∑
k=0
βkpza(t−k) + ZztΠ+ ηza + ǫzat (7)
Instrumented pollution pzat is again obtained using either model 1, 2, or 3 from previous sections. In
the case of forward displacement, the spike in hospital admissions should be followed by a decrease in
admissions, and hence∑3
k=0 βk < β, where the latter β comes from the baseline, contemporaneous
regression. In a sensitivity check (available upon request) we include 6 lags and 3 leads.
3.3.3 Health Outcomes: Heterogeneity and Self-Selection
Our baseline models rely upon the relatively unattractive assumption that the relationship between
pollution and health is the same for everyone in the population. If there is heterogeneity in a person’s
relative susceptibility to pollution (or in how people respond to adverse health outcomes), then
people may sort themselves into locations based on these observed or unobserved differences. This
heterogeneity may manifest itself through access to medical care or through biological differences in
16
the pollution-health relationship among certain segments of the population. Previous research (e.g.,
Chay & Greenstone (2003)) and results presented in subsequent sections of this paper suggest that
health effects differ by observable characteristics of the population. If people sort themselves based
on this underlying heterogeneity, then our estimates may identify the average effect of pollution on
health for a nonrandom subpopulation in the data (Willis & Rosen 1979, Garen 1984, Wooldridge
1997, Heckman & Vytlacil 1998).
We address these issues in various ways. In a sensitivity check, we limit our estimates to
people 65 and older who have guaranteed health insurance in the form of Medicare. Thus, any
heterogeneity in hospitalization should no longer be driven by access to health insurance. Another
concern is that the severity of the particular health shock determines whether a person will seek
emergency care. We therefore also include heart problems as a category, which are severe enough
that patients will seek medical help independent of their insurance or financial situation. There
may also exist significant heterogeneity based on unobservable characteristics. Previous research
suggests that individuals engage in avoidance behavior on days where pollution is predicted to be
high (Neidell 2009), which implies there is likely heterogeneity in β as well as correlation between
β and pzat. In a previous version of this paper, we developed a framework to test whether selection
on unobserved heterogeneity leads to bias in our estimates (Schlenker & Walker 2011), but did
not find this to be the case. The lack of self-selection bias may be in part driven by our research
design, where airport-driven pollution is relatively stochastic and unforecastable, making it difficult
to select on.
3.3.4 Health Outcomes: Poisson Model
Since our dependent variable is measured as hospital visits in a given zip code day (before we
convert it to a sickness rate), we also estimate regression models that account for the non-negative
and discrete nature of the data. Specifically, we use a conditional (“fixed effects”) quasi-maximum
likelihood Poisson model (Hausman, Hall & Griliches 1984, Wooldridge 1999).31 To account for the
endogeneity of pollution exposure, we generalize the standard conditional Poisson model into an
instrumental variables setting. To do this, we adopt a control-function approach to the conditional
Poisson model (see e.g., Wooldridge (1997) and Wooldridge (2002)), whereby we include the residual
(ezat) from our first stage regression (i.e., the effect of taxi time on pollution) in our regression
equation of interest:
E[szat|pzat, Tat,Zzt, ηza] = ηza exp (βpzat + γ1ezat + ZztΠ) (8)
where szat are sickness counts (no longer rates), pzat is the observed pollution level in a county, and
ezat is the residual from one of the first-stage regression of pollution on taxi time using model 1, 2,
or 3. The fixed effect model allows the marginal effect of pollution to differ by zip code. The model
31The Poisson model is generally preferred to alternative count data models, such as the negative binomial model,because the Poisson model is more robust to distributional misspecification provided that the conditional mean isspecified correctly (Cameron & Trivedi 1998, Wooldridge 2002).
17
accounts for the fact that zip codes have different number of residents through the fixed effects ηza.
While including the first-stage error purges the estimates of the various selection biases outlined
above (Wooldridge 2002, p. 663), the standard errors need to be corrected for the variation coming
from the first stage estimation. To account for the first stage sampling error in the ezat, we bootstrap
the regression using a block-bootstrap procedure where we randomly draw the entire history of a
zip code with replacement.
4 Empirical Results
4.1 Aggregate Daily Taxi Time and Local Pollution Levels
We start by examining the effect of airport congestion on pollution levels in surrounding areas.
Appendix Table A5 gives the first-stage results when taxi time is instrumented using runway con-
gestion at the three major airports outside of California. There is one noteworthy result: For
major hubs in California, an increase in taxi time at East Coast airports increases taxi time as
delays propagate through the system. On the other hand, the sign reverses for smaller airports:
an increase in taxi time at East Coast airports decreases local taxi time. As Pyrgiotisa, Maloneb
& Odoni (2013) point out, propagation through the system can have “counter-intuitive results.” If
planes bunch up at one hub, the effects on close-by commuter airports can be the opposite as the
connectors now arrive more evenly spread, or because flights are canceled.32
Table 1 presents regression estimates using the specifications outlined in equation (1), (2), and
(3), presented in columns a, b, and c, respectively. Each column represents a different regression,
where the dependent variable in the columns (1a)-(1c) is the daily mean CO measured in parts per
billion (ppb). Columns (2a)-(2c) report regression estimates for daily mean NO2, while columns
(3a)-(3c) report estimates for ozone O3.33 Taxi time is reported in thousands so that the coefficients
in Table 1 report the marginal effect of a 1000 minute increase in taxi time on local pollution levels.
All regressions report robust standard errors, clustering on both zip code and day.34
Column (1a) suggests that a 1000 minute increase in taxi time increases ambient CO concen-
trations in zip codes within 10km of an airport by 45ppb (an 8% increase relative to the mean,
32For example, flights out of Santa Barbara frequently get canceled if Los Angeles is backed up to reduce the queueof incoming airplanes into Los Angeles.
33OLS estimates are presented in Appendix Table A6.34The heavily over-identified models from equation (3) impose significant computational burdens when estimating
IV models containing two-way, cluster-robust standard errors. To circumvent this issue, we report the results fromrunning the first stage and then using the predicted values in the second stage without accounting for the fact thatwe are using generated regressors in the second stage. Plugging in the predicted regressors is computationally mucheasier because we don’t cluster all the first-stage regressions, instead we simply recover the point estimates from eachregression. Two-way cluster robust routines require estimating three variance-covariance matrices, one correspondingto the first cluster group, one corresponding to the second cluster group, and one corresponding to the two-wayexpansion of the two groups. Since we have more than a hundred instruments in model 3 (12 variables times 12airports times 3 east coast airports = 432 first stage regressions), this imposes a significant computational burden.To understand the likely magnitude of this bias, Appendix Table A7 reports two sets of standard errors for equations(1) and (2): (i) the IV results; and (ii) running the first stage and using the predicted values in the second stagewith two-way clustered errors but no other adjustments. The results suggest that the standard errors from the IVare quite similar to those from manual 2SLS.
18
or 12% of the day-to-day standard deviation). Since the standard deviation of taxi time at LAX
in Table A1 is 1852, a one-standard deviation increase in taxi time leads to 0.23 standard devia-
tion increase in CO pollution of the zip codes around LAX. Column (1b) of Table 1 includes an
interaction of taxi time with distance to the airport. The non-interacted taxi time coefficient now
reports the effect of airplane idling on pollution levels directly at the airport. The point estimate
implies that a one standard deviation increase in taxi time at LAX leads to 0.28 standard deviation
increase in CO levels in areas adjacent to LAX. The interaction term shows how this effect decays
linearly with distance.
Lastly, column (1c) reports the coefficients from the estimated version of equation (3) that
interacts taxi time with wind speed and wind angle from an airport. The F-test for the joint
significance of these coefficients is given in the last two rows of the table and shows that they are
highly significant. Since individual coefficients are difficult to interpret, we plot the marginal effect
of an extra 1000 minutes of taxi time for four wind speeds in the first row of Figure 3. Wind
speeds increase from left to right. The color indicates the marginal impact ranging from low (blue)
to high (red). If a zip code is directly downwind, it is on the positive x-axis, while areas upwind
are on the negative x-axis.35 Figure 3 makes clear that there is significant spatial heterogeneity
in the marginal effect of taxi time, and this heterogeneity depends on distance from an airport,
wind speed, and wind direction. As such, equation (3) (i.e. model 3) is best able to capture this
heterogeneity.
Columns (2a)-(2c) of Table 1 give estimates pertaining to the effect of taxi time on NO2 levels.
The results are comparable to those from CO, although the linear decrease in distance from the
airport is not significant. A one standard deviation increase in taxi time at LAX increases NO2
concentrations by roughly 1ppb, or 10% of the day-to-day standard deviation. The second row
of Figure 3 shows again that downwind areas are much more impacted than upwind areas. Both
Table 1 and Figure 3 show that the relative impact of NO2 is different than CO: the range of
marginal impacts for CO in Figure 3 is between -71% and +43% relative to the average impact
from column (1a) in Table 1. In contrast, the marginal effect of taxi time on NO2 varies between
-60% and +33% relative to the average effect from column (2a) of Table 1. The spatial pattern
is also somewhat different. In subsequent sections, we use these relative differences in pollutant
dispersion to jointly estimate the effect of both CO and NO2. Recall from Section 1 that CO
emissions are higher during low power operation, while NO2 is higher during high power operation.
Larger wind speeds require more thrust during takeoff and hence change the mix of CO and NO2
emissions.
35Areas downwind are more affected by taxi time than areas upwind. For the very highest wind speeds, the largestmarginal impact of taxi time can be found just upwind from the centroid of the airport (although the average marginalimpact remains highest downwind). This is possibly due to the fact that airplanes start against the wind and mostlyline up in the opposite direction, i.e., the direction in which the wind is blowing. Local wind is highly predictive ofcongestion. When local wind is strong and the average local taxi time is high and the queue is long, an additionalunit of congestion due to network delays will hence “add” an additional plane that is idling upwind from the airportcentroid. For example, the four runways of LAX are between 2.7km and 3.7km long, which is significant as we areexamining monitors within 10km of the airport centroid.
19
Finally, columns (3a)-(3c) replicate the same analysis for ozone (O3), a pollutant that is not
directly emitted from airplanes.36 The results in Table 1 suggest that airport taxi time has little
significant impact on ozone levels, although some of the interaction terms are significant. In the
remainder of the analysis we focus on CO and NO2, the two criteria air pollutants for which
airplanes are large emitters, while acknowledging that we may be picking up the health effects of
other pollutants that are correlated with airplane emissions.
Our baseline pollution estimates presented above come from models in which airport taxi time
is instrumented with taxi time at large airports outside of California. We instrument taxi time
because delays and runway congestion might be correlated with local weather, which in turn might
impact pollution levels. In addition, there is likely measurement error in our taxi time variable as it
only includes domestic, commercial flight activity. While we control for weather in our regressions,
there might be unobserved weather (or other) variables that jointly impact both pollution and taxi
time. Appendix Table A6 replicates the baseline IV analysis of Table 1 using local taxi time at
California airports, which is not instrumented. The estimated effect is generally half as big for
CO and NO2. The smaller OLS estimates are consistent with adverse weather (e.g., precipitation)
causing both airport delays and at the same time reducing ambient air pollution. Alternatively,
these results could be driven by the well known attenuation bias stemming from measurement error
in fixed effects models. In the remainder of the paper we rely on instrumented taxi time stemming
from network delays.
We use taxi time at three major airports in our baseline regressions: Atlanta (ATL), Chicago
(ORD), and New York (JFK). Appendix Table A7 presents first-stage F-statistics if we instrument
taxi time at California on up to four airports outside of California. Recall that we allow the
coefficients to vary by airport, as network congestion will have different absolute effects on California
airports. Irrespective of whether we use 1, 2, 3, or 4 airports outside of California, the F-statistic
is well above 10. In our baseline model we use three airports that cover weather patterns in three
regions of the Eastern United States: Southeast (Atlanta), Midwest (Chicago), and Northeast
(New York JFK), and the first-stage F-stat is 50. The fourth large airport outside of California
that we include in columns (d) is Dallas/Fort Worth (DFW). While results are not particularly
sensitive to including DFW, we exclude it from our baseline specifications as it is significantly
closer to California airports and thus may be more endogenous than the other three airports (i.e.
Dallas/Fort Worth may be delayed because California airports are delayed).
Reverse causality is less of a concern for the other three airports: A flight that leaves a California
airport at 6am will not reach Atlanta, Chicago, or New York until roughly noon due to the change
in time zones. Table A8 in the appendix tests for reverse causality directly by regressing taxi time
at an airport on eight weather measures we generally include as controls: a quadratic in minimum
and maximum temperature, precipitation, as well as wind speed.37 The column heading gives the
36Ozone is formed through a complicated chemical reaction between both nitrogen dioxides and VOC’s in thepresence of sunlight. As Auffhammer & Kellogg (2011) have shown, increasing VOC in VOC-rich environments canhave no effect on ozone or slightly decrease it, while it will increase ozone if VOCs are limited compared to NOx.This poses a challenge for the monotonicity assumption behind IV regressions.
37Weather measures in our baseline regression also include the direction in which the wind is blowing relative to
20
airport at which the congestion is measured while the row indicates the airport at which the weather
variables are measured.38 The table reports p-values of a hypothesis test pertaining to the joint
significance of the weather variables. The diagonal is highly significant as local weather measures
impact airport taxi time. While weather at the eastern airports (ATL, ORD, or JFK) sometimes
impacts taxi time at the two largest airports in California (LAX and SFO), the reverse is not true.
This is consistent with weather at Eastern airports causing local network delays that propagate
through the airspace and impact taxi time in California. The reverse direction does not hold.
California airports do not affect East Coast airports on the same day. This result is not simply an
artifact of there being less weather variation in California, as weather at LAX significantly impacts
taxi time at SFO.
We have also run two sensitivity checks to further rule out endogeneity through reverse causality,
the results of which are reported at the end of the subsequent section on health effects and shown in
Table A13. First, we only utilize the combined taxi time between 5am and noon at the three major
Eastern airports to rule out California feedback effects. This reduces the F-stat in model 1 from
50 to 35.5, but the results remain similar to baseline estimates. Second, instead of using taxi time
at the three major Eastern airports, we use the eight weather variables at each of these airports.
Since this effectively increases the number of instruments by a factor of eight, we no longer estimate
model 3 (which had 432 instruments to begin with). The F-statistic for the weather-instrumented
regression is 5436. Again, results remain similar to our baseline estimates but the standard errors
in the second stage increase. Going forward we instrument using the overall daily taxi time, as
it has a higher F-statistic than focusing only on the mornings yet is more tractable than using
weather measures, which would result in 3456 instruments in model 3.
We conduct two last robustness checks. First, since the variation in pollution due to delays
outside of California should be uncorrelated with weather in California, we have estimated models
(not reported) that exclude California weather controls altogether. Reassuringly, our baseline
estimates for the most important pollutant (CO, see below) are similar whether we include or
exclude California weather controls, but the error terms increase. Second, there may of course
be some omitted variable that affects congestion outside of California and health outcomes in
California. This hypothesis is not directly testable, but we have estimated models (available upon
request) which include taxi time at other CA airports as a control variable in our baseline reduced
form regressions, and the results remain very similar.
To put the magnitude of these effects into perspective, it is useful to consider the current
ambient air standards in place for CO as regulated by the EPA under the Clean Air Act. The
current one hour carbon monoxide standard specifies that pollution may not exceed 35 ppm (or
35000 ppb) more than once per year. California has their own CO standard which is 20ppm. A
one standard deviation increase in LAX airplane idling (1852 minutes) translates into an 83 ppb
the direction in which the zip code is located. Since the dependent variable in the current regression is at the airportlevel and not the zip code level, these variables are not well defined and hence dropped.
38If we pair airport taxi time with weather from another airport, we also include the local weather measure ascontrol. The local weather measures are not included in the joint test of significance.
21
increase (44.78 × 1.852) in carbon monoxide levels for areas within 10km of LAX using estimates
from column (1a) of Table 1. Adding this number to the average daily maximum CO level at zip
codes from Panel A of Table A2 (1234 ppb), the estimated increase in pollution concentrations is far
below the current EPA standard. Similarly, for NO2, the current EPA 1-hour standard is 100 ppb.
Using estimates from column (2a) of Table 1, a standard deviation increase in LAX taxi time would
lead to a 1ppb increase in NO2 levels. Evaluated relative to the average daily maximum NO2 levels
of 35.5 ppb, these are again well below the ambient criteria standard. Note that the maximum of
the maximum daily NO2 levels is above the standard as some areas are out of attainment. The
remaining sections estimate the social costs of these congestion related increases in ambient air
concentrations by focusing on heath outcomes of the populations most affected by these emissions.
4.2 Effects of Taxi Time on Local Measures of Health
We begin by investigating the “reduced form” health effects of airports, relating aggregate daily
taxi time to local measures of health. Namely, how does variation in airport congestion predict
local health outcomes? Table 2 presents the results from a regression relating daily measures of
airport taxi time to local hospital admissions for the overall population as well as two susceptible
subgroups: individuals below 5 years of age and individuals aged 65 and above. The dependent
variable is measured as the daily sum of hospital and emergency room visits for persons living in
a particular zip code scaled by the population (per 10 million individuals) in that particular zip
code. The regressions are weighted by zip code population size, and taxi time is instrumented using
taxi time at three major airports in the East. The estimated coefficient on the taxi time variable
corresponds to the increased rate of hospitalizations per 10 million individuals in a zip code for an
extra 1000 minutes of taxi time. Using various diagnosis codes, we examine the impact of taxi time
on asthma, respiratory, and heart related admissions separately. As a falsification exercise, we also
estimate the incidence of taxi time on strokes, bone fractures, and appendicitis rates. The reported
standard errors are clustered on both zip code and day.
For the overall population (Panel A), all respiratory sickness rates as well as heart problems are
significantly impacted by taxi time, while the placebo effects for stroke, bone fractures, and appen-
dicitis are not significantly affected. Results become larger in magnitude for the at-risk age groups.
For the population 65 years and above, the incidence of stroke and bone fractures is marginally
significant at the 10% level. This may be do to statistical chance or may be explained by the fact
that senior citizens may also be more susceptible to sicknesses that covary with one another (e.g.,
a respiratory problem might make them fall and break a bone). Additionally, Medicare provides
doctors implicit incentives to add additional diagnosis codes to receive higher reimbursement rates.
Consistent with this explanation, models for which the dependent variable is measured only using
the primary diagnosis code, the placebo effects for 65 and older are no longer significant.
22
4.3 Hospital Admissions and Instrumented Pollution Exposure
Results thus far have shown that aggregate airplane taxi time generates variation in pollution levels
of nearby communities. We exploit this variation to examine the relationship between pollution and
health explicitly. Table 3 summarizes regression results for various pollutants and illnesses using a
variety of traditional econometric specifications. Each entry corresponds to a different regression,
where the dependent variable is measured as hospital admission rates, and the independent variable
is the daily mean ambient pollution concentration in a particular zip code. As before, regression
estimates are weighted by zip code population and standard errors are clustered on both zip code
and day.39
The first row within each panel presents estimates from a pooled OLS version of equation (5)
without any controls Zzt, which suggests that increased ambient air concentrations lead to adverse
health outcomes for respiratory and heart problems. Since various pollutants are often correlated
with one another, these estimates should be interpreted with caution, as the pollutant of interest will
proxy for other correlated air pollutants. Each consecutive row adds more controls. The second row
uses time controls (year, month, weekday, and holiday fixed effects), and the third row additionally
adds weather controls (quadratic in minimum and maximum temperature, precipitation and wind
speed as well as controls for wind direction). To control for unobserved, time-invariant determinants
of health, the fourth row of each panel in Table 3 reports regression estimates from a model
using zip code fixed effects. The model is identified by examining how within zip code changes in
pollution are related to hospitalization rates of that particular zip code. Again, pollution is often
strongly correlated with health, although the estimates in the fourth row are usually smaller than
those in the first three. These smaller point estimates are consistent with time-invariant omitted
variables introducing bias into the estimates from rows one through three. Alternatively, classical
measurement error in the pollution variable may lead to significant attenuation bias in fixed effects
models (Griliches & Hausman 1986), and this may be responsible for the smaller point estimates
in the last row.
Aside from attenuation bias, fixed effects models may also suffer from biases introduced by any
unobserved, time-varying determinants of both pollution and health (e.g., weather). To explore this
issue further, Table 4 presents instrumental variable estimates of the pollution-health relationship,
using instrumented aggregate airport taxi time as an instrumental variable for daily mean pollution.
Table 4 presents results for both the overall population in Panel A as well as children below 5 in
Panel B and people aged 65 and above in Panel C.40 The three rows (labeled model 1-3) use (i)
taxi time, (ii) taxi time interacted with distance, and (iii) taxi time interacted with distance, wind
speed, and wind direction, respectively. These are the specifications outlined in equation (1), (2),
and (3) above.
39Unweighted regressions yield similar results and are available upon request.40Results for the two remaining groups: children ages 5-19 and adults ages 19-64 are given in Appendix Table A9.
Children between 5 and 19 years of age show no sensitivity to pollution shocks. Conversely, the estimated dose-response for adults are roughly comparable to the baseline estimates, which is not surprising since they are thelargest share of the overall population.
23
The estimates in Table 4 are usually an order of magnitude larger than the OLS, fixed-effects
estimates from Table 3. To put the magnitudes into perspective: The average asthma sickness rate
for the overall population is 339 per 10 million inhabitants (Panel A1 and A2 of Table A3). The
asthma coefficient for CO (model 3) in Table 4 implies that a one standard deviation increase in
CO pollution leads to an additional 0.194×368 = 71 asthma attacks per 10 million people,41 which
is 21% of the daily mean.42 This suggests that fluctuations in air pollution are a major cause of
asthma related illnesses. For heart related problems, the relative magnitude is 18% of the daily
mean. It is important to note that the estimated CO effect may not necessarily be coming from
CO itself but from some other pollutant that’s co-emitted in jet exhaust that we do not observe
(e.g. a toxic VOC or particulate matter that’s emitted due to incomplete combustion). In addition
to measurement error or avoidance behavior, the fact that variation in CO comes from airplanes
may be a further explanation for the discrepancy between OLS and IV estimates. However, the
Federal Aviation Administration (2005a) suggests that aircraft engines produce the same types of
emissions as automobiles, which are the largest single source of carbon monoxide emissions in the
United States.
Models 2 and 3 in Table 4 estimate over-identified models instrumenting pollution with both
taxi time and taxi time interactions. While estimates in model 2 are similar to those from model 1,
estimates from model 3 are generally smaller. The reason for the difference in magnitudes between
models 2 and 3 is not entirely clear, but we believe there are two competing explanations. The
first explanation stems from the inability of models 1 and 2 to capture the spatial heterogeneity
in the effect of taxi time. Recall that model 3 uses distance as well as wind direction and wind
speed. Marginal impacts of airport congestion vary greatly across space as shown in Figure 3, much
more than in a model that only includes distance. Failing to model this heterogeneity in pollution
exposure may lead to inaccurate scaling of the reduced form relationships in our IV/2SLS setting.
A competing explanation as to why model 3 estimates differ from models 1 and 2 stems from
measurement error in the location of exposure. While we know the exact location of each pollution
monitor and hence can correctly model the pollution surface in space, we only know the zip code
of a person’s residence and the hospital, not the exact location where they fell ill. As a result,
all models will pair sickness counts with incorrect pollution measures if they are not close to the
centroid of the zip code when they fell ill, but this might be aggravated by model 3 that explicitly
uses the spatial distribution of the pollution surface. Table A10 investigates this latter hypothesis
by looking at various subsets of the data. Panel A presents our baseline results, Panel B assigns
pollution data based on the zip code of the residence, while Panel C assigns pollution based on the
hospital zip code. A few results are noteworthy: first, the estimates using model specification 3 are
very close to the estimates using specification 1 and 2 in Panel B1 where we only count sicknesses
41Panel A of Table A2 in the appendix shows that the standard deviation for CO is 368.42This back-of-the-envelope calculation increases the pollution level in each zip code by the average overall standard
deviation of pollution fluctuations. Moreover, the average sickness rate is not population weighted. In subsequentsections, we increase pollution in each zip code by the zip-code specific standard deviation in pollution fluctuationsand calculate the population-weighted average sickness count. The relative impact decreases to 17% of the dailymean under the linear probability model and 19% under a Poisson count model.
24
if both the zip code of the residence and hospital are within 10km of the same airport. On the
other hand, model specification 3 diverges in panel B2 where the hospital zip code is outside the
10km radius from airports, perhaps because we measure exposure less accurately (e.g. the person
might have been at work). In addition, panel B3 shows that there are no significant results where
the hospital is within 10km of another airport, suggesting that we are not simply picking up a
daily pattern that is common to all airports.43 As a secondary bit of evidence, model 3 in Panel B
of our baseline Table 4 gives comparable point estimates to model 1 and 2 for children under the
age of 5, whom are more likely to be at home or in a close-by day care. Due to these competing
explanations for the differences across models, we continue to present all model estimates whenever
possible, allowing the reader to choose their preferred estimate.
There are two additional explanations for the discrepancies between models 3 and 1 and 2
which we find less salient. First, there is a well-known bias of 2SLS estimators when instruments
are weak and when there are many over-identifying restrictions (Bound, Jaeger & Baker 1995). In
linear models with iid errors, Stock, Wright & Yogo (2002) propose rule-of-thumb thresholds for F-
statistics for the first stage. However, in both the non-iid case (i.e. with clustered standard errors)
and in cases with multiple endogenous variables, less is known about the relationship between
the F-statistic and the properties of instrumental variables estimates. Baum, Schaffer & Stillman
(2003) suggest comparing the test statistic to the Stock, Wright & Yogo (2002) critical values
for the Cragg-Donald F statistic with a single instrument. According to this metric, results from
Table 1 suggest that model 3 is a strong first-stage predictor of local pollution levels with a F-
statistic that is 14 for CO pollution and to a lesser extent for NO2 pollution (F-stat of 5). The
first stage in model 3 is not as strong as in models 1 and 2, and the model is highly over-identified
with 12 excluded instruments. Bound, Jaeger & Baker (1995) show how the bias of 2SLS increases
in the number of instruments and decreases in the strength of the first stage. The bias of 2SLS
in the case of weakly identified or over-identified models is towards the OLS counterpart. Since
this is consistent with model 3 estimates in Table 4 being smaller than both model 1 and 2 but
still above the OLS estimates, Table A11 in the appendix estimates models 2 and 3 using Limited
Information Maximum Likelihood (LIML), which is median-unbiased for over-identified, constant-
effects models (Davidson & MacKinnon 1993). Results remain similar, which suggests that weak
instrument attenuation is less of a concern (Angrist & Pischke 2008). Finally, a second alternative
explanation for why model 3 gives lower point estimates is that the hourly wind data represent
snapshots of the wind speed and direction and include significant measurement error. However,
this is somewhat at odds with the fact that we find such significant spatial patterns in the pollution
regressions.
Panels B and C of Table 4 present estimates for children and senior citizens. While the dose-
response relationships are larger, so are average sickness rates. In relative terms, a one standard
deviation increase in CO pollution now causes a 37% increase in asthma cases for children under
5 compared to the average daily mean. On the other hand, a one standard deviation increase
43If we assign pollution based on the hospital zip code in panels C, results are generally not significant.
25
in CO pollution causes a 24% increase in heart problems for people 65 and above. The higher
absolute sensitivity in Panel B and C suggests that there may exist significant heterogeneity in the
population response to ambient air pollution exposure. Since the population aged 65 and older
has guaranteed access to health insurance through Medicare, they may be more inclined to visit
the emergency room or hospital relative to the rest of the population, leading to larger estimated
effects. On the other hand, the relative magnitude compared to average sickness rates are only
slightly larger than for the overall population.
Columns (3)-(5) of each panel includes results for one of three placebos: strokes, bone fractures,
and appendicitis. Both strokes and appendicitis are severe enough that people should go to the
hospital. None of the results are significant for the overall population in Panel A. Consistent with
the reduced form evidence in Table 2, some of the coefficients in Panel C are significant at the
10% level. In Appendix Table A12 we replicate the analysis using only the primary diagnosis code.
None of the placebo regressions remain significant. Since we are interested in the overall effect of
pollution on hospitalization rates, our baseline models continue to count total sickness counts for
both primary and secondary diagnoses.
Appendix Table A13 further investigates the sensitivity of our IV estimates to different choices
of instrumental variables. As a point of comparison, Panel A replicates the baseline results of
Table 4 for all ages. Panel B instruments for pollution using only the taxi time between 5am and
noon at Eastern airports to rule out endogeneity through reverse causality. The results remain
robust to this change. Panel C goes one step further and instruments for taxi time at California
airports using only weather measures at the three major airports in the Eastern United States.
While the point estimates remain comparable, the standard errors generally increase.44
4.3.1 Jointly Estimating the Effect of Ambient Air Pollutants
A common challenge in studies linking health outcomes to pollution measures is that ambient air
pollutants are highly correlated. It is therefore difficult to determine empirically which pollutant
is the true cause of any observed changes in health. Our research design provides one possible
solution to the identification problem. Wind speed and wind direction differentially affect both CO
and NO2 dispersion patterns. Moreover, the rate of CO and NO2 emissions depend on the thrust
produced by the engine, and higher wind speeds require more engine thrust. Wind speed hence
impacts both the rate at which pollutants are produced and how they disperse. Table 5 estimates
the joint effect of both CO and NO2 on health using our first stage model with wind speed and
wind direction interactions (model 3).
In all specifications for which we have multiple endogenous variables, we report the Angrist &
Pischke (2008) conditional F-statistics in the tables and text, although these are somewhat hard to
interpret. As mentioned above, there are no rule-of-thumb thresholds for linear models with non-
iid errors or for models with multiple endogenous variables. When comparing the conditional F-
statistics to the Stock, Wright & Yogo (2002), the F-statistics suggest that the first stage is “weak”.
44We do not estimate model 3 using weather variables as it would include 3456 instruments.
26
Perhaps more usefully, in all specifications for which we have multiple endogenous variables, we also
present two tests that are robust to issues pertaining to weak instruments, the Anderson-Rubin test
statistic and the closely related Stock-Wright (2000) S statistic. The null hypothesis tested in both
cases is that the coefficients of the endogenous regressors in the structural equation are jointly equal
to zero, and, in addition, that the overidentifying restrictions are valid. We use a cluster-robust
version of both test statistics that has the correct size even under weak identification (Chernozhukov
& Hansen 2008). The tests are equivalent to estimating the reduced form of the equation (with the
full set of instruments as regressors) and testing that the coefficients of the excluded instruments
are jointly equal to zero. In most specifications, inference based on the Anderson-Rubin and Stock-
Wright tests are consistent with inference based on the Wald test of the same null hypothesis. This
suggests that we are not drawing spurious inferences based on weak instruments. We also present
results using LIML because LIML is approximately median unbiased for overidentified models, and
the results are similar between 2SLS and LIML. When 2SLS is subject to weak instrument bias, the
2SLS estimand will diverge significantly from the LIML estimand toward the OLS estimand. Thus,
the fact that LIML and 2SLS deliver similar results assuages our concerns pertaining to significant
biases associated with weak instruments.45
Table 5 shows that the coefficient for CO remains comparable in size to our baseline estimates
from Table 4, albeit slightly larger. Conversely, the coefficients on NO2 switch sign and are mostly
negative and insignificant. We have also used the methods proposed by Chernozhukov & Hansen
(2008) to build non-spherical confidence regions for the multiple endogenous variables. Within the
joint parameter space of CO and NO2, the joint confidence region lies in the quadrant where CO
is positive and NO2 is weakly negative.46
We interpret these findings as evidence that the returns from regulating CO exceed those from
regulating NO2, at least for the population that comprises our sample. One possible explanation for
our results stems from the work done by Auffhammer & Kellogg (2011). Figure 9 of Auffhammer
& Kellogg (2011) shows that the Southern California coastline, the location of most of the zip
codes in our study, ozone generation seems VOC limited, i.e., a reduction in VOC reduces ozone.
Conversely, regions further inland and in Northern California are NO2 limited. Reducing NO2 in
areas that are VOC limited has little effect on ozone, and this may be the reason we observe small
and insignificant results for NO2. In the remainder of the paper, we therefore focus on CO.
4.3.2 Threshold Effects and Non-Linearities in the Pollution-Health Relationship
We explore the functional form of the dose-response function in four separate ways. First, Table A14
estimates the relationship separately for the summer (April-September) and the winter (October-
March). Each panel of the table provides the point estimates for the two seasons from a joint
regression where all variables and instruments are interacted with seasonal dummies as well as the
p-values of a test whether the coefficients are the same. Especially for the case of children under
45A similar diagnostic exercise of this nature can be found on pages p. 213-215 of Angrist & Pischke (2008).46The full set of results, which consist of 2 dimensional plots for each hypothesis test, are available upon request.
27
the age of 5, the effect seem to be significantly higher during winter months when average pollution
levels are higher.
Recall that CO and NO2 pollution are higher during the winter months, so a nonlinear dose-
response function that has increasing marginal damages of pollution should exhibit larger coeffi-
cients for the winter months. The coefficient for the winter months is almost always larger than for
the summer months for the illnesses that are related to pollution (columns (1a)-(2)). These results
are consistent with increasing marginal impacts of pollution. However, there may be other impor-
tant differences in health outcomes across seasons that could explain these disparities. An obvious
candidate for differences between the summer and the winter would be the level of ambient ozone
concentrations which tend to be much higher in the summer than the winter. In additional results
(Table A15), we control for ozone levels as a potential confound and the results are nearly identical.
One possible explanation for why ozone does not impact the baseline regression results stems from
measurement error in the excluded ozone regressor and/or avoidance behavior pertaining to ozone.
We have also explored models which estimate the possible non-linear effects of pollution on
health outcomes by including higher order polynomials. Models with higher order pollution terms
increase the number of endogenous variables in our regressions, and we use the overidentified
model 3 to instrument for the higher order terms. Since higher order polynomials can be difficult
to interpret, Figure A5 plots the predicted marginal effects of the pollutant on a range of health
outcomes as a function of the level of the pollutant on the given day. That is, we plot the dose
response function, where the y-axis measures the health response and the x-axis measures the level
of pollution. Since we are fitting non-linear models, the responsiveness is allowed to vary across the
x-axis. The dashed line displays the results from our baseline, linear dose-response model (constant
marginal damage). The solid represents results from a quadratic model where the 95% confidence
interval is added in grey. The four columns represent the four sicknesses that are related to pollution
fluctuations (asthma, acute respiratory, all respiratory, and heart problems, respectively). The
predicted marginal effect is plotted over the empirical distribution of daily pollution levels, from
the 5th to the 95th percentile.47 While there is some evidence that respiratory problems (columns
1-3) exhibit increasing marginal damages as pollution levels start to increase, again especially for
children under the age of 5, the confidence intervals reflect an inability to reject the null that the
damage function is constant over the observed range of CO values.
We have investigated non-linearities in two additional ways that are broadly consistent with the
findings above (results available upon request). First, we estimated models whereby we interacted
our daily pollution variation of interest with the mean pollution level in a zip code. This allows
the dose-response curve to vary (linearly) in the level of average pollution levels of a zip code.
If this interaction term is zero, this would support the hypothesis that the marginal effect of a
one unit increase in emissions is the same regardless of the level of ambient air pollution (i.e.
a constant, linear dose-response). If the coefficient on the interaction was significantly positive,
then this would support the hypothesis that the marginal effect of ambient air pollution on health
47Figure A4 shows the observed distribution of daily CO levels in our data set.
28
outcomes is progressively worse in areas with higher than average pollution levels. A challenge
with this particular test is that the average level of ambient air pollution in a zip code can be
correlated with many observed and unobserved factors that may contribute to heterogeneity in the
dose-response relationship. For example, people in more polluted areas may lack basic preventive
health services and thus be more responsive to marginal changes in air pollution because of their
underlying health conditions rather than any non-linearity in the dose-response. Nevertheless,
results suggest (available upon request) that CO exhibits an increasing dose-response function.
Second, we explored the shape of the dose-response function in an OLS, fixed effects setting. While
we think the regression coefficient magnitudes may be attenuated by things such as measurement
error and/or avoidance type behaviors, the shape of the dose-response curve is likely less sensitive
to these concerns (unless of course the bias varied with the level of pollution - which might happen
through avoidance behavior such as “bad air day” alerts). We use this logic to explore the shape of
the dose-response function by fitting OLS, fixed-effect regression models that include polynomials
in the daily mean pollution level (i.e. quadratic, cubic, or quartic). We then plot the predicted
marginal effects of the pollutant on a range of health outcomes as a function of the level of the
pollutant on the given day (as in Figure A5). We see that for both asthma and respiratory illness,
the predicted marginal effect is increasing in the level of the pollutant. The patterns suggests some
sort of “threshold” by which the marginal effect of CO on health outcomes “flattens out”.
While the various results in this section come from different econometric models, the conclusions
pertaining to the shape of the dose-response function remain similar across the specifications. The
evidence suggests that the marginal effect of pollution is increasing in the level of the pollutant,
but at a decreasing rate. The diminishing marginal damages of the dose-response function is also
consistent with modern evidence from epidemiology (see e.g. Pope et al. (2009) and Pope III et al.
(2011)).
4.3.3 Potential Confounding Sources of Variation
While our estimates suggest that CO is primarily responsible for the observed health responses,
there may be other sources of unobserved, concomitant variation that may lead to similar rela-
tionships. For example, while we estimate the effect of CO and NO2 in the same model, we do
not directly control for other pollutants such as ozone. It seems unlikely that ozone O3 is causing
the observed relationship. As mentioned above, Appendix Table A14 estimates the relationship
separately for the summer (April-September) and the winter (October-March). Ozone is higher
during the summer, while CO and NO2 are higher during the winter. The observed health effects
are larger and more significant during the winter time when ozone is not a big problem. We have
also estimated models that directly control for ozone (Table A15), and the results remain similar
and a bit more precise than our baseline estimates. The standard errors are also much larger for
the summer, especially in the case of acute respiratory problems and overall respiratory problems.
This is not surprising, because other pollutants like ozone also impact health outcomes, which will
be part of the error term.
29
One potential omitted variable that we unfortunately cannot measure well is particulate matter,
a pollutant which may emerge from combustion emissions and has been shown in the past to increase
infant mortality due to respiratory causes (Currie & Neidell 2005). Particulate matter monitors in
California are limited in both their spatial and temporal coverage; readings on ambient particulate
monitors are conducted every few days (as opposed to hourly data from other pollutants), and
there are far fewer monitors. These limitations do not square well with our research design which
relies on high-frequency, daily variation across very localized areas. Nevertheless, we have directly
explored the degree to which particular matter predicts adverse health outcomes for the subsample
of days and locations for which we have particulate monitor data. Table A16 presents results using
the full set of particulate monitors for PM2.5.48 Table A16 suggests that PM does not have much
explanatory power in predicting health outcomes, although the standard errors preclude definitive
conclusions.49 Recall that Los Angeles Airport is not a significant point source of particulate
matter. While it is the single-largest point source for CO emissions in the state of California, it
only ranks 2763 and 2782 among emissions of PM10 and PM2.5. Even still, we believe that some
amount of caution is warranted in interpreting CO as the unique pollutant-related causal channel
leading to adverse health outcomes; there may be in fact other unobserved sources of ambient air
pollution that covary with CO that may also affect health.
4.3.4 Inpatient versus Outpatient Data
Traditionally, studies have relied on Inpatient data sets to examine health responsiveness to various
external factors such as pollution. One limitation of such data is that a person only enters the
Inpatient data set if they are admitted for an overnight stay in the hospital. Many ER visits
result in a discharge the same day and hence never result in an overnight stay. Starting in 2005,
California began collecting Outpatient (Emergency Room) data. Previous published estimates all
replied on Inpatient data only. To better understand the differences between these two datasets as
well as compare our results to those from the previous literature, we replicate the analysis using
sickness counts from only the Inpatient data in Panels A1-C1 in Table A17 of the appendix. By the
same token, panels A2-C2 only uses the Outpatient data.50 Not surprisingly, there is a significant
relationship between pollution and heart problems (column 2) in the Inpatient data for patient
ages 65 and above (as these conditions usually require an overnight stay), but no or very limited
sensitivity of asthma or overall respiratory illnesses (column 1a and 1c) to pollution. Conversely,
the Outpatient (ER) data shows a much larger sensitivity of respiratory problems to changes in
pollution. These results show the importance of Outpatient (ER) data when studying the morbidity
effects of ambient air pollution on health outcomes.
48Unfortunately, we only observe 2 PM10 pollution monitors within 15km of an airport (or equivalently 2 zip codes)which makes our research design infeasible due to the importance of distance and wind angle/speed heterogeneity.
49All of the estimates in Table A16 come from limited information maximum likelihood estimates as opposed to2SLS (although results are similar).
50Patients that enter the ER and are later admitted for an overnight stay are dropped from the ER data to avoiddouble counting.
30
4.3.5 Temporal Displacement and Dynamics
Our baseline regression models examine only the contemporaneous effect of pollution on health.
Contemporaneous estimates may lead to underestimates of the total effects of air pollution on health
if health effects respond sluggishly to changes in pollution. Conversely, estimates may overstate the
hypothesized effect due to temporal displacement: if spikes in daily pollution levels make already
sick people go to the hospital one day earlier, contemporaneous models overestimate the true effect
associated with permanently higher pollution levels. If temporal displacement is important, the
contemporaneous increase in sickness rates should be followed by a decrease in sickness rates in
subsequent periods.
We investigate both of these issues by estimating a distributed lag regression model, including
three lags in the pollution variable of interest. Table 6 presents the distributed lag results of
pollution for the overall population. We present individual coefficients as well as the combined effect
(the sum of the four) in the last row of each panel. To preserve space, we only list the results for
the sickness categories that are impacted by changing CO pollution levels. Since regulatory policy
is concerned with the health effects of a permanent change in pollution, we focus on cumulative
effects of the model over the estimated 4 day horizon. The cumulative effect is slightly larger
than the comparable baseline results in Table 4. This might be because some individuals delay
hospital visits, although the exact dynamics are hard to determine empirically given the lack of
significance of the individual coefficients. We have also experimented with different leads/lags
(available upon request). For example, in a model with 3 leads and 6 lags, the sum of the six lags
and contemporaneous terms are similar in magnitude. The three leads, on the other hand, are not
jointly significant.
4.3.6 Count Model
Our baseline health estimates consist of linear probability models, relating the population-scaled
hospital admission rates to changes in pollution. To account for the non-negative and discrete
nature of the hospital admission data, Table 7 presents estimates from a quasi-maximum likelihood,
conditional Poisson IV estimator given in equation (8). In contrast to the baseline linear probability
health models, these models are not weighted. In addition, since we use a control function to address
issues pertaining to measurement error and omitted variables, we adjust standard errors for the
first stage sampling variation using a block-bootstrap sampling procedure, resampling zip codes.51
Analogous to the linear probability model, we find that respiratory illnesses and heart problems are
sensitive to pollution fluctuations, while the three placebos are not (with the usual caveat applying
to sickness counts for people aged 65 and above).
The coefficients no longer give marginal impacts and are difficult to interpret. In order to
compare the marginal impacts of pollution exposure and congestion across all of our models, Table 8
presents the predicted increase in sickness counts from (i) a one standard deviation increase in taxi
51This is equivalent to clustering by zip code instead of two-way clustering by zip code and day. An unweightedregression of the linear probability model (available upon request) that clusters by zip code gives comparable results.
31
time, and (ii) a one standard deviation increase in pollution levels in each zip code. The results
are then added for all zip codes that are within 10km of an airport. The table also summarizes
population surrounding airports. Various admission categories are given in rows, while the columns
show the results for each of the 12 airports. The last column gives the combined impact among all
12 airports.
Panels A, B, and C give the predicted increase in hospital admissions using estimates from
the baseline linear probability model whereby pollution is instrumented using model 3 (pollution
instrumented with taxi time + interactions with distance and wind direction). These results are
presented for the overall population (Panel A), children below 5 years (Panel B), and senior citizens
65 and above (Panel C). Panel D gives the results for the overall population using the count model
shown in Table 7. Impacts are evaluated at the sample mean for the nonlinear Poisson model.
The results from the Poisson model are similar to those from the linear probability model in
Panel A. Panel E gives the average daily sickness count in 2005-2007 for the overall population for
comparison.
Pollution fluctuations have a large effect on the 6 million people living within 10km of one of
the 12 airports: A one standard deviation increase in a zip-codes specific pollution fluctuations
increases asthma counts for the overall population by 17% under the linear probability model and
19% under the Poisson count model.52 Overall, a one standard deviation increase in zip-code
specific daily pollution levels results in 107 additional admissions for respiratory problems and 49
additional admissions for heart problems, which are 17% and 9% of the daily mean. For respiratory
problems, infants only account for roughly one third of the overall impacts. Studies focusing only
on the impact on infants therefore would miss a significant portion of the overall impacts. Not
surprisingly, the elderly are responsible for the largest share of heart related impacts.
Airport congestion significantly contributes to the overall impacts: a one standard deviation
increase in taxi time increases respiratory and heart admissions by roughly 1% of the daily mean. At
LAX, the largest airport in California, a one standard deviation increase in taxi time is responsible
for roughly one-fourth of the effect of a one-standard deviation increase in pollution. On the other
hand, smaller airports (e.g., Santa Barbara or Long Beach) are responsible for a much lower share
of the overall pollution impacts.
4.3.7 Economic Cost
In order to monetize the health impacts associated with both pollution exposure, we use the
diagnosis-specific reimbursement rates offered to hospitals through Medicare.53 We view this mea-
52Recall that these estimates are smaller than what we reported under Table 4, where we increased pollution levelsin each zip code by the average overall standard deviation in pollution levels and took an average baseline sicknessrate that was not population weighted.
53This information comes from a translation between our hospital diagnosis codes (ICD-9) and Diagnosis RelatedGroup (DRG) codes. We used the crosswalk from the AMA Code Manager Online Elite. Using the set of DRGcodes, we calculate the Medicare reimbursement rates using the DRG Payment calculator provided by TRICARE(http://www.tricare.mil/drgrates/). In accordance with Medicare reimbursement policy, we adjust the DRG pay-ments using the average wage index in our sample. The average cost for respiratory problems and heart related
32
sure as a lower bound on the total health costs for several reasons: first, our methodology measures
limited impacts on both a temporal and spatial scale. By focusing on day-to-day fluctuations, we
do not address the long run, cumulative effect of pollution on health. If these are sizable relative
to the contemporaneous effects, the overall cost estimate will be higher. Similarly, our focus has
been on individuals living within 10km of an airport. Some of our estimates suggest the marginal
impact of taxi time extends beyond the 10km radius, in which case we would be understating the
overall effect. Second, we only count people that are sick enough to go to the hospital - anybody
who sees their primary care physician or stays home feeling sick will not be counted. Recent work
by Hanna & Oliva (2011) finds that pollution decreases labor supply in Mexico City, imposing
real economic costs on society not measured in our analysis. Similarly, Deschenes, Greenstone &
Shapiro (2012) find that increased levels of ambient NO2 lead to increased levels of spending on
respiratory related prescription medicines, an outcome not measured in our analysis. Third, and
most importantly, the marginal willingness to pay to avoid treatment is likely higher than the cost
of treatment. For example, severe heart related problems that are not treated within a narrow
time frame will likely result in death. The statistical value of life that EPA uses for its benefit-cost
analyses is around 6 million dollars, which is 1000 times as larger as our medical reimbursement
cost for heart-related problems. Individuals might be willing to pay significantly more than medi-
cal reimbursement rates to avoid illnesses that, if not adequately treated, have dire consequences.
Using the predicted increase in hospital visits under the linear probability model given in Table 8,
a one standard deviation increase in pollution levels amounts to about a $540,000 increase in hos-
pitalization payments related to respiratory and heart related hospital admissions under model 3.54
Since a one-standard deviation change in pollution is an extrapolation from the fluctuations caused
by airport congestion, we also analysis counterfactual where peak exposure levels are capped using
the nonlinear models of Figure A5 but find comparable results that are available upon request.55
5 Conclusions
This study has shown how daily variation in ground level airport congestion due to network delays
significantly affects both local pollution levels as well as local measures of health. In doing so, we
develop a framework through which to credibly estimate the effects of exogenous shocks to local
air pollution on contemporaneous measures of health. Daily local pollution shocks are caused by
events that occur several thousand miles away and are arguably exogenous to the local area. We
admissions are US$ 2702 and 6501, respectively.54The corresponding number under model 1 is $920 thousand. These figures are calculated by taking the estimated
increase in hospital visits and multiplying it by the average Medicare reimbursement for each of the respectivediagnoses.
55Specifically, we test the sensitive of our results to assumed linear extrapolation through a counterfactual whereall CO levels in 2005-2007 are caped at half the observed mean, i.e., values that exceed half the historic mean arereduced to equal half the historic mean. The implied pollution reduction is evaluated both using our linear baselinemodel as well as a quadratic or cubic in pollution exposure. The predicted economic benefits are 520 thousand inthe linear model and 650 thousand in the two nonlinear models, suggesting that allowing for increasing marginaldamages of pollution might give slightly larger damages.
33
address several longstanding issues pertaining to non-random selection and behavioral responses
to pollution. In addition, we show how newly available data on the universe of emergency room
provides much cleaner insight as to the sensitivity of populations to ambient pollution levels, relative
to existing Inpatient Discharge records. Our results suggest that ground operations at airports are
responsible for a tremendous amount of local ambient air pollution. Specifically, a one standard
deviation change in daily congestion at LAX is responsible for a 0.28 standard deviation increase
in levels of CO next to the airport that fades out with distance. The average impact for zip codes
within 10km is 0.23 standard deviations.
When connecting these models to measures of health, we find that admissions for respiratory
problems and heart disease are strongly related to these pollution changes. A one standard de-
viation increase in daily zip-code specific pollution levels increases asthma counts by 17% of the
baseline average, total respiratory problems by 17%, and heart problems by 9%. Infants and the
elderly show a higher sensitivity to pollution fluctuations, and marginal damages of pollution seem
to be increasing in pollution for infants. At the same time, adults age 20-64 are also impacted. For
respiratory problems, the general adult population accounts for the majority of the total impacts
despite the lower sensitivity to fluctuations as they are the largest share of the population. A one
standard deviation increase in pollution levels is responsible for 540 thousand dollars in hospital-
ization costs for the 6 million people living within 10km of one of the 12 airports of our study. This
is likely a significant lower bound as the willingness to pay to avoid such illnesses will be higher
than the Medicare reimbursement rates.
Examining various mechanisms for the observed pollution-health relationship, we find that CO
is primarily responsible for the observed health effects as opposed to NO2 or O3. We find no
evidence of forward displacement or delayed impacts of pollution. We also find no evidence that
people in areas with larger pollution shocks are less susceptible or less responsive to pollution.
These estimates suggest that relatively small amounts of ambient air pollution can have substan-
tial effects on the incidence of local respiratory illness, at least for the population that comprises our
study. While EPA recently decided against lowering the existing carbon monoxide standards due
to lack of sufficient evidence of the harmful effects of CO at levels below current EPA mandates, we
find significant impacts on morbidity. Recent research suggests that the rates of respiratory illness
in the United States are rising dramatically, even as ambient levels of air pollution have continued
to fall (Center for Disease Control 2011). Why asthma rates continue to rise is an open question,
but the increase in asthma rates is most pronounced amongst African Americans who dispropor-
tionately live in densely populated, congested areas. At the same time, traffic congestion in cities
has been rising dramatically. Results presented here suggest that at least part of the increased
rate of asthma in urban areas can be explained by increased levels of traffic congestion. The exact
mechanism remain beyond the scope of the current study, but this remains an interesting area for
further research.56
56Currently, the highest rates of asthma incidence in the United States are found in Bronx, New York (Garget al. 2003). This area of northern New York City is bisected by 5 major highways, that rank among the mostcongested in the United States (Bruner 2009).
34
References
Angrist, Joshua D, and Jorn-Steffen Pischke. 2008. Mostly harmless econometrics: An em-piricist’s companion. Princeton University Press.
Auffhammer, Maximilan, and Ryan Kellogg. 2011. “Clearing the Air? The Effects of GasolineContent Regulation on Air Quality.” American Economic Review, 101(6): 2687–2722.
Batterman, Stuart A, Kai Zhang, and Robert Kononowech. 2010. “Prediction and analysisof near-road concentrations using a reduced-form emission/dispersion model.” EnvironmentalHealth, 9(29).
Baum, Christopher F., Mark E. Schaffer, and Steven Stillman. 2003. “Instrumental vari-ables and GMM: Estimation and testing.” Stata Journal, 3(1): 1–31.
Benson, Paul E. 1984. “CALINE4 - A Dispersion Model for Predicting Air Pollutant Concentra-tions near Roadways.” California Department of Transportation - FHWA/CA/TL-84/15.
Bound, John, David A. Jaeger, and Regina M. Baker. 1995. “Problems with InstrumentalVariables Estimation When the Correlation Between the Instruments and the EndogeneousExplanatory Variable is Weak.” Journal of the American Statistical Association, 90(430): 443–450.
Bruner, Jon. 2009. “America’s Worst Intersections.” Forbes.
Cameron, A. Colin, and Pravin K. Trivedi. 1998. Regression Analysis of Count Data. Cam-bridge, MA:Cambridge University Press.
Center for Disease Control. 2011. “Asthma in the US: Growing Every Year.” CDC Vital Signs.
Chay, Kenneth Y., and Michael Greenstone. 2003. “The Impact of Air Pollution on InfantMortality: Evidence from Geographic Variation in Pollution Shocks Induced by a Recession.”Quarterly Journal of Economics, 118(3): 1121–1167.
Chernozhukov, Victor, and Christian Hansen. 2008. “The reduced form: A simple approachto inference with weak instruments.” Economics Letters, 100(1): 68–71.
Currie, Janet, and Matthew Neidell. 2005. “Air Pollution and Infant Health: What Can WeLearn From California’s Recent Experience?” Quarterly Journal of Economics, 120(3): 1003–1030.
Dahl, Gordon B, and Lance Lochner. 2012. “The impact of family income on child achieve-ment: Evidence from the earned income tax credit.” The American Economic Review,102(5): 1927–1956.
Davidson, Russell, and James G. MacKinnon. 1993. Estimation and Inference in Economet-rics. Oxofd University Press.
Deschenes, Olivier, Michael Greenstone, and Jonathan Guryan. 2009. “Climate Changeand Birth Weight.” American Economic Review: Papers & Proceedings, 99(2): 211–217.
Deschenes, Olivier, Michael Greenstone, and Joseph Shapiro. 2012. “Defensive Invest-ments and the Demand for Air Quality: Evidence from the NOx Budget Program and OzoneReductions.” Energy Institute at Haas Working Paper Series 233.
35
Environmental Protection Agency. 1992. “Procedures for Emission Inventory Prepara-tion.Volume IV: Mobile Sources (EPA420-R-92-009).”
Environmental Protection Agency. 2005. “2005 National Emissions Inventory Data And Doc-umentation.”
Federal Aviation Administration. 2005a. “Aviation & Emissions: A Primer.” Office of Envi-ronment and Energy.
Federal Aviation Administration. 2005b. Passenger Board-ing (Enplanement) and All-Cargo Data for U.S. Airports.http://www.faa.gov/airports/planning capacity/passenger allcargo stats/passenger/.
Friedman, Michael S., Kenneth E. Powell, Lori Hutwagner, LeRoy M. Graham, andW. Gerald Teague. 2001. “Impact of Changes in Transportation and Commuting BehaviorsDuring the 1996 Summer Olympic Games in Atlanta on Air Quality and Childhood Asthma.”Journal of the American Medical Association, 285(7): 897–905.
Garen, John. 1984. “The Returns to Schooling: A Selectivity Bias Approach with a ContinuousChoice Variable.” Econometrica, 52(5): 1199–1218.
Garg, Renu, Adam Karpati, Jessica Leighton, Mary Perrin, and Mona Shah. 2003.Asthma Facts. . Second ed., New York City Department of Health and Mental Hygiene.
Griliches, Zvi, and Jerry A. Hausman. 1986. “Errors in variables in panel data.” Journal ofEconometrics, 31(1): 93–118.
Grossman, Michael. 1972. “On the Concept of Health Capital and the Demand for Health.”Journal of Political Economy, 80(2): 223–255.
Hanna, Rema, and Paulina Oliva. 2011. “The Effect of Pollution on Labor Supply: Evidencefrom a Natural Experiment in Mexico City.” NBER Working Paper 17302.
Hausman, Jerry, Bronwyn H. Hall, and Zvi Griliches. 1984. “Econometric Models for CountData with an Application to the Patents-R & D Relationship.” Econometrica, 52(4): 909–938.
Heckman, J., and E. Vytlacil. 1998. “Instrumental Variables Methods for the Correlated Ran-dom Coefficient Model: Estimating the Average Rate of Return to Schooling When the Returnis Correlated with Schooling.” Journal of Human Resources, 33(4): 974–987.
Ito, Kazuhiko, George D. Thurston, and Robert A. Silverman. 2007. “Characterization ofPM2.5, gaseous pollutants, and meteorological interactions in the context of time-series healtheffects models.” Journal of Exposure Science and Environmental Epidemiology, 17: S45–S60.
Knittel, Christopher R., Douglas L. Miller, and Nicholas J. Sanders. 2011. “Caution,Drivers! Children Present: Traffic, Pollution, and Infant Health.” NBER Working Paper17222.
Linn, William S., Edward L. Avol, Ru-Chuan Peng, Deborah A. Shamoo, and Jack D.Hackney. 1987. “Replicated Dose-Response Study of Sulfur Dioxide Effects in Normal, Atopic,and Asthmatic Volunteers.” American Journal of Respiratory and Critical Care Medicine,136(5): 1127–1135.
36
Moretti, Enrico, and Matthew Neidell. 2011. “Pollution, Health, and Avoidance Behavior:Evidence from the Ports of Los Angeles.” Journal of Human Resources, 46(1): 154–175.
Murphy, Kevin M., and Robert H. Topel. 2002. “Estimation and Inference in Two-StepEconometric Models.” Journal of Business & Economic Statistics, 20(1): 88–97.
Neidell, Matthew. 2009. “Information, Avoidance Behavior, and Health: The Effect of Ozoneon Asthma Hospitalizations.” Journal of Human Resources, 44(2): 450–478.
New York Times. January 27, 2012. N.Y. Airports Account for Half of All Flight Delays.
Peel, Jennifer L., Paige E. Tolbert, Mitchel Klein, Kristi Busico Metzger, W. DanaFlanders, Knox Todd, James A. Mulholland, P. Barry Ryan, and Howard Frumkin.2005. “Ambient Air Pollution and Respiratory Emergency Department Visits.” Epidemiology,16(2): 164–174.
Peters, Annette, Douglas W. Dockery, James E. Muller, and Murray A. Mittleman.2001. “Increased Particulate Air Pollution and the Triggering of Myocardial Infarction.” Cir-culation, 103: 2810–2815.
Pope, C Arden, Richard T Burnett, Daniel Krewski, Michael Jerrett, Yuanli Shi, Eu-genia E Calle, and Michael J Thun. 2009. “Cardiovascular mortality and exposure toairborne fine particulate matter and cigarette smoke shape of the exposure-response relation-ship.” Circulation, 120(11): 941–948.
Pope III, C Arden, Richard T Burnett, Michelle C Turner, Aaron Cohen, DanielKrewski, Michael Jerrett, Susan M Gapstur, and Michael J Thun. 2011. “Lung can-cer and cardiovascular disease mortality associated with ambient air pollution and cigarettesmoke: shape of the exposure-response relationships.” Environmental health perspectives,119(11): 1616.
Pyrgiotisa, Nikolas, Kerry M. Maloneb, and Amedeo Odoni. 2013. “Modelling delay prop-agation within an airport network.” Transportation Research Part C: Emerging Technologies,27: 60–75.
Schildcrout, Jonathan S., Lianne Sheppard, Thomas Lumley, James C. Slaughter,Jane Q. Koenig, and Gail G. Shapiro. 2006. “Ambient Air Pollution and Asthma Exacer-bations in Children: An Eight-CityAnalysis.” American Journal of Epidemiology, 164(6): 505–517.
Schlenker, Wolfram, and Michael J. Roberts. 2009. “Nonlinear Temperature Effects IndicateSevere Damages to U.S. Crop Yields under Climate Change.” Proceedings of the NationalAcademy of Sciences of the United States, 106(37): 15594–15598.
Schlenker, Wolfram, and W Reed Walker. 2011. “Airports, Air Pollution, and Contempora-neous Health.” NBER Working Paper 17684.
Schwartz, J., C. Spix, G. Touloumi, L. Bacharova, T. Barumamdzadeh, A. le Tertre,T. Piekarksi, A. Ponce de Leon, A. Ponka, G. Rossi, M. Saez, and J. P. Schouten.1996. “Methodological issues in studies of air pollution and daily counts of deaths or hospitaladmissions.” Journal of Epidemiology and Community Health, 50: S3–S11.
37
Seaton, A., D. Godden, W. MacNee, and K. Donaldson. 1995. “Particulate air pollutionand acute health effects.” The Lancet, 345(8943): 176–178.
Stock, James H., Jonathan H. Wright, and Motohiro Yogo. 2002. “A survey of weakinstruments and weak identification in generalized method of moments.” Journal of Business& Economic Statistics, 20(4): 518–529.
Transportation Research Board. 2008. “Aircraft and Airport-Related Hazardous Air Pollu-tants: Research Needs and Analysis.” National Academy of Sciences, Airport CooperativeResearch Program (ACRP) Report 7.
United States Environmental Protection Agency. 2004. “Guidance on Airport EmissionsReduction Credits for Early Measures Through Voluntary Airport Low Emissions Programs.”Air Quality Strategies and Standards Division. Office of Air Quality Planning and Standards.
Welman, Stephen, Ashley William, and David Hechtman. 2010. “Calculating Delay Prop-agation Multipliers for Cost-Benefit Analysis.” Center for Advanced Aviation System Devel-opment and MITRE MP 100039.
Willis, R.J., and S. Rosen. 1979. “Education and Self-Selection.” Journal of Political Economy,87(5): S7–S36.
Wooldridge, Jeffrey M. 1997. “On two stage least squares estimation of the average treatmenteffect in a random coefficient model.” Economics Letters, 56(2): 129–133.
Wooldridge, Jeffrey M. 1999. “Distribution-free estimation of some nonlinear panel data mod-els.” Journal of Econometrics, 90(1): 77–97.
Wooldridge, Jeffrey M. 2002. Econometric analysis of cross section and panel data. MIT Press.
Wooldridge, Jeffrey M. 2003. “Further results on instrumental variables estimation of averagetreatment effects in the correlated random coefficient model.” Economics Letters, 79(2): 185–191.
38
Figure 1: Location of Airports, Pollution Monitors, and Zip Codes
Southern California
Airport LocationCARB CO Pollution MonitorCARB NO2 Pollution MonitorZip Code Centroid [0,10] km of AirportZip Code Centroid (10,oo) km of Airport
Northern California
Notes: The 12 largest airports in California are shown as dots. The location of CO pollution monitors in the California
Air Resource Board (CARB) data base are shown as x, the location of NO2 monitors as +. Zip code boundaries are
shown in grey. They are shaded if the centroid is within 10km (6.2miles) of an airport.
39
Figure 2: Histogram of Daily Wind Direction At Airports
Burbank (BUR)North
South
EastW
est
Los Angeles (LAX)North
South
EastW
est
Long Beach (LGB)North
South
EastW
est
Oakland (OAK)North
South
EastW
est
Ontario (ONT)North
South
EastW
est
Palm Springs (PSP)North
South
EastW
est
San Diego (SAN)North
South
EastW
est
Santa Barbara (SBA)North
South
EastW
est
San Francisco (SFO)North
South
EastW
est
San Jose (SJC)North
South
EastW
est
Sacramento (SMF)North
South
EastW
est
Santa Ana (SNA)North
South
EastW
est
Notes: Histogram of the distribution of daily directions in which the wind is blowing (2005-2007). Plot is normalized
to the most frequent category. The four circles indicate the quartile range. Airport locations are shown in Figure 1.
40
Figure 3: Contour Maps: Marginal Impact of Taxi Time on Pollution Levels
Carbon Monoxide (CO)
Downwind
−10km 10km
−10km
10km
Downwind
−10km 10km
−10km
10km
Downwind
−10km 10km
−10km
10km
Downwind
−10km 10km
−10km
10km
13.00
15.04
17.08
19.12
21.16
23.20
25.24
27.28
29.32
31.36
33.40
35.44
37.48
39.52
41.56
43.60
45.64
47.68
49.72
51.76
53.80
55.84
57.88
59.92
61.96
64.00
Nitrogen Dioxide (NO2)
Downwind
−10km 10km
−10km
10km
Downwind
−10km 10km
−10km
10km
Downwind
−10km 10km
−10km
10km
Downwind
−10km 10km
−10km
10km
0.23
0.25
0.27
0.29
0.31
0.34
0.36
0.38
0.40
0.42
0.44
0.46
0.48
0.51
0.53
0.55
0.57
0.59
0.61
0.63
0.65
0.68
0.70
0.72
0.74
0.76
Notes: Graphs display the marginal impact of taxi time (ppb per 1000 minute of taxi time, i.e., kmin) on pollution levels across space for different wind speeds.
The x-axis shows the direction in which the wind is blowing: positive x-values imply the location is downwind, negative value imply they are upwind. Points on
the y-axis are at a right angle to the wind direction. The wind speeds in columns 1-4 are 0.1m/s, 1m/s, 2m/s, and 3m/s corresponding to the 0.1, 10.6, 34.5, and
66.5 percentiles of the distribution of wind speeds in 2005-2007 at the 12 airports in our study (see Figure 1).
41
Table 1: Pollution Regressed On Instrumented Taxi Time
CO Pollution NO2 Pollution O3 PollutionVariable (1a) (1b) (1c) (2a) (2b) (2c) (3a) (3b) (3c)
Taxi Time 44.78∗∗∗ 56.26∗∗∗ 52.56∗∗∗ 0.57∗∗∗ 0.67∗∗∗ 0.67∗∗∗ -0.00 0.08 0.16(5.04) (9.48) (10.49) (0.09) (0.15) (0.22) (0.09) (0.11) (0.20)
Taxi x Distance -1.62 -2.13 -0.01 -0.02 -0.01 -0.03(1.22) (1.37) (0.02) (0.03) (0.01) (0.02)
Taxi x Angleu 13.16∗ 0.31 -0.50∗∗∗
(7.78) (0.22) (0.18)Taxi x Angled 5.48 0.05 0.05
(6.97) (0.18) (0.12)Taxi x Speed -2.05 -0.08∗ 0.04
(1.89) (0.04) (0.05)Taxi x Distance x Angleu -0.60 -0.02 0.05∗∗
(1.10) (0.03) (0.02)Taxi x Distance x Angled 0.16 -0.01 -0.01
(0.89) (0.03) (0.02)Taxi x Distance x Speed 0.55∗∗ 0.01∗ -0.00
(0.25) (0.01) (0.01)Taxi x Angled x Speed 1.70 0.10∗ -0.07
(2.66) (0.05) (0.06)Taxi x Angleu x Speed -10.41∗∗∗ -0.19∗∗ 0.26∗∗∗
(3.74) (0.10) (0.09)Taxi x Dist. x Angleu x Speed 1.50∗∗∗ 0.03∗∗ -0.03∗∗
(0.50) (0.01) (0.01)Taxi x Dist. x Angled x Speed -0.63∗ -0.01 0.01
(0.35) (0.01) (0.01)
Observations 179580 179580 179580 179580 179580 179580 179580 179580 179580Zip Codes 164 164 164 164 164 164 164 164 164Days 1095 1095 1095 1095 1095 1095 1095 1095 1095F-stat(joint sig.) 78.48 42.24 14.11 39.67 19.80 4.88 0.00 0.76 1.26p-value (joint sig.) 1.33e-15 1.66e-15 7.89e-20 2.68e-09 2.00e-08 7.48e-07 .9773 .4705 .2452
Notes: Table regresses zip-code level pollution measures on airport congestion (total taxi time in 1000min) in 2005-2007. Taxi time at the local airport is
instrumented with the taxi time at three major airports in the Eastern United States. All regressions include weather controls (quadratic in minimum and
maximum temperature, precipitation and wind speed as well as controls for wind direction), temporal controls (year, month, weekday, and holiday fixed effects),
and zip code fixed effects. Regressions are weighted by the total population in a zip code. Errors are two-way clustered by zip code and day. Significance levels
are indicated by ∗∗∗ 1%, ∗∗ 5%, ∗ 10%.
42
Table 2: Sickness Rates Regressed On Instrumented Taxi Time
Acute All All Bone Appen-Asthma Respiratory Respiratory Heart Stroke Fractures dicitis
(1a) (1b) (1c) (2) (3) (4) (5)
Panel A: All AgesTaxi Time 13.84∗∗∗ 24.77∗∗∗ 33.89∗∗∗ 19.35∗∗∗ 2.55 -1.33 0.26
(2.72) (7.73) (9.82) (5.24) (1.71) (2.87) (0.68)
Panel B: Ages Below 5Taxi Time 24.46∗∗ 84.28 116.12∗ 6.63∗ 0.80 2.16 -0.29
(11.21) (51.35) (62.46) (3.47) (0.94) (5.84) (1.38)
Panel C: Age 65 and AboveTaxi Time 36.89∗∗∗ 63.80∗∗∗ 100.53∗∗∗ 156.54∗∗∗ 22.87∗ 19.13∗ 0.75
(11.39) (16.43) (25.25) (36.98) (12.95) (9.95) (1.21)
Observations 179580 179580 179580 179580 179580 179580 179580Zip Codes 164 164 164 164 164 164 164Days 1095 1095 1095 1095 1095 1095 1095
Notes: Table regresses zip-code level sickness rates (counts for primary and secondary diagnosis codes per 10 million people) on daily congestion (taxi time in
1000min) that is caused by network delays (taxi time at three major airports in the Eastern United States). All regressions include weather controls (quadratic
in minimum and maximum temperature, precipitation and wind speed as well as controls for wind direction), temporal controls (year, month, weekday, and
holiday fixed effects), and zip code fixed effects. Regressions are weighted by the total population in a zip code. Errors are two-way clustered by zip code and
day. Significance levels are indicated by ∗∗∗ 1%, ∗∗ 5%, ∗ 10%.
43
Table 3: Sickness Rates Regressed On Pollution
Acute All All Bone Appen-Asthma Respiratory Respiratory Heart Stroke Fractures dicitis
(1a) (1b) (1c) (2) (3) (4) (5)
Panel A: CO Pollution - All AgesNo Controls 0.070∗∗∗ 0.265∗∗∗ 0.353∗∗∗ 0.035 -0.002 -0.022∗∗∗ -0.001
(0.017) (0.041) (0.053) (0.028) (0.006) (0.007) (0.001)Time Controls 0.030 0.058 0.070 -0.022 -0.014∗ -0.008 0.001
(0.024) (0.057) (0.075) (0.040) (0.008) (0.010) (0.001)Time + Weather 0.056∗∗ 0.047 0.071 0.002 -0.005 -0.012 -0.001
(0.028) (0.068) (0.091) (0.052) (0.010) (0.012) (0.001)Time + Weather + Zip Code FE 0.011 0.049∗∗∗ 0.077∗∗∗ 0.027∗∗∗ -0.001 -0.007∗ 0.002
(0.007) (0.018) (0.022) (0.008) (0.003) (0.004) (0.001)
Panel B: NO2 Pollution - All AgesNo Controls 3.1∗∗∗ 10.7∗∗∗ 14.6∗∗∗ 4.3∗∗∗ 0.6∗∗∗ -0.3 0.1∗∗
(0.5) (1.3) (1.7) (1.1) (0.2) (0.2) (0.0)Time Controls 1.7∗∗ 6.0∗∗∗ 7.9∗∗∗ 1.0 -0.1 0.6∗ 0.1∗∗
(0.7) (1.5) (2.1) (1.4) (0.3) (0.3) (0.0)Time + Weather 4.2∗∗∗ 8.3∗∗∗ 11.5∗∗∗ 3.0 0.8∗ 0.7 -0.0
(1.0) (2.6) (3.6) (2.4) (0.5) (0.5) (0.1)Time + Weather + Zip Code FE 0.1 1.2∗ 2.5∗∗∗ 0.9∗∗∗ 0.1 -0.0 0.1∗
(0.2) (0.6) (0.8) (0.3) (0.1) (0.2) (0.0)
Observations 179580 179580 179580 179580 179580 179580 179580Zip Codes 164 164 164 164 164 164 164Days 1095 1095 1095 1095 1095 1095 1095
Notes: Table regresses zip-code level sickness rates (based on primary and secondary diagnosis codes) on daily pollution (ppb) in 2005-2007. Each entry is a
separate regression. Columns use sickness rates (counts per 10 million people) for different diseases, while rows use different controls. The first specification
(row) in each panel has no controls, while the second adds time controls (year, month, weekday as well as holiday fixed effects), the third adds weather controls
(quadratic in minimum and maximum temperature, precipitation and wind speed as well as controls for wind direction), and the fourth adds zip code fixed effects.
All regressions are weighted by the total population in a zip code. Errors are two-way clustered by zip code and day. Significance levels are indicated by ∗∗∗ 1%,∗∗ 5%, ∗ 10%.
44
Table 4: Sickness Rates Regressed On Instrumented Pollution
Acute All Heart Bone Appen-Asthma Respiratory Respiratory Problems Stroke Fractures dicitis
(1a) (1b) (1c) (2) (3) (4) (5)
Panel A: All AgesModel 1: CO 0.311∗∗∗ 0.556∗∗∗ 0.761∗∗∗ 0.434∗∗∗ 0.057 -0.030 0.006
(0.065) (0.162) (0.207) (0.134) (0.039) (0.063) (0.015)Model 2: CO 0.307∗∗∗ 0.550∗∗∗ 0.755∗∗∗ 0.419∗∗∗ 0.050 -0.030 0.003
(0.062) (0.163) (0.210) (0.128) (0.038) (0.064) (0.015)Model 3: CO 0.194∗∗∗ 0.396∗∗∗ 0.515∗∗∗ 0.226∗∗∗ 0.020 -0.039 0.002
(0.047) (0.125) (0.165) (0.079) (0.030) (0.040) (0.011)Model 1: NO2 24.5∗∗∗ 43.8∗∗∗ 59.9∗∗∗ 34.2∗∗∗ 4.5 -2.4 0.5
(6.2) (16.2) (20.5) (10.5) (3.1) (5.2) (1.2)Model 2: NO2 24.3∗∗∗ 43.6∗∗∗ 59.8∗∗∗ 33.5∗∗∗ 4.2 -2.4 0.3
(6.1) (16.3) (20.8) (10.4) (3.1) (5.2) (1.2)Model 3: NO2 12.4∗∗∗ 18.9∗ 24.2∗ 17.1∗∗ 0.7 -1.0 0.3
(4.0) (11.0) (14.2) (7.1) (2.2) (3.0) (0.9)
Panel B: Ages Below 5Model 1: CO 0.565∗∗ 1.948∗ 2.683∗∗ 0.153∗ 0.018 0.050 -0.007
(0.240) (1.124) (1.353) (0.081) (0.021) (0.136) (0.032)Model 2: CO 0.579∗∗ 1.930∗ 2.624∗ 0.127 0.020 0.064 -0.013
(0.235) (1.111) (1.356) (0.078) (0.023) (0.132) (0.034)Model 3: CO 0.669∗∗∗ 2.166∗∗∗ 2.493∗∗ 0.075 0.023 -0.012 -0.009
(0.170) (0.796) (0.980) (0.057) (0.015) (0.122) (0.022)Model 1: NO2 42.2∗∗ 145.3 200.2∗ 11.4∗ 1.4 3.7 -0.5
(20.5) (95.2) (117.3) (6.4) (1.6) (10.0) (2.4)Model 2: NO2 43.2∗∗ 144.1 195.9∗ 9.5 1.5 4.8 -0.9
(20.2) (94.3) (117.6) (6.2) (1.7) (9.7) (2.5)Model 3: NO2 43.6∗∗∗ 122.2∗ 140.8∗ 4.5 2.9∗∗ 3.7 0.6
(14.9) (67.7) (82.4) (4.6) (1.3) (9.3) (2.0)
Panel C: Ages 65 and OlderModel 1: CO 0.849∗∗∗ 1.469∗∗∗ 2.314∗∗∗ 3.604∗∗∗ 0.526∗ 0.440∗ 0.017
(0.312) (0.440) (0.642) (1.001) (0.297) (0.242) (0.028)Model 2: CO 0.815∗∗∗ 1.413∗∗∗ 2.275∗∗∗ 3.529∗∗∗ 0.502∗ 0.409∗ 0.016
(0.288) (0.422) (0.637) (0.971) (0.302) (0.241) (0.028)Model 3: CO 0.493∗∗ 0.696∗∗ 1.424∗∗∗ 1.937∗∗∗ 0.198 0.187 -0.025
(0.204) (0.309) (0.511) (0.620) (0.249) (0.161) (0.026)Model 1: NO2 66.5∗∗∗ 114.9∗∗∗ 181.1∗∗∗ 282.0∗∗∗ 41.2∗ 34.5∗ 1.4
(22.8) (34.3) (52.3) (76.1) (24.2) (18.2) (2.2)Model 2: NO2 66.5∗∗∗ 115.1∗∗∗ 181.4∗∗∗ 282.2∗∗∗ 41.2∗ 34.5∗ 1.4
(22.8) (34.3) (52.3) (76.2) (24.3) (18.2) (2.2)Model 3: NO2 35.3∗∗ 38.9 75.8∗ 131.6∗∗∗ 3.6 12.2 -0.8
(14.2) (24.6) (41.3) (47.8) (16.5) (12.0) (1.6)
Observations 179580 179580 179580 179580 179580 179580 179580Zip Codes 164 164 164 164 164 164 164Days 1095 1095 1095 1095 1095 1095 1095
Notes: Table regresses zip-code level sickness rates (counts for primary and secondary diagnosis codes per 10 million
people) on daily instrumented pollution levels (ppb) in 2005-2007. Each entry is a separate regression. Pollution is
instrumented on airport congestion (taxi time) that is caused by network delays (taxi time at three major airports
in the Eastern United States). Model 1 assumes a uniform impact of congestion on pollution levels at all zip codes
surrounding an airport, while model 2 adds an interaction with the distance to the airport, and model 3 furthermore
adds interactions with wind direction and speed (columns (a)-(c) in Table 1). All regressions include weather
controls (quadratic in minimum and maximum temperature, precipitation and wind speed as well as controls for
wind direction), temporal controls (year, month, weekday, and holiday fixed effects), and zip code fixed effects.
Regressions are weighted by the total population in a zip code. Errors are two-way clustered by zip code and day.
Significance levels are indicated by ∗∗∗ 1%, ∗∗ 5%, ∗ 10%.45
Table 5: Sickness Rates Regressed On Instrumented Pollution - Joint Estimation
Acute All Heart Bone Appen-Asthma Respiratory Respiratory Problems Stroke Fractures dicitis
(1a) (1b) (1c) (2) (3) (4) (5)
Panel A: All AgesModel 3: CO 0.222∗∗ 0.867∗∗∗ 1.189∗∗ 0.105 0.054 -0.127∗ -0.006
(0.106) (0.313) (0.476) (0.139) (0.049) (0.073) (0.017)Model 3: NO2 -2.2 -40.1 -57.4 10.7 -2.9 7.6 0.7
(8.0) (25.6) (38.4) (13.4) (3.5) (5.6) (1.5)F(1st stage) - CO 4.18 4.18 4.18 4.18 4.18 4.18 4.18F(1st stage) - NO2 1.57 1.57 1.57 1.57 1.57 1.57 1.57P(Anderson-Rubin) 0.0000 0.0000 0.0000 0.0012 0.0027 0.1418 0.0484P(Stock-Wright S) 0.0413 0.0795 0.1141 0.1232 0.5319 0.4795 0.5710
Panel B: Ages Below 5Model 3: CO 0.901 4.685∗∗ 5.388∗∗ 0.133 -0.073 -0.385 -0.102
(0.599) (2.167) (2.483) (0.142) (0.051) (0.355) (0.072)Model 3: NO2 -18.8 -206.8 -237.6 -4.8 8.0∗ 30.9 7.7
(46.1) (174.9) (199.0) (11.5) (4.1) (27.3) (6.4)F(1st stage) - CO 3.39 3.39 3.39 3.39 3.39 3.39 3.39F(1st stage) - NO2 1.30 1.30 1.30 1.30 1.30 1.30 1.30P(Anderson-Rubin) 0.0000 0.0001 0.0000 0.0185 0.0474 0.0687 0.6100P(Stock-Wright S) 0.0588 0.0886 0.2163 0.4046 0.3567 0.3185 0.5764
Panel C: Age 65 and AboveModel 3: CO 0.268 0.775∗ 1.726∗∗ 1.279 0.490 0.147 -0.051
(0.323) (0.445) (0.772) (0.821) (0.399) (0.281) (0.046)Model 3: NO2 20.1 -6.4 -25.8 60.3 -25.5 3.7 2.2
(22.1) (36.3) (62.6) (67.7) (25.9) (21.4) (3.1)F(1st stage) - CO 4.96 4.96 4.96 4.96 4.96 4.96 4.96F(1st stage) - NO2 2.09 2.09 2.09 2.09 2.09 2.09 2.09P(Anderson-Rubin) 0.1146 0.0026 0.0035 0.0009 0.0164 0.1595 0.0017P(Stock-Wright S) 0.4530 0.1484 0.2756 0.1831 0.3849 0.2969 0.1264
Observations 179580 179580 179580 179580 179580 179580 179580Zip Codes 164 164 164 164 164 164 164Days 1095 1095 1095 1095 1095 1095 1095
Notes: Table regresses zip-code level sickness rates (counts for primary and secondary diagnosis codes per 10
million people) on daily instrumented pollution levels (ppb) in 2005-2007. The effect of the two pollutants is jointly
estimated for the over-identified model 3 using LIML. Pollution is instrumented on airport congestion (taxi time)
that is caused by network delays (taxi time at three major airports in the Eastern United States). All regressions
include weather controls (quadratic in minimum and maximum temperature, precipitation and wind speed as well
as controls for wind direction), temporal controls (year, month, weekday, and holiday fixed effects), and zip code
fixed effects. Regressions are weighted by the total population in a zip code. Errors are two-way clustered by zip
code and day. Significance levels are indicated by ∗∗∗ 1%, ∗∗ 5%, ∗ 10%.
46
Table 6: Sickness Rates of All Ages Regressed On Instrumented CO Pollution - Lagged Pollution
Acute All HeartAsthma Respiratory Respiratory Problems
Model 1Pollution in t-3 0.026 0.220 0.345 0.022
(0.096) (0.188) (0.250) (0.139)Pollution in t-2 0.130 0.109 0.023 0.003
(0.143) (0.246) (0.336) (0.255)Pollution in t-1 -0.017 -0.060 -0.001 -0.020
(0.132) (0.251) (0.289) (0.183)Pollution in t 0.200∗∗ 0.355 0.485 0.422∗∗∗
(0.101) (0.263) (0.331) (0.134)Cum. Effect 0.339∗∗∗ 0.624∗∗∗ 0.853∗∗∗ 0.427∗∗∗
(0.070) (0.163) (0.210) (0.151)
Model 2Pollution in t-3 0.040 0.229 0.353 0.022
(0.094) (0.188) (0.250) (0.138)Pollution in t-2 0.117 0.098 0.013 -0.002
(0.141) (0.245) (0.331) (0.250)Pollution in t-1 -0.021 -0.062 -0.004 -0.028
(0.133) (0.253) (0.291) (0.184)Pollution in t 0.203∗∗ 0.352 0.485 0.415∗∗∗
(0.099) (0.262) (0.331) (0.132)Cum. Effect 0.338∗∗∗ 0.618∗∗∗ 0.847∗∗∗ 0.408∗∗∗
(0.066) (0.163) (0.214) (0.143)
Model 3Pollution in t-3 -0.002 0.126 0.121 0.045
(0.041) (0.095) (0.124) (0.057)Pollution in t-2 0.079 0.023 0.020 -0.014
(0.060) (0.116) (0.151) (0.087)Pollution in t-1 -0.059 0.008 0.020 -0.004
(0.056) (0.154) (0.191) (0.111)Pollution in t 0.177∗∗∗ 0.316 0.420 0.225∗∗
(0.067) (0.201) (0.263) (0.100)Cum. Effect 0.195∗∗∗ 0.473∗∗∗ 0.582∗∗∗ 0.252∗∗∗
(0.052) (0.115) (0.153) (0.067)
Observations 179088 179088 179088 179088Zip Codes 164 164 164 164Days 1092 1092 1092 1092
Notes: Table replicates the results of CO pollution on sickness counts for all ages in Table 4 except that three lags of
the instrumented pollution levels are included. Each column in each panel presents the coefficients from one regression
as well as the cumulative effect (sum of all four coefficients). Significance levels are indicated by ∗∗∗ 1%, ∗∗ 5%, ∗
10%.
47
Table 7: Sickness Counts Regressed On Instrumented CO Pollution - Poisson Model
Acute All Heart Bone Appen-Asthma Respiratory Respiratory Problems Stroke Fractures dicitis
(1a) (1b) (1c) (2) (3) (4) (5)
Panel A: All AgesModel 1: CO 0.834∗∗∗ 0.596∗∗∗ 0.577∗∗∗ 0.488∗∗∗ 0.270 -0.114 0.325
(0.171) (0.109) (0.112) (0.128) (0.184) (0.184) (0.454)Model 2: CO 0.846∗∗∗ 0.589∗∗∗ 0.573∗∗∗ 0.482∗∗∗ 0.246 -0.116 0.245
(0.172) (0.111) (0.116) (0.128) (0.185) (0.188) (0.466)Model 3: CO 0.561∗∗∗ 0.399∗∗∗ 0.378∗∗∗ 0.292∗∗∗ 0.132 -0.150 0.163
(0.132) (0.090) (0.087) (0.095) (0.195) (0.133) (0.325)
Panel B: Ages Below 5Model 1: CO 1.202∗∗∗ 0.237 0.303 2.061∗ 3.334 0.187 -0.369
(0.387) (0.179) (0.208) (1.148) (2.876) (0.572) (2.923)Model 2: CO 1.202∗∗∗ 0.216 0.278 1.891∗ 3.347 0.233 -0.691
(0.396) (0.179) (0.207) (1.105) (2.799) (0.567) (2.963)Model 3: CO 1.133∗∗∗ 0.261∗∗ 0.256∗ 1.297 4.238∗ -0.064 -1.290
(0.287) (0.132) (0.143) (0.966) (2.480) (0.495) (2.643)
Panel C: Ages 65 and OlderModel 1: CO 1.287∗∗∗ 0.757∗∗∗ 0.610∗∗∗ 0.634∗∗∗ 0.397∗ 0.626∗∗ 1.190
(0.364) (0.208) (0.173) (0.165) (0.219) (0.314) (1.247)Model 2: CO 1.264∗∗∗ 0.743∗∗∗ 0.608∗∗∗ 0.630∗∗∗ 0.388∗ 0.589∗ 1.135
(0.341) (0.202) (0.174) (0.166) (0.224) (0.313) (1.291)Model 3: CO 0.804∗∗∗ 0.413∗∗ 0.397∗∗ 0.369∗∗∗ 0.159 0.292 -0.852
(0.275) (0.180) (0.154) (0.126) (0.223) (0.219) (1.185)
Notes: Table replicates the results for regression models of CO in Table 4 except that we use a Poisson count model
instead of a linear probability model. Further differences are that the regressions are unweighted and standard
errors are obtained from 100 clustered bootstrap draws (drawing entire zip code histories with replacement), which
is comparable to clustering by zip code in the baseline regression. Significance levels are indicated by ∗∗∗ 1%, ∗∗ 5%,∗ 10%.
48
Table 8: Impact of CO Pollution on Health (Model 3)
LAX SFO SAN OAK SJC SMF SNA ONT BUR SBA LGB PSP TotalPanel A: Linear Probability Model - All Ages
Population 812 182 540 448 910 41 822 454 794 59 875 93 6028One Standard Deviation Increase in Taxi Time
Asthma 1.20 0.16 0.29 0.15 0.21 0.01 0.26 0.09 0.12 0.00 0.10 0.01 2.60Acute Respiratory 2.44 0.34 0.59 0.30 0.44 0.02 0.53 0.19 0.25 0.01 0.20 0.02 5.31All Respiratory 3.18 0.44 0.77 0.39 0.57 0.03 0.68 0.25 0.32 0.01 0.25 0.03 6.91Heart Disease 1.40 0.19 0.34 0.17 0.25 0.01 0.30 0.11 0.14 0.00 0.11 0.01 3.04
One Standard Deviation Increase in PollutionAsthma 4.80 0.52 4.00 1.37 5.96 0.18 5.13 1.98 5.92 0.18 6.45 0.15 36.63Acute Respiratory 9.82 1.06 8.17 2.80 12.18 0.36 10.49 4.04 12.10 0.36 13.19 0.30 74.87All Respiratory 12.78 1.38 10.63 3.64 15.85 0.47 13.65 5.26 15.75 0.47 17.17 0.40 97.44Heart Disease 5.61 0.61 4.67 1.60 6.96 0.21 5.99 2.31 6.92 0.21 7.54 0.17 42.79
Panel B: Linear Probability Model - Ages 5 and Below
Population 54 11 33 32 68 4 58 35 55 3 65 6 424One Standard Deviation Increase in Taxi Time
Asthma 0.27 0.03 0.06 0.03 0.05 0.00 0.06 0.02 0.03 0.00 0.02 0.00 0.59Acute Respiratory 0.87 0.11 0.19 0.11 0.17 0.01 0.20 0.08 0.09 0.00 0.08 0.01 1.91All Respiratory 1.00 0.13 0.22 0.13 0.20 0.01 0.23 0.09 0.10 0.00 0.09 0.01 2.20Heart Disease 0.03 0.00 0.01 0.00 0.01 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.07
One Standard Deviation Increase in PollutionAsthma 1.14 0.11 0.84 0.33 1.54 0.06 1.24 0.53 1.42 0.03 1.68 0.03 8.96Acute Respiratory 3.69 0.37 2.73 1.08 4.98 0.19 4.03 1.73 4.59 0.09 5.43 0.10 28.99All Respiratory 4.25 0.42 3.14 1.24 5.74 0.22 4.63 1.99 5.28 0.11 6.25 0.11 33.37Heart Disease 0.13 0.01 0.09 0.04 0.17 0.01 0.14 0.06 0.16 0.00 0.19 0.00 1.00
Panel C: Linear Probability Model - Ages 65 and Above
Population 82 26 54 51 88 3 79 34 79 12 89 18 615One Standard Deviation Increase in Taxi Time
Asthma 0.30 0.06 0.07 0.04 0.05 0.00 0.06 0.02 0.03 0.00 0.02 0.01 0.67Acute Respiratory 0.43 0.08 0.10 0.06 0.07 0.00 0.09 0.03 0.04 0.00 0.03 0.01 0.94All Respiratory 0.87 0.17 0.21 0.12 0.15 0.00 0.18 0.05 0.09 0.01 0.07 0.02 1.93Heart Disease 1.19 0.23 0.29 0.16 0.20 0.01 0.24 0.07 0.12 0.01 0.10 0.02 2.63
One Standard Deviation Increase in PollutionAsthma 1.23 0.19 1.02 0.39 1.46 0.03 1.28 0.38 1.51 0.09 1.69 0.08 9.33Acute Respiratory 1.73 0.26 1.44 0.55 2.06 0.04 1.80 0.54 2.13 0.13 2.38 0.11 13.15All Respiratory 3.54 0.54 2.94 1.14 4.21 0.08 3.68 1.10 4.36 0.26 4.87 0.22 26.93Heart Disease 4.82 0.74 4.00 1.54 5.73 0.11 5.01 1.49 5.93 0.35 6.62 0.29 36.63
Panel D: Poisson Model - All Ages
One Standard Deviation Increase in Taxi TimeAsthma 1.41 0.21 0.35 0.24 0.16 0.01 0.16 0.09 0.11 0.00 0.11 0.01 2.87Acute Respiratory 2.62 0.40 0.61 0.42 0.34 0.02 0.43 0.20 0.26 0.00 0.23 0.03 5.55All Respiratory 3.44 0.53 0.81 0.54 0.45 0.02 0.55 0.26 0.34 0.01 0.31 0.04 7.30Heart Disease 1.59 0.28 0.40 0.23 0.22 0.01 0.26 0.11 0.16 0.00 0.14 0.02 3.42
One Standard Deviation Increase in PollutionAsthma 6.35 0.67 5.29 2.29 5.02 0.22 3.60 2.00 6.22 0.09 8.54 0.14 40.41Acute Respiratory 11.47 1.35 8.93 3.98 10.22 0.33 9.09 4.48 13.99 0.19 16.81 0.39 81.24All Respiratory 15.00 1.76 12.01 5.07 13.26 0.41 11.79 5.91 18.16 0.28 22.70 0.57 106.93Heart Disease 6.72 0.87 5.85 2.13 6.30 0.16 5.62 2.48 8.62 0.23 10.03 0.30 49.31
Panel E: Baseline Average - All Ages
Asthma 33.1 7.9 22.3 25.4 24.2 1.6 18.1 14.9 26.0 0.9 36.0 3.0 213.6Acute Respiratory 87.4 21.7 55.2 63.2 71.8 3.6 66.9 48.0 85.8 3.1 104.3 11.8 623.0All Respiratory 121.3 30.3 78.2 85.2 98.7 4.7 91.6 67.2 117.9 4.6 149.0 18.0 866.8Heart Disease 72.8 20.3 50.0 46.4 61.7 2.4 56.9 36.9 73.4 5.0 86.1 12.3 524.2
Notes: Table gives population as well as daily hospital admissions for all zip codes that are within 10km (6.2miles)
of one of the 12 major California airports. Panels A-D give predicted changes in sickness counts, while Panel E
gives baseline averages. Panels A-C use the linear probability model 1 for CO from Table 4, while panel D uses
the Poisson model 1 for CO from Table 7. Panel E gives average daily sickness counts in the data. The first 12
columns give impacts by airport, while the last column gives the total for all 12 airports. Population is in thousand.
Predicted changes in hospitalization are for both inpatient as well as outpatient admissions.
49
A1 Online Appendix
Figure A1: Location of Airports in Study
xxx
x
xx
x
x
xx
x
x
x
x x
Atlanta (ATL)
Chicago O‘Hare (ORD) New York (JFK)
Notes: Figure displays the location of the 12 airports in California as well as the three Eastern airports used in the
baseline regressions to instrument taxi time in California.
A1
Figure A2: Boxplots of Taxi Time By Hour and Airport
Panel A: Airports in California
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240
100
200
300
400
500
600
700
800
900
1000
Time of Day
Tot
al T
axi T
ime
of A
ll P
lane
s (M
inut
es)
Los Angeles (LAX)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240
100
200
300
400
500
600
700
Time of Day
Tot
al T
axi T
ime
of A
ll P
lane
s (M
inut
es)
San Francisco (SFO)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240
100
200
300
400
500
600
700
Time of Day
Tot
al T
axi T
ime
of A
ll P
lane
s (M
inut
es)
San Diego (SAN)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240
50
100
150
200
250
300
Time of Day
Tot
al T
axi T
ime
of A
ll P
lane
s (M
inut
es)
Oakland (OAK)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240
100
200
300
400
500
600
Time of Day
Tot
al T
axi T
ime
of A
ll P
lane
s (M
inut
es)
San Jose (SJC)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240
50
100
150
200
250
Time of Day
Tot
al T
axi T
ime
of A
ll P
lane
s (M
inut
es)
Sacramento (SMF)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240
100
200
300
400
500
600
700
Time of Day
Tot
al T
axi T
ime
of A
ll P
lane
s (M
inut
es)
Santa Ana (SNA)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240
50
100
150
200
250
300
350
400
Time of Day
Tot
al T
axi T
ime
of A
ll P
lane
s (M
inut
es)
Ontario (ONT)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240
50
100
150
200
250
300
350
Time of Day
Tot
al T
axi T
ime
of A
ll P
lane
s (M
inut
es)
Burbank (BUR)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240
10
20
30
40
50
60
70
80
Time of Day
Tot
al T
axi T
ime
of A
ll P
lane
s (M
inut
es)
Santa Barbara (SBA)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240
50
100
150
200
250
300
350
Time of Day
Tot
al T
axi T
ime
of A
ll P
lane
s (M
inut
es)
Long Beach (LGB)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240
20
40
60
80
100
120
Time of Day
Tot
al T
axi T
ime
of A
ll P
lane
s (M
inut
es)
Palm Springs (PSP)
Panel B: Eastern Airports Used as Instruments
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240
1000
2000
3000
4000
5000
6000
7000
8000
Time of Day
Tot
al T
axi T
ime
of A
ll P
lane
s (M
inut
es)
Atlanta (ATL)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240
1000
2000
3000
4000
5000
6000
Time of Day
Tot
al T
axi T
ime
of A
ll P
lane
s (M
inut
es)
Chicago O‘Hare (ORD)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240
500
1000
1500
2000
2500
3000
Time of Day
Tot
al T
axi T
ime
of A
ll P
lane
s (M
inut
es)
John F. Kennedy New York (JFK)
Notes: Boxplots of taxi time by hour of day 2005-2007. The box spans the 25%-75% range, while the median is
shown as black solid line. Whiskers extend to the minimum and maximum.
A2
Figure A3: Seasonality of Pollution
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
020
040
060
080
010
0012
00A
vera
ge C
O P
ollu
tion
(ppb
)
1 31 59 90 120 151 181 212 243 273 304 334 365Day of Year (End of Month Indicated by Dotted Lines)
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
05
1015
2025
3035
40A
vera
ge N
O2
Pol
lutio
n (p
pb)
1 31 59 90 120 151 181 212 243 273 304 334 365Day of Year (End of Month Indicated by Dotted Lines)
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
05
1015
2025
3035
40A
vera
ge O
3 P
ollu
tion
(ppb
)
1 31 59 90 120 151 181 212 243 273 304 334 365Day of Year (End of Month Indicated by Dotted Lines)
Notes: Figure displays average pollution readings over 2005-2007 in our data for CO, NO2 and O3, respectively. CO
and NO2 are highest during the winter, while O3 is highest during summer.
A3
Figure A4: Histogram of Observed Average Daily CO Exposure
05.
0e−
04.0
01.0
015
Den
sity
0 500 1000 1500 2000 2500 3000Average Daily CO Pollution (ppb)
Notes: Histogram shows the observed average daily CO pollution readings in our data set. Red lines indicate the 2.5
and 97.5 percentiles.
A4
Figure A5: Nonlinearity response to CO Pollution
Panel A: All Ages−
.3−
.2−
.10
.1.2
.3.4
.5.6
.7M
argi
nal E
ffect
on
Ast
hma
0 500 1000 1500Average Daily CO Pollution (ppb)
Polynomial of Order: 1 2
−1
01
2M
argi
nal E
ffect
on
Acu
te R
espi
rato
ry
0 500 1000 1500Average Daily CO Pollution (ppb)
Polynomial of Order: 1 2
−1
01
2M
argi
nal E
ffect
on
All
Res
pira
tory
0 500 1000 1500Average Daily CO Pollution (ppb)
Polynomial of Order: 1 2
−.3
−.2
−.1
0.1
.2.3
.4.5
.6.7
.8M
argi
nal E
ffect
on
Hea
rt P
robl
ems
0 500 1000 1500Average Daily CO Pollution (ppb)
Polynomial of Order: 1 2
Panel B: Ages 5 and Below
−1
01
23
Mar
gina
l Effe
ct o
n A
sthm
a
0 500 1000 1500Average Daily CO Pollution (ppb)
Polynomial of Order: 1 2
−3
−2
−1
01
23
45
67
8M
argi
nal E
ffect
on
Acu
te R
espi
rato
ry
0 500 1000 1500Average Daily CO Pollution (ppb)
Polynomial of Order: 1 2
−3
−2
−1
01
23
45
67
89
Mar
gina
l Effe
ct o
n A
ll R
espi
rato
ry
0 500 1000 1500Average Daily CO Pollution (ppb)
Polynomial of Order: 1 2
−.1
0.1
.2.3
Mar
gina
l Effe
ct o
n H
eart
Pro
blem
s
0 500 1000 1500Average Daily CO Pollution (ppb)
Polynomial of Order: 1 2
Panel C: Ages 65 and Above
−1
01
2M
argi
nal E
ffect
on
Ast
hma
0 500 1000 1500Average Daily CO Pollution (ppb)
Polynomial of Order: 1 2
−1
01
23
Mar
gina
l Effe
ct o
n A
cute
Res
pira
tory
0 500 1000 1500Average Daily CO Pollution (ppb)
Polynomial of Order: 1 2
−2
−1
01
23
45
Mar
gina
l Effe
ct o
n A
ll R
espi
rato
ry
0 500 1000 1500Average Daily CO Pollution (ppb)
Polynomial of Order: 1 2
−3
−2
−1
01
23
45
67
Mar
gina
l Effe
ct o
n H
eart
Pro
blem
s
0 500 1000 1500Average Daily CO Pollution (ppb)
Polynomial of Order: 1 2
Notes: Figure displays results when higher order polynomials of CO pollution are included in the overidentified model 3. The dashed line shows the linear baseline
model (polynomial of order 1), while the solid line uses a second-order polynomial. The four columns represent the four sicknesses that are related to pollution
fluctuations (asthma, acute respiratory, all respiratory, and heart problems, respectively). The grey shaded area gives the 95% confidence band for the model
using a second-order polynomial. The confidence band sometimes extends beyond the range of graph (fixed at rounded ±2 times the baseline estimate), but is
not shown for expositional purposes. The x-range is set to to the 95% set of observed values in Figure A4.
A5
Table A1: Summary Statistics - Airports
Airports in Southern CaliforniaLAX SAN SNA ONT BUR SBA LGB PSP
Average Flight Time (min) 125.60 108.55 102.74 80.28 77.45 56.70 185.16 79.27[s.e.] [4.40] [4.15] [3.22] [4.83] [6.02] [2.50] [58.32] [12.41]
Average Flight Distance (miles) 974 815 748 562 520 303 1256 537[s.e.] [26] [26] [18] [36] [36] [11] [181] [109]
Arrival Delays (min) 6.48 6.27 4.81 6.79 7.49 4.27 3.93 8.22[s.e.] [7.09] [6.89] [5.85] [7.02] [7.77] [7.90] [11.10] [8.45]
Average Departure Delays (min) 7.77 6.64 6.12 6.89 7.78 4.89 4.70 6.49[s.e.] [5.19] [5.77] [5.59] [5.78] [7.42] [8.13] [8.49] [8.44]
Average Taxi Time after Landing (min) 8.09 3.73 6.26 4.37 2.78 4.12 4.87 4.34[s.e.] [1.06] [0.40] [0.82] [0.37] [0.46] [0.43] [0.91] [0.51]
Average Taxi Time to Takeoff (min) 15.00 13.50 13.28 10.63 11.61 9.75 14.34 10.68[s.e.] [1.47] [1.83] [1.32] [1.27] [1.21] [1.44] [1.94] [1.51]
Daily Number of Arrivals 641.60 255.02 139.76 104.07 86.09 37.80 35.00 34.21[s.e.] [31.58] [17.29] [13.58] [11.67] [8.63] [3.75] [3.91] [7.86]
Daily Number of Departures 641.33 255.13 139.77 104.02 86.07 37.84 34.99 34.18[s.e.] [32.59] [17.48] [12.53] [11.74] [8.50] [3.78] [3.89] [7.88]
Daily Taxi Time All Flights (min) 14691 4369 2712 1553 1231 519 673 515[s.e.] [1852] [666] [399] [266] [193] [83] [140] [151]
Northern California Eastern United StatesSFO OAK SJC SMF ATL ORD JFK
Average Flight Time (min) 135.10 104.54 95.11 94.22 88.00 101.95 164.18[s.e.] [5.08] [6.38] [3.74] [2.71] [11.14] [4.28] [11.56]
Average Flight Distance (miles) 1061 749 678 687 649 719 1212[s.e.] [40] [35] [21] [19] [14] [30] [80]
Arrival Delays (min) 11.40 5.68 5.84 7.78 10.73 14.71 15.25[s.e.] [14.44] [7.51] [6.79] [7.14] [16.62] [24.17] [19.29]
Average Departure Delays (min) 10.38 8.50 6.44 8.27 14.27 17.11 13.81[s.e.] [9.81] [6.33] [5.77] [6.08] [12.98] [16.72] [16.08]
Average Taxi Time after Landing (min) 5.64 5.37 4.06 4.31 9.80 8.58 9.98[s.e.] [0.49] [0.74] [0.31] [0.40] [1.55] [1.65] [2.66]
Average Taxi Time to Takeoff (min) 16.46 10.84 11.64 10.33 19.44 19.73 32.88[s.e.] [1.73] [1.15] [0.93] [0.86] [3.48] [4.74] [9.99]
Daily Number of Arrivals 364.58 201.03 167.78 148.47 1140.18 992.96 309.53[s.e.] [24.04] [14.24] [13.52] [13.74] [85.16] [75.09] [35.32]
Daily Number of Departures 364.66 201.01 167.73 148.43 1146.63 992.92 309.48[s.e.] [24.46] [14.09] [13.61] [13.74] [90.51] [75.93] [34.51]
Daily Taxi Time All Flights (min) 7979 3235 2614 2166 33081 27170 13059[s.e.] [1061] [409] [298] [324] [5743] [4735] [3804]
Notes: Table lists average flight characteristics by airport in 2005-2007. Airports are ordered by geographic area
and then by decreasing number of flights. The first six variables in each panel are characteristics per flight, while
the last three variables are average characteristics per day.
A6
Table A2: Summary Statistics - Pollution, Weather, and Population by Distance From Airport
Within [0,10]km of Airport Within [0,5]km of Airport Within (5,10]km of AirportMean (Std) Min Max Mean (Std) Min Max Mean (Std) Min Max(1a) (1b) (1c) (1d) (2a) (2b) (2c) (2d) (3a) (3b) (3c) (3d)
Panel A: Daily Pollution DataMean CO (ppb) 576 (368) 0 2994 549 (391) 0 2849 587 (357) 0 2994Max CO (ppb) 1234 (841) 0 7505 1165 (857) 0 7505 1263 (833) 0 5791Mean NO2 (ppb) 20.5 (9.7) 0.7 66.4 19.8 (10.1) 0.7 65.0 20.7 (9.5) 1.0 66.4Max NO2 (ppb) 35.5 (13.5) 2.0 136.0 34.6 (14.1) 2.0 125.9 35.9 (13.2) 2.0 136.0Mean O3 (ppb) 22.8 (10.4) 1.1 90.0 23.4 (11.6) 1.1 90.0 22.6 (9.8) 1.1 65.7Max O3 (ppb) 43.8 (16.3) 2.6 166.0 44.0 (17.1) 2.8 166.0 43.7 (16.0) 2.6 166.0
Panel B: Daily WeatherMin Temp (◦ C) 11.8 (4.3) -4.6 32.1 12.0 (4.5) -4.2 32.1 11.7 (4.2) -4.6 27.7Max Temp (◦ C) 22.8 (5.6) 5.8 49.4 23.0 (5.9) 8.0 49.4 22.7 (5.4) 5.8 46.6Precipitation (mm) 1.05 (4.46) 0.00 101.71 1.00 (4.31) 0.00 87.14 1.08 (4.52) 0.00 101.71Wind Speed (m/s) 2.59 (1.33) 0.00 12.73 2.55 (1.34) 0.00 12.73 2.61 (1.32) 0.00 12.73
Panel C: Average PopulationPopulation (1000) 36.8 (17.8) 11.1 101.1 32.4 (12.6) 11.1 60.7 38.6 (19.3) 11.4 101.1Population Age [0,5) 2.6 (1.7) 0.4 8.7 2.2 (1.3) 0.4 5.6 2.8 (1.9) 0.5 8.7Population Age [5,20) 7.5 (5.0) 0.8 27.4 6.5 (3.6) 0.8 15.9 7.9 (5.5) 0.9 27.4Population Age [20,65) 22.9 (10.6) 7.1 57.7 20.2 (7.8) 7.2 36.8 24.1 (11.4) 7.1 57.7Population Age [65,∞) 3.7 (1.7) 0.3 9.1 3.4 (1.5) 0.3 6.6 3.9 (1.8) 0.5 9.1
Notes: Table lists summary statistics (mean, standard deviation, minimum, and maximum) of variables in the data set. The first four columns (1a)-(1d) use all
zip codes in our baseline regression, while columns (2a)-(2d) only use zip codes within 5km of an airport, and columns (3a)-(3d) use zip codes 5-10km from an
airport.
A7
Table A3: Summary Statistics - Sickness Rates by Distance From Airport
Within [0,10]km of Airport Within [0,5]km of Airport Within (5,10]km of AirportMean (Std) Min Max Mean (Std) Min Max Mean (Std) Min Max(1a) (1b) (1c) (1d) (2a) (2b) (2c) (2d) (3a) (3b) (3c) (3d)
Panel A1: Outpatient Sickness Rates - All Ages
Asthma 184 (268) 0 4162 170 (265) 0 3185 189 (270) 0 4162Acute Respiratory 608 (568) 0 6243 594 (586) 0 5843 614 (561) 0 6243All Respiratory 756 (653) 0 7012 748 (680) 0 7012 760 (641) 0 6243Heart Disease 168 (254) 0 3396 172 (267) 0 3396 166 (247) 0 2775Stroke 23 (90) 0 1456 22 (92) 0 1454 23 (89) 0 1456Bone Fracture 208 (273) 0 2909 208 (282) 0 2909 208 (269) 0 2775Appendicitis 2 (26) 0 903 2 (27) 0 903 2 (26) 0 875
Panel A2: Inpatient Sickness Rates - All Ages
Asthma 155 (237) 0 2775 153 (241) 0 2547 156 (235) 0 2775Acute Respiratory 373 (379) 0 4377 372 (387) 0 3396 373 (376) 0 4377All Respiratory 626 (522) 0 5253 635 (538) 0 5224 622 (514) 0 5253Heart Disease 728 (589) 0 7879 747 (612) 0 5214 720 (579) 0 7879Stroke 149 (235) 0 4183 151 (242) 0 2709 148 (231) 0 4183Bone Fracture 92 (181) 0 2510 95 (189) 0 1829 90 (177) 0 2510Appendicitis 32 (103) 0 1806 32 (105) 0 1806 32 (101) 0 1751
Panel B1: Outpatient Sickness Rates - Ages Below 5
Asthma 413 (1503) 0 33178 383 (1575) 0 33036 425 (1471) 0 33178Acute Respiratory 2739 (4262) 0 90992 2777 (4621) 0 90992 2724 (4104) 0 66357All Respiratory 3084 (4567) 0 90992 3113 (4930) 0 90992 3072 (4407) 0 66357Heart Disease 12 (263) 0 21358 11 (255) 0 15165 12 (266) 0 21358Stroke 1 (63) 0 10860 1 (74) 0 7651 1 (57) 0 10860Bone Fracture 165 (963) 0 33036 160 (1031) 0 33036 166 (933) 0 27785Appendicitis 2 (81) 0 13148 2 (75) 0 7651 2 (84) 0 13148
Panel B2: Inpatient Sickness Rates - Ages Below 5
Asthma 147 (883) 0 23697 138 (922) 0 23697 150 (866) 0 21358Acute Respiratory 404 (1485) 0 25562 403 (1572) 0 24155 405 (1447) 0 25562All Respiratory 483 (1635) 0 33036 483 (1734) 0 33036 483 (1592) 0 26810Heart Disease 55 (568) 0 22701 54 (592) 0 22701 56 (557) 0 21358Stroke 6 (186) 0 21834 7 (225) 0 21834 5 (168) 0 16589Bone Fracture 31 (415) 0 23697 29 (420) 0 23697 31 (412) 0 21358Appendicitis 7 (177) 0 15684 7 (181) 0 12077 7 (175) 0 15684
Panel C1: Outpatient Sickness Rates - Ages 65 and Above
Asthma 142 (696) 0 20730 127 (675) 0 12358 148 (705) 0 20730Acute Respiratory 349 (1109) 0 38168 332 (1117) 0 38168 356 (1106) 0 20730All Respiratory 752 (1636) 0 41459 736 (1655) 0 38168 759 (1628) 0 41459Heart Disease 910 (1803) 0 41459 889 (1816) 0 38168 919 (1797) 0 41459Stroke 136 (684) 0 20730 129 (686) 0 15108 138 (683) 0 20730Bone Fracture 289 (1004) 0 38168 284 (1053) 0 38168 291 (984) 0 20730Appendicitis 1 (46) 0 9705 1 (45) 0 4950 1 (46) 0 9705
Panel C2: Inpatient Sickness Rates - Ages 65 and Above
Asthma 488 (1342) 0 41459 496 (1426) 0 38168 485 (1305) 0 41459Acute Respiratory 1579 (2406) 0 41459 1582 (2530) 0 38168 1578 (2353) 0 41459All Respiratory 3143 (3472) 0 76336 3170 (3654) 0 76336 3132 (3395) 0 62189Heart Disease 4696 (4257) 0 76336 4746 (4502) 0 76336 4675 (4152) 0 41543Stroke 1018 (1895) 0 41459 1013 (1973) 0 38168 1020 (1861) 0 41459Bone Fracture 392 (1151) 0 38168 402 (1183) 0 38168 387 (1138) 0 29721Appendicitis 22 (283) 0 38168 23 (323) 0 38168 21 (265) 0 20730
Notes: Table lists summary statistics (mean, standard deviation, minimum, and maximum) of variables in the
data set. Admissions are counted if either the primary or one of the 24 other diagnosis codes include the ICD-9
classification for an illness. Sickness rates are measured in cases per 10 million people. The first four columns
(1a)-(1d) use all zip codes in our baseline regression, while columns (2a)-(2d) only use zip codes within 5km of an
airport, and columns (3a)-(3d) use zip codes 5-10km from an airport.
A8
Table A4: Summary Statistics - Variables in 2000 Census by Distance From Airport
Zip Codes in Regression Zip Codes Within 20km of Airports Zip Codes in CaliforniaMean Mean Equality Mean Mean Equality Mean Mean Equality[0,5]km (5,10]km p-value [0,10]km (10,20]km p-value [0,10]km (10,oo)km p-value(1a) (1b) (1c) (2a) (2b) (2c) (3a) (3b) (3c)
Population (1000) 32.7 38.2 .0802∗ 35.1 32.4 .155 35.1 18.7 3.09e-23∗∗∗
Percent Age [0,4] (%) 6.8 7.0 .572 6.8 6.5 .171 6.8 6.2 .0137∗∗
Percent Age [65,oo) (%) 11.3 10.2 .221 11.0 11.9 .118 11.0 12.9 .00337∗∗∗
Percent White (%) 58.8 54.3 .226 56.4 57.8 .476 56.4 71.3 2.74e-18∗∗∗
Percent Black (%) 7.2 10.5 .215 9.6 7.4 .0718∗ 9.6 3.6 2.79e-19∗∗∗
Percent Asian (%) 11.9 13.5 .439 12.9 14.4 .219 12.9 6.3 1.06e-15∗∗∗
Percent Urban (%) 99.5 99.2 .758 98.8 98.2 .533 98.8 61.6 8.75e-27∗∗∗
Percent in Poverty (%) 13.2 13.4 .916 13.0 14.5 .121 13.0 14.7 .0421∗∗
Income per Capita (1000) 25.2 25.6 .838 26.7 28.2 .317 26.7 22.9 .000355∗∗∗
Households (1000) 11.4 13.3 .0468∗∗ 12.2 11.1 .0565∗ 12.2 6.3 2.10e-28∗∗∗
On Public Assist. (%) 4.0 5.0 .176 4.5 5.0 .296 4.5 4.8 .454Housing Structures (1000) 12.1 13.8 .0891∗ 12.8 11.6 .0556∗ 12.8 6.7 6.58e-28∗∗∗
Percent Vacant (%) 5.5 4.1 .0849∗ 4.7 5.4 .251 4.7 13.1 1.06e-10∗∗∗
Median Rent (Dollars) 826 813 .774 824 810 .614 824 629 1.11e-15∗∗∗
Median Value (1000) 267 274 .756 292 311 .289 292 214 8.08e-09∗∗∗
Zip Codes 46 114 170 302 170 1491
Notes: Table presents three sets of comparisons of means. Each set of columns presents variables means in columns (a) and (b) and the p-value from a t-test for
equal sample means in column (c). Columns (1a)-(1c) compares zip codes in our baseline regression sample that are closest to airports (within [0,5]km) versus
further away (within (5,10]km). Columns (2a)-(2c) compare zip codes close to airports (within [0,10]km) to neighboring zip codes (within (10,20]km). Columns
(3a)-(3c) compare zip codes close to airports (within [0,10]km) to the rest of California. Significance levels are indicated by ∗∗∗ 1%, ∗∗ 5%, ∗ 10%. Our regression
tables have 164 zip codes while the columns (1a)-(1c) have 160 because 4 zip codes show up in the 2010 Census but not in the 2000 Census. The number of
zip codes in column (2a) is larger than 160 because columns (1a-1c) restrict zip codes to the ones that have at least 10000 inhabitants and for which we have
pollution data. Source: Decennial Census 2000.
A9
Table A5: Taxi Time at California Airports on East Coast Taxi Time
Airport LAX SFO SAN OAK SJC SMF SNA ONT BUR SBA LGB PSP
Taxi Time at ATL 71.87∗∗∗ 36.82∗∗∗ 18.64∗∗∗ 7.10∗∗∗ 8.86∗∗∗ -5.04∗∗∗ 7.87∗∗∗ -9.11∗∗∗ -6.24∗∗∗ -14.61∗∗∗ -16.60∗∗∗ -23.73∗∗∗
(8.72) (4.11) (3.67) (1.85) (1.96) (1.64) (2.22) (1.83) (1.86) (1.70) (1.92) (2.13)Taxi Time at ORD 15.86∗∗ 16.78∗∗∗ -2.54 1.27 2.18 -5.72∗∗∗ 7.37∗∗∗ -6.40∗∗∗ -4.23∗∗ -10.04∗∗∗ -10.01∗∗∗ -11.78∗∗∗
(7.71) (4.54) (3.77) (1.78) (1.84) (1.48) (2.41) (1.79) (1.77) (1.71) (2.02) (1.85)Taxi Time at JFK 181.48∗∗∗ 53.25∗∗∗ -5.43 -12.66∗∗ -43.24∗∗∗ -5.00 -25.11∗∗∗ -22.62∗∗∗ -30.58∗∗∗ -41.19∗∗∗ -28.08∗∗∗ -45.04∗∗∗
(12.89) (8.29) (6.90) (5.66) (5.71) (5.59) (6.05) (5.70) (5.57) (5.62) (5.86) (5.93)
Combined: F-stat 128.76 59.96 8.84 8.07 36.33 7.14 17.35 12.40 11.56 37.13 32.50 52.35(p-val.) (8.6e-43) (3.6e-26) (1.8e-05) (4.8e-05) (4.9e-18) (1.5e-04) (8.0e-10) (2.4e-07) (6.6e-07) (2.4e-18) (1.6e-16) (9.5e-24)
Notes: Table presents first stage of regression in column (1a) or (2a) of Table 1. All coefficients are from one single joint regression but are shown as a matrix
for easier display. Table regresses daily taxi time at each of the 12 California airports (min) on taxi time at three Eastern airports (1000 min) in 2005-2007.
Airports are ordered by the average number of daily flights from largest to smallest. All regressions include weather controls (quadratic in minimum and maximum
temperature, precipitation and wind speed as well as controls for wind direction), temporal controls (year, month, weekday, and holiday fixed effects), and zip
code fixed effects. Regressions are weighted by the total population in a zip code. Errors are two-way clustered by zip code and day. Significance levels are
indicated by ∗∗∗ 1%, ∗∗ 5%, ∗ 10%.
A10
Table A6: Pollution Regressed On Uninstrumented Taxi Time
CO Pollution NO2 Pollution O3 PollutionVariable (1a) (1b) (1c) (2a) (2b) (2c) (3a) (3b) (3c)
Taxi Time 20.17∗∗∗ 25.76∗∗∗ 22.67∗∗∗ 0.27∗∗∗ 0.27∗∗ 0.31 -0.05 0.00 0.05(3.07) (6.63) (8.29) (0.07) (0.12) (0.20) (0.06) (0.08) (0.16)
Taxi x Distance -0.80 -0.92 -0.00 -0.00 -0.01 -0.02(0.86) (1.07) (0.01) (0.03) (0.01) (0.02)
Taxi x Angleu 13.94∗ 0.32 -0.51∗∗∗
(7.39) (0.22) (0.17)Taxi x Angled 4.89 0.05 0.06
(6.80) (0.17) (0.11)Taxi x Speed -2.13 -0.08∗∗ 0.04
(1.84) (0.04) (0.04)Taxi x Distance x Angleu -0.65 -0.02 0.05∗∗
(1.04) (0.03) (0.02)Taxi x Distance x Angled 0.22 -0.01 -0.01
(0.86) (0.02) (0.02)Taxi x Distance x Speed 0.58∗∗ 0.01∗∗ -0.00
(0.24) (0.01) (0.01)Taxi x Angled x Speed 2.27 0.10∗∗ -0.07
(2.71) (0.05) (0.06)Taxi x Angleu x Speed -10.32∗∗∗ -0.20∗∗ 0.26∗∗∗
(3.58) (0.09) (0.09)Taxi x Dist. x Angleu x Speed 1.54∗∗∗ 0.03∗∗ -0.03∗∗
(0.48) (0.01) (0.01)Taxi x Dist. x Angled x Speed -0.69∗ -0.01∗ 0.01
(0.36) (0.01) (0.01)
Observations 179580 179580 179580 179580 179580 179580 179580 179580 179580Zip Codes 164 164 164 164 164 164 164 164 164Days 1095 1095 1095 1095 1095 1095 1095 1095 1095F-stat(joint sig.) 42.80 22.57 5.97 16.93 8.47 4.32 0.51 0.97 2.26p-value (joint sig.) 7.49e-10 2.23e-09 6.38e-12 .0000615 .0003165 3.57e-08 .4755 .3823 .002314
Notes: Table regresses zip-code level pollution on congestion (total taxi time in 1000min) at the airport in 2005-2007. Table is analogous to Table 1 except
that taxi time at California airports is not instrumented with taxi time outside California. All regressions include weather controls (quadratic in minimum and
maximum temperature, precipitation and wind speed as well as controls for wind direction), temporal controls (year, month, weekday, and holiday fixed effects),
and zip code fixed effects. Regressions are weighted by the total population in a zip code. Errors are two-way clustered by zip code and day. Significance levels
are indicated by ∗∗∗ 1%, ∗∗ 5%, ∗ 10%.
A11
Table A7: Pollution Regressed On Instrumented Taxi Time Using Different Airports Outside California
Variable (1a) (1b) (1c) (1d) (2a) (2b) (2c) (2d)
Panel A1 : CO PollutionTaxi Time 35.47∗∗∗ 33.96∗∗∗ 44.55∗∗∗ 45.59∗∗∗ 48.03∗∗∗ 45.38∗∗∗ 56.18∗∗∗ 58.09∗∗∗
(s.e.) (5.87) (5.62) (5.09) (5.14) (10.05) (9.49) (9.52) (9.27)[s.e.] [5.97] [5.72] [5.04] [5.09] [10.12] [9.55] [9.48] [9.22]
Taxi Time x Distance -1.77 -1.61 -1.64 -1.76(s.e.) (1.13) (1.07) (1.22) (1.19)[s.e.] [1.13] [1.07] [1.22] [1.18]
F-stat (joint sig.) 36.34 36.21 76.08 78.17 19.01 19.06 40.96 42.74p-val. (joint sig.) 1.1e-08 1.1e-08 3.0e-15 1.5e-15 3.8e-08 3.7e-08 3.9e-15 1.2e-15F-stat (1st stage) 49.46 35.32 49.92 54.13
Panel A2: NO2 PollutionTaxi Time 0.291∗∗ 0.290∗∗ 0.566∗∗∗ 0.582∗∗∗ 0.405∗∗ 0.418∗∗ 0.666∗∗∗ 0.689∗∗∗
(s.e.) (0.119) (0.118) (0.090) (0.090) (0.206) (0.201) (0.149) (0.150)[s.e.] [0.120] [0.119] [0.090] [0.090] [0.206] [0.202] [0.150] [0.150]
Taxi Time x Distance -0.016 -0.018 -0.014 -0.015(s.e.) (0.022) (0.022) (0.016) (0.016)[s.e.] [0.023] [0.022] [0.016] [0.016]
F-stat (joint sig.) 5.91 5.98 39.55 41.74 3.07 3.19 19.67 20.78p-val. (joint sig.) .01612 .01554 2.8e-09 1.2e-09 .04916 .04359 2.2e-08 9.1e-09F-stat (1st stage) 49.46 35.32 49.92 54.13
Busiest Airports 1 2 3 4 1 2 3 4Observations 179580 179580 179580 179580 179580 179580 179580 179580Zip Codes 164 164 164 164 164 164 164 164Days 1095 1095 1095 1095 1095 1095 1095 1095
Notes: Table regresses zip-code level pollution measures on congestion (total taxi time in 1000min) at the airport in 2005-2007. Taxi time at the local airport
is instrumented on the taxi time at airports in the Eastern United States. Columns (a), (b), (c), and (d) consecutively add additional airports that are used as
instruments, respectively, ATL, ORD, JFK, DFW. Standard errors in () are obtained from joint estimation of the IV regression. Standard errors in [] are obtained
from manually estimating the first stage and using the predicted values in the second stage. All regressions include weather controls (quadratic in minimum and
maximum temperature, precipitation and wind speed as well as controls for wind direction), temporal controls (year, month, weekday, and holiday fixed effects),
and zip code fixed effects. Regressions are weighted by the total population in a zip code. Errors are two-way clustered by zip code and day. Significance levels
are indicated by ∗∗∗ 1%, ∗∗ 5%, ∗ 10%.
A12
Table A8: Taxi Time Regressed on Weather at Airport
Taxi Time Taxi Time Taxi Time Taxi Time Taxi Timeat LAX at SFO at ATL at ORD at JFK
Weather at LAX [1.3e-33]∗∗∗ [0.011]∗∗ [0.321] [0.594] [0.484]Weather at SFO [0.272] [7.2e-21]∗∗∗ [0.357] [0.113] [0.730]Weather at ATL [3.1e-04]∗∗∗ [7.1e-05]∗∗∗ [2.0e-09]∗∗∗ [0.002]∗∗∗ [0.338]Weather at ORD [0.538] [3.8e-04]∗∗∗ [7.9e-06]∗∗∗ [2.0e-25]∗∗∗ [0.275]Weather at JFK [0.123] [0.013]∗∗ [0.048]∗∗ [0.709] [5.5e-09]∗∗∗
Notes: Table gives p-values of the joint significance of the eight weather variables (a quadratic in minimum and
maximum temperature, precipitation and wind speed) used to explain taxi time at an airport. Each entry in the
Table is from a separate regression. The taxi time is from the airport given in the column heading while the weather
variables are from the airport given in the row heading. Regressions that include weather from another airport also
control for local weather measures (not included in joint p-value). P-values are obtained using robust standard errors.
All regressions include temporal controls (year, month, weekday, and holiday fixed effects). Significance levels are
indicated by ∗∗∗ 1%, ∗∗ 5%, ∗ 10%.
A13
Table A9: Sickness Rates Regressed On Instrumented Pollution - Ages 5-64
Acute All Heart Bone Appen-Asthma Respiratory Respiratory Problems Stroke Fractures dicitis
(1a) (1b) (1c) (2) (3) (4) (5)
Panel A: Ages 5 - 19Model 1: CO -0.020 0.030 0.107 0.013 -0.000 -0.127 -0.002
(0.114) (0.289) (0.304) (0.033) (0.012) (0.143) (0.031)Model 2: CO -0.018 0.052 0.134 0.012 0.002 -0.091 -0.004
(0.105) (0.266) (0.282) (0.031) (0.011) (0.145) (0.030)Model 3: CO -0.010 -0.012 -0.032 0.001 -0.004 -0.066 0.019
(0.090) (0.199) (0.221) (0.028) (0.011) (0.083) (0.023)Model 1: NO2 -1.4 2.1 7.7 0.9 -0.0 -9.1 -0.2
(8.2) (20.9) (22.4) (2.4) (0.9) (11.0) (2.2)Model 2: NO2 -1.3 3.4 9.2 0.9 0.1 -7.2 -0.3
(7.7) (19.8) (21.3) (2.3) (0.8) (11.1) (2.2)Model 3: NO2 -5.1 -8.6 -10.6 -0.6 -0.8 -2.4 0.8
(7.7) (15.8) (17.1) (1.9) (0.8) (6.4) (1.7)
Panel B: Ages 20 - 64Model 1: CO 0.264∗∗∗ 0.279∗∗ 0.341∗∗ 0.069 0.001 -0.085∗ 0.007
(0.073) (0.120) (0.166) (0.088) (0.029) (0.049) (0.021)Model 2: CO 0.262∗∗∗ 0.275∗∗ 0.337∗∗ 0.062 -0.004 -0.086∗ 0.005
(0.070) (0.117) (0.164) (0.086) (0.028) (0.049) (0.020)Model 3: CO 0.127∗∗ 0.158∗ 0.181∗ 0.052 0.007 -0.065∗ 0.004
(0.050) (0.081) (0.109) (0.057) (0.023) (0.033) (0.012)Model 1: NO2 21.5∗∗∗ 22.8∗∗ 27.8∗∗ 5.7 0.1 -6.9 0.6
(6.1) (9.7) (13.1) (6.9) (2.3) (4.5) (1.7)Model 2: NO2 21.5∗∗∗ 22.6∗∗ 27.6∗∗ 5.3 -0.2 -7.0 0.5
(6.1) (9.5) (13.0) (6.8) (2.3) (4.6) (1.7)Model 3: NO2 10.2∗∗ 10.0 9.3 5.1 1.1 -3.4 0.4
(4.0) (7.0) (9.4) (4.8) (1.6) (2.3) (1.0)
Observations 179580 179580 179580 179580 179580 179580 179580Zip Codes 164 164 164 164 164 164 164Days 1095 1095 1095 1095 1095 1095 1095
Notes: Table replicates Table 4 for the two remaining age groups: 5-19 and 20-64. Table regresses zip-code level
sickness rates (counts for primary and secondary diagnosis codes per 10 million people) on daily instrumented
pollution levels (ppb) in 2005-2007. Each entry is a separate regression. Pollution is instrumented on airport
congestion (taxi time) that is caused by network delays (taxi time at three major airports in the Eastern United
States). Model 1 assumes a uniform impact of congestion on pollution levels at all zip codes surrounding an airport,
while model 2 adds an interaction with the distance to the airport, and model 3 furthermore adds interactions
with wind direction and speed (columns (a)-(c) in Table 1). All regressions include weather controls (quadratic in
minimum and maximum temperature, precipitation and wind speed as well as controls for wind direction), temporal
controls (year, month, weekday, and holiday fixed effects), and zip code fixed effects. Regressions are weighted by the
total population in a zip code. Errors are two-way clustered by zip code and day. Significance levels are indicated
by ∗∗∗ 1%, ∗∗ 5%, ∗ 10%.
A14
Table A10: Illness Regressed On Instrumented Pollution - Hospital and Residence Zip Codes
Acute All Heart Bone Appen-Asthma Respiratory Respiratory Problems Stroke Fractures dicitis
(1a) (1b) (1c) (2) (3) (4) (5)
Panel A: Baseline - Residence within 10km of AirportModel 1: CO 0.311∗∗∗ 0.556∗∗∗ 0.761∗∗∗ 0.434∗∗∗ 0.057 -0.030 0.006
(0.065) (0.162) (0.207) (0.134) (0.039) (0.063) (0.015)Model 2: CO 0.307∗∗∗ 0.550∗∗∗ 0.755∗∗∗ 0.419∗∗∗ 0.050 -0.030 0.003
(0.062) (0.163) (0.210) (0.128) (0.038) (0.064) (0.015)Model 3: CO 0.194∗∗∗ 0.396∗∗∗ 0.515∗∗∗ 0.226∗∗∗ 0.020 -0.039 0.002
(0.047) (0.125) (0.165) (0.079) (0.030) (0.040) (0.011)Panel B1: Both Hospital and Residence within 10km of Airport
Model 1: CO 0.165∗∗∗ 0.521∗∗∗ 0.728∗∗∗ 0.147∗ -0.019 -0.035 -0.015∗∗
(0.046) (0.156) (0.212) (0.080) (0.024) (0.026) (0.007)Model 2: CO 0.165∗∗∗ 0.498∗∗∗ 0.698∗∗∗ 0.142∗ -0.023 -0.038 -0.016∗∗
(0.044) (0.158) (0.216) (0.077) (0.024) (0.027) (0.007)Model 3: CO 0.148∗∗∗ 0.463∗∗∗ 0.628∗∗∗ 0.114∗∗ -0.024 -0.033∗ -0.006
(0.036) (0.117) (0.160) (0.049) (0.019) (0.018) (0.006)Panel B2: Residence Within 10km of Airport / Hospital Outside 10km of Any Airport
Model 1: CO 0.131∗∗∗ 0.027 0.030 0.287∗∗∗ 0.077∗∗ -0.004 0.018(0.047) (0.098) (0.134) (0.096) (0.031) (0.057) (0.012)
Model 2: CO 0.127∗∗∗ 0.045 0.055 0.277∗∗∗ 0.074∗∗ -0.001 0.016(0.045) (0.099) (0.133) (0.093) (0.029) (0.058) (0.011)
Model 3: CO 0.037 -0.070 -0.107 0.113∗∗ 0.045∗∗ -0.014 0.009(0.034) (0.081) (0.107) (0.057) (0.023) (0.037) (0.008)
Panel B3: Residence Within 10km of Airport / Hospital Within 10km of Another AirportModel 1: CO 0.014 0.006 -0.000 -0.000 -0.000 0.009 0.003
(0.010) (0.018) (0.022) (0.012) (0.004) (0.007) (0.002)Model 2: CO 0.014 0.005 -0.001 -0.001 -0.001 0.009 0.003
(0.009) (0.017) (0.022) (0.011) (0.004) (0.007) (0.002)Model 3: CO 0.009 0.003 -0.006 -0.001 -0.001 0.007∗ -0.001
(0.006) (0.011) (0.013) (0.008) (0.003) (0.004) (0.002)Panel C1: Hospital Within 10km of Airport / Residence Outside 10km of Any Airport
Model 1: CO -0.035 0.078 0.157 -0.208∗∗ 0.005 -0.014 -0.013(0.082) (0.354) (0.452) (0.101) (0.036) (0.063) (0.008)
Model 2: CO -0.032 0.083 0.163 -0.192∗ 0.004 -0.024 -0.014∗
(0.086) (0.382) (0.486) (0.105) (0.034) (0.060) (0.008)Model 3: CO 0.001 0.120 0.214 -0.098∗ -0.003 -0.016 -0.003
(0.057) (0.284) (0.369) (0.056) (0.023) (0.034) (0.005)Panel C2: Hospital Within 10km of Airport / Residence Within 10km of Another Airport
Model 1: CO 0.003 0.015 0.017 0.009 0.001 -0.006 -0.001(0.006) (0.012) (0.013) (0.012) (0.003) (0.007) (0.001)
Model 2: CO 0.003 0.016 0.017 0.009 0.000 -0.006 -0.001(0.006) (0.012) (0.012) (0.012) (0.003) (0.006) (0.001)
Model 3: CO 0.003 0.011 0.013 0.007 0.002 -0.005 -0.001(0.004) (0.010) (0.011) (0.007) (0.002) (0.004) (0.001)
Notes: Table replicates the results for CO in Table 4 but further distinguishes between zip codes of the residence
and hospital. Each regression is run on a sample of 179580 observations, consisting of 164 unique zip codes and 1095
days. Panel A replicates the baseline where patients are assigned the zip code of their residence. Panels B1-B3 still
assign patients to zip codes based on their residence, but split the data: Panel B1 only includes patients where the
hospital was within the same 10km around an airport as the residence. Panel B2 only includes patients that went
to a hospital that was outside of all 10km circles around the 12 airports in the study. Finally, panel B3 includes
patients that went to a hospital that was within 10km of the other 11 airports. Panels C1-C2 assign patients based
on the zip code of the hospital. Panel C1 looks at patients that went to a hospital that was within 10km of an
airport but whose residence was outside all 10km circles around the 12 airports in the study. Panel C2 looks at
patients that went to a hospital that was within 10km of an airport but whose residence was within 10km of one of
the other 11 airports in the study. Significance levels are indicated by ∗∗∗ 1%, ∗∗ 5%, ∗ 10%.
A15
Table A11: Sickness Rates Regressed On Instrumented Pollution - LIML
Acute All Heart Bone Appen-Asthma Respiratory Respiratory Problems Stroke Fractures dicitis
(1a) (1b) (1c) (2) (3) (4) (5)
Panel A: All AgesModel 2: CO 0.307∗∗∗ 0.550∗∗∗ 0.756∗∗∗ 0.420∗∗∗ 0.051 -0.030 0.003
(0.062) (0.163) (0.210) (0.128) (0.038) (0.064) (0.015)Model 3: CO 0.197∗∗∗ 0.404∗∗∗ 0.529∗∗∗ 0.229∗∗∗ 0.020 -0.039 0.002
(0.048) (0.128) (0.171) (0.081) (0.030) (0.041) (0.011)Model 2: NO2 24.3∗∗∗ 43.6∗∗∗ 59.8∗∗∗ 33.8∗∗∗ 4.2 -2.4 0.3
(6.1) (16.3) (20.8) (10.5) (3.1) (5.2) (1.2)Model 3: NO2 13.3∗∗∗ 21.3∗ 28.0 17.9∗∗ 0.8 -1.0 0.3
(4.4) (12.8) (17.0) (7.5) (2.2) (3.0) (0.9)
Panel B: Ages Below 5Model 2: CO 0.579∗∗ 1.930∗ 2.625∗ 0.127 0.020 0.064 -0.013
(0.235) (1.111) (1.357) (0.078) (0.023) (0.132) (0.034)Model 3: CO 0.674∗∗∗ 2.196∗∗∗ 2.529∗∗ 0.075 0.023 -0.012 -0.009
(0.172) (0.809) (0.995) (0.058) (0.015) (0.123) (0.022)Model 2: NO2 43.2∗∗ 144.1 196.1∗ 9.6 1.5 4.8 -0.9
(20.2) (94.3) (117.8) (6.3) (1.7) (9.7) (2.5)Model 3: NO2 45.1∗∗∗ 130.7∗ 151.3∗ 4.6 2.9∗∗ 3.8 0.6
(15.6) (73.8) (90.3) (4.8) (1.3) (9.5) (2.1)
Panel C: Ages 65 and OlderModel 2: CO 0.817∗∗∗ 1.416∗∗∗ 2.276∗∗∗ 3.533∗∗∗ 0.502∗ 0.409∗ 0.016
(0.288) (0.423) (0.638) (0.972) (0.302) (0.242) (0.028)Model 3: CO 0.497∗∗ 0.703∗∗ 1.432∗∗∗ 1.966∗∗∗ 0.199 0.189 -0.025
(0.206) (0.313) (0.514) (0.631) (0.251) (0.162) (0.026)Model 2: NO2 66.8∗∗∗ 115.6∗∗∗ 181.4∗∗∗ 282.5∗∗∗ 41.3∗ 34.6∗ 1.4
(22.9) (34.5) (52.3) (76.3) (24.3) (18.3) (2.2)Model 3: NO2 36.2∗∗ 40.5 77.9∗ 138.6∗∗∗ 3.7 12.4 -0.8
(14.6) (25.6) (42.7) (50.7) (16.8) (12.2) (1.7)
Observations 179580 179580 179580 179580 179580 179580 179580Zip Codes 164 164 164 164 164 164 164Days 1095 1095 1095 1095 1095 1095 1095
Notes: Table replicates models 2 and 3 of Table 4 except that the IV regression is done using limited information
maximum likelihood instead of 2-stage least squares. Model 1 is dropped as it is exactly identified, in which case
LIML is identical to two-stage least squares. Table regresses zip-code level sickness rates (counts for primary and
secondary diagnosis codes per 10 million people) on daily instrumented pollution levels (ppb) in 2005-2007. Each
entry is a separate regression. Pollution is instrumented on airport congestion (taxi time) that is caused by network
delays (taxi time at three major airports in the Eastern United States). Model 1 assumes a uniform impact of
congestion on pollution levels at all zip codes surrounding an airport, while model 2 adds an interaction with the
distance to the airport, and model 3 furthermore adds interactions with wind direction and speed (columns (a)-(c) in
Table 1). All regressions include weather controls (quadratic in minimum and maximum temperature, precipitation
and wind speed as well as controls for wind direction), temporal controls (year, month, weekday, and holiday fixed
effects), and zip code fixed effects. Regressions are weighted by the total population in a zip code. Errors are
two-way clustered by zip code and day. Significance levels are indicated by ∗∗∗ 1%, ∗∗ 5%, ∗ 10%.
A16
Table A12: Sickness Rates (Primary Diagnosis Code) Regressed On Instrumented Pollution
Acute All Heart Bone Appen-Asthma Respiratory Respiratory Problems Stroke Fractures dicitis
(1a) (1b) (1c) (2) (3) (4) (5)
Panel A: All AgesModel 1: CO 0.039 0.245∗ 0.433∗∗ 0.078 0.006 -0.070 0.007
(0.039) (0.135) (0.185) (0.052) (0.023) (0.058) (0.015)Model 2: CO 0.046 0.254∗ 0.445∗∗ 0.075 0.002 -0.069 0.005
(0.037) (0.133) (0.184) (0.054) (0.022) (0.059) (0.015)Model 3: CO 0.041 0.197∗ 0.327∗∗ 0.047 0.009 -0.059 0.003
(0.030) (0.105) (0.142) (0.033) (0.017) (0.036) (0.010)Model 1: NO2 3.1 19.3 34.1∗∗ 6.1 0.5 -5.5 0.6
(3.2) (11.9) (17.0) (4.2) (1.8) (5.0) (1.2)Model 2: NO2 3.5 19.8∗ 34.8∗∗ 6.0 0.3 -5.5 0.4
(3.2) (11.8) (17.0) (4.3) (1.8) (5.1) (1.2)Model 3: NO2 0.3 4.2 10.4 6.5∗∗ 0.8 -2.4 0.2
(2.4) (8.2) (11.5) (3.0) (1.3) (2.9) (0.9)Panel B: Ages Below 5
Model 1: CO 0.370∗∗ 2.087∗∗ 2.669∗∗ 0.001 0.005 -0.009 0.001(0.159) (0.867) (1.141) (0.037) (0.012) (0.135) (0.030)
Model 2: CO 0.400∗∗∗ 2.139∗∗ 2.673∗∗ -0.001 0.007 0.006 -0.004(0.153) (0.853) (1.139) (0.037) (0.013) (0.130) (0.032)
Model 3: CO 0.359∗∗∗ 1.812∗∗∗ 2.113∗∗ -0.022 0.003 -0.043 0.001(0.107) (0.627) (0.852) (0.027) (0.009) (0.117) (0.021)
Model 1: NO2 27.6∗∗ 155.7∗∗ 199.1∗∗ 0.1 0.3 -0.7 0.1(13.0) (73.6) (99.2) (2.8) (0.9) (10.1) (2.3)
Model 2: NO2 29.9∗∗ 159.7∗∗ 199.6∗∗ -0.1 0.5 0.4 -0.3(12.6) (72.2) (99.1) (2.7) (1.0) (9.7) (2.4)
Model 3: NO2 19.7∗∗ 100.7∗ 118.8∗ -2.3 0.4 0.7 1.2(7.8) (52.9) (71.9) (2.2) (0.9) (8.9) (2.0)
Panel C: Ages 65 and OlderModel 1: CO 0.176∗ 0.459∗∗ 1.241∗∗∗ 0.995∗∗∗ 0.103 0.253 0.005
(0.098) (0.232) (0.383) (0.367) (0.184) (0.228) (0.027)Model 2: CO 0.173∗ 0.445∗∗ 1.251∗∗∗ 0.998∗∗∗ 0.069 0.226 0.001
(0.097) (0.224) (0.375) (0.370) (0.184) (0.226) (0.026)Model 3: CO 0.109∗ 0.309∗∗ 0.939∗∗∗ 0.503∗ 0.115 0.092 -0.024
(0.061) (0.139) (0.284) (0.272) (0.139) (0.148) (0.022)Model 1: NO2 13.7∗ 35.9∗∗ 97.1∗∗∗ 77.9∗∗ 8.1 19.8 0.4
(7.4) (17.3) (32.1) (30.5) (14.6) (17.3) (2.1)Model 2: NO2 13.7∗ 35.9∗∗ 97.2∗∗∗ 77.9∗∗ 8.1 19.8 0.4
(7.4) (17.3) (32.1) (30.5) (14.6) (17.3) (2.1)Model 3: NO2 8.5∗∗ 20.2∗∗ 53.5∗∗ 55.5∗∗∗ 5.2 3.7 -0.6
(4.2) (10.1) (22.5) (20.5) (9.8) (11.2) (1.5)
Observations 179580 179580 179580 179580 179580 179580 179580Zip Codes 164 164 164 164 164 164 164Days 1095 1095 1095 1095 1095 1095 1095
Notes: Table replicates Table 4 except that sickness counts are based on primary diagnosis codes only. Table
regresses zip-code level sickness rates (counts for primary and secondary diagnosis codes per 10 million people) on
daily instrumented pollution levels (ppb) in 2005-2007. Each entry is a separate regression. Pollution is instrumented
on airport congestion (taxi time) that is caused by network delays (taxi time at three major airports in the Eastern
United States). Model 1 assumes a uniform impact of congestion on pollution levels at all zip codes surrounding
an airport, while model 2 adds an interaction with the distance to the airport, and model 3 furthermore adds
interactions with wind direction and speed (columns (a)-(c) in Table 1). All regressions include weather controls
(quadratic in minimum and maximum temperature, precipitation and wind speed as well as controls for wind
direction), temporal controls (year, month, weekday, and holiday fixed effects), and zip code fixed effects. Regressions
are weighted by the total population in a zip code. Errors are two-way clustered by zip code and day. Significance
levels are indicated by ∗∗∗ 1%, ∗∗ 5%, ∗ 10%.
A17
Table A13: Sickness Rates of All Ages Regressed On Instrumented Pollution - Sensitivity of IV
Acute All Heart Bone Appen-Asthma Respiratory Respiratory Problems Stroke Fractures dicitis
(1a) (1b) (1c) (2) (3) (4) (5)
Panel A: Baseline: Tax Time at Eastern AirportsModel 1: CO 0.311∗∗∗ 0.556∗∗∗ 0.761∗∗∗ 0.434∗∗∗ 0.057 -0.030 0.006
(0.065) (0.162) (0.207) (0.134) (0.039) (0.063) (0.015)Model 2: CO 0.307∗∗∗ 0.550∗∗∗ 0.755∗∗∗ 0.419∗∗∗ 0.050 -0.030 0.003
(0.062) (0.163) (0.210) (0.128) (0.038) (0.064) (0.015)Model 3: CO 0.194∗∗∗ 0.396∗∗∗ 0.515∗∗∗ 0.226∗∗∗ 0.020 -0.039 0.002
(0.047) (0.125) (0.165) (0.079) (0.030) (0.040) (0.011)Model 1: NO2 24.5∗∗∗ 43.8∗∗∗ 59.9∗∗∗ 34.2∗∗∗ 4.5 -2.4 0.5
(6.2) (16.2) (20.5) (10.5) (3.1) (5.2) (1.2)Model 2: NO2 24.3∗∗∗ 43.6∗∗∗ 59.8∗∗∗ 33.5∗∗∗ 4.2 -2.4 0.3
(6.1) (16.3) (20.8) (10.4) (3.1) (5.2) (1.2)Model 3: NO2 12.4∗∗∗ 18.9∗ 24.2∗ 17.1∗∗ 0.7 -1.0 0.3
(4.0) (11.0) (14.2) (7.1) (2.2) (3.0) (0.9)Panel B: Tax Time 5am to noon at Eastern Airports
Model 1: CO 0.358∗∗∗ 0.532∗∗∗ 0.656∗∗∗ 0.491∗∗∗ 0.044 -0.052 0.013(0.097) (0.165) (0.210) (0.166) (0.043) (0.064) (0.015)
Model 2: CO 0.346∗∗∗ 0.514∗∗∗ 0.634∗∗∗ 0.474∗∗∗ 0.039 -0.056 0.011(0.092) (0.162) (0.209) (0.158) (0.043) (0.065) (0.015)
Model 3: CO 0.200∗∗∗ 0.372∗∗∗ 0.450∗∗∗ 0.225∗∗∗ 0.006 -0.050 0.004(0.054) (0.116) (0.149) (0.082) (0.029) (0.036) (0.010)
Model 1: NO2 27.3∗∗∗ 40.6∗∗∗ 50.1∗∗∗ 37.5∗∗∗ 3.4 -4.0 1.0(7.6) (13.8) (17.2) (12.9) (3.3) (5.1) (1.2)
Model 2: NO2 27.0∗∗∗ 40.1∗∗∗ 49.4∗∗∗ 37.0∗∗∗ 3.2 -4.1 0.9(7.5) (13.8) (17.3) (12.7) (3.3) (5.2) (1.2)
Model 3: NO2 13.4∗∗∗ 18.5∗ 20.9 17.2∗∗ -0.2 -2.0 0.4(4.6) (10.2) (12.7) (7.4) (2.2) (2.8) (0.8)
Panel C: Weather at Eastern AirportsModel 1: CO 0.346∗ 1.009∗ 1.362∗ 0.530∗ 0.173∗ 0.265∗ -0.002
(0.181) (0.580) (0.755) (0.313) (0.095) (0.154) (0.029)Model 2: CO 0.396∗∗ 1.096∗ 1.501∗ 0.541∗ 0.134 0.232 -0.016
(0.192) (0.606) (0.800) (0.323) (0.089) (0.145) (0.029)Model 1: NO2 24.9∗ 72.9 98.3 38.3∗ 12.5 19.2∗ -0.1
(14.5) (48.0) (62.0) (23.2) (7.7) (11.1) (2.1)Model 2: NO2 29.4∗ 78.3∗ 108.4∗ 35.7 6.2 12.6 -2.0
(15.6) (47.0) (61.8) (22.0) (5.6) (8.4) (2.1)
Observations 179580 179580 179580 179580 179580 179580 179580Zip Codes 164 164 164 164 164 164 164Days 1095 1095 1095 1095 1095 1095 1095
Notes: Table lists the results for all ages from Table 4 in Panel A. Panel B instruments taxi time at California
airports on the taxi time between 5am and noon of each day at the three Eastern Airports. Panel C uses the
weather at each of the three Eastern airports as instrument (quadratic in minimum and maximum temperature,
precipitation and wind speed). We do not estimate model 3 in panel C as it would include 3456 instruments.
Table regresses zip-code level sickness rates (counts for primary and secondary diagnosis codes per 10 million
people) on daily instrumented pollution levels (ppb) in 2005-2007. Each entry is a separate regression. Pollution is
instrumented on airport congestion (taxi time) that is caused by network delays (taxi time at three major airports
in the Eastern United States). Model 1 assumes a uniform impact of congestion on pollution levels at all zip codes
surrounding an airport, while model 2 adds an interaction with the distance to the airport, and model 3 furthermore
adds interactions with wind direction and speed (columns (a)-(c) in Table 1). All regressions include weather
controls (quadratic in minimum and maximum temperature, precipitation and wind speed as well as controls for
wind direction), temporal controls (year, month, weekday, and holiday fixed effects), and zip code fixed effects.
Regressions are weighted by the total population in a zip code. Errors are two-way clustered by zip code and day.
Significance levels are indicated by ∗∗∗ 1%, ∗∗ 5%, ∗ 10%.
A18
Table A14: Sickness Rates Regressed On Instrumented CO Pollution - by Season
Acute All Heart Bone Appen-Asthma Respiratory Respiratory Problems Stroke Fractures dicitis
(1a) (1b) (1c) (2) (3) (4) (5)
Panel A: All AgesM1: CO Summer 0.273∗∗ 0.076 0.310 0.402∗∗ 0.092 0.060 0.001
(0.116) (0.231) (0.284) (0.168) (0.070) (0.123) (0.029)M1: CO Winter 0.355∗∗∗ 0.660∗∗∗ 0.763∗∗∗ 0.527∗∗∗ 0.040 -0.139∗∗ 0.021
(0.106) (0.181) (0.223) (0.201) (0.043) (0.059) (0.019)PSummer=Winter 0.548 0.028∗∗ 0.127 0.516 0.449 0.063∗ 0.529
M2: CO Summer 0.276∗∗ 0.077 0.317 0.394∗∗ 0.090 0.064 -0.000(0.114) (0.231) (0.287) (0.169) (0.069) (0.125) (0.029)
M2: CO Winter 0.328∗∗∗ 0.622∗∗∗ 0.714∗∗∗ 0.500∗∗∗ 0.031 -0.139∗∗ 0.016(0.096) (0.172) (0.212) (0.179) (0.042) (0.059) (0.019)
PSummer=Winter 0.688 0.033∗∗ 0.158 0.557 0.394 0.062∗ 0.618
M3: CO Summer 0.247∗∗ -0.003 0.151 0.302∗ 0.091 0.049 0.008(0.102) (0.200) (0.257) (0.163) (0.065) (0.108) (0.028)
M3: CO Winter 0.154∗∗∗ 0.254∗∗ 0.247∗ 0.210∗∗ -0.009 -0.082∗∗ 0.004(0.059) (0.101) (0.141) (0.098) (0.034) (0.036) (0.012)
PSummer=Winter 0.376 0.154 0.665 0.574 0.168 0.193 0.872
Panel B: Ages Below 5M1: CO Summer 0.137 -2.182 -1.617 0.325∗ 0.051 0.536 0.003
(0.348) (1.336) (1.599) (0.182) (0.054) (0.331) (0.056)M1: CO Winter 0.782∗∗ 2.621∗∗ 3.184∗∗ 0.085 -0.013 -0.239 -0.014
(0.334) (1.168) (1.403) (0.093) (0.025) (0.157) (0.044)PSummer=Winter 0.162 0.001∗∗∗ 0.003∗∗∗ 0.215 0.344 0.033∗∗ 0.819
M2: CO Summer 0.160 -2.063 -1.473 0.279 0.053 0.547 0.002(0.331) (1.341) (1.640) (0.179) (0.058) (0.334) (0.057)
M2: CO Winter 0.732∗∗ 2.438∗∗ 2.919∗∗ 0.061 -0.011 -0.204 -0.023(0.317) (1.140) (1.371) (0.092) (0.026) (0.157) (0.045)
PSummer=Winter 0.172 0.001∗∗∗ 0.006∗∗∗ 0.263 0.370 0.047∗∗ 0.725
M3: CO Summer 0.101 -2.195∗ -1.923 0.299∗ 0.072 0.369 -0.012(0.293) (1.216) (1.500) (0.166) (0.056) (0.287) (0.055)
M3: CO Winter 0.568∗∗ 1.457∗∗ 1.530∗ -0.004 -0.009 -0.086 -0.015(0.244) (0.723) (0.850) (0.068) (0.016) (0.156) (0.029)
PSummer=Winter 0.211 0.002∗∗∗ 0.013∗∗ 0.091∗ 0.218 0.129 0.969
Observations 179580 179580 179580 179580 179580 179580 179580Zips x Season 328 328 328 328 328 328 328Days 549 549 549 549 549 549 549
Notes: Table regresses zip-code level sickness rates (counts for primary and secondary diagnosis codes per 10 million
people) on daily instrumented CO pollution levels (ppb) in 2005-2007. Regressions include separate coefficients for
summer (April-September) and winter (October-March), where all controls are also allowed to differ by season.
Daily mean pollution levels for CO in the summer months is 412.6 ppb, and the daily mean pollution level for CO
in winter months is 760.4 ppb. The p-value for the test whether the coefficients on pollution are the same for both
seasons is added below each regression. Pollution is instrumented on airport congestion (taxi time) that is caused
by network delays (taxi time at three major airports in the Eastern United States). Model 1 assumes a uniform
impact of congestion on pollution levels at all zip codes surrounding an airport, while model 2 adds an interaction
with the distance to the airport, and model 3 furthermore adds interactions with wind direction and speed (columns
(a)-(c) in Table 1). All regressions include weather controls (quadratic in minimum and maximum temperature,
precipitation and wind speed as well as controls for wind direction), temporal controls (year, month, weekday, and
holiday fixed effects), and zip code fixed effects. Regressions are weighted by the total population in a zip code.
Errors are two-way clustered by zip code and day. Significance levels are indicated by ∗∗∗ 1%, ∗∗ 5%, ∗ 10%.A19
Table A15: Sickness Rates Regressed On Instrumented Pollution (Controlling for Ozone, All Ages)
Acute All Heart Bone Appen-Asthma Respiratory Respiratory Problems Stroke Fractures dicitis
(1a) (1b) (1c) (2) (3) (4) (5)
Panel A: All AgesModel 1: CO 0.311∗∗∗ 0.556∗∗∗ 0.761∗∗∗ 0.435∗∗∗ 0.057 -0.030 0.006
(0.064) (0.163) (0.207) (0.133) (0.039) (0.063) (0.015)Model 2: CO 0.306∗∗∗ 0.550∗∗∗ 0.755∗∗∗ 0.418∗∗∗ 0.050 -0.030 0.002
(0.061) (0.163) (0.211) (0.126) (0.038) (0.064) (0.015)Model 3: CO 0.198∗∗∗ 0.412∗∗∗ 0.538∗∗∗ 0.235∗∗∗ 0.021 -0.037 0.002
(0.048) (0.128) (0.169) (0.080) (0.030) (0.041) (0.011)Model 1: NO2 24.5∗∗∗ 43.9∗∗∗ 60.0∗∗∗ 34.3∗∗∗ 4.5 -2.4 0.5
(6.3) (16.5) (20.9) (10.5) (3.2) (5.2) (1.2)Model 2: NO2 24.3∗∗∗ 43.6∗∗∗ 59.8∗∗∗ 33.3∗∗∗ 4.1 -2.4 0.2
(6.1) (16.6) (21.2) (10.3) (3.1) (5.3) (1.2)Model 3: NO2 13.0∗∗∗ 21.1∗ 27.5∗ 18.7∗∗∗ 0.9 -0.5 0.3
(4.3) (11.8) (15.1) (7.1) (2.2) (3.1) (0.9)
Observations 179580 179580 179580 179580 179580 179580 179580Zip Codes 164 164 164 164 164 164 164Days 1095 1095 1095 1095 1095 1095 1095
Notes: Table replicates Panel A of Table 4 except that it also controls for ozone. It regresses zip-code level sickness
rates (counts for primary and secondary diagnosis codes per 10 million people) on daily instrumented pollution levels
(ppb) in 2005-2007. Each entry is a separate regression. Pollution is instrumented on airport congestion (taxi time)
that is caused by network delays (taxi time at three major airports in the Eastern United States). Model 1 assumes
a uniform impact of congestion on pollution levels at all zip codes surrounding an airport, while model 2 adds an
interaction with the distance to the airport, and model 3 furthermore adds interactions with wind direction and
speed (columns (a)-(c) in Table 1). All regressions include weather controls (quadratic in minimum and maximum
temperature, precipitation and wind speed as well as controls for wind direction), temporal controls (year, month,
weekday, and holiday fixed effects), and zip code fixed effects. Regressions are weighted by the total population in
a zip code. Errors are two-way clustered by zip code and day. Significance levels are indicated by ∗∗∗ 1%, ∗∗ 5%, ∗
10%.
A20
Table A16: Sickness Rates Regressed On Instrumented Particulate Pollution, All Ages - LIML
Acute All Heart Bone Appen-Asthma Respiratory Respiratory Problems Stroke Fractures dicitis
(1a) (1b) (1c) (2) (3) (4) (5)
Model 1: PM2.5 20.1 -76.3 -64.5 34.9 -1.6 -14.8 -1.6(34.5) (126.6) (138.1) (42.9) (13.2) (18.6) (4.1)
FFirst Stage 0.89 0.89 0.89 0.89 0.89 0.89 0.89
Model 2: PM2.5 26.4 -81.9 50.2 65.5 -1.0 -9.5 0.6(18.4) (2086.4) (276.3) (46.5) (9.9) (10.7) (3.6)
FFirst Stage 3.49 3.49 3.49 3.49 3.49 3.49 3.49
Model 3: PM2.5 -4.9 -5.3 -10.7 15.5 3.3 1.3 0.7(3.2) (15.5) (17.7) (11.2) (3.4) (4.9) (1.5)
FFirst Stage 4.90 4.90 4.90 4.90 4.90 4.90 4.90
Observations 41926 41926 41926 41926 41926 41926 41926Zip Codes 53 53 53 53 53 53 53Days 1091 1091 1091 1091 1091 1091 1091
Notes: Table regresses zip-code level sickness rates (counts for primary and secondary diagnosis codes per 10 million
people) on daily instrumented pollution levels (ppb) in 2005-2007. Each entry is a separate regression. Pollution is
instrumented on airport congestion (taxi time) that is caused by network delays (taxi time at three major airports
in the Eastern United States). Model 1 assumes a uniform impact of congestion on pollution levels at all zip codes
surrounding an airport, while model 2 adds an interaction with the distance to the airport, and model 3 furthermore
adds interactions with wind direction and speed. All regressions include weather controls (quadratic in minimum
and maximum temperature, precipitation, precipitation and wind speed as well as controls for wind direction),
temporal controls (year, month, weekday, and holiday fixed effects), and zip code fixed effects. Regressions are
weighted by the total population in a zip code. Errors are two-way clustered by zip code and day. Significance levels
are indicated by ∗∗∗ 1%, ∗∗ 5%, ∗ 10%.
A21
Table A17: Sickness Rates Regressed On Instrumented Pollution - Inpatient vs Outpatient Data
Acute All Heart Bone Appen-Asthma Respiratory Respiratory Problems Stroke Fractures dicitis
(1a) (1b) (1c) (2) (3) (4) (5)
Panel A1: All Ages - Inpatient DataModel 1: CO 0.041 0.116∗ 0.122 0.163∗ 0.021 0.040 0.007
(0.041) (0.060) (0.089) (0.092) (0.030) (0.029) (0.015)Model 2: CO 0.039 0.112∗ 0.119 0.154∗ 0.016 0.040 0.004
(0.039) (0.059) (0.090) (0.092) (0.030) (0.030) (0.015)Model 3: CO 0.020 0.069 0.059 0.074 -0.008 0.001 0.003
(0.027) (0.042) (0.069) (0.055) (0.025) (0.022) (0.010)Panel B1: Ages Below 5 - Inpatient Data
Model 1: CO 0.013 0.224 0.082 0.093 0.013 -0.028 -0.012(0.146) (0.285) (0.366) (0.069) (0.019) (0.053) (0.024)
Model 2: CO 0.020 0.224 0.069 0.071 0.014 -0.017 -0.016(0.140) (0.277) (0.361) (0.064) (0.021) (0.053) (0.025)
Model 3: CO 0.112 0.326∗ 0.172 0.062 0.017 -0.019 -0.005(0.094) (0.188) (0.265) (0.046) (0.014) (0.041) (0.017)
Panel C1: Ages 65 and Older - Inpatient DataModel 1: CO 0.324 0.771∗∗ 0.903∗ 1.873∗∗ 0.248 0.240 0.015
(0.211) (0.349) (0.498) (0.764) (0.253) (0.156) (0.028)Model 2: CO 0.299 0.738∗∗ 0.912∗ 1.848∗∗ 0.225 0.224 0.013
(0.197) (0.349) (0.509) (0.763) (0.256) (0.155) (0.027)Model 3: CO 0.150 0.234 0.474 0.953∗ 0.010 -0.006 -0.029
(0.138) (0.250) (0.400) (0.490) (0.209) (0.119) (0.026)Panel A2: All Ages - Outpatient Data
Model 1: CO 0.269∗∗∗ 0.440∗∗∗ 0.639∗∗∗ 0.271∗∗∗ 0.037∗∗ -0.070 -0.002(0.053) (0.135) (0.167) (0.069) (0.015) (0.046) (0.004)
Model 2: CO 0.268∗∗∗ 0.438∗∗∗ 0.637∗∗∗ 0.264∗∗∗ 0.035∗∗ -0.070 -0.002(0.053) (0.135) (0.168) (0.064) (0.014) (0.046) (0.004)
Model 3: CO 0.174∗∗∗ 0.327∗∗∗ 0.456∗∗∗ 0.153∗∗∗ 0.028∗∗∗ -0.040 -0.001(0.041) (0.108) (0.135) (0.047) (0.010) (0.027) (0.002)
Panel B2: Ages Below 5 - Outpatient DataModel 1: CO 0.552∗∗∗ 1.724∗ 2.601∗∗ 0.060∗ 0.006 0.078 0.006
(0.189) (0.957) (1.115) (0.034) (0.006) (0.120) (0.017)Model 2: CO 0.559∗∗∗ 1.706∗ 2.555∗∗ 0.055∗ 0.006 0.081 0.004
(0.189) (0.953) (1.123) (0.033) (0.005) (0.116) (0.018)Model 3: CO 0.557∗∗∗ 1.840∗∗∗ 2.321∗∗∗ 0.013 0.006∗ 0.006 -0.004
(0.159) (0.692) (0.814) (0.024) (0.003) (0.107) (0.011)Panel C2: Ages 65 and Older - Outpatient Data
Model 1: CO 0.525∗∗∗ 0.698∗∗∗ 1.411∗∗∗ 1.731∗∗∗ 0.278∗∗ 0.200 0.002(0.148) (0.225) (0.321) (0.403) (0.108) (0.144) (0.004)
Model 2: CO 0.517∗∗∗ 0.675∗∗∗ 1.364∗∗∗ 1.681∗∗∗ 0.277∗∗ 0.185 0.002(0.140) (0.216) (0.308) (0.381) (0.110) (0.143) (0.004)
Model 3: CO 0.343∗∗∗ 0.462∗∗∗ 0.950∗∗∗ 0.984∗∗∗ 0.188∗∗ 0.193∗∗ 0.004(0.106) (0.151) (0.222) (0.295) (0.087) (0.088) (0.004)
Notes: Table replicates Table 4 except that sickness counts only use Inpatient Data (patients stay overnight) or
Outpatient Data (patients do not stay overnight). Table regresses zip-code level sickness rates (counts for primary
and secondary diagnosis codes per 10 million people) on daily instrumented pollution levels (ppb) in 2005-2007.
Each entry is a separate regression. Pollution is instrumented on airport congestion (taxi time) that is caused by
network delays (taxi time at three major airports in the Eastern United States). Model 1 assumes a uniform impact
of congestion on pollution levels at all zip codes surrounding an airport, while model 2 adds an interaction with
the distance to the airport, and model 3 furthermore adds interactions with wind direction and speed (columns
(a)-(c) in Table 1). All regressions include weather controls (quadratic in minimum and maximum temperature,
precipitation and wind speed as well as controls for wind direction), temporal controls (year, month, weekday, and
holiday fixed effects), and zip code fixed effects. Regressions are weighted by the total population in a zip code.
Errors are two-way clustered by zip code and day. Significance levels are indicated by ∗∗∗ 1%, ∗∗ 5%, ∗ 10%.
A22