Thesis Trickey Kevin draft 3

72
1 THE UNIVERSITY OF CHICAGO HYDRAULIC FRACTURING-ASSOCIATED HEALTH OUTCOMES AMONG THE ELDERLY IN PENNSYLVANIA AND NEW YORK, 1999–2015 AN HONORS THESIS SUBMITTED TO THE FACULTY OF THE BIOLOGICAL SCIENCES COLLEGIATE DIVISION IN PARTIAL REQUIREMENTS FOR A BACHELOR OF SCIENCE DEGREE IN BIOLOGICAL SCIENCES WITH RESEARCH HONORS BY KEVIN TRICKEY CHICAGO, IL MAY 2020

Transcript of Thesis Trickey Kevin draft 3

Page 1: Thesis Trickey Kevin draft 3

1

THE UNIVERSITY OF CHICAGO

HYDRAULIC FRACTURING-ASSOCIATED HEALTH OUTCOMES AMONG

THE ELDERLY IN PENNSYLVANIA AND NEW YORK, 1999–2015

AN HONORS THESIS SUBMITTED

TO THE FACULTY OF THE BIOLOGICAL SCIENCES COLLEGIATE

DIVISION IN PARTIAL REQUIREMENTS FOR A BACHELOR OF SCIENCE

DEGREE IN BIOLOGICAL SCIENCES WITH RESEARCH HONORS

BY

KEVIN TRICKEY

CHICAGO, IL

MAY 2020

Page 2: Thesis Trickey Kevin draft 3

2

Table of Contents 1 Acknowledgments ...................................................................................................... 3 2 Abstract ....................................................................................................................... 4 3 Background ................................................................................................................ 5

3.1 Overview ........................................................................................................................ 5 3.2 Development of Well Drilling and Stimulation Technologies .................................. 5

3.2.1 Early Drilling: A Story of Salt ..................................................................................................5 3.2.2 The Ruffner Brothers and Early American Brine Technologies ..............................................7 3.2.3 Oil Distillation and the Drake Well ..........................................................................................9 3.2.4 Colonel Roberts’ Explosive Torpedoes ..................................................................................11 3.2.5 Well Casings and Perforations ...............................................................................................14 3.2.6 Recent Technologies Enabling the Fracking Boom ...............................................................16

3.3 Stages of a Modern Fracking Operation .................................................................. 18 3.3.1 Well Pad Construction and Drilling Rig Setup ......................................................................18 3.3.2 Drilling ...................................................................................................................................20 3.3.3 Perforation and Fracking ........................................................................................................21 3.3.4 Flowback, Production, and Waste Disposal ...........................................................................23 3.3.5 Reclamation ............................................................................................................................24

3.4 Community Health with Fracking: Concerns and Evidence ................................. 24 3.4.1 Ground and Surface Water Contamination ............................................................................24 3.4.2 Air Pollution ...........................................................................................................................26 3.4.3 All-Cause Hospitalization Rates .............................................................................................27 3.4.4 Traffic Injury ..........................................................................................................................28 3.4.5 Other Impacts .........................................................................................................................28

4 Methods .................................................................................................................... 30 4.1 Data Sources and Rationale ....................................................................................... 30

4.1.1 Hydraulic Fracturing Activity Data ........................................................................................30 4.1.2 Hospitalization Data ...............................................................................................................30

4.2 Model Specifications ................................................................................................... 31 4.2.1 Study Regions .........................................................................................................................31 4.2.2 Study Population and Health Outcomes .................................................................................33 4.2.3 Panel Data and Model Specifications .....................................................................................35

5 Results ...................................................................................................................... 37 5.1 Characteristics of Study Regions .............................................................................. 37 5.2 Model Estimates .......................................................................................................... 39 5.3 Sensitivity Analyses .................................................................................................... 42

5.3.1 Different Exposure Specifications ..........................................................................................42 5.3.2 Alternative Control Regions ...................................................................................................44 5.3.3 Exclusion of Outlier Zip Codes ..............................................................................................45 5.3.4 Exclusion of Populous and Sparse Zip Codes ........................................................................45

6 Discussion ................................................................................................................ 47 6.1 Summary of Evidence ................................................................................................ 47 6.2 Strengths, Limitations, and Future Directions ........................................................ 51

6.2.1 Strengths and Limitations .......................................................................................................51 6.2.2 Procedural Remarks ................................................................................................................53 6.2.3 The Ideal Analysis ..................................................................................................................54

6.3 Conclusion ................................................................................................................... 55 7 References ................................................................................................................ 56 8 Supplementary Material .......................................................................................... 64

Page 3: Thesis Trickey Kevin draft 3

3

1 Acknowledgments

Beyond the results presented here, this thesis is the culmination of a two-and-a-half-year

journey of development for me with my mentor, Prachi Sanghavi. So much of my learning, my

success, and my view of science has been informed by Prachi’s training and personal dedication.

As an advisor she has always offered me complete support as well as the latitude to pursue my

ambitions and achieve them, and I feel extremely grateful for her backing, advice, and

friendship.

I would also like to thank the other former and current members of the Sanghavi Lab, in

particular Sibyl Pan, Sameep Shah, and Jessy Nguyen, for their help in dealing with all types of

scientific and technical challenges. I also owe thanks to the Center for Research Informatics and

the Biostatistics Lab at the University of Chicago for providing the computational resources and

support necessary to complete this work. In particular, I thank Anup Patel for his help in

managing the Medicare claims data for this study.

I am grateful to David Pincus, Steve Kron, and the members of the 2020 BSCD Research

Honors Program for providing helpful feedback, encouraging my best work, teaching me to be a

skeptical student of science, and providing lunch. I also thank Dan Nicolae, Kavi Bhalla, and

Brandon Pierce for taking the time to review my thesis and participate in my thesis committee.

Their support of undergraduate research is inspiring and very appreciated.

Finally, I could not have achieved this without the unconditional support from my family

and friends. This thesis is theirs as much as mine—the good parts of it, at least.

Page 4: Thesis Trickey Kevin draft 3

4

2 Abstract

Hydraulic fracturing has become a major US industry over the last decade, with many

communities in northern Pennsylvania experiencing rapid and dense unconventional well

development. However, there are many unresolved questions over the impact of hydraulic

fracturing activity on public health via pollution and other mechanisms. Broad studies of all

hospitalization data have suggested general associations between hydraulic fracturing

development and increased cardiovascular, respiratory, and genitourinary outcomes, but further

work is needed to characterize the nature of the associations and any particular outcomes that

may drive them.

In these analyses I focus on specific outcomes identified through the literature on air and

water pollution, rather than general hypothesis testing across diagnosis groups. I model inpatient

hospitalization rates among the Medicare population based on exposure to hydraulic fracturing

activity in the Marcellus Shale region of Pennsylvania/New York. Using zip code-level panel

data from 1999–2015, I find significant associations between cumulative hydraulic fracturing

development and hospitalizations for arrhythmias, thrombocytic causes, and heart failures, robust

to various sensitivity analyses. I additionally find some evidence for associations with

cerebrovascular disease, kidney disease, and chronic obstructive pulmonary disease. I do not find

evidence for associations with pneumonia, certain genitourinary outcomes, or motor vehicle

accidents. These analyses present new evidence that exposure to local hydraulic fracturing,

through air pollution or other pathways, has strong adverse effects on cardiovascular health and

impacts general morbidity among the elderly.

Page 5: Thesis Trickey Kevin draft 3

5

3 Background

3.1 Overview

Hydraulic fracturing (“fracking”) is a complex process at the intersection of many

industries, many of which have potential to impact human health. Construction, transportation,

chemical and environmental engineering, and waste disposal are among those that may affect

local health directly, while the social and economic effects on fracking communities may also

have less direct influences. Nationally, fracking has transformed the energy landscape of the

United States, elevating the US to both the world’s largest crude oil producer since 2018 (1) and

the world’s largest natural gas producer since 2012 (2).

Because of the complexity and various components of fracking operations, elucidating its

relevance to human health requires a holistic understanding of the entire fracturing process, from

initial construction to final reclamation. To that end, this introduction first describes the major

milestones in drilling and fracturing that led to our modern understanding of fracking. Second, I

characterize the processes involved in a modern fracking operation, with particular focus on the

stages of relevance to this study. Finally, I review the existing literature on health outcomes

associated with fracking activity. Readers wishing to skip a review of fracking’s history may

jump to Section 3.3 or 3.4 as desired.

3.2 Development of Well Drilling and Stimulation Technologies

3.2.1 Early Drilling: A Story of Salt

Modern oil and natural gas development is founded upon a long history of ground fluid

extraction, with some technologies dating back over two thousand years. Although petroleum

and gas are the most common fluids drilled for today (excluding freshwater, which is

comparatively shallow and easily accessible), the development of industrial well-boring

Page 6: Thesis Trickey Kevin draft 3

6

technologies was almost exclusively motivated by saltwater brine mining until the 19th century.

Saltwater was boiled off to produce salt, a highly valuable commodity for food supplementation,

preservation, and trade.

The earliest evidence of brine extraction wells comes from the Sichuan region, China.

Around 2,250 years ago, inhabitants were boring wells to extract groundwater with a salinity of

around 50 grams per liter, compared to the 35 grams per liter in seawater (3). These deeper wells

were bored using a percussion-rod technique: heavy, metal chisels were bound to the end of long

bamboo poles (bamboo is resistant to salt corrosion), which were repeatedly bashed into a hole in

the ground to grind away rock. Around 1050 C.E., this gave way to the similar cable-tool drilling

technique, where the chisel was suspended instead by a much thinner, flexible bamboo cable,

hanging from a derrick resembling a hangman’s gallows,1 and dropped into the well before being

hoisted up again. With its lighter total weight, deeper range, and greater power, the cable-tool

approach enabled drilling to depths of hundreds of meters, and remained the dominant well-

boring method until nearly 1900.

Although the cable-tool process is rarely seen today in industrial nations, the extent of its

technological development was remarkable. Specialized drill bits were crafted for every depth

and rock formation. Cave-ins, lost tools, and off-course drilling were frequent issues encountered

by drillers, and myriad instruments akin to fishing hooks and vascular balloon stents were

developed to solve them. Even diagnosing a problem could be challenging when it was hundreds

of feet below the surface. Of particular relevance to future well stimulation techniques was the

1 In fact, this resemblance gave rise to the word “derrick” itself, after famous English executioner Thomas

Derrick, who lived c. 1600.

Page 7: Thesis Trickey Kevin draft 3

7

usage of well casings, or hollowed logs inserted at the surface of the well so that fluids would

pass through them on their way to the surface. Instead of preventing fluid loss during extraction,

early casings were generally deployed to keep shallow freshwater out of the well to ensure

maximal saltiness of produced brines.

The Sichuan brine industry continued to develop for centuries and, although it became

practiced in other parts of the world, including the United States, brine extraction remained a

dominant economic and technologically innovative force in the region well into the twentieth

century. In 1835, the Shenhai salt well became the first shaft to surpass 1,000 meters in depth.

The region also emerged as one of the first to use ground-recovered petroleum and natural gas

for industrial purposes. Although people had long known that flammable oil and gas could be

extracted from the ground, often as unwanted by-products of brine drilling, in the 16th century

salt miners realized that these by-products could be burned below the produced brines as part of

the boiling process. This development fueled the large-scale industrialization of the Sichuan salt

mining industry and primed the world for an oil and gas drilling explosion.

3.2.2 The Ruffner Brothers and Early American Brine Technologies

In the late 1700s, Caucasian settlers in the Kanawha Valley, modern day West Virginia,

recorded the presence of salty springs in the region. The ready salt supply attracted waves of new

settlers, and those who survived gruesome battles with indigenous tribes began sinking hollowed

logs into the salty quicksand to amplify production. Word of the abundant, red-tinged saltwater

spread quickly such that Joseph Ruffner, a farmer who had heard of the impressive Kanawha salt

springs, purchased 502 acres without even seeing it. In 1803 his sons David and Joseph inherited

the land and set in motion plans for a large brine mining operation.

Page 8: Thesis Trickey Kevin draft 3

8

Another regional entrepreneur, Elisha Brooks, had been producing 150 pounds of salt per

day from shallow wells and salt furnaces (4), but in 1806 the Ruffner brothers began preparation

for a more productive supply. Called “The Great Buffalo Lick,”2 the idea was to insert a

hollowed sycamore tree (called a “gum”), with an internal diameter of 4 feet, to the bottom of the

quicksand where the source of the saltwater was purported to be. A platform was fixed at the

surface where two men would empty bucketloads of saltwater; two other men would operate a

pulley contraption stand, lowering a large bucket (made from a whiskey barrel) to a fifth man

inside the gum. The Ruffner brothers managed to sink the gum 13 feet before encountering rock.

They relocated to 100 yards further from the nearby river, this time sinking the gum through 45

feet of quicksand before encountering the same sediments that were near the river. They bored

through this percussively with a 20-foot oak tube, sharpened and iron-tipped, but found the brine

beneath to be less salty than it had been by the river.

Giving up once more, they returned to their original location, this time sealing the gum

against the bottom rock with closely engineered refinement of the tube’s base and small wedges

inserted into the open spaces. Discovering that the water seeping up through the rock was highly

salty, the Ruffner brothers slowly and painstakingly bored through 40 feet of rock before

encountering a strong flow of brine on January 15, 1808. The famous final construction was a

whittled and joined wooden tube, 2½ inches in diameter, that was inserted as a watertight well

casing to prevent dilution of the brine on its journey upward through the rock. After 18 months

2 The name was derived from the vast herds of buffalo, deer, and other animals that would pass through the

valley to lick the salt deposits. Large game was so plentiful that the famous hunter, pioneer, and warrior Daniel

Boone settled just across the river from the Ruffners’ salt well.

Page 9: Thesis Trickey Kevin draft 3

9

of work and uncertainty, in February of 1808 the first batch of brine was boiled in the Ruffner

Bros.’ salt furnace.

Although they built on technology from overseas and undoubtedly relied on prior,

undocumented experiences, the Ruffners’ operation was in many ways a pioneering venture,

predicted to fail by many yet closely followed by their neighbors in case of success. The Great

Buffalo Lick is now recognized as the first documented bored well in the Western Hemisphere

(5). In particular, the geological conditions in the Appalachian region required innovation of

advanced well casing technologies to seal the shaft from dilution. Upon their success, drilling

operations commenced all over the valley by various entrepreneurs, and by 1817 at least 15

regional wells produced over 600,000 bushels of salt (6). Eventually, a great 1861 flood and the

destruction of the Civil War would all but extinguish the Kanawha Valley brine industry, but by

that point drilling technology had pivoted its epicenter to other regions and begun to center on oil

production.

3.2.3 Oil Distillation and the Drake Well

Natural gas and petroleum have been known as combustible fuels for as long as historical

records have been kept, but widespread consumption of these fossil fuels were disfavored to

other materials like wood, coal, and animal fats until the 19th century in America. Although

natural wells and spring-like effluents of natural gas and petroleum exist in certain locations, the

fluids were difficult to transport from the collection site and, for petroleum, highly smoky and

unappealing to burn.

William Hart is credited for the first natural gas well in the United States. In 1921

reportedly trapped combustible natural gas bubbling up in a creek with his wife’s washtub, and

Page 10: Thesis Trickey Kevin draft 3

10

subsequently dug a well in Fredonia, New York (7).3 Hart, a gunsmith by trade, dug the well

himself, although the exact circumstances and manner are unknown. By contrast, the first

American oil well on record was entirely accidental. On March 11, 1829, brine drillers in

Kentucky struck an oil deposit and observed an enormous geyser of petroleum (8).4

Globally, the first intentionally drilled oil well was dug by percussive tool in 1846 in

Baku, by Russian engineers Semyenov and Alekseev (9). The depth of their well was 69 feet;

over a decade later, George Bissell and Edwin Drake famously also drilled 69 feet to encounter

oil in Pennsylvania.

Aside from incremental improvement in drilling technologies, the drivers of such a

radical explosion in oil production largely had to do with the ability to refine crude petroleum

(known as “rock oil”). Samuel Martin Kier, who in 1849 noticed that the oil contaminating his

saltwater wells appeared the same as the medicinal oil purchased by his wife, teamed up with

John Kirkpatrick to construct the first oil refinery (10). Various attempts at distillation provided

measurable but ultimately economically insufficient success. In 1851, Kier first sold a crude oil

distillate as lamp fuel, but even this produced too much smoke to be a practical indoor lighting

solution (11).

Three years later, Benjamin Silliman, Jr., a Yale chemist commissioned by lawyer

George Bissell, applied the technique of fractional distillation (invented by Silliman Sr.) to crude

oil. Suddenly, the resulting distillate burned even more brightly and cleanly than whale oil and

3 He began to supply fuel to 5 nearby buildings, launching the precursor company to modern billion-dollar

corporation National Fuel.

4 The veracity of this report as the first oil well is ambiguous and contested, but is supported by a 1934 on-

site commemorative tablet commissioned by the Kentucky General Assembly.

Page 11: Thesis Trickey Kevin draft 3

11

ethanol, the standard lamp fuels of the day. The development of the distillation process from

crude petroleum into kerosene sparked an economic demand for petroleum that still drives

drilling operations today.5 Bissell, elated at the prospect, founded the Pennsylvania Rock Oil

Company, precursor to the Seneca Oil Company (12). He hired an unemployed railroad

conductor, Edwin Drake, seemingly based only upon Drake’s free railroad pass to Titusville,

Pennsylvania. In Titusville, Drake and his assistant Billy Smith dug for oil unsuccessfully until

the Seneca Oil Company retracted its funding and accepted its losses (13–15). Drake obtained a

personal credit line to continue drilling, and the famous “Drake Well” struck oil just before his

funds ran out. The well is often considered the first of a great boom in American oil drilling; by

the following year, 75 wells in the same valley had produced over 500,000 barrels of oil. Despite

the explosion of the industry, Drake died a poor man: he had only purchased a small plot of land,

never patented his drilling method, and lost what he did earn on Wall Street (16).

3.2.4 Colonel Roberts’ Explosive Torpedoes

While Edwin Drake was drilling in Titusville, American politics were brewing to a

breaking point. Two years after Drake’s well struck oil, the Civil War erupted. Although the oil

rush was never fully quenched, America’s focus was squarely on the battlefield.

It was during the particularly bloody Battle of Fredericksburg, often remembered as the

heaviest Union defeat of the war (17), that Union Colonel Edward Roberts noticed Confederate

explosives being dropped into a millrace—the canal used to power a water mill—to clear it from

5 Today, electric lights have antiquated the use of kerosene lighting. However, in an impressively prescient

note in his experimental journal, Silliman, Jr. noted the potential of his distilled product to act as a lubricant.

Lubricants for machinery would become a primary source of oil demand during the industrial revolution.

Page 12: Thesis Trickey Kevin draft 3

12

the battlefield (18, 19). After discharge from the Union Army in 1963, reportedly inspired by this

observation, he set to experimentation in “superincumbent fluid tamping,” where various

explosives were dropped into a well subsequently filled with water to fracture the surrounding

rock and stimulate flow. In November of 1866, Roberts was granted the first patent for a

fracturing technique (20). It described an iron case containing up to 20 pounds of explosive

powder, along with a torpedo cap that could be ignited through a long wire (Figure 1). The case

was lowered into an oil well and detonated, breaking apart surrounding rock. Roberts’ most

valuable and novel contribution was the use of water to fill the borehole before detonation,

thereby “tamping” the explosion and increasing its underground effectiveness (18).

Page 13: Thesis Trickey Kevin draft 3

13

Figure 1: U.S. Patent 59,936, the first for a fracturing device to stimulate oil well production. Developed by Col. Edward Roberts in 1866.

Roberts founded the Roberts Petroleum Torpedo Company upon this invention, selling

each rocket for $100 to $200 ($1800–3600 in 2018 dollars), plus a royalty of one-fifteenth of the

additional oil flow. His steep fees and notable success—some wells reported as much as a

Page 14: Thesis Trickey Kevin draft 3

14

twelve-fold increase in oil production within a week—inspired entrepreneurs to develop all types

of similar devices, often operating at night. Roberts spent over $250,000 (approximately $4.5

million in 2018 dollars) on civil litigation against these “moonlighters,” and may hold the record

for money spent by a single person to protect a U.S. patent (21).

Roberts’ invention left a lasting impact on the oil and gas industry. His explosive of

choice, nitroglycerin, continued to be used to stimulate oil wells long after Roberts’ death in

1881. Nitroglycerin was manufactured in plants owned by the Otto Cupler Torpedo Company, a

derivative of Roberts’ original Torpedo Company, until a series of accidental explosions

destroyed the final plants by 1990.

Today, Roberts receives the credit for the invention of explosive well stimulation, with

most available evidence corroborating this perspective. One 1902 book by John James

McLaurin, however, mentions some experimentation between 1860 and 1865 that involved

lowering bottles of explosives into wells and igniting them through fuses, pistol shots, or red-hot

irons dropped through the bottle (22). Although most were reportedly unsuccessful in stimulating

increased oil flow and are no longer identifiable by place or person, the trials mentioned by

McLaurin suggest a background of ongoing invention and development that complicates Col.

Roberts’ single-inventor narrative.

3.2.5 Well Casings and Perforations

Industrialization around the turn of the twentieth century swelled the limits of well

access, with steel bits and powerful energy supplies enabling drilling depths of thousands of feet,

deeper than necessary for most oil and gas reserves. Still, the new wave of industrial drillers

faced the age-old problem of contamination by other fluids and particles from the ground.

Purification was an additional step that necessitated extra cost and more waste to deal with.

Page 15: Thesis Trickey Kevin draft 3

15

Casing the wells, or inserting a tubular structure to protect the upward flow from surrounding

rock, was the natural solution, but effective casings were an immense technological challenge at

thousand-foot depths. The Ruffner brothers had cased their 58-foot brine well with whittled

wood, but inserting and joining watertight tubes thousands of feet below surface was impractical.

Moreover, most tubing was either subject to collapse at high subterranean pressures or too heavy

to be manufactured, transported, and threaded downhole.

In 1919, a 27-year-old Erle Halliburton founded the New Method Oil Well Cementing

Company on a solution to this problem (23). His novel method involved a steel pipe, smaller in

diameter than the drilled well, inserted into the well; then, cement was pumped down into the

inner pipe. A plug was used to push the cement to the bottom of the inner pipe and then back up

around the sides of the well, in the inter-pipe space, where it hardened to become the well casing

(24). Halliburton’s company profited remarkably from this patent and would eventually take his

name to become today’s multi-billion-dollar drilling firm. Current industrial well casing methods

are still founded primarily on Halliburton’s original design.

Finally, once oil wells were effectively cased, they needed to be perforated at the right

locations to allow oil extraction. Patents for well perforating devices can be found dating back to

1902, with expanding lever or scissor-like mechanisms constructed to rupture the well casing

(25). In 1930, Bill Lane and Walt Wells conceived the idea of using guns as perforating devices

and established the Lane-Wells company for this purpose in 1932 (26). Lane-Wells developed a

perforating gun, detailed by a 1938 Popular Science issue (Figure 2). By the end of the 1930s,

steel bullets were the most popular method of perforating cement casings; perforating guns

remain the method used industrially today.

Page 16: Thesis Trickey Kevin draft 3

16

Figure 2: A 1938 Popular Science diagram illustrating Lane-Wells' perforating gun.

3.2.6 Recent Technologies Enabling the Fracking Boom

Despite some initial success with the explosive well stimulation techniques pioneered in

Col. Roberts’ day, oil and gas production was primarily driven by conventional extraction

methods (i.e. non-fracturing techniques). Explosives like nitroglycerin could be dangerous and

Page 17: Thesis Trickey Kevin draft 3

17

unpredictable, and finding areas that would produce oil without stimulation was mostly easier

and cheaper.

Still, several 20th-century developments bear significance to the hydraulic fracturing

story. In 1935, Grebe and Stoesser described the first injection of a non-explosive fluid that

stimulated oil production of a well (27). They used acid to “etch” the rock, providing more

spaces and channels for oil to flow through. It was not terribly productive at the time, but the

technique foreshadowed the important role that acids would come to have in modern fracturing

fluids.

Many in the industry consider the first true fracturing operation to have occurred in the

Hugoton field, Kansas, in 1947. It was a simple two-wing fracture, with one on either side of a

vertical well (28). In 1949, hydraulic fracturing (using water rather than acid or explosives) was

first applied commercially by Halliburton Exploration and Stanolind Oil and Gas Corporation

(29). The technique included all of the major elements of a hydraulic fracturing operation from

today: building on two years of commercial experimentation, water was injected at intense

pressure into a well near Duncan, Oklahoma to fracture rock. An uptick in hydraulic fracturing

followed, but total production from hydrofracturing wells remained slight in comparison to those

using conventional extraction. Because of the heavy expense associated with the necessary

infrastructure—pumps, water supply pipes, artificial ponds to hold the fluid—as well as the cost

of water and waste disposal, hydraulic fracturing was simply not profitable.

It remained unprofitable for several decades until a series of key inventions that involved

horizontal drilling. Horizontal drilling describes first drilling downward under the surface, then

drilling laterally through a specific geological layer of interest. The technology had been in use

since the 1960’s, primarily as a method of river crossings for pipes and utility lines (30), but oil

Page 18: Thesis Trickey Kevin draft 3

18

and gas well operators soon realized its potential to vastly expand the limits of accessible

reserves. In particular, less permeable shale formations—where fracking techniques had been

performed in vertical wells—could be drilled horizontally to allay the substantial cost of

repeatedly drilling downward to access different locations in the shale. The primary

technological challenge involved in horizontal drilling is accurately controlling and monitoring a

steerable drill bit, a daunting challenge at thousand-foot depths. Most modern horizontal drill bits

employed by the fracking industry achieve directional drilling through rotational bits that apply

pressure toward one side, redirecting the bit, although myriad technologies have been developed

over the last three decades to perform the same function (31, 32). Similarly, most drill bits can be

tracked using an artificial magnetic field generated as a reference on the surface, but there are

many other recently developed technologies for accurately tracking drilling progress that are

beyond the scope of this thesis (33–35).

3.3 Stages of a Modern Fracking Operation

Hydraulic fracturing is a long, complex, and expensive process with variations dependent

on particular regions, geologies, economies, and more. However, fracking operations can

generally be standardized into several distinct phases, which are also relevant in discussions

community health implications.

3.3.1 Well Pad Construction and Drilling Rig Setup

Before drilling can commence, a “well pad” must be bulldozed and the necessary

infrastructure set up. Well pads are generally rectangular clearings of three to four acres where

all the components of the operation are housed: the drilling rig, construction vehicles, loading

and unloading space for trucks, large tanks for fluid storage, and portable offices (Figure 3). In

Page 19: Thesis Trickey Kevin draft 3

19

addition, an access road must be constructed, another costly undertaking especially in remote

areas.

Once the pad has been cleared, the drilling rig is transported to the site and assembled

over the well head. The rig is a tall structure that provides the necessary height and energy to

feed the drilling apparatus into the hole. Importantly, a single well pad can often contain multiple

wells with different directional drilling.

Transport of the rig components and construction of the well pad require many trips to

and from the pad by construction and oversize vehicles. Pad clearing and construction can last

between zero and four weeks, depending on location and terrain, while the drilling rig takes

seven to ten days to assemble (36).

Figure 3: A fracking well pad in Southeast Ohio with a drilling rig set up. Photo courtesy of Ted Auch (37).

Page 20: Thesis Trickey Kevin draft 3

20

3.3.2 Drilling

Once the rig is in place, the well is drilled vertically to a point below any freshwater

aquifers in the region. A steel pipe of slightly smaller diameter than the drilled well is inserted

and cement is pumped down and up around the steel pipe, hardening to form a protective casing

between the well. The casing is generally 16 to 20 inches in diameter (38). The process continues

with vertical drilling to the depth of the shale reserve, which is usually no more than 5,000 feet

below the surface, before the drill bit provides lateral pressure and bends into a horizontal

position. The well can then be bored directionally for up to two miles.

During every drilling process, the rock cuttings and “drilling mud” that are bored from

the earth return to the surface through the center of the drilling mechanism. They are collected at

the surface and must be transported away and disposed, generally into landfills or other pit

burials. Vertical and horizontal drilling may last up to two and six weeks, respectively, but in

general this phase is relatively continuous once started and can be accomplished in a matter of

days. Deeper shales, like the Marcellus formation of the Northeastern states, as well as longer

horizontal wells will require more time to complete drilling.

As in the well pad construction phase, a predominant community health risk in this stage

stems from air pollution, with combustion emissions from the drill rig and heavy truck traffic

among the chief concerns. Transportation of the rock cuttings to a disposal facility necessitates a

variable but highly acute increase in truck traffic; a time period of seven to ten days could see an

increase on the order of 100 truck round trips to dispose of cuttings from the drilling phase.

Once drilling is complete, the rig is disassembled and fracking is scheduled. Arrival of

the fracking crew could take days to weeks.

Page 21: Thesis Trickey Kevin draft 3

21

3.3.3 Perforation and Fracking

Charges are dropped into the well and detonated to perforate the casing at several points

along the horizontal portion. This blows holes in the well casing and the immediately

surrounding shale, creating a route for fluid from the shale reserve into the well.

The next stage of the operation is the fracking phase, which stimulates the flow of oil or

gas. One to ten million gallons of fracking fluid are injected into the well at up to 9,000 pounds

per square inch (psi); this high pressure creates fractures in the surrounding shale. Like most

other components of the fracking system, the fracking fluid used in the operation has several

standard ingredients but is subject to frequent modification and engineering, much of which is

kept proprietary by individual firms. Understandably, the injection of fluids containing unknown

chemical additives into the ground has raised some concern over drinking water safety and other

types of contamination; this discussion is reviewed in Section 3.4.1, below. Most fracking fluids

are comprised of approximately 90% base fluid (water), 10% proppant (sand) to lodge in the

fractures and prevent them from collapsing back down, and 1% other chemical additives,

including acids, biocides, friction reducers, surfactants, scale inhibitors, and more (39). Fracking

fluids are often called “slickwater” due to the inclusion of friction reducers.

The actual fracturing itself occurs through about 20 “fracs,” or high-intensity injections,

each lasting around one hour and usually less than a week in total (36). Fracs are accomplished

by lining a dozen or so specialized pumping trucks above the well, then pumping the fluid into

the well simultaneously (Figure 4). Base fluid, proppant, and chemical additives are generally

delivered separately and mixed on-site before injection.

Page 22: Thesis Trickey Kevin draft 3

22

Figure 4: A pad with a well in the fracturing process. Fracking trucks are lined up on both sides of the well to provide the injection pressure. Photo courtesy of Ted Auch (40).

Finally, because each well requires millions of gallons of fluids, and individual pads may

contain multiple wells, delivery and storage of this fluid is itself a nontrivial task. “Frac ponds”

or “swimming pools” are often constructed on-site, and hundreds of tanker trucks are employed

to deliver water, sand, and chemical additives. Because the tanker trucks must usually be cleared

from the pad to enable fracking set-up, delivery of the base fluid generally occurs 15 to 25 days

before the fracture date. Much like removal of drill cuttings, water and sand delivery poses a

possible public health burden due to the acute influx of up to 600 trucks per well head in a time

span of days. In a minority of cases, water may be delivered by pipeline rather than tanker trucks,

Page 23: Thesis Trickey Kevin draft 3

23

requiring pipeline construction but relieving the truck necessity; this decision is often based on

the site’s proximity to a usable body of water and other economic factors.

3.3.4 Flowback, Production, and Waste Disposal

Immediately after fracking is completed, the great pressure on the injected fluids repels

much of it back to the surface, now mixed with native subterranean fluids and particles,

including some oil or gas. This initial period of rapid return is termed the “flowback” phase, and

lasts around two but sometimes up to eight weeks. Once most of the pressure has been relieved

through flowback, a gentler supply of “produced waters” follows, and the well enters its final

“production” stage. Production may last years to decades, with twenty-year lifespans not

uncommon; sometimes, wells are “re-fracked” in order to prolong production.

Flowback and produced waters are a mixture of (salt) water, sand, oil and gas, and other

native brines or particles. Separation of these components is typically performed on-site in large

settling tanks, where sand falls to the bottom and hydrocarbons rise to the top. Large centrifuges

may also be employed, particularly to separate sand during initial flowback. Once settling has

completed, the oil or gas is transported away for further processing, often by a constructed

pipeline.

Disposal of the separated brine is another great challenge faced by the fracking industry.

The current gold standard for brine disposal is deep-well injection, where tankers transport the

brine to specified sites to be injected into the ground far below any aquifers (e.g. 10,000 feet).

Other options for used brines include recycling to subsequent fracking operations, treatment and

release, or use as road salt for ice control. Because brine disposal is a concern across the entire

production lifespan of a well, brine hauling trucks offer a smaller, steadier, and lengthier

presence than the acutely intense trucking involved in fluid delivery. However, in communities

Page 24: Thesis Trickey Kevin draft 3

24

with hundreds of simultaneously producing wells, brine disposal trucks aggregated across all

producing wells may present a significant community health concern.

3.3.5 Reclamation

Once oil or gas is no longer recoverable from the well, many states require full or partial

reclamation of the well pad area. The reclamation process involves plugging the well, removing

equipment, planting vegetation, and otherwise returning the land to its original form. The exact

steps required differ widely by state.

3.4 Community Health with Fracking: Concerns and Evidence

Because of its complexity, the fracking process has potential to affect human health along

many dimensions. This section is divided into putative transmission routes and discusses the

epidemiological and environmental evidence for health risks, as well as the rationale for

choosing specific diagnoses to study for each.

3.4.1 Ground and Surface Water Contamination

Among the health concerns listed in this section, water contamination is perhaps the one

most frequently covered in the media (41). Academic research on fracking-related water

contamination is sparser, in part because it is very difficult to link health outcomes to water

contamination from a particular source. Instead, much of the literature has focused on

establishing a potential link between water contaminants and fracking sources by using chemical

methods such as isotopic signatures. The extent to which fracking creates fissures that might

enable migration from a shale play into a groundwater aquifer is the subject of much debate (42).

One of the potential water contaminants that isotopic fingerprinting could putatively

reveal is natural gas itself, primarily methane. Methane occurs naturally in groundwater from

microbial sources, as well as through natural migration from thermogenic layers (43). However,

Page 25: Thesis Trickey Kevin draft 3

25

a demonstrable increase in methane concentrations related to fracking activity would be

significant as it would imply the possibility of other chemical contamination. Moreover, excess

concentrations of dissolved methane, although not considered a hazard for drinking, may change

the microbial community makeup, resulting in oxygen depletion, increased solubility of heavy

metals, and sulfate reduction; excess dissolved methane may also be associated with risk of

explosion (44). Several studies have reported increased dissolved methane concentrations in

water samples closer to high fracking activity (45, 46), while at least one has verified an isotopic

match to a fracking well source (47). However, the credibility of this study has been challenged

on the basis of wide variability in isotope measurements (48, 49), and even a conclusive link

between groundwater contamination and a deep thermogenic source would not preclude other

contamination routes besides fracking, such as abandoned well shafts or natural seepage (50).

Unfortunately, fracking threatens more sinister water contaminants than dissolved

methane, since flowback fluids and produced waters often contain chemical levels that exceed

federal Safe Drinking Water Act standards (51). Native substances in underground brines, such

as radioactive or heavy metals, may pose more contamination risk to groundwater than injected

fracking fluids (52). Aside from methane, some organic compounds detected in groundwater

have been attributed to fracking operations in the Marcellus shale region (53), including the

aromatic compounds benzene, toluene, ethylbenzene, and xylene, known as BTEX (54, 55).

Notably, the exact sources and routes by which such chemicals might enter the water supply are

unclear, but include accidental spills and leakage and improper waste disposal, natural migration,

and dissolution from the air into surface water (56). In some cases, particular spills have resulted

in nearby or downstream increases in fracking-related chemicals, including 2-n-Butoxyethanol

(57). Sampling from water bodies downstream of the release of treated effluents of fracking

Page 26: Thesis Trickey Kevin draft 3

26

waste fluids reveal elevated concentrations of halide ions, radioactive metals, and brominated

products, even post-processing (58, 59). At least one study that measured samples from

community water supplies over the Marcellus region argues that chemical signatures of fracking

are detectable even at the drinking water level within 1 km of fracked wells (60).

At the time of writing, there have been no studies conclusively linking water

contaminants from fracking to observed human health outcomes. However, there have been

several case reports of large farm animals, including horses, cows, and dogs, suffering poisoning

or death after ingesting waters downstream from fracking-related spills (61). When identified,

such case reports have primarily been caused by heavy metal ingestion. In my analyses, I attempt

to capture the effects of water pollution by studying kidney disease, since the filtration functions

of the kidney can result in higher susceptibility to environmental contaminants, including heavy

metals (62).

3.4.2 Air Pollution

The other prominent route through which chemical contaminants might affect human

health is through the air (63). Diesel and other exhaust products are emitted through operation of

tanker trucks, pumps, and rotary drills. In addition, the exposure of underground fluids to the

surface can release various other hydrocarbons as gases into the air; sometimes, excess methane

is burned (flared) on the spot if there is no capacity to store it.

Exploratory studies have measured air samples from well pads of active wells, finding

potentially hazardous levels of airborne non-methyl hydrocarbons (NMHCs), methylene

chloride, polycyclic aromatic hydrocarbons, hydrogen sulfide, formaldehyde, and BTEX

compounds (64–68). Theoretical models have estimated 30% increases in carbon dioxide,

nitrogen oxides, and particulate matter from fracking-related truck traffic on peak activity days

Page 27: Thesis Trickey Kevin draft 3

27

(69), while other models have estimated cancer and non-cancer hazard indices attributed to local

fracking-related emissions (70).

Many of the environmental pollutants observed above have shown associations with

health outcomes, including pregnancy outcomes such as low birth weight (71). Birth outcomes

represent a frequent study case due to the availability of large, complete, and reliable datasets.

Among investigations of the association between birth outcomes and proximity to active fracking

wells, several have identified significant increases in preterm birth but not in low birth weight

(72–74), while several others have identified the reverse, more frequent low birth weight but no

significant change in preterm birth incidence (75–77). Increased risks of other outcomes,

including fetal death and congenital heart disorders, have also been reported (73, 78).

Finally, in Pennsylvania, hospitalizations over asthma exacerbation incidents have been

reported to increase with greater exposure to hydraulic fracturing activity, pointing to an airborne

irritant (79, 80).

Based on this evidence, I include specific cardiovascular, cerebrovascular, and

respiratory outcomes for this study, including asthma hospitalizations, arrhythmias, heart failure,

and thrombocytic events, which have known associations with air pollution (81–85).

3.4.3 All-Cause Hospitalization Rates

Some epidemiological work has approached the association of health outcomes with

fracking proximity by comparing hospitalization rates across different health categories under

different exposure conditions, without assuming a particular disease. Jemielita et al. found a

significant association between well density and cardiology inpatient rates among zip codes in

Pennsylvania (86). Expansions on this work have found additional associations with

genitourinary hospitalizations (87) and hospitalizations for pneumonia among the elderly (88).

Page 28: Thesis Trickey Kevin draft 3

28

Not much evidence has been published from hospitalization data sourced outside of

Pennsylvania (89).

Based on these indications, I investigated the cardiovascular outcomes outlined in the

previous section. I also studied pneumonia and the specific genitourinary outcomes that drive the

association with hydraulic fracturing in previous literature (Section 4.2.2).

3.4.4 Traffic Injury

Despite the acute increase in heavy trucking through local roads required by fracking

operations, with around 1,000 additional truck visits per well (36), there is a paucity of published

work studying its effect on traffic injury rates. Oil and gas workers have one of the highest traffic

fatality rates among any industry (90), but little work has differentiated conventional and

unconventional drilling operations. Graham et al. find a modest but statistically significant

increase in the number of crashes per million vehicle-miles per month in fracking counties vs.

matched non-fracking counties in Pennsylvania (91). Notably, increased truck traffic is one of

the most frequently perceived problems associated with increased shale drilling to residents in

affected communities, according to surveys (92, 93), but beyond some preliminary work (94)

there has been little research quantifying the traffic injury burden imposed by fracking. To try to

elucidate whether this traffic is associated with accidents and injury, I include motor vehicle

accidents as an outcome category in my analyses.

3.4.5 Other Impacts

A variety of groups have published research in health outcomes occurring through other

potentially fracking-related mechanisms. Increases in rates of gonorrhea and other sexually

transmitted infections, as well as prostitution-related arrests, have been attributed to the social

and demographic changes associated with the influx of a predominantly young, male workforce

Page 29: Thesis Trickey Kevin draft 3

29

in new fracking communities (95, 96). Fracking may also have impacts on local housing markets

(97), birth rates (98), unemployment levels (99), crime rates (100), education attainment (101),

and many other social or economic factors with direct or indirect knock-on effects for

community health.

Some community health surveys have reported mixed results with regards to self-

reported health symptoms after the initiation of local shale gas development, with symptoms of

stress, skin conditions, and upper respiratory conditions among those most commonly identified

by community residents (102–104). In my analyses here, I did not include any additional health

outcomes based on this research.

Page 30: Thesis Trickey Kevin draft 3

30

4 Methods

4.1 Data Sources and Rationale

4.1.1 Hydraulic Fracturing Activity Data

I used spud data from the Pennsylvania Department of Environmental Protection

(PADEP), encompassing all wells drilled from 2002–2019 (105). The data include each well’s

location and spud date (the first date of drilling), but not the fracture date. The same PADEP

dataset has been used to create fracking exposure metrics to study a variety of outcomes, and is

considered a reliable and complete data source (72, 75, 76, 80, 86, 88, 91).

Pennsylvania wells are the focus in this investigation for three primary reasons: (a) the

state lies above the rich Marcellus shale formation but saw almost no fracking activity prior to

the last decade, permitting before–after studies; (b) it shares a long border with New York, a

non-fracking state, enabling salient interstate comparisons while minimizing confounders; and

(c) the state publishes a comprehensive and reliable record of all hydraulic fracturing activity.

4.1.2 Hospitalization Data

Medicare insurance claims were obtained from the Centers for Medicare and Medicaid

Services (CMS). Available data encompassed 100% of Medicare beneficiaries in the U.S. and

included all inpatient claims in the MedPAR file. For this analysis I linked hospitalization data to

demographic and population data in the Master Beneficiary Summary File, for the years 1999–

2015. I required beneficiaries to be enrolled in Medicare Parts A and B (fee-for-service) in the

month that they were admitted for their claim. Inpatient hospitalizations were studied over

outpatient and physician visits in order to capture the most severe and acute outcomes (more in

Section 6.2.1).

Page 31: Thesis Trickey Kevin draft 3

31

The Medicare population represents a good study population for several key reasons.

Older adults may represent a particularly susceptible population to some pollutants (106).

Furthermore, Medicare is a comprehensive, national program that enables comparisons across

states and health systems and, more importantly, includes nearly all residents aged 65 or older,

minimizing sampling bias. Across the Pennsylvania and New York zip codes in this study, over

94,000 beneficiaries were enrolled in Medicare in any given year, approximately 18% of the total

population. To date, there have been no published studies specifically evaluating fracking-related

health outcomes among the Medicare population. Other studies have analyzed fracking-related

outcomes using data sources like Geisinger Health System records in Pennsylvania (72, 79),

birth certificate data from state health statistics bureaus (75–77), and the Pennsylvania Health

Care Cost Containment Council (80, 87, 88).

This study and its use of protected health information was approved via a Data Use

Agreement with the Centers for Medicare & Medicaid Services and by the University of Chicago

Institutional Review Board. Data cleaning, preparation, and analysis was performed in Python

3.7.6 and R 3.5.0, using the computational resources of the Center for Research Informatics’

Gardner High Performance Computing Cluster at the University of Chicago (107).

4.2 Model Specifications

4.2.1 Study Regions

The Southern Tier of New York and the northern part of Pennsylvania lie atop of the

Marcellus Shale, one of the richest shales in the U.S. While Pennsylvania has experienced

aggressive development of the hydraulic fracturing industry since 2009, New York did not

develop fracking activity due to its high outlay costs and the threat of a statewide ban, which

materialized after a 2014 report by the New York Department of Environmental Conservation

Page 32: Thesis Trickey Kevin draft 3

32

outlined potential health and environmental risks (36). Consequently, over 100 miles of border

starkly divides fracking territory from non-fracking territory, in a region with otherwise similar

population densities, demographics, and rural economies. The border presents an opportunity for

identifying fracking-related health outcomes, but to date no public health studies have explicitly

studied this border, likely due to the difficulty of finding consistent health data between states.

I identified 51 Pennsylvania zip codes across Susquehanna, Bradford, and Tioga counties

as the exposure region based on the presence of unconventional wells in the PADEP dataset

(Figure 5). I refer to these exposed zip codes as the “PA Border” group. As unexposed controls, I

selected 68 New York zip codes, the “NY Border” group, from Broome, Tioga, Chemung, and

Steuben counties, which approximately mirror the “PA Border” zip codes across the state border.

Two other, more distant alternative control regions, “NY Northeastern” and “PA Southern,”

were also identified and used in sensitivity analysis in Section 5.3.2. Importantly, the main cross-

border comparison is based on administrative/political boundaries, which environmental

pollutants may not observe. Some exposures, particularly airborne pollutants, may affect both the

PA and NY border groups due to their geographical proximity.

Page 33: Thesis Trickey Kevin draft 3

33

Figure 5: Pennsylvania and New York well locations and zip codes used in this study. Red dots indicate locations of drilled unconventional wells in the PADEP dataset. Four study groups of zip codes were identified based on similar

population densities and demographics. The “PA Border” group, in blue, is the main exposure group; these zip codes were identified by their high fracking density. The “NY Border” group, in red, represents the main control group. The other two groups were selected as additional controls, chosen for their geographical distance from

dense fracking areas in order to observe any distance-dependent effects, such as those of air pollution. Gray/black outlines mark counties.

4.2.2 Study Population and Health Outcomes

Medicare claims through September 30, 2015 consist of an admitting diagnosis, a

primary diagnosis, and up to twenty-four secondary diagnoses, each recorded as an International

Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) code. I identified

health outcomes in Medicare claims using the admitting diagnosis, primary diagnosis, and first

secondary diagnosis in each claim. While inspection revealed that the counts of particular

diagnoses (e.g. all cardiovascular outcomes) increased with the number of secondary diagnosis

Page 34: Thesis Trickey Kevin draft 3

34

columns used (Supplementary Figure 1), diagnosis columns 3 through 25 were ignored to avoid

obscuring acute outcomes with preexisting conditions, which may be listed as later secondary

diagnoses.

I chose to study health conditions with existing literature that satisfied at least one of

three criteria: (a) it suggested a direct association with fracking, (b) it showed association with

certain airborne pollutants expected from fracking operations, including diesel exhaust, or (c) it

suggested association with ingestion of fracking-related water contaminants, including heavy

metals (Table 1). I also investigated motor vehicle accidents, since fracking is associated with a

large trucking presence (Section 3.4.4). I selected ICD-9-CM codes for these health conditions in

keeping with the existing literature on the relevant outcomes.

Major category (putative contaminant pathway)

Outcome name ICD-9-CM code(s) Reference

Cardiovascular (air pollution)

Ischemic heart diseases 410, 411, 414 (83) Thrombocytic causes 415.1, 433, 434, 444,

452, 453 (81)

Heart failure 428 (81) Arrhythmia 427 (81)

Cerebrovascular (air pollution)

Cerebrovascular 430–437 (82, 85) Stroke 434 (82, 85)

Respiratory (air pollution)

Asthma 493 (79, 80) Chronic obstructive pulmonary disease

490–496 (84)

Pneumonia 480–488 (88) Genitourinary (water contamination)

Kidney disease 580–589 (62) Infections of kidney, calculus of ureter, urinary tract infection

590, 592.1, 599.0 (87)

Bladder cancer 188 (108) Motor vehicle accidents (trucking)

Motor vehicle accidents E810–E819 (External cause-of-injury codes)

(91)

Table 1: ICD-9-CM codes and their categorizations for this study. References serve as rationale and/or precedent for investigating these particular ICD-9-CM codes.

Page 35: Thesis Trickey Kevin draft 3

35

The study by Denham and colleagues, who studied Pennsylvania Health Care Cost

Containment Council data, identified three ICD-9-CM codes—590, 592.1, and 599.0,

representing infection of the kidney, calculus of the ureter, and urinary tract infection,

respectively—as the drivers of a significant association between fracking and genitourinary

hospitalizations in Pennsylvania counties (87). An earlier study by Jemielita et al. found

significant associations between fracking wells and cardiology procedures, but did not identify

particular ICD-9-CM codes in detail (86). Section 6.1 further discusses this work’s relevance to

those studies.

To calculate yearly incidence rates for a particular outcome, I counted the number of

MedPAR inpatient claims that listed the relevant ICD-9-CM code in the admitting, primary, or

second diagnosis columns. For instance, if a beneficiary had a primary diagnosis of ICD-9-CM

434.91, or “Cerebral artery occlusion, unspecified with cerebral infarction,” that person would

contribute to their zip code’s incidence of stroke (ICD-9-CM 434) as well as cerebrovascular

disease (ICD-9-CM 430–437).

The total study population in each zip code and year was calculated as the number of

beneficiaries listed in the MBSF file who were enrolled in Medicare Parts A and B for the same

number of months they were alive in that year.

4.2.3 Panel Data and Model Specifications

For each ICD-9-CM outcome group, a panel dataset was generated with one observation

per zip code per year, from 1999–2015. Each observation included the zip code’s yearly

incidence of the outcome, the numbers of new and cumulative spuds drilled, and various

demographic features.

For the main analysis, I estimate the following specification separately for each outcome:

Page 36: Thesis Trickey Kevin draft 3

36

𝑦!" = 𝛼 + 𝜁! + 𝜏" + 𝛽𝑆!" + 𝑢!", (1)

where 𝑦!" is the incidence per beneficiary for year 𝑡 in zip code 𝑧, 𝛼 is the baseline intercept, 𝜁!

and 𝜏" are respectively zip code and year fixed effects, 𝑆!" is the cumulative number of spuds

drilled per beneficiary in zip code 𝑧 through year 𝑡, and 𝑢!" is a random error term assumed to be

uncorrelated with the regressors. The zip code fixed effects 𝜁! control for unobserved time-

invariant heterogeneity between zip codes, while the year fixed effects 𝜏" control for common

shocks affecting outcomes in all zip codes. Standard errors were clustered at state level to

account for within-state correlations in the outcomes. Other time-varying covariates, such as zip

code poverty rates and demographic distributions, were intentionally omitted to avoid controlling

for effects of well development, since well development often involves socioeconomic and

demographic shifts; this is discussed further in Section 6.2.2. The quantity of interest was 𝛽,

interpreted as the linear effect of an additional spud on hospitalization numbers, per capita.

I conducted various sensitivity analyses using different treatment metrics, control regions,

and model specifications (Section 5.3). Despite testing several different outcomes separately, I

present results without any post-hoc adjustment for multiple hypotheses; a fuller discussion of

multiple testing is in Section 6.2.2.

Page 37: Thesis Trickey Kevin draft 3

37

5 Results

5.1 Characteristics of Study Regions

The first unconventional spud in the PA Border region (zip codes mapped in Figure 5)

appeared in Bradford County in June 2005. Well development continued at a slow pace until

2008–2009, when the number of new spuds began to accelerate (Figure 6). Peak well

development occurred in 2010–2011, although new spuds continued to be drilled through 2015.

Supplementary Figure 2 shows a map of Pennsylvania zip codes shaded by spud density.

Figure 6: Exposure to fracking activity among zip codes in PA Border region, 1999–2015. One grey line is drawn for each zip code; red lines are aggregated over all zip codes in the region. Left panel shows the cumulative number

of spuds. Right panel shows the cumulative spuds per beneficiary, the exposure metric in the main regression.

The main model (1) is robust to differences between states or zip codes as long as those

differences remain constant over time; in other words, it assumes that all exogenous factors that

influence zip code incidence rates have parallel time-varying trends across zip codes. Figure 7

shows time series data of aggregate demographic statistics across fracking zip codes (PA Border)

and non-fracking zip codes (NY Border zip codes are used in the main analysis, and secondary

0

1000

2000

3000

2000 2005 2010 2015Year

Cum

ulat

ive S

puds

0.0

0.2

0.4

0.6

2000 2005 2010 2015Year

Cum

ulat

ive S

puds

per

Ben

efic

iary

Page 38: Thesis Trickey Kevin draft 3

38

control regions NY Northeastern and PA Southern are used in sensitivity analyses). Overall, the

trends of observed demographics appear relatively parallel, although the NY Border control

region has around four times the population of the exposed PA Border region. In Section 5.3.2, I

evaluate the different control regions to see if my results depend on the particular region used

(NY Border, NY Northeastern, or PA Southern).

Figure 7: Time-series plots of demographic trends in the study population between the exposed region (PA Border, blue), the main control region (NY Border, red), and the two additional control regions (NY Northeastern and PA

Southern, green and violet). Plots are shown for total beneficiary count, sex distribution, average zip code age distribution, and the percent identified as white. The main model results presented in Section 5.2 assume that, to the

extent these demographics influence outcomes, they have parallel trends throughout the study period.

0

20000

40000

60000

80000

2000 2005 2010 2015Year

Bene

ficia

ries

Total Number of Beneficiaries

42

43

44

45

46

47

2000 2005 2010 2015Year

Perc

ent o

f Med

icar

e po

pula

tion

Percent Male Among Beneficiaries

70

75

80

2000 2005 2010 2015Year

Aver

age

Med

ian

Age

with

IQR

Showing median (solid) and IQR (dashed)Age Distributions (Weighted Average)

95

96

97

98

99

2000 2005 2010 2015Year

Perc

ent o

f Med

icar

e Po

pula

tion

Percent White Among Beneficiaries

StudyGroupNY Border

NY Northeastern

PA Border

PA Southern

Page 39: Thesis Trickey Kevin draft 3

39

5.2 Model Estimates

Overall, I find consistent evidence that, within a zip code, increased fracking over time is

positively associated with cardiovascular and cerebrovascular outcomes, including arrhythmias,

heart failures, and thrombocytic causes like heart attacks and strokes (Table 2). In particular, the

model estimates that, among a zip code’s Medicare population, drilling an additional 100

unconventional wells is associated with approximately 4.4 additional cardiovascular

hospitalizations. There is also evidence for associations with kidney disease and possibly with

chronic obstructive pulmonary disease (COPD) events. The other outcomes, namely pneumonia,

bladder cancer, and motor vehicle accidents, do not show any association with fracking activity.

The model’s full results are displayed in Table 2, including the outcomes tested and the

estimated effect of the cumulative spuds per capita on the number of cases per capita.

Outcome Estimated Effect Standard Error P-value Cardiovascular 0.0441 0.0028 <0.0001 Arrhythmia 0.0308 0.0023 <0.0001 Heart failure 0.0198 0.0017 <0.0001 Thrombocytic causes 0.0045 0.00046 <0.0001 Cerebrovascular 0.0060 0.00019 <0.0001 Stroke 0.0049 0.00039 <0.0001 Respiratory 0.0250 0.0126 0.047 COPD, incl. asthma 0.0209 0.0076 0.0061 Asthma 0.0001 0.0002 0.42 Pneumonia –0.0023 0.0069 0.74 Kidney disease 0.0095 0.0028 0.0007 Infections of kidney, calculus of ureter, urinary tract infection

–0.0027 0.0033 0.41

Malignant bladder cancer –0.0005 0.0005 0.28 Motor vehicle accidents 0.0002 0.0005 0.72 Table 2: Model estimates and standard errors for the effect of cumulative wells per beneficiary on number of cases

per beneficiary. Standard errors are robust to state-level clustering. P-values are unadjusted.

Visualizing all the data described by these models at the zip code level, as time series or

other plots, would likely be difficult to interpret, so I aggregated hospitalization rates across all

the fracking zip codes (PA Border) and non-fracking zip codes (NY Border) and plotted

Page 40: Thesis Trickey Kevin draft 3

40

aggregate trends as time series. Figure 8 shows these time series data for the three groups of

outcomes showing the most evidence of an association with fracking exposure: cardiovascular

outcomes, cerebrovascular outcomes, and kidney disease outcomes. Because these plots are

aggregated by states, they obscure zip code-level variation in both the exposure and response,

which may drive model estimates; however, for cardiovascular and cerebrovascular outcomes in

particular, there does appear to be a relative increase in hospitalization rates in the PA Border

region coinciding with years of peak well development, 2010–2012. The time series figures also

allow visualization of the hospitalization trends prior to years of peak spud development, at least

at a state-aggregated level. Time series plots are presented for all outcomes in Supplementary

Figure 3.

Page 41: Thesis Trickey Kevin draft 3

41

Figure 8: Time series plots of hospitalization rates for cardiovascular outcomes, cerebrovascular outcomes, and kidney disease outcomes, as determined through admitting, primary, and second diagnoses of Medicare claims. Red

lines indicate hospitalization rates among New York (non-fracking) zip codes, while blue lines indicate hospitalization rates among Pennsylvania (fracking) zip codes. Dashed black lines trace the trajectory of new spuds

per year, scaled to no particular axis, but included as a reference to identify periods of heavy well development.

0.0

2.5

5.0

7.5

2000 2005 2010 2015Year

Hosp

italiz

atio

ns p

er H

undr

ed B

enef

iciar

ies

ICDï9ïCM 415.1, 427ï8, 433ï4, 444, 452ï3Cardiovascular Outcomes

0.0

0.5

1.0

1.5

2.0

2000 2005 2010 2015Year

Hosp

italiz

atio

ns p

er H

undr

ed B

enef

iciar

ies

ICDï9ïCM 430ï437Cerebrovascular Outcomes

0

1

2

3

2000 2005 2010 2015Year

Hosp

italiz

atio

ns p

er H

undr

ed B

enef

iciar

ies

ICDï9ïCM 580ï589Kidney Disease Outcomes

RegionNY Border

PA Border

Page 42: Thesis Trickey Kevin draft 3

42

5.3 Sensitivity Analyses

The estimates given by the main model specification are rather striking in both magnitude

and significance, so I conducted a host of sensitivity analyses to determine whether the results

are replicable under different specifications. The following subsections present sensitivity

analyses for cardiovascular, cerebrovascular, and kidney disease outcomes, since the main model

suggests the strongest evidence for associations with these three outcome groups. Full data from

the same sensitivity analyses on sub-categories (e.g. arrhythmia) and other outcomes (e.g.

malignant bladder cancer) are included in the Supplementary Material.

5.3.1 Different Exposure Specifications

The main model regresses the number of cumulative spuds per beneficiary against the

number of new cases per beneficiary. I used cumulative spuds rather than the number of new

spuds each year because fracking wells may affect health outcomes throughout the duration of

their production lifespan, through diesel truck traffic, routine well operations, and occasional re-

fracturing. Furthermore, the exposure itself (e.g. ingestion of contaminated water) or the

presentation of resultant health outcomes (e.g. kidney disease) may not occur until an undefined

period after the well’s drilling or fracturing. However, some outcomes may only be impacted

during the drilling and fracking stages, so it is useful to analyze the alternative outcome

specification of new spuds per year rather than cumulative spuds.

Secondly, in the main model I divided the number of spuds by the Medicare population in

each zip code to access per-capita exposure and improve interpretability of the regression output

(since incidence is also a per-capita measurement). However, alternative exposure specifications

could include the raw count of spuds or the density of spuds per unit area, each with slightly

varying interpretations.

Page 43: Thesis Trickey Kevin draft 3

43

To compare whether significance was dependent on the choice of exposure specification,

I repeated the main regression several times (1), replacing the independent variable 𝑆!" with the

spud count, the spud density, and the spuds per beneficiary, repeating each with new spuds and

cumulative spuds. Results comparing these metrics are displayed for the outcome categories

estimated by the main model with greatest significance: cardiovascular outcomes (Table 3),

cerebrovascular outcomes (Table 4), and kidney disease (Table 5). For all three of these

outcomes, most of these model specifications yielded comparable, strongly significant estimates

for the effect; the exceptions are the “new spud count” specification for cardiovascular outcomes

and the “new spuds per square meter” specification for cerebrovascular outcomes. Similar

patterns are evident in the sub-categories studied (arrhythmia, heart failure, thrombocytic causes,

stroke); results of the same sensitivity analysis for each outcome can be found in Supplementary

Table 1.

Exposure Specification Estimated Effect Standard Error P-value Cumulative spuds per beneficiary (main model)

0.0441 0.0028 <0.0001

Cumulative spuds per square meter

12960 840 <0.0001

Cumulative spud count 4.33e-5 1.73e-5 0.012 New spuds per beneficiary 0.1708 0.0249 <0.0001 New spuds per square meter 33730 9112 0.0002 New spud count 1.32e–4 9.37e–5 0.16

Table 3: Estimated effects of fracking exposure on cardiovascular hospitalizations (ICD-9-CM 415.1, 427, 428, 433, 434, 444, 452, 453) from models with different exposure specifications.

Exposure Specification Estimated Effect Standard Error P-value Cumulative spuds per beneficiary (main model)

0.0060 0.00019 <0.0001

Cumulative spuds per square meter

723 205 0.0004

Cumulative spud count 1.21e–5 5.86e–8 <0.0001 New spuds per beneficiary 0.0160 0.0063 0.011 New spuds per square meter –4043 2656 0.13 New spud count 4.56e–5 1.37e–5 0.0009

Page 44: Thesis Trickey Kevin draft 3

44

Table 4: Estimated effects of fracking exposure on cerebrovascular hospitalizations (ICD-9-CM 430–437) from models with different exposure specifications.

Exposure Specification Estimated Effect Standard Error P-value Cumulative spuds per beneficiary (main model)

0.0095 0.0028 0.0007

Cumulative spuds per square meter

1992 671 0.0030

Cumulative spud count 1.84e–5 4.40e–6 <0.0001 New spuds per beneficiary 0.0324 0.0066 <0.0001 New spuds per square meter 6603 1416 <0.0001 New spud count 7.15e–5 1.72e–5 <0.0001

Table 5: Estimated effects of fracking exposure on kidney disease hospitalizations (ICD-9-CM 580–589) from models with different exposure specifications.

5.3.2 Alternative Control Regions

As further analyses, I replaced the “control” zip codes on the New York border with

groups of fracking-free zip codes in northeastern New York and Southern Pennsylvania,

separately (Table 6). Notably, both alternative control regions gave similar results, with the

effect of fracking on cardiovascular outcomes estimated as at least 0.0615, compared to 0.0441

in the original model. The magnitude of the effect on cerebrovascular outcomes was similarly

higher using alternative control regions: 0.01 compared to 0.006. In contrast, both alternative

control regions do not generate significant results for kidney disease outcomes, and the estimated

effect magnitude is smaller than in the original model. Table 6 is extended to all outcomes of

interest in Supplementary Table 2.

Outcome Using NY Northeastern Zips Using PA Southern Zips Cardiovascular outcomes 0.0615 (p<0.0001) 0.0629 (p=0.0009) Cerebrovascular outcomes 0.0100 (p<0.0001) 0.0111 (p=0.0018) Kidney disease 0.0074 (p=0.14) 0.0031 (p=0.75) Table 6: Effect estimates from the main model (1) after replacing NY Border zip codes with NY Northeastern zip

codes or PA Southern zip codes (Figure 5). Coefficients are given along with p-values in parentheses.

Page 45: Thesis Trickey Kevin draft 3

45

5.3.3 Exclusion of Outlier Zip Codes

Since individual outliers can unduly influence linear regression, I removed any points

deemed as outliers in the incidence rate of each outcome as well as upper outliers in the exposure

metric (cumulative number of spuds per beneficiary).6 Model estimates after these exclusions are

presented in Table 7 and Supplementary Table 3. Because of the exclusion of outliers in

incidence as well as exposure, the standard errors presented are smaller, leading to deceptively

significant results; however, the estimated associations are generally more positive across all

outcomes than in the original model.

Outcome Estimated Effect Standard Error P-value Cardiovascular outcomes 0.0537 0.0052 <0.0001 Cerebrovascular outcomes 0.0137 0.0012 <0.0001 Kidney disease 0.0026 0.0010 0.0099

Table 7: Model estimates using data without outlier zip code-years, both for incidence and exposure metrics.

5.3.4 Exclusion of Populous and Sparse Zip Codes

As discussed in Section 6.2.1, a limitation of this study is that it treats all zip codes

identically; i.e. it does not account for greater uncertainty in incidence rates from less populous

zip codes. To evaluate whether this affects the model substantially, I restricted the zip codes to

the middle two quartiles of population, or those zip codes between the 25th and 75th percentile of

population. Table 8 and Supplementary Table 4 display the estimates of the main model on this

restricted dataset. Notably, the effect of fracking on cardiovascular hospitalizations is reduced to

0.0221, though still significantly positive. Cerebrovascular outcomes are no longer significantly

associated with fracking exposure, although strokes in particular (ICD-9-CM 434) do show

6 Outliers were defined as greater than the 75th percentile + 1.5(IQR) or less than the 25th percentile –

1.5(IQR), where IQR=interquartile range. When determining outliers in the exposure, zip codes with zero well

development were excluded before determining percentiles and IQR (but included in the regression).

Page 46: Thesis Trickey Kevin draft 3

46

evidence of a positive correlation (Supplementary Table 4). The effect on kidney disease

hospitalizations appears similar to or slightly more pronounced than the main model estimates

(Table 2).

Outcome Estimated Effect Standard Error P-value Cardiovascular outcomes 0.0221 0.0003 <0.0001 Cerebrovascular outcomes –0.0011 0.0013 0.43 Kidney disease 0.0152 0.0019 <0.0001

Table 8: Model estimates using data without first and fourth quartiles of zip code-years by population.

Page 47: Thesis Trickey Kevin draft 3

47

6 Discussion

6.1 Summary of Evidence

In these analyses I find that, among zip codes of a constant Medicare population,

increased spud development is associated with greater cardiovascular hospitalizations, consistent

across a variety of robustness checks. Among cardiovascular outcomes, cardiac arrhythmias

(ICD-9-CM 427) are most strongly associated with cumulative spud development, with an

estimated 0.031 additional hospitalizations predicted by each additional spud in a zip code of

constant population. Heart failure and thrombocytic causes also contribute significant

associations with spud development, although their significance may depend on the model

specification. There is also evidence for an association between spud development and

hospitalizations for cerebrovascular outcomes, kidney disease, and COPD, although these are

dependent on the particular zip codes that are chosen for modeling. I do not find evidence for an

effect on pneumonia, bladder cancer, or motor vehicle accidents.

Although a growing body of literature has investigated hydraulic fracturing-associated

health outcomes using hospitalization records, no studies have specifically investigated outcomes

among the Medicare population or across the Pennsylvania–New York border, which represents

a demographically and economically similar region except for the presence of fracking on the

Pennsylvania side. The association I observed between spud development and cardiovascular

outcomes does support evidence from Jemielita and colleagues, who studied inpatient discharge

records from 2007–2011 in Pennsylvania zip codes from Bradford, Susquehanna, and Wayne

counties and found significant positive associations between cardiology procedures and well

number/density (86). My analysis extends and refines their study in several ways. First, I

incorporate panel data from 1999–2015, which is particularly valuable since peak spud

Page 48: Thesis Trickey Kevin draft 3

48

development was still occurring in 2011. Second, I include fracking zip codes from Tioga

County as well as Bradford and Susquehanna, and instead of using Wayne County within

Pennsylvania as an unexposed control, I use four New York counties that border the exposed

Pennsylvania counties. Finally, while Jemielita et al. grouped all cardiology procedures with the

same Medicare Severity Diagnosis Related Group (MS-DRG), I selected particular

cardiovascular outcomes plausibly related to fracking pollutants (Section 4.2.2) in order to

prevent dilution of an effect with background hospitalizations unaffected by fracking. This work

helps to refine the general cardiology association to specific diagnosis codes and outcomes, like

arrhythmia, in order to direct future research on causes and mechanisms.

In contrast, my results do not support findings of associations with other outcomes from

other studies on fracking. Peng et al.’s study of Pennsylvania inpatient panel data from 2001–

2013 found significant positive associations between fracking development and pneumonia

hospitalizations among the elderly (88). Notably, my data presents a nonsignificant, negative

association for pneumonia hospitalization events. This negative association becomes statistically

significant when I replace the NY Border control zip codes with zip codes further away, either

from the NY Northeastern or PA Southern groups (Supplementary Table 2). Furthermore, the

nonsignificant negative association exists when regressed against spuds per capita or spuds per

square meter, but it becomes a highly significant positive association when the raw number of

spuds—either new or cumulative—is used as the exposure instead (Supplementary Table 1). For

their significant estimates, Peng et al. used a combination of an indicator variable describing

whether any active well existed in the county and the log of unconventional natural gas output.

Taken together, these data suggest that the association with pneumonia may be particularly

Page 49: Thesis Trickey Kevin draft 3

49

sensitive to model specification, and that interpreters should be wary of possibly conflicting

evidence.

Second, Denham and colleagues published findings of significant associations between

cumulative fracking development and countywide genitourinary hospitalizations, using

Pennsylvania inpatient panel data from 2003–2014 (87). They further identified this association

to be driven by non-elderly females with three particular diagnoses: infections of the kidney,

calculus of the ureter, and urinary tract infection. Their study proposed two plausible

mechanisms for this association: one based on exposure to fracking-related contaminants, and a

second based on transient demographic changes, originally advanced by Beleche and Cintina,

who found higher rates of prostitution-related arrests and certain sexually transmitted infections,

suggesting demographic changes involving the influx of a young, impermanent workforce

involved in fracking operations (96). I used the same three ICD-9-CM codes identified by

Denham et al., but did not find any evidence of an association with well development among the

elderly. As such, my results would support the latter interpretation of Denham et al.’s results

(Beleche and Cintina’s mechanism), although it is possible that the first mechanism might still be

true if the younger demographic were somehow more exposed than the rest of the population.

Third, some evidence has suggested associations between Pennsylvania fracking activity

and asthma exacerbations (79). I do not find particularly strong or significant associations to

support this research, and my results appear rather dependent on model specification in the

sensitivity analyses. It is possible my results are simply too underpowered to detect an effect,

since only around 0.3% of the Medicare population experienced an asthma-related

hospitalization during any given year. Pediatric asthma, which can be more prevalent and more

sensitive to air pollution, may present a stronger research opportunity (80).

Page 50: Thesis Trickey Kevin draft 3

50

In most of the outcomes discussed above, the predominant mechanism for any causal

effect would likely be air pollution. The drilling and fracking stages typically involve emissions

of nitrogen oxides (NOx), particulate matter (PM), and volatile organic compounds (VOCs) from

the drill rig, compressor, generator engines, heaters, pumps, and truck traffic. During flowback

and production, many VOCs and other hazardous air pollutants are released into the air from

underground (109). Air sampling near fracking wells has discovered potentially dangerous levels

of hydrogen sulfide, formaldehyde (65), methylene chloride, polycyclic aromatic hydrocarbons

(64), BTEX (67), alkanes (66), and other pollutants, all attributed to fracking activity. Non-

fracking studies have demonstrated concerning effects of these pollutants, including elevated risk

of ischemic stroke (82, 85), arrhythmias, thrombosis, and heart failures (81), and other outcomes

(84). My results support these suggestions. Notably, when I include unexposed zip codes further

away from the exposed region than the NY Border zip codes, the effect sizes for cardiovascular

and cerebrovascular outcomes are substantially higher. While further analyses would be needed

to establish a distance-response behavior, it is possible that the cardiovascular and

cerebrovascular effects in the main model are underestimated because some NY Border zip

codes are also affected by air pollutants from Pennsylvania wells. In this case, the other control

regions would yield more accurate results, suggesting the association is closer to 0.06 cases per

spud than 0.04.

Besides air pollution, water contamination is another possible mechanism for health

outcomes. Because of its filtration functions, the kidney is particularly sensitive to toxic water

contaminants (62). Nephrotoxicity may result from contamination of drinking or bathing water

by arsenic and other heavy metals, or other chemicals. Although evidence that fracking leads to

water contamination is more contentious than corresponding evidence for air pollution,

Page 51: Thesis Trickey Kevin draft 3

51

contaminants have been observed in cases, particularly after spills or leakages (53, 54, 58, 110,

111), and the extent of contamination from routine operations is still unknown (56). While my

results do not suggest a particular contaminant pathway, the significant effects of fracking

observed on kidney disease does indicate that further research is still needed to elucidate the root

of the correlation. Given that the estimate’s significance diminishes when I replace NY Border

zip codes with NY Northeastern or PA Southern zip codes, it is possible that the observed

association is mainly a feature of uniquely low kidney disease rates in NY Border zip codes.

Finally, despite sound a priori evidence for fracking-related increases in truck traffic

(Section 3.4.4), my analysis of motor vehicle accidents did not produce any notable results.

Moreover, several follow-up panel data investigations of traffic accidents in the general

population, using individual crash data from Pennsylvania and Texas state departments of

transportation, have not revealed any strong evidence for increased accidents in counties or

months with high fracking volume (data not shown). Unless other results emerge, despite limited

indications (91), any effect of fracking on traffic accidents appears negligible.

6.2 Strengths, Limitations, and Future Directions

6.2.1 Strengths and Limitations

This study’s main strengths are its data sources and study regions. Both Medicare claims

and PADEP records are reliable and complete, and my panel data from 1999–2015 includes

more years than other studies. In addition, the model structure is robust to differences across zip

codes as long as those differences do not vary with time. The study regions, namely the zip codes

in the NY and PA Border groups, were precisely selected to match areas exposed to fracking

development, and limited to regions with high development and no development (e.g. as opposed

to other studies that incorporate all Pennsylvania counties). Their geographical and demographic

Page 52: Thesis Trickey Kevin draft 3

52

proximity lends credibility to the assumption of minimal time-varying differences except as

related to fracking development. Finally, where other studies frequently use county-level

hospitalization and exposure measures from across Pennsylvania, my analyses have more precise

zip code-level measurements and exclude urban areas (e.g. Pittsburgh), where excess background

air pollution may obscure results.

A primary limitation of this study is that it treats each zip code/year unit in the panel data

as an equivalent point; i.e., two zip codes with the same exposure and hospitalization rate

contribute equally to the model, even if one has a greater population and therefore more certainty

in its estimate. It may be possible to incorporate this zip code-level uncertainty using a two-stage

model. However, without individual-level data on exposure (e.g. patient addresses), estimates of

fracking exposure would still carry some undefinable uncertainty. The sensitivity analyses in

Section 5.3.4, where the 25% most populous and least populous zip codes were excluded, were

included to help address this limitation.

A second limitation is that despite clustering of standard errors by state, the outcomes

possess serial autocorrelation within zip codes over years. Such autocorrelation may result in

downward-biased standard errors and an overstatement of significance. While this concern is not

fully addressed in my models, the zip code fixed effects and the relatively few time points per zip

code (<20 years, with fracking activity <10 years) provide reasonable trust in the estimates and

their standard errors. In addition, outcomes that demonstrate consistently large, positive effect

sizes (0.01–0.06) and consistently significant correlations (p << 0.0001) across all sensitivity

analyses present strong evidence for association.

Furthermore, this model does not capture the spatial or temporal correlation in data

points; for instance, if a zip code with high fracking activity spreads environmental pollutants

Page 53: Thesis Trickey Kevin draft 3

53

into neighboring zip codes as well, the impact on surrounding areas is not captured by the model.

This concern is amplified by the sensitivity analyses using alternative control regions, which

demonstrate preliminary evidence that hospitalization rates may be affected even in non-fracking

zip codes that neighbor exposed areas. This omission may result in misestimation of standard

errors. To address the temporal component, further analyses may potentially include terms

representing outcomes in the previous year(s) in the regression model, or other similar features.

Accounting for the spatial component of this autocorrelation is more difficult, but perhaps could

be approached using specifications that incorporate hospitalization rates from neighboring zip

codes as additional predictors. Based on the other evidence presented in my analyses, it is

unlikely that these adjustments would substantially alter the effect size or direction that is

observed, but they may result in different standard errors and more accurate p-values.

Despite the study’s strength coming from its dataset, I was only able to use inpatient

claims due to data availability purposes. As a result, only symptoms that required inpatient

hospitalization were detected. This is both a strength and a limitation. Any effects observed

among inpatient cases offer fairly strong evidence for a correlation, since many additional cases

will likely be resolved through physician or outpatient visits, or without professional medical

care at all. However, without non-inpatient records there is no opportunity to detect less

immediately serious outcomes that may be associated with well development. In sum, because of

the inpatient limitation, the effects presented here are likely underestimates of true morbidity

associated with fracking.

6.2.2 Procedural Remarks

Similar studies often include other time-varying covariates in the model, such as poverty

estimates, median income, or production metrics from alternative energy sources, but in this

Page 54: Thesis Trickey Kevin draft 3

54

panel data structure such covariates have high likelihoods of being impacted by fracking

development themselves. Consequentially, they were omitted in order to obtain a holistic view of

the effects of fracking development, both positive and negative. Further studies may find it

worthwhile to observe how including certain covariates changes model estimates, thereby

determining which covariates, such as socioeconomic measurements, may influence the causal

pathway.

Among the various outcomes tested and the many specifications in the sensitivity

analyses, there are likely some with overestimated significance, purely by chance of multiple

hypotheses. Although multiple testing was a consideration throughout the analyses, no formal

correction procedure was implemented for a few reasons. First, model results, particularly among

sensitivity analyses on the same outcome, were observed as a holistic group providing a unified

body of evidence, rather than a pool from which to select “significant” results. Second, many of

these tests were expected to have correlated significance, even between outcomes; for instance,

zip codes with more arrhythmia hospitalizations may also have more heart failure claims, and

may even have more claims for kidney disease, depending on the reason for the correlation.

Although some multiple testing correction methods are more robust to such correlation issues in

practice, nearly all assume the multiple tests are independent. Finally, the estimates I’ve

identified as significant in these investigations do not come from an arbitrary significance cutoff,

but rather display convincing evidence across multiple sensitivity analyses and would do so even

under an overconservative Bonferroni correction.

6.2.3 The Ideal Analysis

The most convincing analysis possible would use patient-level exposure data and

outcome data. One possible specification, for instance, would be a logistic model regressing a

Page 55: Thesis Trickey Kevin draft 3

55

person’s status (hospitalized/healthy) on their exposure to fracking (e.g. the inverse distance to

the nearest well, the number of wells within 5 miles, etc.). Without individual-level address data,

this analysis is impossible on the whole Medicare population. However, one opportunity for this

patient-level analysis is with a subset of Medicare beneficiaries living in nursing homes, since

the locations of nursing homes are known. This possibility represents a salient follow-up

opportunity I intend to pursue to achieve stronger and more accurate estimates of fracking effects

on health outcomes.

6.3 Conclusion

Overall, I present novel, consistent evidence for associations between unconventional

well development and cardiovascular hospitalizations, particularly for arrhythmias, among the

elderly in Northern Pennsylvania. These associations are robust to a variety of treatment and

model specifications. To a lesser extent, there are also associations with hospitalizations for

cerebrovascular, kidney disease, and possibly COPD events. My analyses invite further research

into mechanisms and morbidity caused by fracking activity, and suggest that the effect of well

development on local community health should become a broader part of the fracking

conversation in academic and public circles.

Page 56: Thesis Trickey Kevin draft 3

56

7 References

1. , The United States is now the largest global crude oil producer - Today in Energy - U.S. Energy Information Administration (EIA) (August 8, 2019).

2. , Global Natural Gas Production | World gas natural statistics | Enerdata (August 8, 2019).

3. O. Kuhn, Ancient Chinese Drilling: Salt Drilling in Ancient China and Other Stories.

4. , WVGES Geology: History of West Virginia Salt Industry (October 15, 2019).

5. J. E. Brantly, History of Oil Well Drilling (Gulf Pub Co, 1971).

6. C. A. Whiteshot, The Oil-well Driller: A History of the World’s Greatest Enterprise, the Oil Industry (C. A. Whiteshot, 1905).

7. , Nation’s first gas well was dug in Western New York. Buffalo News (2016) (October 17, 2019).

8. , History - Dale Hollow - Resort Parks - Kentucky State Parks (October 17, 2019).

9. , Visions of Azerbaijan Magazine ::: A BRIEF HISTORY OF OIL AND GAS WELL DRILLING. Vis. Azerbaijan Mag. (October 17, 2019).

10. , Development of the Pennsylvania Oil Industry - National Historic Chem…. archive.is (2013) (December 9, 2019).

11. , Samuel Kier - Medicine Man & Refiner | Oil150.com (2012) (December 9, 2019).

12. B. Black, M. Ladson, Oil at 150: Energy Past and Future in Pennsylvania. Pa. Legacies 10, 6–13 (2010).

13. , Titusville, Pennsylvania, 1896 (1896) (December 10, 2019).

14. , Drake’s Well (December 10, 2019).

15. , Early Oil in Pennsylvania (December 10, 2019).

16. , Who Made America? | Innovators | Edwin Drake (December 10, 2019).

17. H. com Editors, Battle of Fredericksburg. HISTORY (December 17, 2019).

18. , Shooters - A “Fracking” History. Am. Oil Gas Hist. Soc. (2019) (December 17, 2019).

19. J. Manfreda, The origin of fracking actually dates back to the Civil War. Bus. Insid. (December 17, 2019).

20. , Improvement in method of increasing capacity of oil-wells (1866) (December 17, 2019).

Page 57: Thesis Trickey Kevin draft 3

57

21. American Oil & Gas Historical Society, The petroleum age. Pet. Age (2004).

22. J. J. McLaurin, Sketches in Crude-oil: Some Accidents and Incidents of the Petroleum Development in All Parts of the Globe ... (The author, 1902).

23. , Halliburton cements Wells. Am. Oil Gas Hist. Soc. (2019) (December 19, 2019).

24. E. P. Halliburton, Method and means for cementing oil-wells (1921) (December 19, 2019).

25. T. E. Clark, Device for perforating well-casings (1902) (December 19, 2019).

26. , Lane-Wells 100,000th Perforation. Am. Oil Gas Hist. Soc. (2019) (December 19, 2019).

27. J. Grebe, M. Stoesser, Increasing crude production from established fields. World Pet. 6, 473–482 (1935).

28. T. T. Palisch, M. Vincent, P. J. Handren, Slickwater Fracturing: Food for Thought. SPE Prod. Oper. 25, 327–344 (2010).

29. C. T. Montgomery, M. B. Smith, Hydraulic Fracturing: History of an Enduring Technology. J. Pet. Technol. 62, 26–40 (2010).

30. , The Birth and Development of Horiozntal Directional Drilling. Trenchless Technol. (2012) (December 30, 2019).

31. A. G. E. Kosmala, A. C. Pisoni, D. K. Pirovolou, S. J. Kotsonis, Actively controlled rotary steerable system and method for drilling wells (2000) (January 3, 2020).

32. , Resource Library | SLB (January 3, 2020).

33. D. B. Gaudin, J. C. Beasley, A Comparison of MWD and Wireline, Steering Tool Guidance Systems in Horizontal Drilling in (Society of Petroleum Engineers, 1991) https:/doi.org/10.2118/22536-MS (January 3, 2020).

34. Z. Wang, M. Poscente, D. Filip, M. Dimanchev, M. P. Mintchev, Rotary in-drilling alignment using an autonomous MEMS-based inertial measurement unit for measurement- while-drilling processes. IEEE Instrum. Meas. Mag. 16, 26–34 (2013).

35. J. Včelák, P. Ripka, A. Zikmund, Precise Magnetic Sensors for Navigation and Prospection. J. Supercond. Nov. Magn. 28, 1077–1080 (2015).

36. , High-Volume Hydraulic Fracturing in NYS - NYS Dept. of Environmental Conservation (July 8, 2019).

37. T. Auch, TAuch_Infrastructure-Fracking-RiceEnergy-Richland_OH_July2019 (2019) with aerial assistance by LightHawk. Provided by FracTracker Alliance, fracktracker.org/photos.

Page 58: Thesis Trickey Kevin draft 3

58

38. , » Well Completion NaturalGas.org (January 4, 2020).

39. , “Hydraulic Fracturing for Oil and Gas: Impacts from the Hydraulic Fracturing Water Cycle on Drinking Water Resources in the United States” (U.S. Environmental Protection Agency, 2016).

40. T. Auch, TAuch_Infrastructure-FrackPad_Flaring-EclipseResources-Fairview_OH_July2018 (2018) with aerial assistance by LightHawk. Provided by FracTracker Alliance, fracktracker.org/photos.

41. C. Davenport, Reversing Course, E.P.A. Says Fracking Can Contaminate Drinking Water. N. Y. Times (2017) (March 11, 2019).

42. N. R. Warner, et al., Geochemical evidence for possible natural migration of Marcellus Formation brine to shallow aquifers in Pennsylvania. Proc. Natl. Acad. Sci. 109, 11961–11966 (2012).

43. J.-P. Nicot, et al., Controls on Methane Occurrences in Shallow Aquifers Overlying the Haynesville Shale Gas Field, East Texas. Groundwater 55, 443–454 (2017).

44. R. D. Vidic, S. L. Brantley, J. M. Vandenbossche, D. Yoxtheimer, J. D. Abad, Impact of shale gas development on regional water quality. Science 340, 1235009 (2013).

45. S. G. Osborn, A. Vengosh, N. R. Warner, R. B. Jackson, Methane contamination of drinking water accompanying gas-well drilling and hydraulic fracturing. Proc. Natl. Acad. Sci. 108, 8172–8176 (2011).

46. T. Wen, et al., Exploring How to Use Groundwater Chemistry to Identify Migration of Methane near Shale Gas Wells in the Appalachian Basin. Environ. Sci. Technol. 53, 9317–9327 (2019).

47. R. B. Jackson, et al., Increased stray gas abundance in a subset of drinking water wells near Marcellus shale gas extraction. Proc. Natl. Acad. Sci. 110, 11250–11255 (2013).

48. T. Saba, M. Orzechowski, Lack of data to support a relationship between methane contamination of drinking water wells and hydraulic fracturing. Proc. Natl. Acad. Sci. 108, E663–E663 (2011).

49. S. C. Schon, Hydraulic fracturing not responsible for methane migration. Proc. Natl. Acad. Sci. 108, E664–E664 (2011).

50. R. J. Davies, Methane contamination of drinking water caused by hydraulic fracturing remains unproven. Proc. Natl. Acad. Sci. 108, E871–E871 (2011).

51. P. F. Ziemkiewicz, J. D. Quaranta, A. Darnell, R. Wise, Exposure pathways related to shale gas development and procedures for reducing environmental and public risk. J. Nat. Gas Sci. Eng. 16, 77–84 (2014).

Page 59: Thesis Trickey Kevin draft 3

59

52. A. Vengosh, R. B. Jackson, N. Warner, T. H. Darrah, A. Kondash, A Critical Review of the Risks to Water Resources from Unconventional Shale Gas Development and Hydraulic Fracturing in the United States. Environ. Sci. Technol. 48, 8334–8348 (2014).

53. B. D. Drollette, et al., Elevated levels of diesel range organic compounds in groundwater near Marcellus gas operations are derived from surface activities. Proc. Natl. Acad. Sci. 112, 13184–13189 (2015).

54. S. A. Gross, et al., Analysis of BTEX groundwater concentrations from surface spills associated with hydraulic fracturing operations. J. Air Waste Manag. Assoc. 63, 424–432 (2013).

55. Z. L. Hildenbrand, et al., A Comprehensive Analysis of Groundwater Quality in The Barnett Shale Region. Environ. Sci. Technol. 49, 8254–8262 (2015).

56. A. Vengosh, N. Warner, R. Jackson, T. Darrah, The Effects of Shale Gas Exploration and Hydraulic Fracturing on the Quality of Water Resources in the United States. Procedia Earth Planet. Sci. 7, 863–866 (2013).

57. G. T. Llewellyn, et al., Evaluating a groundwater supply contamination incident attributed to Marcellus Shale gas development. Proc. Natl. Acad. Sci. 112, 6325–6330 (2015).

58. J. S. Harkness, et al., Iodide, Bromide, and Ammonium in Hydraulic Fracturing and Oil and Gas Wastewaters: Environmental Implications. Environ. Sci. Technol. 49, 1955–1963 (2015).

59. M. L. Hladik, M. J. Focazio, M. Engle, Discharges of produced waters from oil and gas extraction via wastewater treatment plants are sources of disinfection by-products to receiving streams. Sci. Total Environ. 466–467, 1085–1093 (2014).

60. E. Hill, L. Ma, Shale Gas Development and Drinking Water Quality. Am. Econ. Rev. 107, 522–525 (2017).

61. M. Bamberger, R. E. Oswald, Impacts of Gas Drilling on Human and Animal Health. NEW Solut. J. Environ. Occup. Health Policy 22, 51–77 (2012).

62. X. Xu, S. Nie, H. Ding, F. F. Hou, Environmental pollution and kidney diseases. Nat. Rev. Nephrol. 14, 313–324 (2018).

63. A. A. Roy, P. J. Adams, A. L. Robinson, Air pollutant emissions from the development, production, and processing of Marcellus Shale natural gas. J. Air Waste Manag. Assoc. 64, 19–37 (2014).

64. T. Colborn, K. Schultz, L. Herrick, C. Kwiatkowski, An Exploratory Study of Air Quality Near Natural Gas Operations. Hum. Ecol. Risk Assess. Int. J. 20, 86–105 (2014).

Page 60: Thesis Trickey Kevin draft 3

60

65. G. P. Macey, et al., Air concentrations of volatile compounds near oil and gas production: a community-based exploratory study. Environ. Health 13, 82 (2014).

66. G. Pétron, et al., Hydrocarbon emissions characterization in the Colorado Front Range: A pilot study. J. Geophys. Res. Atmospheres 117 (2012).

67. A. L. Rich, H. T. Orimoloye, Elevated Atmospheric Levels of Benzene and Benzene-Related Compounds from Unconventional Shale Extraction and Processing: Human Health Concern for Residential Communities. Environ. Health Insights 10, EHI.S33314 (2016).

68. T. Vinciguerra, et al., Regional air quality impacts of hydraulic fracturing and shale natural gas activity: Evidence from ambient VOC observations. Atmos. Environ. 110, 144–150 (2015).

69. P. S. Goodman, et al., Investigating the traffic-related environmental impacts of hydraulic-fracturing (fracking) operations. Environ. Int. 89–90, 248–260 (2016).

70. L. M. McKenzie, R. Z. Witter, L. S. Newman, J. L. Adgate, Human health risk assessment of air emissions from development of unconventional natural gas resources. Sci. Total Environ. 424, 79–87 (2012).

71. K. P. Stillerman, D. R. Mattison, L. C. Giudice, T. J. Woodruff, Environmental Exposures and Adverse Pregnancy Outcomes: A Review of the Science. Reprod. Sci. 15, 631–650 (2008).

72. J. A. Casey, et al., Unconventional natural gas development and birth outcomes in Pennsylvania, USA. Epidemiol. Camb. Mass 27, 163–172 (2016).

73. K. W. Whitworth, A. K. Marshall, E. Symanski, Maternal residential proximity to unconventional gas development and perinatal outcomes among a diverse urban population in Texas. PLOS ONE 12, e0180966 (2017).

74. K. W. Whitworth, A. K. Marshall, S. Symanski, Drilling and Production Activity Related to Unconventional Gas Development and Severity of Preterm Birth. Environ. Health Perspect. 126, 037006.

75. J. Currie, M. Greenstone, K. Meckel, Hydraulic fracturing and infant health: New evidence from Pennsylvania. Sci. Adv. 3, e1603021 (2017).

76. S. L. Stacy, et al., Perinatal Outcomes and Unconventional Natural Gas Operations in Southwest Pennsylvania. PLOS ONE 10, e0126425 (2015).

77. E. L. Hill, Shale gas development and infant health: Evidence from Pennsylvania. J. Health Econ. 61, 134–150 (2018).

78. McKenzie Lisa M., et al., Birth Outcomes and Maternal Residential Proximity to Natural Gas Development in Rural Colorado. Environ. Health Perspect. 122, 412–417 (2014).

Page 61: Thesis Trickey Kevin draft 3

61

79. S. G. Rasmussen, et al., Association Between Unconventional Natural Gas Development in the Marcellus Shale and Asthma Exacerbations. JAMA Intern. Med. 176, 1334–1343 (2016).

80. M. D. Willis, T. A. Jusko, J. S. Halterman, E. L. Hill, Unconventional natural gas development and pediatric asthma hospitalizations in Pennsylvania. Environ. Res. 166, 402–408 (2018).

81. G. Hoek, B. Brunekreef, P. Fischer, J. van Wijnen, The Association between Air Pollution and Heart Failure, Arrhythmia, Embolism, Thrombosis, and Other Cardiovascular Causes of Death in a Time Series Study. Epidemiology 12, 355–357 (2001).

82. G. A. Wellenius, et al., Ambient Air Pollution and the Risk of Acute Ischemic Stroke. Arch. Intern. Med. 172, 229–234 (2012).

83. D.-H. Tsai, J.-L. Wang, K.-J. Chuang, C.-C. Chan, Traffic-related air pollution and cardiovascular mortality in central Taiwan. Sci. Total Environ. 408, 1818–1823 (2010).

84. J. O. Anderson, J. G. Thundiyil, A. Stolbach, Clearing the Air: A Review of the Effects of Particulate Matter Air Pollution on Human Health. J. Med. Toxicol. 8, 166–175 (2012).

85. A. S. V. Shah, et al., Short term exposure to air pollution and stroke: systematic review and meta-analysis. BMJ 350 (2015).

86. T. Jemielita, et al., Unconventional Gas and Oil Drilling Is Associated with Increased Hospital Utilization Rates. PLoS ONE 10 (2015).

87. A. Denham, M. Willis, A. Zavez, E. Hill, Unconventional natural gas development and hospitalizations: evidence from Pennsylvania, United States, 2003–2014. Public Health 168, 17–25 (2019).

88. L. Peng, C. Meyerhoefer, S.-Y. Chou, The health implications of unconventional natural gas development in Pennsylvania. Health Econ. 27, 956–983 (2018).

89. A. K. Werner, et al., All-age hospitalization rates in coal seam gas areas in Queensland, Australia, 1995–2011. BMC Public Health 16, 125 (2016).

90. K. D. Retzer, R. D. Hill, S. G. Pratt, Motor vehicle fatalities among oil and gas extraction workers. Accid. Anal. Prev. 51, 168–174 (2013).

91. J. Graham, et al., Increased traffic accident rates associated with shale gas drilling in Pennsylvania. Accid. Anal. Prev. 74, 203–209 (2015).

92. G. Theodori, Paradoxical Perceptions of Problems Associated with Unconventional Natural Gas Development. South. Rural Sociol. 24, 97–117 (2009).

Page 62: Thesis Trickey Kevin draft 3

62

93. K. Brasier, et al., Residents’ Perceptions of Community and Environmental Impacts From Development of Natural Gas in the Marcellus Shale: A Comparison of Pennsylvania and New York Cases. J. Rural Soc. Sci. 26, 32–61 (2011).

94. A. B. Trueblood, G. Sansom, Are Fracking Sites Associated with Increased Motor Vehicle Crashes in Texas? Tex. Public Health J. 67, 15–17 (2015).

95. T. Komarek, A. Cseh, Fracking and public health: Evidence from gonorrhea incidence in the Marcellus Shale region. J. Public Health Policy 38, 464–481 (2017).

96. T. Beleche, I. Cintina, Fracking and risky behaviors: Evidence from Pennsylvania. Econ. Hum. Biol. 31, 69–82 (2018).

97. L. Muehlenbachs, E. Spiller, C. Timmins, The Housing Market Impacts of Shale Gas Development. Am. Econ. Rev. 105, 3633–3659 (2015).

98. M. S. Kearney, R. Wilson, “The Family Formation Response to a Localized Economic Shock: Evidence from the Fracking Boom” (Social Science Research Network, 2016) (July 16, 2019).

99. K. Hastings, L. R. Heller, E. F. Stephenson, Fracking and Labor Market Conditions: A Comparison of Pennsylvania and New York Border Counties. East. Econ. J. 43, 649–659 (2017).

100. A. James, B. Smith, There will be blood: Crime rates in shale-rich U.S. counties. J. Environ. Econ. Manag. 84, 125–152 (2017).

101. D. S. Rickman, H. Wang, J. V. Winters, Is shale development drilling holes in the human capital pipeline? Energy Econ. 62, 283–290 (2017).

102. P. M. Rabinowitz, et al., Proximity to natural gas wells and reported health status: Results of a household survey in Washington County, Pennsylvania. Environ. Health Perspect. 123, 21–26 (2015).

103. K. J. Ferrar, et al., Assessment and longitudinal analysis of health impacts and stressors perceived to result from unconventional shale gas development in the Marcellus Shale region. Int. J. Occup. Environ. Health 19, 104–112 (2013).

104. N. Steinzor, W. Subra, L. Sumi, Investigating Links between Shale Gas Development and Health Impacts through a Community Survey Project in Pennsylvania. NEW Solut. J. Environ. Occup. Health Policy 23, 55–83 (2013).

105. , Oil and Gas Reports. Dep. Environ. Prot. (July 22, 2019).

106. Cakmak Sabit, Dales Robert E., Vidal Claudia Blanco, Air Pollution and Mortality in Chile: Susceptibility among the Elderly. Environ. Health Perspect. 115, 524–527 (2007).

107. , High Performance Computing. Cent. Res. Inform. (April 21, 2020).

Page 63: Thesis Trickey Kevin draft 3

63

108. S. D. Richardson, M. J. Plewa, E. D. Wagner, R. Schoeny, D. M. DeMarini, Occurrence, genotoxicity, and carcinogenicity of regulated and emerging disinfection by-products in drinking water: A review and roadmap for research. Mutat. Res. Mutat. Res. 636, 178–242 (2007).

109. R. A. Field, J. Soltis, S. Murphy, Air quality concerns of unconventional oil and natural gas production. Environ. Sci. Process. Impacts 16, 954–969 (2014).

110. N. Abualfaraj, P. L. Gurian, M. S. Olson, Assessing Residential Exposure Risk from Spills of Flowback Water from Marcellus Shale Hydraulic Fracturing Activity. Int. J. Environ. Res. Public. Health 15, 727 (2018).

111. J. Wisen, et al., Assessing the potential of cross-contamination from oil and gas hydraulic fracturing: A case study in northeastern British Columbia, Canada. J. Environ. Manage. 246, 275–282 (2019).

Page 64: Thesis Trickey Kevin draft 3

64

8 Supplementary Material

Supplementary Figure 1: Solid lines show yearly incidences of all studied cardiovascular outcomes, as percentages, identifying outcomes only through the admitting and primary diagnoses. Dashed lines show yearly incidences using admitting, primary, and first secondary diagnoses. Dotted lines show yearly incidences using admitting, primary, and all secondary diagnoses. Cardiovascular outcomes are shown here, but other outcomes show similar relative

differences over the number of diagnosis columns included.

5

10

15

20

2000 2005 2010 2015Year

Cas

esPe

rHun

dred StudyGroup

NY Border

NY Northeastern

PA Border

PA Southern

ICD Codes: 428, 427, 415.1, 433, 434, 444, 452, 453Raw incidence per 100: airïcardiovascularïall

Page 65: Thesis Trickey Kevin draft 3

65

Supplementary Figure 2: Pennsylvania zip codes shaded by the density of spuds in the PADEP dataset. Lighter zip codes have a higher density of spuds, calculated as spuds per square meter. Red lines outline county boundaries.

Page 66: Thesis Trickey Kevin draft 3

66

Supplementary Figure 3: Hospitalization time series plots for different outcomes. Dashed black lines indicate trajectory of new spuds drilled per year, scaled to no particular axis. Y-axis is hospitalizations per year.

0.0

2.5

5.0

7.5

2000 2005 2010 2015

ICDï9ïCM 428, 427, 415.1, 433ï4, 444, 452ï3Cardiovascular

0

1

2

3

4

2000 2005 2010 2015

ICDï9ïCM 427Arrhythmia

0

1

2

3

4

5

2000 2005 2010 2015

ICDï9ïCM 428Heart Failure

0.0

0.5

1.0

2000 2005 2010 2015

ICDï9ïCM 415.1, 433ï4, 444, 452ï3Thrombocytic

0.0

0.5

1.0

1.5

2.0

2000 2005 2010 2015

ICDï9ïCM 430ï7Cerebrovascular

0.0

0.2

0.4

0.6

2000 2005 2010 2015

ICDï9ïCM 434Stroke

0

2

4

6

2000 2005 2010 2015

ICDï9ïCM 490ï6, 480ï8Respiratory

0

1

2

3

4

2000 2005 2010 2015

ICDï9ïCM 490ï6COPD

0.0

0.2

0.4

2000 2005 2010 2015

ICDï9ïCM 493Asthma

0

1

2

3

2000 2005 2010 2015

ICDï9ïCM 480ï8Pneumonia

0

1

2

3

2000 2005 2010 2015

ICDï9ïCM 580ï9Kidney Disease

0.0

0.5

1.0

1.5

2.0

2.5

2000 2005 2010 2015

ICDï9ïCM 590, 592.1, 599Genitourinary

0.00

0.05

0.10

0.15

2000 2005 2010 2015

ICDï9ïCM 188Bladder Cancer

RegionNY Border

PA Border

Page 67: Thesis Trickey Kevin draft 3

67

Supplementary Table 1: Data from sensitivity analyses on the exposure specification for each outcome. See Section 5.3.1 for context.

Exposure Specification Estimated Effect Standard Error P-value All cardiovascular outcomes: ICD-9-CM 415.1, 427, 428, 433, 434, 444, 452, 453

Cumulative spuds per beneficiary (main model)

0.0441 0.0028 <0.0001

Cumulative spuds per square meter

12960 840 <0.0001

Cumulative spud count 4.33e-5 1.73e-5 0.012 New spuds per beneficiary 0.1708 0.0249 <0.0001 New spuds per square meter 33730 9112 0.0002 New spud count 1.32e–4 9.37e–5 0.16

Arrhythmia: ICD-9-CM 427 Cumulative spuds per beneficiary (main model)

0.0308 0.0023 <0.0001

Cumulative spuds per square meter

10095 410 <0.0001

Cumulative spud count 2.41e–5 1.40e–5 0.085 New spuds per beneficiary 0.0950 0.0139 <0.0001 New spuds per square meter 28059 4164 <0.0001 New spud count 7.21e–5 5.18e–5 0.16

Heart failure: ICD-9-CM 428 Cumulative spuds per beneficiary (main model)

0.0198 0.0017 <0.0001

Cumulative spuds per square meter

6115 470 <0.0001

Cumulative spud count 1.99e–5 8.52e–6 0.020 New spuds per beneficiary 0.0944 0.0107 <0.0001 New spuds per square meter 17896 4433 <0.0001 New spud count 6.42e–5 4.65e–5 0.17

Thrombocytic causes: ICD-9-CM 415.1, 433, 434, 444, 452, 453 Cumulative spuds per beneficiary (main model)

0.0045 0.0005 <0.0001

Cumulative spuds per square meter

822 230 0.0003

Cumulative spud count 1.11e–5 5.41e–7 <0.0001 New spuds per beneficiary 0.0213 0.0068 0.0018 New spuds per square meter 449 2379 0.85 New spud count 4.84e–5 1.61e–5 0.0026

All cerebrovascular outcomes: ICD-9-CM 430–437 Cumulative spuds per beneficiary (main model)

0.0060 0.00019 <0.0001

Cumulative spuds per square meter

723 205 0.0004

Cumulative spud count 1.21e–5 5.86e–8 <0.0001

Page 68: Thesis Trickey Kevin draft 3

68

New spuds per beneficiary 0.0160 0.0063 0.011 New spuds per square meter –4043 2656 0.13 New spud count 4.56e–5 1.37e–5 0.0009

Stroke: ICD-9-CM 434 Cumulative spuds per beneficiary (main model)

0.0049 0.0004 <0.0001

Cumulative spuds per square meter

1161 182 <0.0001

Cumulative spud count 9.62e–6 9.68e–7 <0.0001 New spuds per beneficiary 0.0190 0.0036 <0.0001 New spuds per square meter 2223 1341 0.097 New spud count 4.16e–5 8.43e–6 <0.0001

All respiratory outcomes: ICD-9-CM 480–488, 490–496 Cumulative spuds per beneficiary (main model)

0.0250 0.0126 0.047

Cumulative spuds per square meter

7223 4374 0.099

Cumulative spud count 5.66e–5 2.19e–5 0.0097 New spuds per beneficiary 0.0553 0.0224 0.014 New spuds per square meter 4954 8498 0.56 New spud count 7.35e–5 5.49e–5 0.18

Chronic Obstructive Pulmonary Disease: ICD-9-CM 490-496 Cumulative spuds per beneficiary (main model)

0.0209 0.0076 0.0061

Cumulative spuds per square meter

7581 2104 0.0003

Cumulative spud count 4.09e–5 1.46e–5 0.0051 New spuds per beneficiary 0.0611 0.0173 0.0004 New spuds per square meter 8740 6388 0.17 New spud count 7.85e–5 4.61e–5 0.089

Asthma: ICD-9-CM 493 Cumulative spuds per beneficiary (main model)

0.0001 0.0002 0.42

Cumulative spuds per square meter

–320 170 0.060

Cumulative spud count 4.57e–6 6.54e–7 <0.0001 New spuds per beneficiary –0.0007 0.0003 0.0064 New spuds per square meter –915 201 <0.0001 New spud count 1.39e–5 1.17e–6 <0.0001

Pneumonia: ICD-9-CM 480–488 Cumulative spuds per beneficiary (main model)

–0.0023 0.0069 0.74

Cumulative spuds per square meter

–1380 2609 0.60

Cumulative spud count 2.73e–5 4.91e–6 <0.0001

Page 69: Thesis Trickey Kevin draft 3

69

New spuds per beneficiary –0.0120 0.0086 0.16 New spuds per square meter –3024 2695 0.26 New spud count 3.63e–5 9.72e–6 0.0002

Kidney disease: ICD-9-CM 580–589 Cumulative spuds per beneficiary (main model)

0.0095 0.0028 0.0007

Cumulative spuds per square meter

1992 671 0.0030

Cumulative spud count 1.84e–5 4.40e–6 <0.0001 New spuds per beneficiary 0.0324 0.0066 <0.0001 New spuds per square meter 6603 1416 <0.0001 New spud count 7.15e–5 1.72e–5 <0.0001

Infections of kidney, calculus of ureter, urinary tract infection: ICD-9-CM 590, 592.1, 599

Cumulative spuds per beneficiary (main model)

–0.0027 0.0033 0.41

Cumulative spuds per square meter

–542 1165 0.64

Cumulative spud count 3.12e–6 7.84e–6 0.69 New spuds per beneficiary –0.0133 0.0092 0.15 New spuds per square meter –5872 2424 0.016 New spud count 6.44e–6 2.65e–5 0.81

Malignant bladder cancer: ICD-9-CM 188 Cumulative spuds per beneficiary (main model)

–0.0005 0.0005 0.28

Cumulative spuds per square meter

–237 126 0.060

Cumulative spud count –1.53e–6 7.26e–7 0.036 New spuds per beneficiary –0.0002 0.0014 0.91 New spuds per square meter –68 393 0.86 New spud count –2.53e–6 2.82e–6 0.36

Motor vehicle accidents: ICD-9-CM Cumulative spuds per beneficiary (main model)

0.0002 0.0005 0.72

Cumulative spuds per square meter

181 121 0.13

Cumulative spud count 1.32e–6 6.42e–7 0.040 New spuds per beneficiary 0.0005 0.0009 0.58 New spuds per square meter 21.3 295 0.94 New spud count 5.00e–6 1.52e–6 0.0010

Page 70: Thesis Trickey Kevin draft 3

70

Supplementary Table 2: Estimates for 𝛽 in the main model (1) after replacing the NY Border zip codes with NY Northeastern zip codes (middle column) and PA Southern zip codes (right column). Estimates and their

corresponding p-values are given. For a map of the different regions see Figure 5.

Outcome Using NY Northeastern Zips Using PA Southern Zips Cardiovascular 0.0615 (p<0.0001) 0.0629 (p=0.0009) Arrhythmia 0.0411 (p<0.0001) 0.0369 (p<0.0001) Heart failure 0.0188 (p<0.0001) 0.0333 (p=0.011) Thrombocytic causes 0.0098 (p=0.011) 0.0090 (p=0.033) Cerebrovascular 0.0100 (p<0.0001) 0.0111 (p=0.0018) Stroke 0.0064 (p<0.0001) 0.0063 (p=0.0003) Respiratory 0.0195 (p=0.16) 0.0014 (p=0.70) COPD, incl. asthma 0.0219 (p=0.048) 0.0065 (p=0.0018) Asthma 0.0007 (p=0.23) 3.4e–5 (p=0.87) Pneumonia –0.0107 (p=0.016) –0.0167 (p<0.0001) Kidney disease 0.0074 (p=0.14) 0.0031 (p=0.75) Infections of kidney, calculus of ureter, urinary tract infection

0.0031 (p=0.0019) 0.0039 (p<0.0001)

Malignant bladder cancer 0.0003 (p=0.034) 0.0002 (p=0.59) Motor vehicle accidents –0.0003 (p=0.51) –8.7e–4 (p<0.0001)

Page 71: Thesis Trickey Kevin draft 3

71

Supplementary Table 3: Model estimates after excluding outlier zip code-years, either in incidence or exposure metrics, for all outcomes.

Outcome Estimated Effect Standard Error P-value Cardiovascular 0.0537 0.0052 <0.0001 Arrhythmia 0.0272 0.0050 <0.0001 Heart failure 0.0216 0.0060 0.0003 Thrombocytic causes 0.0105 0.0017 <0.0001 Cerebrovascular 0.0137 0.0012 <0.0001 Stroke 0.0091 0.0002 <0.0001 Respiratory 0.0257 0.0113 0.0230 COPD, incl. asthma 0.0427 0.0204 0.0367 Asthma 0.0036 3.2e–5 <0.0001 Pneumonia 0.0083 0.0098 0.3955 Kidney disease 0.0026 0.0010 0.0099 Infections of kidney, calculus of ureter, urinary tract infection

–0.0019 0.0002 <0.0001

Malignant bladder cancer 0.0003 3.1e–5 <0.0001 Motor vehicle accidents 0 0 NA

Page 72: Thesis Trickey Kevin draft 3

72

Supplementary Table 4: Model estimates after excluding zip code-years in first and fourth quartiles of population.

Outcome Estimated Effect Standard Error P-value Cardiovascular 0.0221 0.0003 <0.0001 Arrhythmia 0.0187 0.0005 <0.0001 Heart failure 0.0018 0.0011 0.11 Thrombocytic causes 0.0035 0.0008 <0.0001 Cerebrovascular –0.0011 0.0013 0.43 Stroke 0.0014 0.0006 0.022 Respiratory 0.0282 0.0184 0.13 COPD, incl. asthma 0.0270 0.0112 0.016 Asthma 0.0030 0.0003 <0.0001 Pneumonia 0.0055 0.0099 0.58 Kidney disease 0.0152 0.0019 <0.0001 Infections of kidney, calculus of ureter, urinary tract infection

0.0055 0.0016 0.0005

Malignant bladder cancer –0.0010 0.0008 0.22 Motor vehicle accidents 0.0003 0.0002 0.12