Ambient Geographic Information and biosurveilance todd barr

26
Ambient Geographic Information and Biosurveillance Capstone Presentation Todd Barr March 20, 2013

Transcript of Ambient Geographic Information and biosurveilance todd barr

Page 1: Ambient Geographic Information and biosurveilance todd barr

Ambient Geographic

Information and

Biosurveillance

Capstone Presentation

Todd Barr

March 20, 2013

Page 2: Ambient Geographic Information and biosurveilance todd barr

“Classic” Biosurveillance

• Reports Only the Cases that are handled by Medical

Professionals

• Data is sent to the Centers for Disease Control and

Prevention

• Data is Aggregated to the State level

• Standard Turn Around time is anywhere from 7 to 10

days depending on the data, and the level of the crisis

Page 3: Ambient Geographic Information and biosurveilance todd barr

Ambient Geographic

Information

• Ambient Geographic Information (AGI) differs from

Volunteered Geographic Information (VGI)

• Most Commonly Captured from Twitter, Facebook and

Four Square

• Can be used to trace vectors through Social Networks

• Can Determine “Hot Spots” of activity via Hashtags, key

words and modifiers

• Starting to be used in Biosurveillance, but still does not

have buy in from “establishment”

Page 4: Ambient Geographic Information and biosurveilance todd barr

Risk Terrain Modeling

• Originally Used to Predict Crime

• Core Concept is that Certain activities are related to

Geographic Features (Assaults tend to occur near certain

Liquor Stores, Bars or Entertainment Venue)

• Leads to a Spatial Understanding for Strategic Decision

Making

• Allows Decision Makers to make best use their of

Resources

Page 5: Ambient Geographic Information and biosurveilance todd barr

AGI and RTM Enhancing

Biosurveilance

• AGI

• Allowing Real Time Disease Information to be consumed

and Analyzed both Spatially and Text

• No turn around time

• Not Aggregated to a State level

• RTM

• Generation of a RTM Map for Public Health by County

• People in the lesser served areas less likely to seek medical

attention and less likely to have symptoms/aliment reported

Page 6: Ambient Geographic Information and biosurveilance todd barr

Data Collection - RTM

• Used the Criteria from Publication “County Health Rankings and Roadmaps: a Healthier Nation County by County

• 32 influencers on health and health care quality

• Examples

• Number of Medical Doctors in County

• Proximity to Medical Care

• Percentage of Population with Health Insurance

• Divided Counties into Quartiles

• 152 counties had no Data

Page 7: Ambient Geographic Information and biosurveilance todd barr

Data Collection - AGI

• Used Python Script To Collect Tweets within the US to

populate spreadsheet

• Collected an average of 40,000 tweets a night

• Roughly 5% of those Tweets had location data

• Used Hashtags, Keywords and Modifiers to determine if

they were talking about the Flu, or getting a Flu shot

Page 8: Ambient Geographic Information and biosurveilance todd barr

The Study

• Collection of Flu Related Geo located Tweets within the

United States from the week of January 5 to the week

ending February 2

• Determined how many of those Tweets were in each

Quartile

• Compare the Results to the CDC Data from those same

timeframe

Page 9: Ambient Geographic Information and biosurveilance todd barr

Data Cleaning - AGI

• Total Usable Tweets 25,000

• Geocoding Issues

• Most had City and State

• Some just had State

• Others had full State Names which did not Geocode

• Others had Clinics for Cities and Cities for States

• Used both ESRI Online Geocoding as well as CartoDB

• ESRI Online Geolocated 75% of the total tweets

• CartoDB Geolocated 90% of the total tweets

Page 10: Ambient Geographic Information and biosurveilance todd barr

Data Metrics – Key Words

0

5000

10000

15000

20000

25000

30000

flu Influenza h1n1 H3N2 H5N1 Adenovirus

Key Word and Hashtag

Page 11: Ambient Geographic Information and biosurveilance todd barr

Data Metrics - Modifiers

0

500

1000

1500

2000

2500

3000

Tweet Modifiers

Page 12: Ambient Geographic Information and biosurveilance todd barr

Data Metrics – by State

0

500

1000

1500

2000

2500

3000

AK

AL

AR

AZ

MD FL

MA

NY

CA

DE

GA

VA

TN

MO NJ

MI

WI

NC HI

IA ID IN KS

KY

LA

PA

ME

OR

MN

MS

MT

ND

NE

NH

NM

NV

OK

WV

WA RI

SC

SD

UT

VT

WY

Page 13: Ambient Geographic Information and biosurveilance todd barr

Data Metrics – by Quartile

Total Tweets By Quartile

Quartile 1

Quartile 2

Quartile 3

Quartile 4

No Data

Page 14: Ambient Geographic Information and biosurveilance todd barr

Maps – All Tweets

Page 15: Ambient Geographic Information and biosurveilance todd barr

Map – Tweets January 5th

Page 16: Ambient Geographic Information and biosurveilance todd barr

Map – CDC ILI January 5

Page 17: Ambient Geographic Information and biosurveilance todd barr

Maps – Tweets January 12

Page 18: Ambient Geographic Information and biosurveilance todd barr

Maps – CDC ILI January 12

Page 19: Ambient Geographic Information and biosurveilance todd barr

Maps – Tweets January 19

Page 20: Ambient Geographic Information and biosurveilance todd barr

Maps – CDC ILI January 19

Page 21: Ambient Geographic Information and biosurveilance todd barr

Maps – Tweets January 26

Page 22: Ambient Geographic Information and biosurveilance todd barr

Maps – CDC ILI January 26

Page 23: Ambient Geographic Information and biosurveilance todd barr

Maps – Tweets February 2

Page 24: Ambient Geographic Information and biosurveilance todd barr

Maps – CDC ILI February 2

Page 25: Ambient Geographic Information and biosurveilance todd barr

Conclusions

• Social Media can be used as a new tool in the

Biosurveillance Toolkit

• Tweets are nearly evenly disturbed between the Risk

Quartiles

• Social Media shows trends that are reflected in the CDC

Data

Page 26: Ambient Geographic Information and biosurveilance todd barr

Contact

Todd Barr

[email protected]