Reverse geocoding and implicaons for geospaal...

Post on 29-Jul-2020

9 views 0 download

Transcript of Reverse geocoding and implicaons for geospaal...

Reversegeocodingandimplica1onsforgeospa1alprivacy

PaulZandbergenDepartmentofGeographyUniversityofNewMexico

Outline

•  Geospatial privacy •  Geocoding / reverse geocoding •  Experimental design •  Results and Conclusion

Geospa1alPrivacy

•  Underincreasingthreatfromnewtechnologies,highresolu1ondataandeasy‐to‐usetools

How easy is it to hack this map? 

Geocoding

•  Widelyemployed

•  Wellunderstood•  Substan1alerrors

Toolofahacker:Reversegeocoding

•  Geocodinginreverse•  Rela1veeasy,rela1velynew•  Keytoolfor“hacking”publishedmaps•  Notwellunderstood

ID Address ZIP 101 123 Main St 12345 102 456 Central Ave 12346 … … …

Howtoprotectspa1alconfiden1ality?

•  Geographicmasking

•  Buthowtodothismosteffec1vely?•  NeedbeRerunderstandingofreversegeocoding

randompointwithincircle

randompointoncircle

randomdistanceanddirec1on

donutmasking

donutmaskingwithexclusion

ReviewoftheStateoftheArt

Panel of confidentiality issues arising from the integration of remotely sensed and self-identifying data

“No known technical strategy [….] for managing linked spatial-social data adequately resolves conflicts among the objectives of data linkage, open

access, data quality, and confidentiality protection across datasets and across

uses.” (Conclusion 3)

National Research Council, 2007

ResearchQues1ons

What are the capabili5es of reverse geocoding to iden5fy individuals from published loca5ons? 

How does this vary with the methods employed for geocoding and reverse geocoding? 

How does this vary with popula5on density? 

ExperimentalDesign

•  Actualbuildingloca1onswithknownaddresses–  TravisCounty,TX(Aus1n)–  Sampleof2,500residen1alloca1ons

–  Stra1fiedacross5popula1ondensityclasses•  Geocodeusing5differentgeocoders•  Reversegeocodeusing3differenttechniques•  Determineaccuracyofreversegeocoding

AddressPoints

Geocoders

•  TeleAtlasAddressPoints(commercial)•  TeleAtlasStreets(commercial)

•  GoogleMaps(freeAPI)

•  StreetMapUSAPro2007inArcGIS

•  Geoly1cs2007(usingTIGER2007data)

StudyArea–TravisCounty,TX

Popula1onDensityZones Sampleofresiden1aladdresspoints

ReverseGeocoding&Accuracy

•  Originalresiden1aladdresspoints–  Snaptonearestresiden1albuilding

•  TeleAtlasreversestreetgeocoding–  Submitforcommercialprocessing

•  GoogleMapsreversegeocoding–  FreeAPI

•  Accuracyofreversematches1.  Perfectmatch(streetnameandnumber)

2.  Closematch(numberwithin10)

3.  Samestreetonly

Results–MatchRates(%)

GeocodingTechnique Popula1ondensity(people/km2)

<50 50to250 250to1000 1000to2500 >2500 Total

TeleAtlasAP 43.6 70.2 91.8 92.8 93.6 78.4

TeleAtlasStreet 93.2 92.8 98.0 99.8 96.8 96.1

GoogleMaps 92.2 95.6 98.2 99.0 96.2 96.2

StreetMapPro 81.2 83.6 95.8 99.0 95.8 91.1

Geoly1cs 77.0 80.2 92.4 96.2 90.2 87.2

Combined 31.4 59.0 86.0 89.6 86.8 70.6

Results–SameStreetMatches

ReverseGeocodingTechnique

Aus1nAP GoogleMaps TeleAtlasStreet

GeocodingTechnique

Aus1nAP 100.0 96.7 90.1

TeleAtlasAP 99.5 92.0 89.6

GoogleMaps 99.2 32.0 90.5

TeleAtlasStreet 88.2 92.5 99.5

StreetMapPro 89.1 76.8 82.5

Geoly1cs 54.5 54.7 56.9

Results–CloseReverseMatches

ReverseGeocodingTechnique

Aus1nAP GoogleMaps TeleAtlasStreet

GeocodingTechnique

Aus1nAP 100.0 95.5 23.1

TeleAtlasAP 99.1 91.8 24.3

GoogleMaps 98.8 28.6 24.3

TeleAtlasStreet 69.8 63.2 92.5

StreetMapPro 60.8 42.9 44.0

Geoly1cs 33.8 27.4 29.3

Results–PerfectReverseMatches

ReverseGeocodingTechnique

Aus1nAP GoogleMaps TeleAtlasStreet

GeocodingTechnique

Aus1nAP 100.0 94.2 9.1

TeleAtlasAP 97.9 91.8 9.0

GoogleMaps 97.2 27.8 9.1

TeleAtlasStreet 18.7 9.4 55.0

StreetMapPro 17.0 7.3 8.2

Geoly1cs 7.1 2.3 2.6

EffectofPopula1onDensity–PercentPerfectMatches

Geocoding Reverse Popula1ondensity(people/km2)

<50 50to250 250to1000 1000to2500 >2500

StreetMapPro Aus1nAP 22.9 16.9 15.3 16.1 17.3

Geoly1cs Aus1nAP 14.0 7.1 7.0 5.4 6.5

Aus1nAP TeleAtlasStreet 2.5 6.8 11.6 11.4 8.3

TeleAtlasStreet TeleAtlasStreet 51.6 63.7 56.7 53.8 49.8

GoogleMaps TeleAtlasStreet 4.5 6.4 12.3 11.2 7.4

TeleAtlasStreet GoogleMaps 9.6 9.8 8.8 9.6 9.4

Geoly1cs GoogleMaps 1.9 2.0 2.6 2.5 2.3

ResultsSummary

•  Accuracyofreversegeocodingvariesgreatly•  Buildinglevel(reverse)geocodingistypicallymostaccurate•  Streetgeocodingisquitenoisy

–  Easytogettherightstreet–  Veryfewperfectmatches

•  Accuracyissubstan1allyimprovedifknowledgeoftheoriginalgeocodingtechniqueisavailable!

•  NoclearpaRernwithpopula1ondensity

RoopopGeocodinginGoogleMapsandVirtualEarth

Commercial Address Points - TeleAtlas

Source: TeleAtlas, 2009

Licensed to Google, Virtual Earth, ArcGIS Business Analyst, Pitney Bowes / Group 1 / MapInfo

51 million address points in the US

ReverseGeocoding

ReversegeocodingnowsupportedinGoogleMapsandMicrosopVirtualEarth

AlsosupportedinlatestversionofArcGIS–requiressomecustomiza1onorArcWebservices

Numerousfreeeasy‐to‐useonlineu1li1es

Conclusions

•  Accuracyofreversegeocoding–  Variesgreatlywithgeocoder/reversegeocodercombina1on

–  Between2%and98%perfectreversematches

•  Knowledgeoforiginalgeocodingmethodiscri1cal–  noisyresultsfromstreetgeocodingcanbereversecoded

•  Trends:–  Addresspointsarethenewstandardingeocoding–  Reversegeocodingisrela1velyeasy

•  Techniquestoprotectprivacymayneedtoassumeaworst‐casescenario:veryhighresolu1onaddressdata

FutureResearch

•  Replicateinotherstudyareas•  Examineurban/ruralgradientsmoreclosely•  Experimentwithdifferentmaskingtechniques•  Developaframeworkforspa1alκ‐anonymity

Acknowledgements

•  Na1onalScienceFounda1on•  UNMResearchAlloca1onCommiRee•  AmericanCivilLiber1esUnion