Understanding business location patterns through co ... · Metropolis, N., & Ulam, S. (1949). The...

1
1. Candidate Generation Through Co-Location Pattern Mining: Used concept of participation index(pi) to find interesting co-locations Participation Index : Efficient, but many false positives 2. Statistical Tests of Co-Location and Clustering Patterns: Ripley’s K and Cross-K against Poisson Complete Spatial Randomness Ripley’s K : " = ’( *(, -. /,) 1 345 Cross-K : " = 5 ’( *(, -. /,) 1 - 345 K Functions : Computationally expensive, very accurate Autocorrelation vs. Business Decision 3. Monte-Carlo Simulation: Shuffle businesses among their location domain, calculate K-function value each time Repeat 999 times, if above 95% confidence, reject null hypothesis and claim intentional clustering Introduction and Background Discussion Dixon, P. M. (2014). Ripley's K Function. Wiley StatsRef: Statistics Reference Online. Huang, Y., Shekhar, S., & Xiong, H. (2004). Discovering colocation patterns from spatial data sets: a general approach. IEEE Transactions on Knowledge and Data Engineering, 16(12), 1472-1485. Metropolis, N., & Ulam, S. (1949). The monte carlo method. Journal of the American Statistical Association, 44(247), 335-341. Methods and Analysis Techniques References 1. Gas stations in general tend to be clustered or inconclusive. Main gas stations tend to be inconclusive or declustered. Small gas businesses lack access to extensive land choice and/or data 2. Food locations tend to cluster due to competition or complementation 3. Food and Bank location may be clustered due urban commercial region planning by cities Co-Location Pattern: A co-location is a set of spatial features that frequently occur together (e.g. Walmart and Subway) Figure 1 : Walmart and Subway Co-Location Retrieved from http://www.kptv.com/story/33335477/affidavit-man-pointed-gun-at-subway-manager-inside-walmart-shot-at-pumpkins Objectives and Significance: Discover previously unknown business location patterns among: 1. Specific brands within industries 2. Specific brands across industries 3. Different Industries 4. Same Industries Help small businesses choose location by revealing location patterns of large brands Data Set: Dataset for three largest cities in the US obtained through querying Google Places API. Heat maps of locations shown in Figures 2, 3, 4. Brightness indicates density Figure 2 : New York City Figure 3 : Los Angeles Figure 4 : Chicago Understanding business location patterns through co-location pattern mining Jeffrey Chiu 1 , Amin Vahedian Khezerlou 2 , Xun Zhou 2 Irvington High School 1 - Department of Management Sciences, The University of Iowa 2 Figure 6 : Flow of Analysis in this project Figure 5 : Example Monte-Carlo K Function Retrieved from resources.esri.com/help/9.3/arcgisdesktop/com/gp_toolref/spatial_statistics_tools/ Results Figure 7 : Ripley’s K Function for All Gas Stations and Major Gas Stations Only (a) NYC All (b) NYC Major (c) LA All (d) LA Major (f) Chicago Major (f) Chicago All Figure 8 : Cross-K Functions within food Industry (a) Chipotle/Le Paine Quotidien NYC (b) Juice Press/Le Paine Quotidien NYC (c) McDonalds/BR NYC (d) Dunkin/Subway NYC Further Directions (d) Dunkin/Subway Chicago (e) Starbucks/Subway LA 1. Find interesting co-location patterns among other cities and find overarching patterns among cities 2. Work with economics researchers to generate economic theories behind co- location patterns 3. Submit for publication to an academic journal after further results and analysis Acknowledgements Figure 9 : Cross-K Function across Industries (a) Chipotle/HSBC Bank NYC (b) Jimmy Johns/ Chase Chicago (c) Starbucks/ Wells Fargo LA Special thanks to Amin and Dr. Zhou for their guidance on this project. Thanks to SSTP for allowing me to have such a wonderful opportunity. Figure 10 : Cross-K Function of types food and bank (a) New York City (b) Los Angeles (c) Chicago

Transcript of Understanding business location patterns through co ... · Metropolis, N., & Ulam, S. (1949). The...

Page 1: Understanding business location patterns through co ... · Metropolis, N., & Ulam, S. (1949). The monte carlomethod. Journal of the American Statistical Association, 44(247), 335-341.

1. CandidateGenerationThroughCo-LocationPatternMining:● Usedconceptofparticipationindex(pi) tofindinterestingco-locations● ParticipationIndex:Efficient,butmanyfalsepositives

2. StatisticalTestsofCo-LocationandClusteringPatterns:● Ripley’sKandCross-K againstPoissonCompleteSpatialRandomness

● Ripley’sK :𝐾" 𝑑 = 𝜆'( ∑*(,-./,)

1�345

● Cross-K :𝐾" 𝑑 = 𝜆5'( ∑ *(,-./,)

1-�345

● KFunctions:Computationallyexpensive,veryaccurate

● Autocorrelationvs.BusinessDecision3. Monte-CarloSimulation:

● Shufflebusinessesamongtheirlocationdomain,calculateK-functionvalueeachtime

● Repeat999times,ifabove95%confidence,rejectnullhypothesisandclaimintentionalclustering

IntroductionandBackground Discussion

Dixon,P.M.(2014).Ripley'sKFunction.WileyStatsRef:StatisticsReferenceOnline.

Huang,Y.,Shekhar,S.,&Xiong,H.(2004).Discoveringcolocationpatternsfromspatialdatasets:ageneralapproach.IEEETransactionsonKnowledgeandDataEngineering,16(12),1472-1485.

Metropolis,N.,&Ulam,S.(1949).Themontecarlo method.JournaloftheAmericanStatisticalAssociation,44(247),335-341.

MethodsandAnalysisTechniques

References

1. Gasstationsingeneraltendtobeclusteredorinconclusive.Maingasstationstendtobeinconclusiveordeclustered.Smallgasbusinesseslackaccesstoextensivelandchoiceand/ordata

2. Food locationstendtoclusterduetocompetitionorcomplementation

3. FoodandBanklocationmaybeclustereddueurbancommercialregionplanningbycities

Co-Location Pattern:A co-location is a set of spatialfeaturesthatfrequentlyoccurtogether(e.g.WalmartandSubway)

Figure1:WalmartandSubwayCo-LocationRetrievedfromhttp://www.kptv.com/story/33335477/affidavit-man-pointed-gun-at-subway-manager-inside-walmart-shot-at-pumpkins

ObjectivesandSignificance:Discoverpreviouslyunknownbusinesslocationpatternsamong:

1. Specificbrandswithinindustries2. Specificbrandsacrossindustries3. DifferentIndustries4. SameIndustries

HelpsmallbusinesseschooselocationbyrevealinglocationpatternsoflargebrandsDataSet:DatasetforthreelargestcitiesintheUSobtainedthroughqueryingGooglePlacesAPI.HeatmapsoflocationsshowninFigures2,3,4.Brightnessindicatesdensity

Figure2:NewYorkCity Figure3:LosAngeles

Figure4:Chicago

Understandingbusinesslocationpatternsthroughco-locationpatternminingJeffreyChiu1,AminVahedian Khezerlou2,Xun Zhou2

IrvingtonHighSchool1- DepartmentofManagementSciences,TheUniversityofIowa2

Figure6:FlowofAnalysisinthisproject

Figure5:ExampleMonte-CarloKFunctionRetrievedfromresources.esri.com/help/9.3/arcgisdesktop/com/gp_toolref/spatial_statistics_tools/

Results

Figure7:Ripley’sKFunctionforAllGasStationsandMajorGasStationsOnly(a)NYCAll (b)NYCMajor (c)LAAll (d)LAMajor (f)ChicagoMajor(f)ChicagoAll

Figure8:Cross-KFunctionswithinfood Industry

(a)Chipotle/LePaineQuotidien NYC

(b)JuicePress/LePaineQuotidien NYC

(c)McDonalds/BRNYC (d)Dunkin/SubwayNYC

FurtherDirections

(d)Dunkin/SubwayChicago

(e)Starbucks/SubwayLA

1. Findinterestingco-locationpatternsamongothercitiesandfindoverarchingpatternsamongcities

2. Workwitheconomicsresearcherstogenerateeconomictheoriesbehindco-locationpatterns

3. Submitforpublicationtoanacademicjournalafterfurtherresultsandanalysis

Acknowledgements

Figure9:Cross-KFunctionacrossIndustries

(a)Chipotle/HSBCBankNYC

(b)JimmyJohns/ChaseChicago

(c)Starbucks/WellsFargoLA

SpecialthankstoAminandDr.Zhoufortheirguidanceonthisproject.ThankstoSSTPforallowingmetohavesuchawonderfulopportunity.

Figure10:Cross-KFunctionoftypesfood andbank(a)NewYorkCity (b)LosAngeles (c)Chicago