Mobile Network Big Data for Smarter Urban & Transportaon...
Transcript of Mobile Network Big Data for Smarter Urban & Transportaon...
MobileNetworkBigDataforSmarterUrban&Transporta:onPlanning
ThisworkwascarriedoutwiththeaidofagrantfromtheInterna5onalDevelopmentResearchCentre,CanadaandtheDepartmentforInterna5onalDevelopmentUK..
SriganeshLokanathan
DataforDevelopmentParliamentHouse,Ulaanbaatar,Mongolia
24February2016
Catalyzing policy change through research to improve people’s lives in the emerging Asia Pacific by facilitating their use of hard and soft infrastructures through the use of knowledge, information and technology.
Ourmission
Wherewework
BigdataworkonlyinSriLankain2012-15HopetoextendtoBangladeshin2016-17
“Smartci5es,”thenewbuzzphrase• IBMhasbeenpromo5ngsmartci5esand big data
since the2000s
• But two visions – New smart cities created on green fields, like South
Korea’s Songdo, OR – Improving the functioning of existing cities
• We believe we canmakeourexis5ngci5essmart,onthecheapbyleveragingopendataandbigdata
4
Bigdata• Anall-encompassingtermforanycollec5onofdatasetssolargeand/orcomplexthatitbecomesdifficulttoprocessusingtradi5onaldataprocessingapplica5ons.
• Challengesinclude:analysis,capture,cura5on,search,sharing,storage,transfer,visualiza5on,andprivacyviola5ons.
• Examples:– 100millionCallDetailRecordsperdaygeneratedbySriLankacompanies
– 45TerabytesofdatafromHubbleTelescope
5
Ifwewantcomprehensivecoverageofthepopula5on,whatarethesourcesofbig
dataindevelopingeconomies?• Administra5vedata
– E.g.,digi5zedmedicalrecords,insurancerecords,taxrecords
• Commercialtransac5ons(transac5on-generateddata)– E.g.,Stockexchangedata,banktransac5ons,creditcardrecords,
supermarkettransac5onsconnectedbyloyaltycardnumber
• Sensorsandtrackingdevices– E.g.,roadandtrafficsensors,climatesensors,equipment&
infrastructuresensors,mobilephonescommunica5ngwithbasesta5ons,satellite/GPSdevices
• Onlineac5vi5es/socialmedia– E.g.,onlinesearchac5vity,onlinepageviews,blogs/FB/twiberposts
6
MobileNetworkBigDataisonlyop5onatthis5me
7
MobileSIMs/100 Internetusers/100 Facebookusers/100
Myanmar 50 2 12
Bangladesh 76 10 9
Pakistan 73 14 11
India 73 18 9
SriLanka 107 26 16
Philippines 112 40 41
Indonesia 125 17 25
Thailand 143 35 49
Mongolia 124 27 37
Source: ITU Measuring Information Society 2014; Facebook advertising portal
Mobilenetworkbigdata+otherdataèrich,5melyinsightsthatserveprivateaswellaspublicpurposes
8
Construct Behavioral Variables
1. Mobility variables 2. Social variables 3. Consumption variables
Other Data Sources
1. Data from Dept. of Census & Statistics
2. Transportation data 3. Health data 4. Financial data 5. Etc.
Dual purpose insights
Private purposes
1. Mobility & location based services
2. Financial services 3. Richer customer
profiles 4. Targeted
marketing 5. New VAS
Public purposes
1. Transportation & Urban planning
2. Crises response + DRR
3. Health services 4. Poverty mapping 5. Financial inclusion
Mobile network big data (CDRs, Internet access usage, airtime recharge
records)
Datausedintheresearch• Mul5plemobileoperatorsinSriLankahaveprovidedfourdifferenttypes
ofmeta-data– CallDetailRecords(CDRs)
• Recordsofcalls• SMS• Internetaccess
– Air5merechargerecords• DatasetsdonotincludeanyPersonallyIden5fiableInforma5on
– Allphonenumbersarepseudonymized– LIRNEasiadoesnotmaintainanymappingsofiden5fierstooriginalphone
numbers
• Cover50-60%ofusers;veryhighcoverageinWestern(whereColombothecapitalcityinlocated)&Northern(mostaffectedbycivilconflict)Provinces,basedoncorrela5onwithcensusdata
9
Howmobilenetworkoperatorsseetheworld
• CallDetailRecord(CDR)– Recordsofallcallsmadeandreceivedbyapersoncreatedmainlyfor
thepurposesofbilling– SimilarrecordsexistforallSMS-essentandreceivedaswellasforall
Internetsessions
– TheCellIDinturnhasalat-longposi5onassociatedwithit.
10
• Air:mereloadrecords– Recordsofallair5mereloadsperformedbyprepaidSIMs– Eachrowcorrespondstoarecordofoneperson’sac5vity:
11
Ananecdote:landusemapping
12
Hourlyloadingofmobilephonebasesta5onsrevealdis5nctpaberns
• Wecanusethisinsighttogroupbasesta5onsintodifferentgroups,usingunsupervisedmachinelearningtechniques
13
TypeY:?TypeX:?
Methodology
• The5meseriesofusersconnectedatabasesta5oncontainsvaria5ons,thatcanbegroupedbysimilarcharacteris5cs
• Amonthofdataiscollapsedintoanindica5veweek(SundaytoSaturday),withthe5meseriesnormalizedbythez-score
• PrincipalComponentAnalysis(PCA)isusedtoiden5fythediscriminantpabernsfromnoisy5meseriesdata
• Eachbasesta5on’spabernisfilteredinto15principalcomponents(covering95%ofthedataforthatbasesta5on)
• Usingthe15principalcomponents,weclusterallthebasesta5onsinto3clustersinanunsupervisedmannerusingk-meansalgorithm
14
Basedontheloadingpabernsofbasesta5onswecaninfer3differenttypesofspa5alclusters
forColomboDistrict
15
• Cluster-1exhibitspaNernsconsistentwithcommercialarea
• Cluster-3exhibitspaNernsconsistentwithresiden:alarea
• Cluster-2exhibitspaNernsmoreconsistentwithmixed-use
OurresultsshowCentralBusinessDistrict(CBD)inColombocityhasexpanded
16
SmallareainNEcornerofColomboDistrictclassifiedasbelongingtoCluster1?
17
Seethawaka Export Processing Zone
Photo ©Senanayaka Bandara - Panoramio
Internalvaria5onsinmixeduseregions:Morecommercialormoreresiden5al?
18 Bluedots:moreresiden5althancommercialReddots:morecommercialthanresiden5al
Understandingthecommercialtoresiden5alspectrum
19
Highly commercial
Highly residential
Landuseclustering-findingthenumberofclusters
• Numberoflandusepabernsisnotobvious– Someareasmayhaveuniquelandusepaberns
• Weareworkingonunpackingtheresiden5altocommercialspectrumfurther
• Wecaniden5fysignalsuniquetospecifictypesofac5vi5ese.g.sports
20
Calibra5ngMNBDbasedfindingswithground-truthlandusesurveydatacangreatlyimprove
theoverallanalyses• Supervisedlearningmethodscanbeusedtoinves5gatetherela5onshipbetweenzonallandusedataandmobileac5vitydata• Amodeltrainedonareaswhichhavelandusedataatahigherspa5al
resolu5oncanbeusedtopredictthelanduseofotherregions• Incorrectlylabeledcellscanrevealhowtheirzonallandusecategory
differfromhowtheyarebeingused
21
Available‘ground-truth’datasources• LandusemapscompiledbytheSurveyDepartment
– Lastupdatedin2010;2014/15updatenotyetready– Lackofdetailonurbanlandusecategories:
• Urbancommercialareasallcomeunderthecategoryof“built-upareas”• Urban,semi-urbanresiden5alareasallgroupedunderthecategory“homegardens”
• LandusemapscompiledbyUrbanDevelopmentAuthority(UDA)– Lastupdatedin2000;2005updateforDehiwala-MtLaviniaMC;
2014/15updatenotyetcomplete– Hasmorerelevantclassifica5onsforurbanareas,buts5lldon’tmatch
upwithplannedzoningmaps
22
23
2013MNBDanalyses 2020UDAplan 2000UDAlandusesurvey
2010SurveyDept.LandUseMap
Sohowareweproceedingnow?• WorkingwithUDAtocalibrateMNBDbasedresults
• Collabora5ngwithresearchersfromUniversityofMoratuwa’sTown&CountryPlanningDepartment:– Buildingimprovedpredic5vemodelsforspa5alchangesusingMNBD,
UDAdata,&SurveyDept.data
• Exploringtheuseofloca5onbasedsocialnetworkdata(eg:Foursquare)– biasedtowardsthepreferencesoftheusers(eg:more
restaurants/cafesthanofficebuildings)
24
Implica5onsforurbanpolicy• Almostreal-5memonitoringofurbanlanduse• Cancomplementinfrequentsurveys&alignmasterplantoreality
25
Anotherexample:transporta5oninsights
26
TheadvantageofMNBDisitshowsmobilityofpeopleacross5meandspace
27
Popula:ondensitychangesinColomboregion:weekday/weekendPicturesdepictthechangeinpopula:ondensityatapar:cular:merela:vetomidnight
28
Wee
kday
Su
nday
Decrease in Density Increase in Density
Time 18:30 Time 12:30 Time 06:30
46.9%ofColomboCity’sday5mepopula5oncomesfromsurroundingareas,butsomesurprises...
29
HomeDSD %ageofColombo’sday:mepopula:on
Colombocity 53.1
1. Maharagama 3.7
2. Kolonnawa 3.5
3. Kaduwela 3.34. SriJayawardanapura
Kobe 2.9
5. Dehiwala 2.6
6. Kesbewa 2.5
7. Wabala 2.5
8. Kelaniya 2.1
9. Ratmalana 2.0
10. Moratuwa 1.8
Colombo city is made up of Colombo and Thimbirigasyaya DSDs
Implica5onsforpublicpolicy
• Popula5onmapsandmobility/migra5onpabernsareessen5alpolicytools– Includedinmostcensusandhouseholdsurveys
• MNBDallowsustoimprove&extendmeasurement– Largesamplesizes– Veryfrequentmeasurement
• Moreprecisemeasures– Commu5ng/seasonal/long-termmigra5on
30
Implica5onsforpublicpolicy
• Improveplanningfor“spiky”eventsthatcreatepressureongovernmentandprivatelysuppliedservices– Humanitarianresponsetodisasters– Specialevents/holidays
• Urban&transporta5onplanning– Caniden5fyhighvolumetransportcorridorstopriori5zeforprovision
ofmasstransit– Mapdefactomunicipalboundaries
• Healthpolicy– Mobilitypabernsthatcanhelprespondtospreadofinfec5ous
diseases(e.g.,dengue,malaria,zika,etc.)
31
HowareourmobilityinsightscurrentlybeingusedbytheSriLankangovernment?
• InsightsdevelopedbyLIRNEasiausedbythegovernmentinformula5ngtheWesternRegionMegapolisPlan(WRMP)– LIRNEasiainsightsusedtosupportcaseforfurtherdevelopmentof
specificperipheralregionsintheprovince– Argumentsfortheneedtoalleviatedailyconges5oninColombocity
againuseLIRNEasiainsights
32
Whataresomeofthechallenges?
33
Challenge Solu:on(s)Nego5a5ngaccesstodata • Win-win;insights/techniquesforpublicpolicy
outputscanbeleveragedforprivatebusinessinterestsaswell
• Pro-ac5veac5onbyoperator(s)ratherthanreac5vetogrowinggovernmentinterestinusingsuchdata
Minimizingharmsfromdatasharing
• Developmentofself-regulatoryguidelinesforcompe55veindustries
• Datafrommonopolieshavelesscompe55veimplica5ons
• Useoflegalinstrumentsandtechnicalsolu5onsSkills • Assembleinterdisciplinaryteamsthataresuperiorto
whatconsultantscanofferResearchèpolicy • Policyenlightenmentasstep1
Advantagesofusingmobilenetworkbigdataindevelopmentalpolicy?
34
FeatureofMNBD BenefitHighfrequency • Cancomplementsurveysandcensuseswith
morefrequentmeasurements• “New”findings(e.g.hourlypopula5ondensity)
Nearreal5me • Createnearreal-5meproxyindicators(e.g.,economicac5vity,socio-economiclevels)atfinegeographicandtemporalresolu5on
• Respondtoreal-5meevents(Almost)universalcoverage
• Allowextrapola5onofsurveys• Captureinformal(economic)ac5vi5es
Thankyou.
Moreinfoat:hbp://lirneasia.net/projects/bd4d/
35