Knowledge Discovery from Mobile Phone Communication Activity Data Streams Fergal Walsh Data Stream...
-
Upload
winifred-briggs -
Category
Documents
-
view
213 -
download
1
Transcript of Knowledge Discovery from Mobile Phone Communication Activity Data Streams Fergal Walsh Data Stream...
Knowledge Discovery from Mobile Phone Communication Activity Data Streams
Fergal Walsh
Data Stream
Research presented in this poster was funded by a Strategic Research Cluster Grant (07/SRC/I1168) by Science Foundation Ireland under the National Development Plan. The authors gratefully acknowledge this support.
Data Exploration
Stream ProcessorStream ProcessorRaw CDR DataRaw CDR Data Indexed
DatabaseIndexed
Database Exploratory Query ToolExploratory Query ToolData stream processor for pre-processing each record and computing
aggregates
Data stream processor for pre-processing each record and computing
aggregatesSpatial, temporal and
user indices for efficient querying
Spatial, temporal and user indices for efficient
querying1 week of data
(> 200 million records)1 week of data
(> 200 million records)Web based tool for ad hoc
spatio-temporal queriesWeb based tool for ad hoc
spatio-temporal queries
Communication event counts per cell per hour (weekday average)Communication event counts per cell per hour (weekday average)00:0000:00 08:0008:00 12:0012:00 18:0018:00
Trajectories of 2 sample usersTrajectories of 2 sample users Location of caller and callee for 2 sample usersLocation of caller and callee for 2 sample users
Anonymised Customer Data Records (CDR) from Meteor, Ireland’s 3rd largest mobile phone network
More than 1 million customersOne record per call/sms sent receivedAbout 40 million records per day
Information retrieval using stream data mining and machine learning techniques
•Find users similar to some example users (classification using Support Vector Machines):• Users who travel from Maynooth to Dublin daily• Users who travel to Dublin from rural areas daily (using semantics of spatial areas)• Groups of users who are planning a meet-up (using communication motifs)
•Find areas with similar phone usage activity profiles (clustering)• Nightlife, business, residential, rural
•Find clusters of users with similar activity profiles (clustering)
Development of (ncg.nuim.ie/i2maps/)
Future WorkLearn activity chains (probabilistic models) of each users communication and movement events. These will use semantic labels rather than raw spatial locations.
Predict movement and communication events from learned models.
Current Work
About 7000 cells (spatial areas)
Cell areas range from <1km2 to ~50km2
PublicationsPozdnoukhov A., Walsh F., Exploratory Novelty Identification in Human Activity Data Streams, ACM SIGSPATIAL International Workshop on GeoStreaming at 18th ACM SIGSPATIAL GIS, 2010.
Pozdnoukhov A., Walsh F., Kaiser F., Statistical Machine Learning from VGI, Position paper at Role of Volunteered Geographic Information in Advancing Science Workshop at GIScience'10, 2010.
Kaiser C., Walsh F., Farmer C. and Pozdnoukhov A., User-centric time-distance representation of road networks. In Springer LNCS proc. of the GIScience'10 (full paper). 2010.
Records are ordered by time and independent of each other,making this data ideally suited to stream processing
The authors gratefully acknowledge the support of Meteor for providing the data used in this poster, in particular Mr. John Bathe and Mr. Adrian Whitwham.
Thanks to Ronan Farrell (IMWS) for obtaining the data from Meteor for StratAG
Thanks to John Doyle for providing the cell tessellation used in the examples above.
Acknowledgements