Network Complexity and Spatio-Temporal Data …data is the most difficult challenge in...
Transcript of Network Complexity and Spatio-Temporal Data …data is the most difficult challenge in...
Network Complexity and Spatio-Temporal Data Mining (STDM)
Dr Tao Cheng + STANDARD team {[email protected]} Senior Lecturer in GeoInformatics Department of Civil, Environmental and Geomatic Engineering (CEGE) University College London
Outline
• Nature of Network complexity • Its challenges for STDM • Case studies from the STANDARD project • Future directions for NC and STDM
Challenges - Network Complexity 1) Heterogeneity (structure & performance)
- nonlinearlity - nonstationarity (MAUP problem in GIS)
Great progress in describing structure (e.g. power-laws) of
‘what is’, but how to model and predict nonlinear and nonstationary
performance?
Challenges - Network Complexity 2) Dynamics
- changes in physical structure (nodes & links) - implications for supply/capacity changes - changes in movement patterns on the network (density/flow/speed; behaviour)
- leads to changes in demand Much progress in modelling supply - demand interactions at the
macroscopic level, but - lack of clarity about implications for individual behaviours
and their collective effects; - No readily available tools to demonstrate or capture the
transition from free flow to congestion
Challenges - Network Complexity 3) Interactions & Associations
- spatial (upstream/downstream) - temporal (past/present/future) - spatio-temporal - multiple factors (incidents, weather, big events,..) - multiple networks
We accommodate spatial or temporal associations
(autocorrelations), but - Fail to integrate treatment of spatio-temporal
autocorrelation simultaneously - Failure to consider multiple networks
Research Frontiers in Network Complexity 1) Forecasting and prediction
- nonlinearlity & nonstationarity 2) Tools to capture/illustrate the processes
- Emergence and tipping points - Simulating behaviour (macroscopic properties alter because of accumulated microscopic changes)
3) Spatio-temporal dependence and interactions - impact of activities on the network
- interactions between networks
BigData – empirical theory and testing
• Short-term and long-term journey time prediction – STARIMA; ANN; Kernel-based approach
• Early detection of traffic congestion – clustering: STC; STSS
• Interactive visualization of journey time reliability and traffic congestion – 2D (hotspot); 3D(wall-map; isosurface)
• Simulation of non-recurrent congestion – Agent-based simulation
• Intervention Analysis (weather, tube strike, road works) – regression
STANDARD – Spatio-Temporal Analysis of Network Data and Route Dynamics understand traffic congestions in space-time
Space-time prediction & forecasting The challenge lies in the non-stationary (heterogeneity) and non-linearity of space-time data.
Statistical Approaches • STARIMA models • space-time geostatistical
models • spatial panel data models • space-time GWR How to calibrate the spatio-temporal autocorrelations is the bottleneck.
Machine Learning Approaches • artificial neural networks
(ANNs) • self-organized maps • Genetic algorithms • support vector machines
(SVMs) • Kernel-based approach The interpretability of machine learning is low
Real &me traffic forecas&ng
9
James Haworth
10
Interval Naïve ARIMA STARIMA LSTARIMA 5 minutes 49.4 47.4 55.9 46 15 minutes 74.7 68.7 89.1 67.3 30 minutes 93.2 82.1 109 80
Results – Root mean squared error (seconds/kilometre)
James Haworth & Jaiqiu Wang: Space-Time Modelling and Prediction
Space-time clustering To extract meaningful patterns (clusters)
• To detect outliers or emerging phenomena (epidemic outbreaks or traffic congestion)
• Considering the spatial, temporal and thematic attributes seamlessly and simultaneously, and the dynamicity in the data is the most difficult challenge in spatio-temporal clustering
• Spatio-temporal scan statistics (STSS) sheds lights on this aspect
• Efforts are needed to improve computation efficiency and to reduce the false alarm rate of STSS
Clusters of Congestion 25 May 2010 – State Opening of Parliament
Berk Anbaroglu - STSS for early detection of non-recurrent traffic congestion
Space-time visualisation Explores the patterns hidden in the large data sets
• using advanced (analytical) visualization and animation – static 2D maps – 3D wall maps and isosurface (hotspots in space-time)
• Tools: “Visual Analytics” and “Geovisual Analytics” • Still, real-time visualization of dynamic processes is still very
challenging due to large volume and high dimensions of the data. • Methods are needed to show evolution and dissipation in space
and time simultaneously (e.g. crime or traffic congestion)
Space-Time Visualisation: data -> process, story traffic congestion in space-time (1)
Cheng, Emmonds, Tanaksaranond, Sonoiki (2010), Multi-Scale Visualisation of Inbound and Outbound Traffic Delays in London, The Cartographic Journal, 47: 323–329.
Visualization of traffic congestion in space-time (2)
3D Wall maps of inbound roads on 6th – 7th September 2010
Top view
Side view
Isosurface
Visualising Congestion Build-up in London 3D Wall Map Travel Time Interactive Visualization Tool
Garavig Tanaksaranond – Space-Time Visualisation of Traffic Congestion
• Understanding formation of congestion
through the behaviour of individual drivers • How do drivers react when faced with road
closure? • Depends on the urban environment,
individual knowledge of the network and conditions, and behaviour of others
• Behaviour of individuals (microscopic behaviour) influences the formation and movement of congestion (macroscopic phenomena)
(Manley & Cheng, 2010)
Space-‐Time Mul&-‐Agent Simula&on
SPREAD OF CONGESTION
Regent’s Park
Hyde Park
Saturation 0 – 0.2
0.2 – 0.4
0.4 – 0.5
0.5 – 0.6
0.6 – 0.7
0.7 – 0.8
0.8 – 0.9
0.9 – 1.0
1.0 – 1.2
1.2 – 1.5
> 1.5
Ed Manley – Agent-based Simulation
Machine Learning
LocaHon InformaHon
GPS
Mode of Transport & Stops
h"p://www.homepages.ucl.ac.uk/~ucesadb/video.html
GPS Tes=ng data: 110 par&cipants, 2 Months/ par&cipant , 20 second collec&on rate All par&cipants based in Greater London
Adel Bolbol Fernandez - Understanding Travel Behaviours from GPS Data Logs
Future Directions of STDM/NC (1) • New methods and theory are needed for mining crowd sources that
contributed by citizens and volunteers including social media data – often extremely noisy, biased, and nonstationary, e.g. trajectory data – Method needed to combine text mining with STDM – This area is relevant to the recent development of citizen sciences and
VGI in particular.
• Theory and methods need to be developed to extract meaningful patterns from those individual sensors and put them under the framework of networks and network complexity such as transport and social-networks made up of those individual.
• Under network, the interaction and dynamic flows should be considered in mining spatio-temporal patterns.
• This aspect is relevant to the complexity theory and network dynamics in particular.
Future Directions (cont.) • STDM for emergency and tipping points, i.e. how to generate actionable
knowledge, i.e. finding the emergent patterns and tipping points of economics and epidemics?
• It is important to find outliers, but more important is finding the critical points before the system breaks down so that mitigating action can be taken to avoid the worst scenarios such as traffic congestion and epidemic transmission.
• Another challenge of STDM is how to calibrate, explain and validate
the knowledge extracted. • A good example of this is the calibration of spatial (or spatio-temporal)
autocorrelation. Higher order spatial autocorrelation models have been developed, but the pitfalls have also been found (LeSage and Pace 2011).
• This makes machining learning more promising in future STDM.
Future Directions (cont.) • grid computation and cloud computation
– Key for scaling the algorithm to large network • Open sources (data + software + algorithms) • Online computation • Real-time computation
• More systematic applications – CPC
• …
Acknowledgements
hKp://standard.cege.ucl.ac.uk
+ Dr Andy Chow + Colleagues in TfL