Understanding User Mobility Based on GPS Data
Yu ZhengMicrosoft Research Asia
Outline
• Introduction• Architecture• Walk-Based Segmentation• Feature Extraction• Graph-based post-processing• Experiments• Conclusion
Introduction (1)• Goal & Results: Inferring transportation modes from raw GPS data
– Differentiate driving, riding a bike, taking a bus and walking– Achieve a 0.75 inference accuracy (independent of other sensor data)
GPS log
Users
Infer model
Introduction (2)• Motivation
– For users: • Reflect on past events and understand their own life pattern• Obtain more reference knowledge from others’ experiences
– For service provider:• Classify trajectories of different transportation modes• Enable smart-route design and recommendation
• Difficulty– Velocity-based method cannot handle this problem well (<0.5 accuracy)– People usually transfer their transportation modes in a trip– The observation of a mode is vulnerable to traffic condition and weather
Introduction (2)• Contributions and insights
– A change point-based segmentation method• Walk is a transition between different transportation modes• Handle congestions to some extent
– A set of sophisticated features• Robust to traffic condition• Feed into a supervise learning-based inference model
– A graph-based post-processing• Considering typical user behavior• Employing location constrains of the real world
• WWW 2008 (first version)
Architecture
Change Point Clustering
Training Data
Segmentation
Extracting Feature
Knowledge Extraction
Test Data
Segmentation
Extracting Feature
Post-ProcessingSpatial
Knowledge
Model Training Inference
Model
Trans. Modes Spatial IndexingSpatially Indexed
Knowledge
Online Inference Offline Learning
Graph Building
Walk-Based Segmentation
• Commonsense knowledge from the real world– Typically, people need to walk before transferring transportation modes– Typically, people need to stop and then go when transferring modes
Walk-Based Segmentation• Change point-based Segmentation Algorithm
– Step 1: distinguish all possible Walk Points, non-Walk Points. – Step 2: merge short segment composed by consecutive Walk Points or non-Walk points– Step 3: merge consecutive Uncertain Segment to non-Walk Segment.– Step 4: end point of each Walk Segment are potential change points
WalkBus
Certain Segment
Denotes a non-walk Point: P.V>Vt or P.a>at
Denotes a possible walk point: P.V<Vt and P.a<at
(b)
(c)
Backward ForwardCar
(a)
Certain Segment3 Uncertain Segments
Car
Feature Extraction (1)
• Features
Category Features Significance
Basic Features
Dist Distance of a segment
MaxVi The ith maximal velocity of a segment
MaxAi The ith maximal acceleration of a segment
AV Average velocity of a segment
EV Expectation of velocity of GPS points in a segment DV Variance of velocity of GPS points in a segment
Advanced
Features
HCR Heading Change RateSR Stop RateVCR Velocity Change Rate
Feature Extraction (2)• Our features are more discriminative than velocity
– Heading Change Rate (HCR)– Stop Rate (SR)– Velocity change rate (VCR)– >65 accuracy
H1p1
p2
p3
p1.V1p2.V2
L1, T1
p1. head p2. head
Velocity
Velocity
Velocity
Distance
Distance
Distance
a) Driving
b) Bus
c) Walking
Vs
Vs
Vs
Graph-Based Post-Processing (1)• Using location-constraints to improve the inference performance??
Bus Stop Traffic Light
Crossroad
Graph-Based Post-Processing (2)• Transition probability between different transportation modes
– P(Bike|Walk) and P(Bike|Driving)
Segment[i-1]: Driving Segment[i]: Walk Segment[i+1]: Bike
P(Driving): 75%P(Bus): 10%P(Bike): 8%P(Walk): 7%
P(Bike): 62%P(Walk): 24%P(Bus): 8%P(Driving): 6%
P(Bike): 40%P(Walk): 30%P(Bus): 20%P(Driving): 10%
Ground Truth
Inference result
Transition P(Walk|Driving) Transition P(Bike|Walk)
Segment[i].P(Bike) = Segment[i].P(Bike) * P(Bike|Car)
Segment[i].P(Walk) = Segment[i].P(Walk) * P(Walk|Car)
Graph-Based Post-Processing (3)
M={Driving, Walk, Bike, Bus}, E.g., P(M0) = P(Driving); P(M3|M1)= P(Bus | Walk);
N1 N2
N7 N8
N6N5
N3
N1 N2
N5
N3
N4
N1 N4N8 N5
P18(Mi)
P185(Mi|Mj)
Building Graph
(3) Spatial indexing(4) Probability calculation
N7 N8
N6
Change points and start/end points
(1) (2)
A start or end point A change point
P85(Mi) P54(Mi)
P854(Mi|Mj)
P581(Mi|Mj) P458(Mi|Mj)
• Mine a implied road network from users’ GPS logs – Use the location constraints and typical user behaviors as probabilistic cues– Being independent of the map information
Graph-Based Post-Processing (4)
Inference model Knowledge Mining
Features X Labeled data
Posterior Probability P(mi | X)
Posterior Probability P(mi | Eij)
Prior ProbablityP(mi)
Final Results: P(mi | X, Eij)= P(mi | X) P(mi | Eij) / P(mi)
Search spatial index
Found in graph ?
Segments of a GPS trajectory
Normal post-processing
Output the mode with maxProb as
result
Is maxProb > T1?
Transition probability-based
enhancement
Is maxProb < T2?
Prior probability-based enhancement
N
NN
Output the mode with maxProb as result
Y
End
Y
Experiments (1)
• Framework of Experiments
Change point-based segmentation method
Normal post-processing Graph-based post-processing
Basic Features New Features
Inference model based on Decision Tree
Data and Devices
Experiments (2)
Rank Feature AS AD Rank Feature AS AD
1 HCR 0.345 0.561 8 DV 0.269 0.357
2 SR 0.335 0.561 9 MaxV2 0.322 0.344
3 AV 0.382 0.547 10 MaxV1 0.294 0.257
4 VCR 0.336 0.526 11 MaxA2 0.239 0.217
5 EV 0.375 0.523 12 MaxA1 0.259 0.208
6 Dist 0.302 0.499 13 MaxA3 0.256 0.197
7 MaxV3 0.334 0.365
• Single Feature Exploration
Feature Combinations
Transportation mode Chang PointAS AD Precision Recall
MaxA1 + MaxA2 + MaxA3 0.297 0.283 0.118 0.584
MaxV1 + MaxV2 + MaxV3 0.480 0.526 0.142 0.687
Distance + EV + AV 0.480 0.550 0.227 0.582
Distance + EV + MaxV1 0.548 0.597 0.217 0.55
AV + EV + MaxV1 0.558 0.621 0.253 0.603
MaxV3 + MaxA3 + AV 0.511 0.632 0.138 0.669
SR + HCR + VCR 0.575 0.644 0.286 0.643
Basic Features 0.618 0.673 0.284 0.681Basic Features + Advanced Features 0.635 0.715 0.373 0.724
AD CP/P CP/R
Enhanced Features (EF) 0.728 0.491 0.817
EF + normal post-processing 0.741 0.508 0.818
EF + graph-based post-processing 0.762 0.516 0.818
Ground truth
Predicted Results (KM)
Walk Driving Bus Bike
Walk 1026.4 122.1 386.5 357.3 0.543 Recall
Driving 42.6 2477.3 458.5 235.1 0.771
Bus 34.8 164.7 1752.4 46.2 0.877
Bike 49.3 113.5 31.9 1234.3 0.864
0.891 0.861 0.666 0.6590.762
Precision
Top Related