Chris Andrews Georgia Institute of Technology B.S. Computer Science 5 th Year Undergraduate...

Post on 31-Dec-2015

214 views 0 download

Transcript of Chris Andrews Georgia Institute of Technology B.S. Computer Science 5 th Year Undergraduate...

Chris AndrewsGeorgia Institute of TechnologyB.S. Computer Science5th Year Undergraduate

Trajectory Pattern MiningFosca Giannotti

Mirco Nanni

Dino Pedreschi

Fabio Pinelli

ConceptsAnalyze trajectory of moving objects

A 3mins B 5mins C 10mins D

Trajectory Patterns – description of frequent behavior relating to space and time

Frequent Sequence Pattern (FSP)Determine if trajectory sequence matches any trajectory patterns

in a given set

Study different methods of preparing a Temporally Annotated Sequence (TAS) for data mining

Trajectory Patterns (T-Patterns)Trajectory Pattern

sequence of time-stamped locationsS = { ( x0, y0, t0 ) , … , ( xn, yn, tn ) }

Temporal Annotation set of times relating to trajectoriesA = { a1 , a2, … an }

Temporally Annotated Sequence(S,A) = (x0,y0) a1 (x1,y1) a2 … an (xn,yn)

Neighborhood FunctionNeighborhood Function N : R2 -> P (R2)

Calculates spatial containment of regionsInput point to find enclosing Region of InterestDefines the necessary proximity to fall into a regionParameters:

e – radius or necessary proximity of points

Regions of Interest (RoI)Performing these comparisons on points is costlyA simple preprocessing step can alleviate this

Utilize the Neighborhood Function NR()Translate each set of points into regionsTimestamp is selected from when the trajectory first entered

the regionNow compare sequence of regions and timestamps using the

TAS mining algorithm presented in [2].

Static RoINeighborhood Function NR()

Initially receives set of R disjoint spatial regions R regions are predefined based on prior knowledgeEach represents relevant place for processing

Static NR() simplifies problem of mining patternsSequence of points become groupedResult: sequence of regions(x,y) a1 (x’,y’) becomes X a1 Y

Dynamic RoIData sets often do not possess predetermined


Instead need to formulate regions based on criteria of density of the trajectories

Preprocessing now must determine set R of popular regions from the data set

R is now the set of Region of Interests from used by the Neighborhood Function NR() to translate points into Regions of Interest

Popular RegionsGrid G of n x m cells Density Threshold dEach cell with density G(i,j) Set R of popular regions

Each region in R forms rectangular regionSets in R are pair wise distinctDense cells always contained in some region in RAll regions in R have average density above dAll regions in R cannot expand without their

average density decreasing below d

Grid Density PreparationSplit space into n x m grid with small cells

Increment cells where trajectory passes

Neighborhood Function NR() determines which surrounding cells

Regression - increment continuously along trajectory

Popular Regions AlgorithmAlgorithm: PopularRegions( G, d )Complexity: O ( |G| log |G| )

Iteratively consider each dense cellFor each:

Expands in all four directionsSelect expansion that maximizes densityRepeat until expansion would decrease below

density threshold


Evaluating the T-PatternsCompute density of each cell of grid

Compute set of RoI’s by determining Popular Regions

Translate the input trajectories into sequence of RoI’s and timestamps for the transitions

Input the trajectories and times into TAS mining algorithm[2]

ExperimentsGPS Data

Fleet of 273 trucks in Athens, Greece112,203 total points recordedRunning both static & dynamic pattern algorithmsVarious parameter settings

Performance AnalysisSynthetic Data by CENTRE synthesizer50% random & 50% predetermined

Pattern Mining ResultsStatic found: A t1 B t2 BDynamic found: A t1 B’ t2 B’’

Execution Time Results• Increase linearly with increasing

number of input trajectories (both algorithms)

• Grow when density threshold decreases

• Static performs better with extreme threshold

• Static does not perform with middle threshold

Additional ResultsIncreasing radius of spatial neighborhood obtains

irregular performance and large values lead to poor execution times

Changing time tolerance (t) obtains results similar to TAS’s

Increasing the number of points in each trajectory causes linear growth of execution times

Works Cited[1] Trajectory pattern mining, Fosca Giannotti, Mirco

Nanni, Fabio Pinelli, Dino Pedreschi, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining KDD. ACM, 2007.

[2] Efficient Mining of Sequences with Temporal Annotations. F. Giannotti, M. Nanni, and D. Pedreschi. In Proc. SIAM Conference on Data Mining, pages 346–357. SIAM, 2006.