Computational Geometry and Spatial Data Mining
description
Transcript of Computational Geometry and Spatial Data Mining
![Page 1: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/1.jpg)
Computational Geometry and Spatial Data Mining
Marc van KreveldDepartment of Information and
Computing SciencesUtrecht University
![Page 2: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/2.jpg)
![Page 3: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/3.jpg)
![Page 4: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/4.jpg)
Clustering?
• Are the people clustered in this room? How do we define a cluster?
• In spatial data mining we have objects/ entities with a location given by coordinates
• Cluster definitions involve distance between locations
![Page 5: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/5.jpg)
Clustering - options
• Determine whether clustering occurs• Determine the degree of clustering• Determine the clusters• Determine the largest cluster
• Determine the outliers
![Page 6: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/6.jpg)
![Page 7: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/7.jpg)
![Page 8: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/8.jpg)
Co-location
• Are the men clustered?• Are the women clustered?
• Is there a co-location of men and women?
![Page 9: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/9.jpg)
![Page 10: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/10.jpg)
Co-location
• Like before, we may be interested in– is there co-location?– the degree of co-location– the largest co-location– the co-locations themselves– the objects not involved in co-location
![Page 11: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/11.jpg)
![Page 12: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/12.jpg)
Spatio-temporal data
• Locations have a time stamp• Interesting patterns involve space and
time
![Page 13: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/13.jpg)
![Page 14: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/14.jpg)
Trajectory data• Entities with a trajectory (time-stamped
motion path)• Interesting patterns involve subgroups
with similar heading, expected arrival,joint motion, ...
• n entities = trajectories; n = 10 – 100,000• t time steps; t = 10 – 100,000
input size is nt• m size subgroup (unknown); m = 10 – 100,000
![Page 15: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/15.jpg)
Examples of trajectory data
• Tracked animals (buffalo, birds, ...)• Tracked people (potential terrorists)• Tracked GSMs (e.g. for traffic purposes)• Trajectories of tornadoes• Sports scene analysis (players on a
soccer field)
![Page 16: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/16.jpg)
Example pattern in trajectories
• What is the location visited by most entities?
location = circular region of specified radius
![Page 17: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/17.jpg)
Example pattern in trajectories
• What is the location visited by most entities?
location = circular region of specified radius
4 entities
![Page 18: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/18.jpg)
Example pattern in trajectories
• What is the location visited by most entities?
location = circular region of specified radius
3 entities
![Page 19: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/19.jpg)
Example pattern in trajectories
• Compute buffer of each trajectory
![Page 20: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/20.jpg)
Example pattern in trajectories
• Compute buffer of each trajectory
0
1
2
1
11
• Compute the arrangement of the buffers and the cover count of each cell
1
![Page 21: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/21.jpg)
Example pattern in trajectories
• One trajectory has t time stamps; its buffer can be computed in O(t log t) time
• All buffers can be computed in O(nt log t) time
• The arrangement can be computed in O(nt log (nt) + k) time, where k = O( (nt)2 ) is the complexity of the arrangement
• Cell cover counts are determined in O(k) time
![Page 22: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/22.jpg)
Example pattern in trajectories
• Total: O(nt log (nt) + k) time• If the most visited location is visited by
m entities, this is O(nt log (nt) + ntm)
• Note: input size is nt ;n entities, each with location at t moments
![Page 23: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/23.jpg)
Patterns in entity data
Spatial data• n points (locations)• Distance is important
– clustering pattern• Presence of attributes
(e.g. man/woman):– co-location patterns
Spatio-temporal data• n trajectories, each
has t time steps• Distance is time-
dependent– flock pattern– meet pattern
• Heading and speed are important and are also time-dependent
![Page 24: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/24.jpg)
Entities in subdivisions• Also co-location pattern• Discovered simply by overlay
E.g., occurrences of oakson different soil types
![Page 25: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/25.jpg)
Clustering entities in subdivisions
• What if it is known that the entities only occur in regions of a certain type?
bird nestsradius of cluster
Situation without subdivision
![Page 26: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/26.jpg)
Clustering entities in subdivisions
• What if it is known that the entities only occur in regions of a certain type?
bird nests
Situation with subdivisionland-water
radius of cluster
![Page 27: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/27.jpg)
Clustering entities in subdivisions
burglary
housecar
![Page 28: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/28.jpg)
Region-restricted clustering
• Determine clusters in point sets that are sensitive to the geographic context (at least, for the relevant aspects)
Assume that a set of regions is given where points can only be, how should we define clusters?
Joint research with Joachim Gudmundsson (NICTA, Sydney) and Giri Narasimhan (U of F, Miami), 2006
![Page 29: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/29.jpg)
Region-restricted clustering• Given a set P of points, a set F of regions,
a radius r and a subset size m, aregion-restricted cluster is a subset P’ P inside a circle C where– P’ has size at least m– C has radius at most 2r– C contains at most r2 area of regions of F
≤ 2r sum area ≤ r2
r
![Page 30: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/30.jpg)
Region-restricted clustering
• Given a set P of n points, a set F of polygons with nf edges in total, and values for r and m, report all region-restricted clusters of exactly m points
• Exactly m points?• “Real” clustering (partition)?• Outliers?
![Page 31: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/31.jpg)
Region-restricted clustering
• Exactly m points?Every cluster with >m points consists of clusters with m points with smaller circles
• “Real” clustering (partition)?
• Outliers?
m = 5
![Page 32: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/32.jpg)
Region-restricted clustering
• Exactly m points?Every cluster with >m points consists of clusters with m points with smaller circles
• “Real” clustering (partition)?
• Outliers?
m = 5
![Page 33: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/33.jpg)
Region-restricted clustering
1. Determine all smallest circles with m points of P inside
2. Test if the radius is ≤ r (report) or > 2r (discard)
3. If the radius is in between, determine the area of regions of F inside
![Page 34: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/34.jpg)
Region-restricted clustering1. Determine all smallest circles with m
points of P inside
• Use (m-2)-th order Voronoi diagram: cells where the same (m-2) points are closest
• Its vertices are centers of smallest circles around exactly m points
![Page 35: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/35.jpg)
ordinary =order-1 VD
![Page 36: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/36.jpg)
order-2 VD
![Page 37: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/37.jpg)
order-3 VD
![Page 38: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/38.jpg)
Region-restricted clustering
• The m-th order Voronoi diagram (or (m-2)) has O(nm) cells, edges, and vertices
• It can be constructed in O(nm log n) time
we get O(nm) smallest circles with m points inside; for each we also know the radius
![Page 39: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/39.jpg)
Region-restricted clustering
2. Test if the radius is ≤ r (report) or > 2r (discard)
Trivial in O(1) time per circle, so in O(nm) time overall
![Page 40: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/40.jpg)
Region-restricted clustering
3. Determine the area of regions of F inside
Brute force: O(nf) time per circle, so in O(nmnf) time overall
![Page 41: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/41.jpg)
Region-restricted clustering• Complication: This need not give all
region-restricted clusters!– Need to compute area of F inside a circle with
moving center– Requires solving high-degree polynomials
![Page 42: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/42.jpg)
Region-restricted clusters
• The anti-climax: we cannot give an exact algorithm!
• If we takes squares instead of circles, we can deal with the problem ....
![Page 43: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/43.jpg)
Region-restricted clustering
3. Determine the area of regions of F inside
Brute force: O(nf) time per square, so in O(nmnf) time overall
The total time for steps 1, 2, and 3 isO(nm log n) + O(nm) + O(nmnf) =
O(nm log n + nmnf) time
![Page 44: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/44.jpg)
Region-restricted clustering
3. Determine the area of regions of F inside
Using a suitable data structure (only possible for squares): O(log2 nf) time per square, so in O(nm log2 nf) time overall
The total time becomesO(nm log n + nf log2 nf + nm log2 nf)
order- (m-2)VD construction
preprocessingof data structure
total query timein data structure
![Page 45: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/45.jpg)
Region-restricted clustering
• The squares solution generalizes toregular polygons (e.g. 20-gons)
• An approximation of the radius within (1+)r gives a O(n/2 + nf log2 nf + n log nf /(m 2)) time algorithm
16-gon
![Page 46: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/46.jpg)
Region-restricted clustering• Open problems:
– Develop a region-restricted version of k-means clustering, single link clustering, ...
– Region-restricted co-location?– Replace region-restricted by gradual model
0 /unit 2 /unit 5 /unit 8 /unit
typical: clusters:
![Page 47: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/47.jpg)
Patterns in trajectories
• n trajectories, each with t time steps n polygonal lines with t vertices
• Already looked at most visited location
![Page 48: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/48.jpg)
Patterns in trajectories• Flock: near positions of (sub)trajectories for some
subset of the entities during some time• Convergence: same destination region for some
subset of the entities• Encounter: same destination region with same arrival
time for some subset of the entities• Similarity of trajectories• Same direction of movement, leadership, ......
flock convergence
![Page 49: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/49.jpg)
Patterns in trajectories• Flocking, convergence, encounter patterns
– Laube, van Kreveld, Imfeld (SDH 2004)– Gudmundsson, van Kreveld, Speckmann (ACM GIS 2004)– Benkert, Gudmundsson, Huebner, Wolle (ESA 2006)– ...
• Similarity of trajectories– Vlachos, Kollios, Gunopulos (ICDE 2002)– Shim, Chang (WAIM 2003)– ...
• Lifelines, motion mining, modeling motion– Mountain, Raper (GeoComputation 2001)– Kollios, Scaroff, Betke (DM&KD 2001)– Frank (GISDATA 8, 2001)– ...
![Page 50: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/50.jpg)
Patterns in trajectories• Flock: near positions of (sub)trajectories for some
subset of the entities during some time– clustering-type pattern– different definitions are used
• Given: radius r, subset size m, and duration T,a flock is a subset of size m that is inside a (moving) circle of radius r for a duration T
![Page 51: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/51.jpg)
![Page 52: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/52.jpg)
Patterns in trajectories• Longest flock: given a radius r and subset size m,
determine the longest time interval for which m entities were within each other’s proximity (circle radius r)
Time = 0 1 65432 7 8
longest flock in [ 1.8 , 6.4 ]
m = 3
![Page 53: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/53.jpg)
Patterns in trajectories• Meet: near some position of (sub)trajectories for some
subset of the entities– clustering-type pattern
• Given: radius r, subset size m, and duration T,a meet is a subset of size m that is inside a (stationary) circle of radius r for a duration T
this was “moving” for flock
![Page 54: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/54.jpg)
![Page 55: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/55.jpg)
Patterns in trajectories• The same subset required for a flock or meet?
Example: meet with m = 4; duration is 3+ time steps or 4+ time steps?
![Page 56: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/56.jpg)
Patterns in trajectories
flock
meet
fixed subset variable subset
examples for m = 3
![Page 57: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/57.jpg)
Patterns in trajectories
Exact results ( input size is n )
NP-hard O(n3 log n)
O(n4 2 log n + n2 3)
fixed subset variable subset
flock
meet O(n4 2 log n + n2 3)
![Page 58: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/58.jpg)
Patterns in trajectories• A radius-2 approximation of the longest flock can be
computed in time O(n2 log n)
... meaning: if the longest flock of size m for radius rhas duration T, then we surely find a flock of size m and duration T for radius 2r
longest flock for r at least as long a flock for 2r
![Page 59: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/59.jpg)
Patterns in trajectoriesApproximate radius results ( input size is n )
flock
meet
fixed subset variable subset
O(n2 log n) O((n2
log n) / 2)
O((n2 log n) / (m2))O((n2
log n) / (m2))
factor 2 factor 2+
factor 1+ factor 1+
NP-hard O(n3 log n)
O(n4 2 log n + n2 3) O(n4 2 log n + n2 3)
![Page 60: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/60.jpg)
v3
Fixed subset flock• It is NP-complete to decide if a graph has a subgraph
with m nodes that is a clique
v1 v2 v3 v4 v5 v6 v7
For every node of the graph,make an entity with a trajectory
all nodes notadjacent to v1 go here
v1
v2 v4
v5v6
v7
v1 is not adjacent tov4, v5, and v7
r
![Page 61: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/61.jpg)
v3
Fixed subset flock
v1 v2 v3 v4 v5 v6 v7
v1
v2 v4
v5v6
v7
v4 not in flock
v4 in flock
![Page 62: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/62.jpg)
v3
Fixed subset flock
v1 v2 v3 v4 v5 v6 v7
v1
v2 v4
v5v6
v7
The trajectories have a fixed flock of size m and full duration if and only if the graph has a clique of size m
flock {v4,v5,v7} of (full) duration 23 (3·7+2) and size 3
![Page 63: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/63.jpg)
Fixed subset flock• Longest fixed flock is NP-hard• Max clique has no approximation
cannot approximate duration, nor flock size• The reduction applies for all radii < 2r
v1 v2 v3 v4 v5 v6 v7
v4 not in flock
v4 in flock
![Page 64: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/64.jpg)
Flock and meet algorithms• Go into 3D (space-time) for algorithms
time
0
1
2
4
3
flock meet
![Page 65: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/65.jpg)
Fixed subset flock, approximation• An efficient radius-2 approximation
algorithm of longest fixed flock exists• Idea: if some vi is in the longest flock,
then all other entities are within distance 2r from vi
radius 2r, centered at vi
vi
flock with vi
2r
![Page 66: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/66.jpg)
Fixed subset flock, approximation• For each vj, we can determine the
O() time intervals where vj is in the column of vi
• Maintain the intersections for all entities in an augmented tree inO(n log n) time
• Do this for all columns (role of vi)and report longest overall pattern
Total: O(n2 log n) time
![Page 67: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/67.jpg)
Variable subset flock, exact• The subset that forms the flock may
change entities, but must stay of size m
• Any flock subset at any instant has a disk D of radius r with at least 2 entities on the boundary defining entities
r
defining entities
![Page 68: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/68.jpg)
Variable subset flock, exact• Two entities define two cylinders
through time by tracing the two possible radius r disks
![Page 69: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/69.jpg)
Variable subset flock, exact• Two entities define two cylinders
through time by tracing the two possible radius r disks
![Page 70: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/70.jpg)
Variable subset flock, exact• Two entities define two cylinders
through time by tracing the two possible radius r disks
![Page 71: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/71.jpg)
Variable subset flock, exact• Two entities define two cylinders
through time by tracing the two possible radius r disks
![Page 72: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/72.jpg)
Variable subset flock, exact• Two entities define two cylinders
through time by tracing the two possible radius r disks
![Page 73: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/73.jpg)
Variable subset flock, exact• Two entities define two cylinders
through time by tracing the two possible radius r disks
![Page 74: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/74.jpg)
Variable subset flock, exact• Two entities define two cylinders
through time by tracing the two possible radius r disks
![Page 75: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/75.jpg)
Variable subset flock, exact• Two entities define two cylinders
through time by tracing the two possible radius r disks
![Page 76: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/76.jpg)
Variable subset flock, exact• Two entities define two cylinders
through time by tracing the two possible radius r disks
![Page 77: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/77.jpg)
Variable subset flock, exact• Two entities define two cylinders
through time by tracing the two possible radius r disks
![Page 78: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/78.jpg)
Variable subset flock, exact• Two entities define two cylinders
through time by tracing the two possible radius r disks
![Page 79: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/79.jpg)
Variable subset flock, exact• A critical moment is where another
entity is on the boundary of the disk; it may go outside or inside
![Page 80: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/80.jpg)
Variable subset flock, exact• At a critical moment:
– a variable subset flock may start (m entities)– a variable subset flock may stop (<m
entities)– Three pairs of defining entities have disks
that coincide
• There are also critical moments when two entities are at distance exactly 2r
• Between two time steps ti and ti+1 there are O(n3) critical moments in total there are O(n3 ) critical moments
2r
![Page 81: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/81.jpg)
Variable subset flock, exact• Let the O(n3 ) critical moments be the nodes in
a directed acyclic graph G• Edges of G are between two consecutive critical
moments of the same two defining entities– directed from earlier to later– weight is time between critical moments– only if at least m entities are inside the disk
time A longest variable subset flock is a maximum weight path in G
![Page 82: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/82.jpg)
Variable subset flock, exact• The graph G can be built in O(n3 log n) time• A maximum weight path can be found in
O(n3 log n) time
time
A longest variable subset flock is a maximum weight path in G
![Page 83: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/83.jpg)
Patterns in trajectories, summary• Flock and meet patterns require algorithms in 3-
dimensional space (space-time)• Exact algorithms are inefficient only suitable for
smaller data sets• Approximation can reduce running time with one or
two orders of magnitude
![Page 84: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/84.jpg)
Patterns in trajectories, summary
flock
meet
fixed subset variable subset
O(n2 log n) O((n2
log n) / 2)
O((n2 log n) / (m2))O((n2
log n) / (m2))
factor 2 factor 2+
factor 1+ factor 1+
NP-hard O(n3 log n)
apx
exact
apx
exact O(n4 2 log n + n2 3) O(n4 2 log n + n2 3)
![Page 85: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/85.jpg)
Future research on longest trajectories
• Faster exact and approximation algorithms• Better approximation factors• Remove restriction of fixed shape of flocking region
(compact or elongated both possible during same flock)• Longest duration convergence
longest convergence
![Page 86: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/86.jpg)
Patterns in trajectories
• Flock and meet patterns require algorithms in 3-dimensional space (space-time)
• Exact algorithms are inefficient only suitable for smaller data sets
• Approximation can reduce running time with an order of magnitude
![Page 87: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/87.jpg)
To conclude
• With an exact definition of a spatial or spatio-temporal pattern, geometric algorithms can be used to compute all patterns
• Many known structures from computational geometry are useful (Voronoi diagrams, arrangements, ...)
• Since the (exact) algorithms may be inefficient, approximation may be a solution
![Page 88: Computational Geometry and Spatial Data Mining](https://reader036.fdocuments.us/reader036/viewer/2022062521/56816864550346895ddeb9c5/html5/thumbnails/88.jpg)
To discuss• What patterns must be detected in practice
(both spatial and spatio-temporal)?
• What is the most appropriate definition (formalization) of these?
• Spatial association rules, auto-correlation, irregularities, classification, ... and other computable things in spatial/spatio-temporal data mining