STAR-Tree Spatio-Temporal Self Adjusting R-Tree John Tran Duke University Department of Computer...

Post on 11-Jan-2016

213 views 1 download

Transcript of STAR-Tree Spatio-Temporal Self Adjusting R-Tree John Tran Duke University Department of Computer...

STAR-Tree Spatio-Temporal Self Adjusting R-Tree

John TranDuke University

Department of Computer Science

Adviser: Pankaj K. Agarwal

Problem Large Moving Data Sets

Many static data structures exist, but not many account for motion, which is realistic

Examples of Use Geographic Information Systems

Air-Traffic Control

Protein Interactions

Traffic Patterns

Defining the data

Can represent data as points in Rd

For our problem: Set of data points in R2: S = {p1, p2, …, pn} Can parameterize points to pi = (xi(t), yi(t))

Piecewise differentiable velocities

Bounding boxes can be represented by 2 points

Queries

Query 1 – Report all points of S that lie inside rectangle R at time t

Queries

Query 2 – Report all points of S that lie inside rectangle R at any time between t1 and t2

Queries

Query 3 – Report the nearest neighbor of point in S

R-Tree Bounding Box

Hierarchy All Children nodes

are bound by parents bounding box

Points are stored in leaf nodes

STAR-Tree Same concept as

R-Tree Incorporate

movement into tree structure

Conflicts

As bounding boxes change, overlap occurs Need to adjust for these overlap conflicts

QT Implementation

OpenGL Implementation

Road Simplification Road data from US Bureau of

Census (TIGER) Paths are determined using

Dijkstra’s Shortest Path Algorithm Shapes of these paths are typically

simple but include many vertices Simplify path using Douglas-Peucker

heuristic (5 vertices max)

Road Simplification Simplify road network

TIGER data is not perfect Polygonal chain with vertex lists Sometimes does not match roads that

should be matched

Analysis of RDU RoadsV

ert

ices

wit

h n

str

eets

n streets

Analysis of RDU Roads

n vertices

Str

eets

wit

h n

vert

ices

Road Simplification

Protein Shape Matching

Problem Match two proteins based on

similarity or dissimilarity using intramolecular distance comparison

Data Start from PDB files

Parse to get vertex list

Calculating Distance Matrix Given a vertex list

Calculating Distance Matrix Given a vertex list

Defining cost

-GCTGATACTAGCT

| |||| |||||

GGGTGAT-GTAGCT

Let g(k) = +(k-1) is the cost of starting a new indel gap is the cost of continuing a gap

Cost Function E(i,j) = min{D(i,j-1) + , E(i,j-1) + } F(i,j) = min{D(i-1,j) + , F(i-1,j) + } D(i,j) = min{D(i-1,j-1) + (i,j),

E(i,j), F(i,j)}

Where (i,j) = normalized sum of difference distance between Ai and all the matched vertices and Bj to the corresponding matched vertices

Comparing identical Proteins

Test Cases