Taking Constraints out of Constraint Databases
-
Upload
randall-rosario -
Category
Documents
-
view
42 -
download
1
description
Transcript of Taking Constraints out of Constraint Databases
1
Taking Constraints out of Constraint Databases
Dina GoldinUniversity of Connecticut
Applications of Constraint DatabasesParis, France, June 2004
2
Relational DatabasesCodd[70] provided an additional level of abstraction
between physical data and queries
Customized data layout for
each application
queries
Physical Layer
Table-basedLogical Layer
queries
3
Advantages of Relational Model
• Data model: Uniform table-based representation for all data at logical level
• Data independence: Can modify physical layer without affecting queries
• Simple set-of-points semantics, RA=RC
• Efficient indexing methods
A commercial success in the 1980s!
4
Object-Relational Databases• Disadvantages of RDBs:
– only good for traditional, “administrative” data
• OO technology corrects this: – encapsulate non-administrative data
– provide methods to access it
• Object-relational databases provide this technology within a relational framework.
They are the latest commercial success.
5
Outline• Introduction
– relational, OR data models• GIS systems:
– CDB technology to the rescue• Constraint Databases:
– it’s not just about constraints– one more level of abstraction
• Constraint-backed databases:– practical considerations– getting constraint-backed technology right
6
Geographic Information Systems
• Until recentlly, leading commercial systems for spatial data
• Not database systems per se– cannot manage non-geographic data– no ad-hoc querying (users perform built-in operations
or execute predefined queries)– single-layered architecture (no data independence
when writing queries)– in-memory (no index stuctures)
7
Newer Approaches to Managing Spatial Data
• Marrying GIS and object-relational databases– Example: Oracle Spatial Data Option– Full power of a relational DB plus…
• Spatial data – encapsulated as new data types within the OR framework– same data types as in ARC/Info (leading GIS system)
• Spatial operations – as methods over the new data types – based on GIS operations
• Spatial data access structures– based on bounding boxes
8
Data Separation in OR/GIS Databases
• Spatial data stored in spatial relations– predefined set of spatial data types (point, region, etc…)– each relation is a set of spatial objects of one type, with a key– predefined set of operations over spatial objects
• “Traditional” data stored in regular relations– Including thematic/descriptive data pertaining to spatial objects
• Spatial & administrative data are logically separate – only keys of spatial objects to correlate between them– spatial data processing limited to predefined types and operators
• Separation applies to query output as well– limited query expressiveness
Can constraint databases offer a better solution?
9
Constraint Databases • Contribution of KKR[90,95]• Key idea: Allow relations that include infinitely
many points– “Finite relations are generalized to finitely
representable relations” [GK96]
• Generalized: original term for tuples and relations with infinite semantics– We now prefer the term constraint for such tuples and
relations
Goal: next commercial success (for GIS applications)
10
Revisiting the Logical Layer • Components of the logical database layer:
– set-of-tuples data semantics– implementation-independent (logical) data
representation
• Relational databases– finite semantics– trivial one-to-one correspondence between
the two components
• Constraint databases:– infinite semantics– correspondence between data semantics
and data representation no longer trivial
Infinite semantics of finitely representable data imply an additional level of abstraction; we need to separate logical layer into two
Physical Layer
Table-basedLogical Layer
queries
11
Additional Level of Abstraction
Physical Layer:File-based data storage; indexing structures, data access methods; implementation-dependent
Logical Layer:(queries defined over this layer)finite set-of-point semantics;table-based representation;Implementation-independent
Abstract Logical Layer:(queries defined over this layer) infinite set-of-point semantics
Concrete Logical Layer:Finite data representation;implementation-independent
RDB to CDB: from two layers to three
12
Outline• Introduction
– relational, OR data models• GIS systems:
– CDB technology to the rescue• Constraint Databases:
– it’s not just about constraints– one more level of abstraction
• Constraint-backed databases:– practical considerations– getting constraint-backed technology right
13
Concrete Data Model in CDBs• Requirements for the concrete layer
– clean set-of-point semantics– efficient (index-based) data access methods– not required to use constraints (queries are over the abstract layer,
so actual choice of representation is transparent to user)
• Pure Constraint Databases– concrete layer is constraint-based– examples: CDB/CQA (query algebra), MLPQ (logic programming)
• Constraint-backed databases– concrete layer is not purely constraints – data may be represented geometrically
14
Practical Considerationsof GIS Applications
• Data input/output is not based on constraints – data often obtained by digitization (generates points and segments)– geometrical, visual, some standard spatial format… – in pure CDBs, converted to constraints
• Spatial features are never straight lines or convex polytopes– many short segments– frequent local change of direction– broken up into many constraint tuples (convex cells) per spatial object
• Continuous (real time) data visualization – most users do NOT want to see constraints, but a GUI– visualization requires spatial outline (boundary points)– constraints need to be converted back to geometrical representation– conversions carry heavy performance penalty (not real-time)
• Experience shows that practical systems are not pure– E.g. Dedale uses geometrical representations, explicitly translating to the
constraint representation for the constraint engine [GSSG03]
15
Geometric Data Representation• In the physical layer, need for geometry-based representations
recognized early on– KKR90 suggested computational geometry algorithms as evaluation
primitives
• Examples of geometric representations:– Points– Polylines: for trajectories, regions– Triangulated Irregular Networks (TINS): for terrains (2.5 dimensional)
• Efficient visualization• Efficient query evaluation
– If region R(x,y) is stored as a sequence of points that outline it, XR can be obtained by finding extrema of X-coordinates for these points.
– Bounding boxes equally easy to compute.
16
Role of Constraints in Constraint-Backed Databases
Define query semantics (abstract level)– for proving query correctness– to spare users from ad-hoc operators with arbitrary restrictions
• Provide default data model (concrete level)– one of the available data representations– e.g. when data is truly multidimensional
• For data integration– as intermediate representation between non-compatible systems
17
DEDALE• Not a pure constraint database• Nesting takes place at abstract level
LandUse(lname,geom[x,y])Flight(fname,traj[t,x,y,a])Country(cname,geom[x,y,h])
– Queries use nest and unnest operations explicitly
• Geometric representation in the concrete layer– geom in Country is represented as a TIN– traj in Flight is represented as a set of sample points
along the flight path
• Data model does not separate spatial and administrative data
18
DEDALE vs. CQA/CDB
• Over which location were the airplanes flying at time t1?
MAP X [X.fname, x,y ( t=t1 (X.traj))] (Flight)
• Return the part of the parcels contained in rectangle Rect(x,y)
MAP X [X.lname, X.geom ∩ Rect] (LandUse)
• Return all land parcels that have a point in Rect(x,y) lname,geom (MAP X [X.lname, X.geom, (x,y) in Rect (X.geom)] (LandUse))
Output limited to 2 spatiotemporal dimensions
(3 in case of interpolated attributes)
LandUse(lname,geom[x,y])Flight(fname,traj[t,x,y,a])Country(cname,geom[x,y,h])
LandUse(lname,x,y)Flight(fname,t,x,y,a)Country(cname,x,y,h)
R0 := SELECT t=t1 from FlightR1 := PROJECT R0 on fname,x,y
R0 := JOIN LandUse and Rect
R0 := JOIN LandUse and RectR1 = PROJECT R0 on lnameR2 = JOIN R1 and LandUse
Pure constraint DB not practical
19
Getting Constraint-Backed Systems Right
• Clean semantics and full expressiveness of constraint databases• Geometrical representation issues not a user concern
– though expert users may want to take more control
• System support for three-tier architecture– More sophisticated than for pure constraint databases, or for current spatial
databases
• Query processing engine must– choose the best concrete representation for output queries, among those
supported by system– select query evaluation strategies in the presence of a wider mix of possible
representations and techniques– take into account storage and visualization– perhaps maintain multiple representations for the same data?
20
Questions?