Taking Constraints out of Constraint Databases

20
1 Taking Constraints out of Constraint Databases Dina Goldin University of Connecticut Applications of Constraint Databases Paris, France, June 2004

description

Taking Constraints out of Constraint Databases. Dina Goldin University of Connecticut Applications of Constraint Databases Paris, France, June 2004. queries. Table-based Logical Layer. Physical Layer. Relational Databases. Codd[70] provided an additional level of abstraction - PowerPoint PPT Presentation

Transcript of Taking Constraints out of Constraint Databases

Page 1: Taking Constraints out of Constraint Databases

1

Taking Constraints out of Constraint Databases

Dina GoldinUniversity of Connecticut

Applications of Constraint DatabasesParis, France, June 2004

Page 2: Taking Constraints out of Constraint Databases

2

Relational DatabasesCodd[70] provided an additional level of abstraction

between physical data and queries

Customized data layout for

each application

queries

Physical Layer

Table-basedLogical Layer

queries

Page 3: Taking Constraints out of Constraint Databases

3

Advantages of Relational Model

• Data model: Uniform table-based representation for all data at logical level

• Data independence: Can modify physical layer without affecting queries

• Simple set-of-points semantics, RA=RC

• Efficient indexing methods

A commercial success in the 1980s!

Page 4: Taking Constraints out of Constraint Databases

4

Object-Relational Databases• Disadvantages of RDBs:

– only good for traditional, “administrative” data

• OO technology corrects this: – encapsulate non-administrative data

– provide methods to access it

• Object-relational databases provide this technology within a relational framework.

They are the latest commercial success.

Page 5: Taking Constraints out of Constraint Databases

5

Outline• Introduction

– relational, OR data models• GIS systems:

– CDB technology to the rescue• Constraint Databases:

– it’s not just about constraints– one more level of abstraction

• Constraint-backed databases:– practical considerations– getting constraint-backed technology right

Page 6: Taking Constraints out of Constraint Databases

6

Geographic Information Systems

• Until recentlly, leading commercial systems for spatial data

• Not database systems per se– cannot manage non-geographic data– no ad-hoc querying (users perform built-in operations

or execute predefined queries)– single-layered architecture (no data independence

when writing queries)– in-memory (no index stuctures)

Page 7: Taking Constraints out of Constraint Databases

7

Newer Approaches to Managing Spatial Data

• Marrying GIS and object-relational databases– Example: Oracle Spatial Data Option– Full power of a relational DB plus…

• Spatial data – encapsulated as new data types within the OR framework– same data types as in ARC/Info (leading GIS system)

• Spatial operations – as methods over the new data types – based on GIS operations

• Spatial data access structures– based on bounding boxes

Page 8: Taking Constraints out of Constraint Databases

8

Data Separation in OR/GIS Databases

• Spatial data stored in spatial relations– predefined set of spatial data types (point, region, etc…)– each relation is a set of spatial objects of one type, with a key– predefined set of operations over spatial objects

• “Traditional” data stored in regular relations– Including thematic/descriptive data pertaining to spatial objects

• Spatial & administrative data are logically separate – only keys of spatial objects to correlate between them– spatial data processing limited to predefined types and operators

• Separation applies to query output as well– limited query expressiveness

Can constraint databases offer a better solution?

Page 9: Taking Constraints out of Constraint Databases

9

Constraint Databases • Contribution of KKR[90,95]• Key idea: Allow relations that include infinitely

many points– “Finite relations are generalized to finitely

representable relations” [GK96]

• Generalized: original term for tuples and relations with infinite semantics– We now prefer the term constraint for such tuples and

relations

Goal: next commercial success (for GIS applications)

Page 10: Taking Constraints out of Constraint Databases

10

Revisiting the Logical Layer • Components of the logical database layer:

– set-of-tuples data semantics– implementation-independent (logical) data

representation

• Relational databases– finite semantics– trivial one-to-one correspondence between

the two components

• Constraint databases:– infinite semantics– correspondence between data semantics

and data representation no longer trivial

Infinite semantics of finitely representable data imply an additional level of abstraction; we need to separate logical layer into two

Physical Layer

Table-basedLogical Layer

queries

Page 11: Taking Constraints out of Constraint Databases

11

Additional Level of Abstraction

Physical Layer:File-based data storage; indexing structures, data access methods; implementation-dependent

Logical Layer:(queries defined over this layer)finite set-of-point semantics;table-based representation;Implementation-independent

Abstract Logical Layer:(queries defined over this layer) infinite set-of-point semantics

Concrete Logical Layer:Finite data representation;implementation-independent

RDB to CDB: from two layers to three

Page 12: Taking Constraints out of Constraint Databases

12

Outline• Introduction

– relational, OR data models• GIS systems:

– CDB technology to the rescue• Constraint Databases:

– it’s not just about constraints– one more level of abstraction

• Constraint-backed databases:– practical considerations– getting constraint-backed technology right

Page 13: Taking Constraints out of Constraint Databases

13

Concrete Data Model in CDBs• Requirements for the concrete layer

– clean set-of-point semantics– efficient (index-based) data access methods– not required to use constraints (queries are over the abstract layer,

so actual choice of representation is transparent to user)

• Pure Constraint Databases– concrete layer is constraint-based– examples: CDB/CQA (query algebra), MLPQ (logic programming)

• Constraint-backed databases– concrete layer is not purely constraints – data may be represented geometrically

Page 14: Taking Constraints out of Constraint Databases

14

Practical Considerationsof GIS Applications

• Data input/output is not based on constraints – data often obtained by digitization (generates points and segments)– geometrical, visual, some standard spatial format… – in pure CDBs, converted to constraints

• Spatial features are never straight lines or convex polytopes– many short segments– frequent local change of direction– broken up into many constraint tuples (convex cells) per spatial object

• Continuous (real time) data visualization – most users do NOT want to see constraints, but a GUI– visualization requires spatial outline (boundary points)– constraints need to be converted back to geometrical representation– conversions carry heavy performance penalty (not real-time)

• Experience shows that practical systems are not pure– E.g. Dedale uses geometrical representations, explicitly translating to the

constraint representation for the constraint engine [GSSG03]

Page 15: Taking Constraints out of Constraint Databases

15

Geometric Data Representation• In the physical layer, need for geometry-based representations

recognized early on– KKR90 suggested computational geometry algorithms as evaluation

primitives

• Examples of geometric representations:– Points– Polylines: for trajectories, regions– Triangulated Irregular Networks (TINS): for terrains (2.5 dimensional)

• Efficient visualization• Efficient query evaluation

– If region R(x,y) is stored as a sequence of points that outline it, XR can be obtained by finding extrema of X-coordinates for these points.

– Bounding boxes equally easy to compute.

Page 16: Taking Constraints out of Constraint Databases

16

Role of Constraints in Constraint-Backed Databases

Define query semantics (abstract level)– for proving query correctness– to spare users from ad-hoc operators with arbitrary restrictions

• Provide default data model (concrete level)– one of the available data representations– e.g. when data is truly multidimensional

• For data integration– as intermediate representation between non-compatible systems

Page 17: Taking Constraints out of Constraint Databases

17

DEDALE• Not a pure constraint database• Nesting takes place at abstract level

LandUse(lname,geom[x,y])Flight(fname,traj[t,x,y,a])Country(cname,geom[x,y,h])

– Queries use nest and unnest operations explicitly

• Geometric representation in the concrete layer– geom in Country is represented as a TIN– traj in Flight is represented as a set of sample points

along the flight path

• Data model does not separate spatial and administrative data

Page 18: Taking Constraints out of Constraint Databases

18

DEDALE vs. CQA/CDB

• Over which location were the airplanes flying at time t1?

MAP X [X.fname, x,y ( t=t1 (X.traj))] (Flight)

• Return the part of the parcels contained in rectangle Rect(x,y)

MAP X [X.lname, X.geom ∩ Rect] (LandUse)

• Return all land parcels that have a point in Rect(x,y) lname,geom (MAP X [X.lname, X.geom, (x,y) in Rect (X.geom)] (LandUse))

Output limited to 2 spatiotemporal dimensions

(3 in case of interpolated attributes)

LandUse(lname,geom[x,y])Flight(fname,traj[t,x,y,a])Country(cname,geom[x,y,h])

LandUse(lname,x,y)Flight(fname,t,x,y,a)Country(cname,x,y,h)

R0 := SELECT t=t1 from FlightR1 := PROJECT R0 on fname,x,y

R0 := JOIN LandUse and Rect

R0 := JOIN LandUse and RectR1 = PROJECT R0 on lnameR2 = JOIN R1 and LandUse

Pure constraint DB not practical

Page 19: Taking Constraints out of Constraint Databases

19

Getting Constraint-Backed Systems Right

• Clean semantics and full expressiveness of constraint databases• Geometrical representation issues not a user concern

– though expert users may want to take more control

• System support for three-tier architecture– More sophisticated than for pure constraint databases, or for current spatial

databases

• Query processing engine must– choose the best concrete representation for output queries, among those

supported by system– select query evaluation strategies in the presence of a wider mix of possible

representations and techniques– take into account storage and visualization– perhaps maintain multiple representations for the same data?

Page 20: Taking Constraints out of Constraint Databases

20

Questions?