Copyright © C. J. Date 2005page 49 SAMPLE QUERIES : l Query A:Get S#-FROM-TO triples for suppliers...
-
Upload
alisa-paschal -
Category
Documents
-
view
213 -
download
1
Transcript of Copyright © C. J. Date 2005page 49 SAMPLE QUERIES : l Query A:Get S#-FROM-TO triples for suppliers...
Copyright © C. J. Date 2005 page 1
SAMPLE QUERIES :
Query A: Get S#-FROM-TO triples for suppliers who havebeen able to supply at least one part during at least one interval of time, where FROM and TO together designate a maximal interval during which supplierS# was in fact able to supply at least one part
Query B: Get S#-FROM-TO triples for suppliers who have beenunable to supply any parts at all during at least one interval of time, where FROM and TO together designate a maximal interval during which supplier S# was in fact unable to supply any part at all
You’ve got to be joking!
Copyright © C. J. Date 2005 page 2
TO SUM UP :
"Temporal" constraints and queries can be expressed,but they quickly get very complicated indeed
We need some carefully thought out and well-designed shorthands …
… which typically don’t exist yet in today’s commercial DBMSs, of course
So let’s investigate!
Copyright © C. J. Date 2005 page 3
2. LAYING THE FOUNDATIONS :
Time and the DB
What’s the problem?
Intervals
Interval operators
The EXPAND and COLLAPSE operators
The PACK and UNPACK operators
Relational operators
Copyright © C. J. Date 2005 page 4
INTERVALS :
Crucial observation: Need to deal with intervals as such (i.e., as values in their own right), instead of as pairs of FROM-TO values
But what’s an interval?
Consider proposition: "Supplier S1 was able to supply part P1 from day 4 to day 10"
Does interval "from day 4 to day 10" include day 4? day 10?
Given some interval, we sometimes want to regard specified begin and end points as included and sometimes not
Copyright © C. J. Date 2005 page 5
[d04:d10] -- closed-closed = d04 d05 d06 d07 d08 d09 d10[d04:d11) -- closed-open = d04 d05 d06 d07 d08 d09 d10(d03:d10] -- open-closed = d04 d05 d06 d07 d08 d09 d10(d03:d11) -- open-open = d04 d05 d06 d07 d08 d09 d10
Closed-open is convenient and most often used in practice:
E.g., split [d04:d11) immediately before, say, day 7 … result is [d04:d07) and [d07:d11)
But closed-closed is most intuitive and we’ll favor it throughout this presentation
Copyright © C. J. Date 2005 page 6
FULLY TEMPORALIZING SUPPLIERS AND SHIPMENTSUSING INTERVALS :
S_DURING SP_DURING
S# DURING S# P# DURING
S1 [d04:d10] S1 P1 [d04:d10]S2 [d02:d04] S1 P2 [d05:d10]S2 [d07:d10] S1 P3 [d09:d10]S3 [d03:d10] S1 P4 [d05:d10]S4 [d04:d10] S1 P5 [d04:d10]S5 [d02:d10] S1 P6 [d06:d10]
S2 P1 [d02:d04]S2 P1 [d08:d10]S2 P2 [d03:d03]S2 P2 [d09:d10]S3 P2 [d08:d10]S4 P2 [d06:d09]S4 P4 [d04:d08]S4 P5 [d05:d10]
Copyright © C. J. Date 2005 page 7
PREDICATES : S_DURING:
From the begin point of DURING to the end point of DURING inclusive (and not immediately before the begin point of DURING or immediately after the end point of DURING), supplier S# was under contract
SP_DURING:
From the begin point of DURING to the end point of DURING inclusive (and not immediately before the begin point of DURING or immediately after the end point of DURING), supplier S# was able to supply part P#
Copyright © C. J. Date 2005 page 8
SOME IMMEDIATE ADVANTAGES :
Constraints to prohibit FROM-TO pairs in which TO < FROM are now unnecessary ("FROM < TO" is implicit)
Primary keys {S#,DURING} (for S_DURING), {S#,P#,DURING}
(for SP_DURING)—choice no longer arbitrary
Don’t need to worry about whether FROM-TO intervals in previous version of DB are open or closed wrt FROM and TO
[d04:d10], [d04:d11), (d03:d10], (d03:d11) are distinct "possreps" for the very same interval* —don’t need to know which, if any, is actual physical representation
* See The Third Manifesto
Copyright © C. J. Date 2005 page 9
INTERVALS AREN’T NECESSARILY TEMPORAL :
Tax brackets are represented by taxable income ranges (intervals whose contained points are money values)
Machines operate within certain temperature and voltage ranges (intervals whose contained points are
temperatures and voltages, respectively)
Animals vary in the range of frequencies of light and sound waves to which their eyes and ears are receptive
Various natural phenomena occur in ranges in depth of soil or sea or height above sea level
Etc., etc.
Copyright © C. J. Date 2005 page 10
POINT TYPES AND INTERVAL TYPES :
Granularity of interval [d04:d10]
= one day
= granularity of type DATE
Assume DATE is a builtin type representing Gregorian dates: i.e., points on timeline accurate to one day (granularity thus one day by definition)
Exact type of interval value [d04:d10] = INTERVAL_DATE
Copyright © C. J. Date 2005 page 11
POINT TYPES AND INTERVAL TYPES (cont.) :
INTERVAL is a type generator
With associated generic interval operators (see later) and constraints
DATE is the point type of this particular interval type
Determines specific set of interval values that make up this particular interval type
Namely, the set of all possible intervals of the form [di:dj], where di and dj are DATE values and di < dj
Copyright © C. J. Date 2005 page 12
FURTHER EXAMPLES :
INTERVAL_INTEGER
Values are intervals of the form [i:j], where i and j are INTEGER values and i < j
Granularity = one (unity)
INTERVAL_TIMESTAMP
Values are intervals of the form [ti:tj], where ti and tj are TIMESTAMP values and ti < tj
Granularity = one microsecond (assume TIMESTAMP values are accurate to the microsecond)
Copyright © C. J. Date 2005 page 13
POINT TYPES :
Type T can be used as a point type if all of the following are defined for it:
A total ordering (">" etc. available for any pair of values of type T)
Niladic FIRST and LAST operators
Monadic NEXT and PRIOR operators (can fail)
NEXT = successor function
Successor function assumed unique
Thus, e.g., DATE is a valid point type
Copyright © C. J. Date 2005 page 14
Informal notation: Successor of d = d+1Predecessor of d = d-1
Successor function is what enables us to determine what points are contained in any given interval
E.g., if i = [d04:d10], contained points are exactly d04, d04+1, d04+2, …, d10
Successor function for type DATE = "next day"
Copyright © C. J. Date 2005 page 15
INTERVALS :
Let T be a point type. Then an interval (value) i of type INTERVAL_T is a scalar value for which two monadic ops, BEGIN and END, and one dyadic op, , are defined, such that:
BEGIN(i) and END(i) each return a value of type T
BEGIN(i) < END(i)
If p is a value of type T, then p i is true if and only if BEGIN(i) < p and p < END(i) are both true
Note that intervals are always nonempty
Copyright © C. J. Date 2005 page 16
A MORE SEARCHING EXAMPLE :A RELATION WITH TWO INTERVAL ATTRIBUTES
S_PARTS_DURING
S# PARTS DURING
S1 [P1:P3] [d01:d04] Not meant to correspondS1 [P2:P4] [d07:d08] in any particular wayS1 [P5:P6] [d09:d09] to sample SP_DURING valueS2 [P1:P1] [d08:d09]S2 [P1:P2] [d08:d08] Note problems: e.g., "S3S2 [P3:P4] [d07:d08] was able to supply P4 on S3 [P2:P4] [d01:d04] days 1-4" appears twiceS3 [P3:P5] [d01:d04]S3 [P2:P4] [d05:d06] Will revisit this example S3 [P2:P4] [d06:d09] laterS4 [P3:P4] [d05:d08]
Copyright © C. J. Date 2005 page 17
2. LAYING THE FOUNDATIONS :
Time and the DB
What’s the problem?
Intervals
Interval operators
The EXPAND and COLLAPSE operators
The PACK and UNPACK operators
Relational operators
Copyright © C. J. Date 2005 page 18
INFORMAL NOTATION :
Point type T, typical value p—use p+1, p+2, p-1, p-2, etc., as shorthands with obvious meanings (a real language would provide NEXT_T / PRIOR_T ops, also FIRST_T / LAST_T)
Interval type INTERVAL_T—use [p1:pn] to denote typical interval selector invocation (a real language would use more explicit syntax— e.g., INTERVAL_T ( [p1:pn] ) )
Let i be the interval [b:e]. Then:
BEGIN(i) and END(i) return b and e, resp.
p i b < p AND p < e
PRE(i) returns b-1 / POST(i) returns e+1 (can fail)
Let i = unit interval [p:p]; POINT FROM i returns p
Copyright © C. J. Date 2005 page 19
ALLEN’S OPERATORS :
i1 = i2
b1 e1i1
b2 e2i2
i1 i2 / i2 i1 (also i1 i2 / i2 i1)
b1 e1i1
b2 e2 i2
Copyright © C. J. Date 2005 page 20
ALLEN’S OPERATORS (cont.) :
i1 BEFORE i2 / i2 AFTER i1
b1 e1 b2 e2i1 i2
i1 MEETS i2 / i2 MEETS i1
b1 e1b2 e2i1 i2
i1 OVERLAPS i2 / i2 OVERLAPS i1
b1 e1i1
b2 e2 i2
Copyright © C. J. Date 2005 page 21
ALLEN’S OPERATORS (cont.) :
i1 MERGES i2 i1 OVERLAPS i2 OR i1 MEETS i2
b1 e1i1
b2 e2i2
Or:
b1 e1b2 e2i1 i2
Copyright © C. J. Date 2005 page 22
ALLEN’S OPERATORS (cont.) :
i1 BEGINS i2
b1 e1i1
b2 e2 i2
i1 ENDS i2
b1 e1i1
b2 e2 i2
Copyright © C. J. Date 2005 page 23
OTHER OPERATORS :
COUNT(i) /* aka DURATION(i) */ returns no. of points in i (i.e., cardinality)
i1 UNION i2 returns [MIN(b1,b2):MAX(e1,e2)] if i1 MERGES i2 and is otherwise undefined /* result is an interval */
b1 e1i1
b2 e2i2
b1 e2 i1 UNION i2
Copyright © C. J. Date 2005 page 24
OTHER OPERATORS (cont.) :
i1 INTERSECT i2 returns [MAX(b1,b2):MIN(e1,e2)] if i1 OVERLAPS i2 and is otherwise undefined
/* result is an interval */
b1 e1i1
b2 e2i2
b2 e1
i1 INTERSECT i2
Copyright © C. J. Date 2005 page 25
OTHER OPERATORS (cont.) : i1 MINUS i2 returns [b1:MIN(b2-1,e1)] if b1 < b2 and e1 < e2,
[MAX(e2+1,b1):e1] if b1 > b2 and e1 > e2, and is otherwise undefined (i.e., undefined if i1 BEGINS i2or i1 ENDS i2 or if either of i1 and i2 properly includes the other) /* result is an interval */
b1 e1i1
b2 e2i2
b1 b2-1
i1 MINUS i2
Copyright © C. J. Date 2005 page 26
SAMPLE QUERIES :
Get supplier numbers for suppliers who were able to supply part P2 on day 8
( SP_DURING WHEREP# = P# (‘P2’)AND d08 DURING ) { S# }
Get pairs of suppliers who were able to supply the same part at the same time
WITH SP_DURING RENAME ( S# AS X#, DURING AS XD ) AS T1 ,SP_DURING RENAME ( S# AS Y#, DURING AS YD ) AS T2 ,T1 JOIN T2 AS T3 ,
( T3 WHERE XD OVERLAPS YD ) AS T4 ,( T4 WHERE X# < Y# ) AS T5 :
T5 { X#, Y# }
Note the use of WITH to introduce names for expressions
Copyright © C. J. Date 2005 page 27
Get pairs of suppliers who were able to supply the same part at the same time, together with the parts and times in question
WITH SP_DURING RENAME ( S# AS X#, DURING AS XD ) AS T1 ,SP_DURING RENAME ( S# AS Y#, DURING AS YD ) AS T2 ,T1 JOIN T2 AS T3 ,
( T3 WHERE XD OVERLAPS YD ) AS T4 ,( T4 WHERE X# < Y# ) AS T5 ,( EXTEND T5 ADD ( XD INTERSECT YD AS DURING ) ) AS T6 :
T6 { X#, Y#, P#, DURING }
Copyright © C. J. Date 2005 page 28
2. LAYING THE FOUNDATIONS :
Time and the DB
What’s the problem?
Intervals
Interval operators
The EXPAND and COLLAPSE operators
The PACK and UNPACK operators
Relational operators
Copyright © C. J. Date 2005 page 29
THE EXPAND AND COLLAPSE OPERATORS :
Work on sets of intervals, not individual intervals (or pairs of intervals) per se
Each takes a set of intervals all of the same type as its single operand and returns another such set as its result
{ […], […], …, […] }
{ […], […], …, […] }
Result in each case is a particular canonical form for the original set
Copyright © C. J. Date 2005 page 30
ASIDE : "CANONICAL FORM"
Given (a) a set S of objects and
(b) a notion of equivalence among such objects
subset C of S is said to be a set of canonical forms for S (under the stated definition of equivalence) if and only if every object s in S is equivalent to just one object c in C
Object c is the canonical form for object s
All "interesting" properties that apply to s also apply to c; thus, we can study just the small set C, not the large set S, in order to obtain or prove a variety of "interesting" results
Copyright © C. J. Date 2005 page 31
EXPANDED FORM :
The objects we wish to study are sets of intervals, where the intervals are all of the same type
Let X1 and X2 be two such sets. Define equivalence:
X1 and X2 are equivalent if and only if set of all points in intervals in X1 =set of all points in intervals in X2
E.g.:
X1 = { [d01:d01], [d03:d05], [d04:d06] }
X2 = { [d01:d01], [d03:d04], [d05:d05], [d05:d06] }
Copyright © C. J. Date 2005 page 32
EXPANDED FORM (cont.) :
Corresp set of points = { d01, d03, d04, d05, d06 }
But we’re more interested in corresp set of unit intervals:
X3 = { [d01:d01], [d03:d03], [d04:d04],[d05:d05], [d06:d06] }
X3 is equivalent to both X1 and X2 (it’s the expanded form of both)
If X is a set of intervals all of the same type, then the expanded form of X is the set of all intervals of the form [p:p] where p is a point in some interval in X
Copyright © C. J. Date 2005 page 33
EXPANDED FORM (cont.) :
Given any such set X, a corresponding expanded form always exists; expanded form is equivalent to X and is unique
Expanded form of X is one possible canonical form for X
Unique set equivalent to X such that every contained interval is of minimum possible duration (viz., one)
X1 X2 if and only if they have the same expanded form
Intuitively, the expanded form of X allows us to focus on the information content of X at an atomic level, without worrying about the many different ways that information might be bundled together into clumps
Copyright © C. J. Date 2005 page 34
COLLAPSED FORM :
X1 = { [d01:d01], [d03:d05], [d04:d06] }
X2 = { [d01:d01], [d03:d04], [d05:d05], [d05:d06] }
X3 = { [d01:d01], [d03:d03], [d04:d04],[d05:d05], [d06:d06] }
Expanded form here (X3) has greatest cardinality: fluke!
X4 = { [d01:d01], [d03:d03], [d03:d04], [d03:d05],[d03:d06], [d04:d04], [d04:d05], [d04:d06] }
X4 has same expanded form but greater cardinality than X3
Copyright © C. J. Date 2005 page 35
COLLAPSED FORM (cont.) :
X5 = { [d01:d01], [d03:d06] }
X5 has same expanded form but minimum possible cardinality (it’s the collapsed form of X1, X2, X3, X4)
If X is a set of intervals all of the same type, then the collapsed form of X is the set Y of intervals of the sametype such that:
X and Y have the same expanded form, and
No two distinct intervals i1 and i2 in Y are such that i1 MERGES i2 is true
Copyright © C. J. Date 2005 page 36
COLLAPSED FORM (cont.) :
Given any such set X, a corresponding collapsed form always exists; collapsed form is equivalent to X and is unique
Collapsed form of X is another possible canonical form for X
Unique set equivalent to X that has the minimum possible cardinality
X1 X2 if and only if they have the same collapsed form
Intuitively, the collapsed form of X allows us to focus on the information content of X in a compressed (clumped) form, without worrying about the possibility that clumps might overlap or abut
Copyright © C. J. Date 2005 page 37
LET X BE A SET OF INTERVALS ALL OF THE SAME TYPE :
EXPAND ( X ) : returns expanded form of X
COLLAPSE ( X ) : returns collapsed form of X
(What happens if X has cardinality zero? Or one?)
Ops are not inverses of each other! In fact:
EXPAND ( COLLAPSE ( X ) ) EXPAND ( X )
COLLAPSE ( EXPAND ( X ) ) COLLAPSE ( X )
Copyright © C. J. Date 2005 page 38
NOW I NEED TO CLEAN UP MY ACT !
Relational model doesn’t support general sets, it supports relations!
However, a set of values v1, v2, …, vn all of the same type can easily be converted into a unary relation:
RELATION { TUPLE { A v1 } , TUPLE { A v2 },………………… , TUPLE { A vn } }
Returns: Arelation
v1 selectorv2 invocation. .vn
Copyright © C. J. Date 2005 page 39
So let us replace EXPAND and COLLAPSE as previously described by versions in which argument is specified as a unary relation
r EXPAND(r) COLLAPSE(r)
DURING DURING DURING
[d06:d09] [d01:d01] [d01:d01] [d04:d08] [d04:d04] [d04:d10] [d05:d10] [d05:d05] [d01:d01] [d06:d06]
[d07:d07] [d08:d08] [d09:d09] [d10:d10]
Extend definition of equivalence accordingly
Copyright © C. J. Date 2005 page 40
EXPANDING / COLLAPSING NULLARY RELATIONS :
Nullary relation has no attributes at all …
There are exactly two such relations!
TABLE_DEE has just one tuple (the "0-tuple")
TABLE_DUM has no tuples at all
Highly desirable to define versions of EXPAND and COLLAPSE for nullary relations (see later)
Definition: Result = input in every case
Copyright © C. J. Date 2005 page 41
2. LAYING THE FOUNDATIONS :
Time and the DB
What’s the problem?
Intervals
Interval operators
The EXPAND and COLLAPSE operators
The PACK and UNPACK operators
Relational operators
Copyright © C. J. Date 2005 page 42
THE PACK AND UNPACK OPERATORS :
Preliminaries The PACK operator The UNPACK operator Sample queries Packing and unpacking on no attributes Packing and unpacking on several attributes
Copyright © C. J. Date 2005 page 43
PRELIMINARY EXAMPLE :
UNPACK r PACK r r ON DURING ON DURING
S# DURING S# DURING S# DURING
S2 [d02:d04] S2 [d02:d02] S2 [d02:d05] S2 [d03:d05] S2 [d03:d03] S4 [d02:d06] S4 [d02:d05] S2 [d04:d04] S4 [d09:d10] S4 [d04:d06] S2 [d05:d05] S4 [d09:d10] S4 [d02:d02]
S4 [d03:d03] S4 [d04:d04] S4 [d05:d05] S4 [d06:d06] S4 [d09:d09] S4 [d10:d10]
Copyright © C. J. Date 2005 page 44
RECALL QUERY A :
Get S#-FROM-TO triples for suppliers who have been able to supply at least one part during at least one interval of time, where FROM and TO together designate a maximal interval during which supplier S# was in fact able to supply at least one part
Revised version:
Get S#-DURING pairs for suppliers who have been able to supply at least one part during at least one interval of time, where DURING designates a maximal interval during which supplier S# was in fact able to supply at least one part
We will build up our formulation one step at a time ...
Copyright © C. J. Date 2005 page 45
S# P# DURING
S1 P1 [d04:d10]S1 P2 [d05:d10]S1 P3 [d09:d10]S1 P4 [d05:d10]S1 P5 [d04:d10]S1 P6 [d06:d10]S2 P1 [d02:d04]S2 P1 [d08:d10]S2 P2 [d03:d03]S2 P2 [d09:d10]S3 P2 [d08:d10]S4 P2 [d06:d09]S4 P4 [d04:d08]S4 P5 [d05:d10]
SP_DURING (SAMPLE VALUE) :
Copyright © C. J. Date 2005 page 46
PROJECT AWAY PART NUMBERS :
S# DURING
S1 [d04:d10] WITH SP_DURING { S#, DURING }S1 [d05:d10] AS T1 :S1 [d09:d10] /* part numbers irrelevant */S1 [d06:d10]S2 [d02:d04]S2 [d08:d10] Note the redundancy—S2 [d03:d03] E.g., "Supplier S1 was able to S2 [d09:d10] supply something on day 6"S3 [d08:d10] appears three times!S4 [d06:d09]S4 [d04:d08]S4 [d05:d10]
T1
Copyright © C. J. Date 2005 page 47
DESIRED RESULT (ELIMINATING REDUNDANCY) :
S# DURING
S1 [d04:d10] Packed formS2 [d02:d04] of T1 on DURING:S2 [d08:d10]S3 [d08:d10] "PACK T1 ON DURING"S4 [d04:d10]
Note: Given DURING value for given supplier in RESULT does not necessarily exist as an explicit DURING value for that supplier in T1 (see, e.g., S4)