PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

30
PostgreSQL & Temporal PostgreSQL & Temporal Data Data Christopher Browne Christopher Browne Afilias Canada Afilias Canada PGCon 2009 PGCon 2009

Transcript of PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Page 1: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

PostgreSQL & Temporal PostgreSQL & Temporal DataDataChristopher BrowneChristopher Browne

Afilias CanadaAfilias Canada

PGCon 2009PGCon 2009

Page 2: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

AgendaAgenda

What kind of temporal data do we need?What kind of temporal data do we need?

What data types does PostgreSQL offer?What data types does PostgreSQL offer?

Temporality RepresentationsTemporality Representations

Time Travel, Transaction Tables, Serial Time Travel, Transaction Tables, Serial NumbersNumbers

Page 3: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

What kind of temporal What kind of temporal data do we need?data do we need?

Databases store facts about objects and Databases store facts about objects and eventsevents

Interesting times includeInteresting times include

When an event took placeWhen an event took place

When the event was recordedWhen the event was recorded

When someone was charged for the eventWhen someone was charged for the event

Page 4: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

More Interesting TimesMore Interesting Times

When you start recognizing income on the When you start recognizing income on the eventevent

When you end recognizing income on the When you end recognizing income on the eventevent

When an object state beginsWhen an object state begins

When an object state endsWhen an object state ends

Page 5: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

PostgreSQL Data TypesPostgreSQL Data Types

DateDateProblem: Pre-assumes evaluation of cutoff Problem: Pre-assumes evaluation of cutoff between days!between days!

Time with/without timezoneTime with/without timezoneProblem: Comparisons of Date+Time turn into Problem: Comparisons of Date+Time turn into hideous SQLhideous SQL

TimestampTimestampCombines Date + TimeCombines Date + Time

Page 6: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

PostgreSQL Data TypesPostgreSQL Data Types

Timestamp with time zoneTimestamp with time zoneAllows collecting time in ‘local times’ and Allows collecting time in ‘local times’ and recognizing thatrecognizing that

IntervalIntervalDifference between two times/timestampsDifference between two times/timestampsVery useful for indicating duration of time Very useful for indicating duration of time rangesranges

Page 7: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

OperatorsOperators

time/timestamp/date +|- interval = time/timestamp/date +|- interval = time/timestamp/datetime/timestamp/date

timestamp - timestamp = intervaltimestamp - timestamp = interval(likewise for the others)(likewise for the others)

timestamp <, <=, >, >= timestamptimestamp <, <=, >, >= timestamp

A BETWEEN B AND CA BETWEEN B AND CA >= B and A <= CA >= B and A <= C

Page 8: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Variations on “when is Variations on “when is it???”it???”

NOW(), transaction_timestamp, NOW(), transaction_timestamp, current_timestampcurrent_timestampall providing all providing startstart of transaction of transaction

statement_timestampstatement_timestamp

clock_timestampclock_timestamp

transaction commit timestamp - not available!transaction commit timestamp - not available!

Page 9: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Commit TimestampCommit Timestamp

Useful representation: Tables record (serverID, ctid)Useful representation: Tables record (serverID, ctid)

At COMMIT time, if the transaction has used this, then insert (serverID, At COMMIT time, if the transaction has used this, then insert (serverID, ctid, clock_timestamp) into timestamp tablectid, clock_timestamp) into timestamp table

Eliminates Slony-I “SYNC” thread & simplifies queriesEliminates Slony-I “SYNC” thread & simplifies queries

Helpful for multimaster replication strategiesHelpful for multimaster replication strategies

Adds a table full of timestamps that needs cleansing :-(Adds a table full of timestamps that needs cleansing :-(

Page 10: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

PGTemporalPGTemporal

PgFoundry project implementing PgFoundry project implementing (timestamp,timestamp) type + all logical operations(timestamp,timestamp) type + all logical operations

First aspect: Supports inclusive & exclusive periodsFirst aspect: Supports inclusive & exclusive periods

[ From, To ], ( From, To ), [ From, To ), ( From, To ][ From, To ], ( From, To ), [ From, To ), ( From, To ]

[ and ] indicate “inclusive” periods beginning and [ and ] indicate “inclusive” periods beginning and ending at the specified momentending at the specified moment

( and ) indicate exclusive periods excluding ( and ) indicate exclusive periods excluding endpointsendpoints

Page 11: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Inclusion & ExclusionInclusion & Exclusion

Commonly, [From, To) is the ideal representationCommonly, [From, To) is the ideal representation

Today’s data easily characterized as [2009-05-Today’s data easily characterized as [2009-05-22,2009-05-23)22,2009-05-23)

This month’s period: [2009-05-01, 2009-06-01)This month’s period: [2009-05-01, 2009-06-01)

Successive periods Successive periods do not overlapdo not overlap[2009-04-01,2009-05-01),[2009-05-01,2009-06-01)[2009-04-01,2009-05-01),[2009-05-01,2009-06-01)

Note that SQL “BETWEEN” is equivalent to [From,To]Note that SQL “BETWEEN” is equivalent to [From,To]

Page 12: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

A Veritable Panoply of A Veritable Panoply of OperatorsOperators

length(p), first(p), last(p), prior(p), next(p)length(p), first(p), last(p), prior(p), next(p)

contains(p, t), contains(p1, p2), contained_by(t, contains(p, t), contains(p1, p2), contained_by(t, p), contained_by(p1,p2), overlaps(p1,p2), p), contained_by(p1,p2), overlaps(p1,p2), adjacent(p1,p2), overleft(p1,p2), adjacent(p1,p2), overleft(p1,p2), overright(p1,p2), is_empty(p), equals(p1,p2), overright(p1,p2), is_empty(p), equals(p1,p2), nequals(p1,p2), before(p1,p2), after(p1,p2)nequals(p1,p2), before(p1,p2), after(p1,p2)

period(t), period(t1,t2), empty_period()period(t), period(t1,t2), empty_period()

period_intersect(p1,p2), period_union(p1,p2), period_intersect(p1,p2), period_union(p1,p2), minus(p1,p2)minus(p1,p2)

Page 13: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Core????Core????

Should PGTemporal be in core?Should PGTemporal be in core?

What would be needed for it to head in?What would be needed for it to head in?

Page 14: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Classical SQL Classical SQL TemporalityTemporality

Developing Time-Oriented Database Developing Time-Oriented Database Applications in SQL - Richard Snodgrass, Applications in SQL - Richard Snodgrass, available freely as PDFavailable freely as PDF

Uses periods much as in PGTemporalUses periods much as in PGTemporal

Standard SQL does not support periods, alas!Standard SQL does not support periods, alas!

Considerable attention to handling insertion of Considerable attention to handling insertion of past/future historypast/future history

Page 15: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Foreign Key ChallengesForeign Key Challenges

Nontemporal tables: No temporality, No problem!Nontemporal tables: No temporality, No problem!

Referencing table is temporal, referenced table Referencing table is temporal, referenced table isn’t: No problem!isn’t: No problem!

Referenced table is temporal Troublesome! Referenced table is temporal Troublesome!

Referential integrity may be violated simply via Referential integrity may be violated simply via passage of timepassage of time

Referenced & referencing tables may vary Referenced & referencing tables may vary independently!independently!

Page 16: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

PostgreSQL Time TravelPostgreSQL Time Travel

Take a stateful tableTake a stateful table

Add triggers to capture (From,To) timestamps Add triggers to capture (From,To) timestamps on INSERT, UPDATE, DELETEon INSERT, UPDATE, DELETE

Sadly, this breaks if you require referential Sadly, this breaks if you require referential integrity constraints pointing to this table :-(integrity constraints pointing to this table :-(

Page 17: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Time Travel ActionsTime Travel Actions

On INSERTOn INSERT

(NEW.From, NEW.To) = (NOW(), NULL)(NEW.From, NEW.To) = (NOW(), NULL)

On DELETEOn DELETE

(OLD.From, OLD.To) = (PrevValue, NOW())(OLD.From, OLD.To) = (PrevValue, NOW())

On UPDATEOn UPDATE

Transforms into DELETE old, INSERT newTransforms into DELETE old, INSERT new

Page 18: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Pulling Specific StatePulling Specific State

Current state:Current state:select * from table where endtime is NULLselect * from table where endtime is NULL

State at a particular time: Set Returning State at a particular time: Set Returning FunctionFunctionselect * from table_at_time(ts)select * from table_at_time(ts)

Pulls tuples effective at that timePulls tuples effective at that time

starttime <= tsstarttime <= ts

endtime is null or endtime >= tsendtime is null or endtime >= ts

Page 19: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Explicit Temporal TablesExplicit Temporal Tables

Accept that it’s temporal to begin withAccept that it’s temporal to begin with

Not just a way to get “history for free”Not just a way to get “history for free”

Enables Science Fiction: Declaring future Enables Science Fiction: Declaring future state!state!

At 9am next Wednesday, state will changeAt 9am next Wednesday, state will change

Eliminates need for “batch jobs”Eliminates need for “batch jobs”

May need to pre-record future-dated events!May need to pre-record future-dated events!

Page 20: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Science Fiction....Science Fiction....

Page 21: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

ProblemsProblems

Foreign key references into temporal tables are Foreign key references into temporal tables are problematicproblematic

Overlap?Overlap?

Reference disappearing?Reference disappearing?

Fixing problems requires “fabricating a Fixing problems requires “fabricating a historical story” not just “fixing the state”historical story” not just “fixing the state”

Page 22: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Temporality via Tx Temporality via Tx ReferencesReferences

create table transactions (create table transactions ( tx_id integer primary key default tx_id integer primary key default nextval(‘tx_seq’),nextval(‘tx_seq’), whodunnit integer not null references whodunnit integer not null references users(user_id),users(user_id), and_when timestamptz not null default NOW()); and_when timestamptz not null default NOW());

create table slightly_temporal_object (create table slightly_temporal_object ( object_id serial primary key, object_id serial primary key, tx_id integer not null default currval(‘tx_seq’) tx_id integer not null default currval(‘tx_seq’) references transactions(tx_id)); references transactions(tx_id));

Page 23: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Getting More Temporal - Getting More Temporal - II

Add ON UPDATE trigger that updates tx_id to Add ON UPDATE trigger that updates tx_id to currval(‘tx_seq’)currval(‘tx_seq’)

Page 24: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

More Temporal: History!More Temporal: History!

Create a “past history” tableCreate a “past history” table

Similar schema, but drop all data validationSimilar schema, but drop all data validation

Add end_txAdd end_tx

UPDATE/DELETE throw obsolete tuples into UPDATE/DELETE throw obsolete tuples into the “past history table”the “past history table”

Data validation dropped because validation Data validation dropped because validation can change over timecan change over time

Page 25: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Serial Number Serial Number TemporalityTemporality

Used in DNSUsed in DNS

Sets of updates grouped together temporallySets of updates grouped together temporally

A “bump of serial number” indicates A “bump of serial number” indicates common publishing at a common point in common publishing at a common point in timetime

Page 26: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

ObjectObject ValueValue ZonZonee

FroFromm

ToTo

ns1.abc.orgns1.abc.org 10.2.3.110.2.3.1 orgorg 11 33

ns1.abc.orgns1.abc.org 10.2.3.210.2.3.2 orgorg 33

ns2.abc.orgns2.abc.org 10.2.2.110.2.2.1 orgorg 22

ns3.abc.orgns3.abc.org 10.9.1.210.9.1.2 orgorg 11 33

ns1.abc.orgns1.abc.org 10.2.3.110.2.3.1 infoinfo 1717 1919

ns2.abc.orgns2.abc.org 10.2.3.210.2.3.2 infoinfo 1414 1818

ns2.abc.orgns2.abc.org 10.9.1.210.9.1.2 infoinfo 1818

ns3.abc.orgns3.abc.org 141.2.3.141.2.3.44

infoinfo 1919

Page 27: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Zone Representation Zone Representation MeritsMerits

It’s fast. We extract multimillion record zones It’s fast. We extract multimillion record zones in minutesin minutes

Arbitrary ability to roll back...Arbitrary ability to roll back...

Nicely supports DNS AXFR/IXFR operationsNicely supports DNS AXFR/IXFR operations

Each serial # represents a sort of “Logical Each serial # represents a sort of “Logical Commit”Commit”

Page 28: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Further Merits of thisFurther Merits of this

Rename “zone” to “module” and this is nice for Rename “zone” to “module” and this is nice for configurationconfiguration

We already know it supports large amounts of We already know it supports large amounts of data efficientlydata efficiently

Configuration is smaller (we hope!)Configuration is smaller (we hope!)

Page 29: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

Demerits of zone-like Demerits of zone-like structurestructure

No way to specify a point of time in the futureNo way to specify a point of time in the future

Serial numbers are intended to just keep rolling Serial numbers are intended to just keep rolling alongalong

HOWEVER....HOWEVER....

With complex apps & configuration, fancier With complex apps & configuration, fancier temporality looks like a misfeaturetemporality looks like a misfeature

Page 30: PostgreSQL & Temporal Data Christopher Browne Afilias Canada PGCon 2009.

ConclusionsConclusions

3 ways to represent temporal information3 ways to represent temporal information

Timestamps, Transaction IDs, Serial numbersTimestamps, Transaction IDs, Serial numbers

PostgreSQL changes possiblePostgreSQL changes possible

Should PGtemporal be added to “core”?Should PGtemporal be added to “core”?

Should we try to have temporal foreign key Should we try to have temporal foreign key functionality in core?functionality in core?