Sándor Héman Marcin Zukowski Niels Nes Lefteris Sidirourgos Peter Boncz Positional Update Handling...

29
Sándor Héman Marcin Zukowski Niels Nes Lefteris Sidirourgos Peter Boncz Positional Update Handling in Column Stores

Transcript of Sándor Héman Marcin Zukowski Niels Nes Lefteris Sidirourgos Peter Boncz Positional Update Handling...

Sándor HémanMarcin ZukowskiNiels NesLefteris SidirourgosPeter Boncz

Positional Update Handlingin Column Stores

UPDATE IN PLACE:A Poison Apple?Jim Gray, 1981

“..for performance reasons, most disc-based systems have been seduced into updating the data in place.”

30 years of hardware improvements in sequential/throughput beating random/latency…. in-place less feasible every year.

alternative: differential approach.In column stores, in-place updating is by now clearly infeasible

Problem: Column Store Updates

• I/O proportional to number of attributes– I/O blocks large and compressed– Sometimes even replicated– Read-Optimized Update-Unfriendly

• Table often kept ordered on sort-key (SK) attributes– Uniform update load scattered write access

Solution: Differential Structure

• Maintain updates (INS/DEL/MOD) in a differential structure– Merge with base table during scan

Solution: Differential Structure

• Maintain updates (INS/DEL/MOD) in a differential structure– Merge with base table during scan

• Challenges:– Efficiently maintainable data-structure– Minimize Merge impact for read-only queries

Naïve Approach: Delta Tables

• For each table, maintain two update friendly row-store tables:– INS(C1..Cn)– DEL(SK1..SKm)– MOD = DEL + INS

store prod new qty

London stool N 10

London table N 20

Paris rug N 1

Paris stool N 5

Base table: inventorySort-Key (SK): [store, prod]

store prod new qty

Berlin chair Y 5

Berlin cloth Y 20

Inserts table: INS

store prod

Paris rug

Deletes table: DEL

Naïve Approach: Delta Tables

store prod new qty

London stool N 10

London table N 20

Paris rug N 1

Paris stool N 5

Base table: inventorySort-Key (SK): [store, prod]

store prod new qty

Berlin chair Y 5

Berlin cloth Y 20

Inserts table: INS

store prod

Paris rug

Deletes table: DEL

• Rewrite table scans:

MergeUnion[store,prod](Scan(INS), MergeDiff[store,prod]( Scan(Inventory), Scan(DEL)))

Naïve Approach: Delta Tables

• Rewrite table scans:

MergeUnion[store,prod](Scan(INS), MergeDiff[store,prod]( Scan(Inventory), Scan(DEL)))

for up-to-date image• Expensive!

– I/O to scan SK ‘merge’ columns; also if querydoes not need SK cols

– Each query pays CPU effort to locate the same change positions over and over again

store prod new qty

Berlin chair Y 5

Berlin cloth Y 20

London stool N 10

London table N 20

Paris stool N 5

Actual table: inventorySort-Key (SK): [store, prod]

The Idea: Positional Updates

• Remember the position of an update rather than its SK values– Merge once at write Read-Optimized approach– No need to scan SK columns– Scan can skip less CPU overhead

Notation:• TABLEx state of TABLE at time x• SID(t): StableID

– Position of tuple t in immutable base TABLE0 Stable

• RIDx(t): RowID– Position of visible tuple t at time x VOLATILE!– SID(t) = RID0(t)

SID STORE PROD NEW QTY RID

0 Berlin chair Y 5 0

0 Berlin cloth Y 20 1

0 Berlin table Y 10 2

0 London chair N 30 3

1 London stool N 10 4

2 London table N 20 5

3 Paris rug N 1 6

4 Paris stool N 5 7

SID/RID Example

INSERT INTO inventory VALUES(‘Berlin’, ‘table’, Y, 10)INSERT INTO inventory VALUES(‘Berlin’, ‘cloth’, Y, 20)INSERT INTO inventory VALUES(‘Berlin’, ‘chair’, Y, 5)

TABLE1

SID STORE PROD NEW QTY RID

0 London chair N 30 0

1 London stool N 10 1

2 London table N 20 2

3 Paris rug N 1 3

4 Paris stool N 5 4

TABLE0

SIDs and RIDs

• RID(t) = SID(t) + ∆(t)• ∆(t) = #inserts before t – #deletes before t

= RID(t) – SID(t)• SID and RID are monotonically increasing

– organize positional updates on SID in a counting B-Tree that keeps track cumulative deltas (∆)• Positional Delta Tree (PDT)

– SIDs are stable– Only need to maintain cumulative ∆ on path root leaf

PDT Example

STORE PROD NEW QTY

Berlin table Y 10

Berlin cloth Y 20

Berlin chair Y 5

INSERT INTO inventory VALUES(‘Berlin’, ‘table’, Y, 10)INSERT INTO inventory VALUES(‘Berlin’, ‘cloth’, Y, 20)INSERT INTO inventory VALUES(‘Berlin’, ‘chair’, Y, 5)

0

2 1

SID

0 0

ins ins

i2 i1

SID

type

value

0

ins

i0

SID

type

value

Insert Value Table

i0

i1

i2

SID STORE PROD NEW QTY RID

0 London chair N 30 0

1 London stool N 10 1

2 London table N 20 2

3 Paris rug N 1 3

4 Paris stool N 5 4

TABLE0

PDT Example

TABLE1

DELETE FROM inventory WHERE store = ‘Berlin’ AND prod = ‘table’DELETE FROM inventory WHERE store = ‘Paris’ AND prod = ‘rug’

SID STORE PROD NEW QTY RID

0 Berlin chair Y 5 0

0 Berlin cloth Y 20 1

0 Berlin table Y 10 2

0 London chair N 30 3

1 London stool N 10 4

2 London table N 20 5

3 Paris rug N 1 6

4 Paris stool N 5 7

STORE PROD NEW QTY

Berlin table Y 10

Berlin cloth Y 20

Berlin chair Y 5

0

2 -1

SID

0 0

ins ins

i2 i1

SID

type

value

3

del

d0

SID

type

value

Insert Value Table

i0

i1

i2

PDT Example

TABLE2

INSERT INTO inventory VALUES (‘Paris’, ‘rack’, Y, 4)

SID STORE PROD NEW QTY RID

0 Berlin chair Y 5 0

0 Berlin cloth Y 20 1

0 London chair N 30 2

1 London stool N 10 3

2 London table N 20 4

4 Paris stool N 5 5

Insert at RID = 5

STORE PROD NEW QTY

Berlin table Y 20

Berlin cloth Y 5

Berlin chair Y 10

0

2 -1

SID

0 0

ins ins

i2 i1

SID

type

value

3

del

d0

SID

type

value

Insert Value Table

i0

i1

i2

STORE PROD NEW QTY

Paris rack Y 4

Berlin cloth Y 20

Berlin chair Y 5

0

2 -1

SID

0 0

ins ins

i2 i1

SID

type

value

3 3

ins del

i0 d0

SID

type

value

Insert Value Table

i0

i1

i2

RID 5 > 0 + 2

PDT Example

0 0

ins ins

i2 i1

SID

type

value

0 1

0 1RID

0

ins

i4

SID

type

value

2

2RID

1

ins

i3

SID

type

value

3

4RID

3 3

ins del

i0 d0

SID

type

value

4 5

7 8RID

0

2 1

SID

RID

∆ 2

2

3

1 0

SID

RID

∆ 4

7

1

3 1

SID

RID

∆ 3

4

INSERT INTO inventory VALUES (‘London’, ‘rack’, Y, 4)INSERT INTO inventory VALUES (‘Berlin’, ‘rack’, Y, 4)

Separator SIDs

Subtree ∆

Separator RIDs

Running ∆

Stacking PDTs• Arbitrary number of layers: “deltas on deltas on ..”

– RID domain of child PDT = SID domain of parent PDT

generalization:• PDT contains all differences in time [lo,hi]

Table

PDT

PDT

PDT

lohi

PDTt1t2

PDTt0t1

consecutive t2=t1

PDT t2t3

PDT t0t1

vs are

Table

PDT

PDT

PDT

PDT

PDT

PDT PDTt2t3

Stacking PDTs• Arbitrary number of layers: “deltas on deltas on ..”

– RID domain of child PDT = SID domain of parent PDT

generalization:• PDT contains all differences in time [lo,hi]

Table

lohi

consecutive t2=t1

aligned t2=t0

“same base”

PDT t2t3

PDT t0t1

vs are

Table

PDT PDT

Stacking PDTs• Arbitrary number of layers: “deltas on deltas on ..”

– RID domain of child PDT = SID domain of parent PDT

generalization:• PDT contains all differences in time [lo,hi]

Table

lohi

consecutive t2=t1

aligned t2=t0

“same base”

overlapping [t2,t3] overlaps [t0,t1]“uncomparable” / “incompatible”

PDT t2t3

PDT t0t1

vs are

Table

PDT PDT

PDT

Stacking for Isolation

• ‘lock’ PDT down for further updates– Immutable read-PDT BIG: main memory resident

• ‘stack’ empty PDT on top– Updateable write-PDT SMALL: L2 cache resident– Note: PDTs are consecutive

• once in a while changes are propagated– Propagate() operation

• Requires consecutive PDTs

Stable Table

Read-PDT

Write-PDT

TABLEx

Propagate()

Read-PDT

Stable Table

Read-PDT

Write-PDT

TABLEx

Write-PDT

Trans PDT

CopyWrite-PDT

TransactionState

Snapshot Isolation

• Transaction creates snapshot copy of write-PDT

• Updates go into trans-PDT

• On commit, Propagate() trans-PDT into write-PDT

Pro

pag

ate

()

Optimistic Concurrency Control

Stable Table

Read-PDT

Write-PDT

Trans PDT

TABLEx

CopyWrite-PDT

TransA Trans

PDT

CopyWrite-PDT

TransB

• Two concurrent transactions

Optimistic Concurrency Control

Stable Table

Read-PDT

Trans PDT

TABLEx

CopyWrite-PDT

TransA Trans

PDT

CopyWrite-PDT

TransB

Propagate()

Write-PDT

• Two concurrent transactions• A commits before B

Optimistic Concurrency Control

Stable Table

Read-PDT

Write-PDT

Trans PDT

TABLEx

TransA Trans

PDT

CopyWrite-PDT

TransB

Pro

paga

te()

• Two concurrent transactions• A commits before B• Can not commit B into

modified write-PDT!– A changed RID enumeration

Optimistic Concurrency Control

Stable Table

Read-PDT

Write-PDT

Trans PDT

TABLEx

TransA

TransB

Serialize()Trans PDT

• Two concurrent transactions• A commits before B• Can not commit B into

modified write-PDT!– A changed RID enumeration

• Serialize(A, B)– Makes aligned PDTs consecutive– MAY FAIL!! trans abort

= succeeds if no conflict= write set intersection

Consecutive!Trans PDT

Optimistic Concurrency Control

Stable Table

Read-PDT

Trans PDT

TABLEx

TransA Trans

PDT

TransB

Write-PDT

Prop

agat

e()

• Two concurrent transactions• A commits before B• Can not commit B into

modified write-PDT!– A changed RID enumeration

• Serialize(A, B)– Makes aligned PDTs consecutive– MAY FAIL!! trans abort

= succeeds if no conflict= write set intersection

Extend to any number of concurrent transactions by serializing against all PDTs of transactions that committed during its lifetime

(a.k.a. backward looking OCC)

Serialize()

Concluding..

• PDTs speed-up differential update merging– Reduced I/O volume– Reduced CPU merge overhead

• Tree structure – logarithmic lookup & maintenance of volatile RIDs– main operations: Merge(), Propagate(), Serialize()

• PDTs are stackable, and capture Write-Set– Great structure for Snapshot Isolation

• Formal definitions, algorithms and benchmarks in paper

Thank you!

Microbenchmarks

TPCH-30