Download - Serializable Snapshot Isolation in PostgreSQL

Transcript
Page 1: Serializable Snapshot Isolation in PostgreSQL

Serializable Snapshot Isolation in PostgreSQL

Dan PortsUniversity of Washington

MIT

Kevin GrittnerWisconsin Supreme Court

Tuesday, August 28, 2012

Page 2: Serializable Snapshot Isolation in PostgreSQL

For years, PostgreSQL’s “SERIALIZABLE” mode did not provide true serializability

• instead: snapshot isolation – allows anomalies

PostgreSQL 9.1: Serializable Snapshot Isolation

• based on recent research [Cahill, SIGMOD ’08]

• first implementation in a production DB release & first in a purely-snapshot DB

Tuesday, August 28, 2012

Page 3: Serializable Snapshot Isolation in PostgreSQL

This talk....

• Motivation: Why serializability?   Why did we choose SSI?

• Review of snapshot isolation and SSI

• Implementation challenges & optimizations

• Performance

Tuesday, August 28, 2012

Page 4: Serializable Snapshot Isolation in PostgreSQL

Serializability vs. Performance

Two perspectives:

• Serializability is important for correctness

• simplifies development; don’t need to worry about race conditions

• Serializability is too expensive to use

• locking restricts concurrency; use weaker isolation levels instead

Tuesday, August 28, 2012

Page 5: Serializable Snapshot Isolation in PostgreSQL

Serializability vs. Performance(in PostgreSQL)

PostgreSQL offered snapshot isolation instead

• better performance than 2-phase locking“readers don’t block writers, writers don’t block readers”

• but doesn’t guarantee serializability!

Snapshot isolation isn’t enough for some users

• complex databases with strict integrity requirements,e.g. Wisconsin Court System

Tuesday, August 28, 2012

Page 6: Serializable Snapshot Isolation in PostgreSQL

Serializability vs. Performance(in PostgreSQL)

PostgreSQL offered snapshot isolation instead

• better performance than 2-phase locking“readers don’t block writers, writers don’t block readers”

• but doesn’t guarantee serializability!

Snapshot isolation isn’t enough for some users

• complex databases with strict integrity requirements,e.g. Wisconsin Court System

Serializable Snapshot Isolationoffered true serializability with

performance benefits of snapshot isolation!Tuesday, August 28, 2012

Page 7: Serializable Snapshot Isolation in PostgreSQL

Serializable Snapshot Isolation

SSI approach:

• run transactions using snapshot isolation

• detect conflicts between transactions at runtime;abort transactions to prevent anomalies

Appealing for performance reasons

• aborts less common than blocking under 2PL

• readers still don’t block writers!

[Cahill et al. Serializable Isolation for Snapshot Databases, SIGMOD ’08]Tuesday, August 28, 2012

Page 8: Serializable Snapshot Isolation in PostgreSQL

SSI in PostgreSQL

Available in PostgreSQL 9.1; first production implementation

Contributions: new implementation techniques

• Detecting conflicts in a purely-snapshot DB

• Limiting memory usage

• Read-only transaction optimizations

• Integration with other PostgreSQL features

Tuesday, August 28, 2012

Page 9: Serializable Snapshot Isolation in PostgreSQL

Outline

• Motivation

• Review of snapshot isolation and SSI

• Implementation challenges & optimizations

• Performance

• Conclusions

Tuesday, August 28, 2012

Page 10: Serializable Snapshot Isolation in PostgreSQL

Goal: ensure at least one guard always on-duty

guard on-duty?on-duty?

Alice y

Bob y

Tuesday, August 28, 2012

Page 11: Serializable Snapshot Isolation in PostgreSQL

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y

if > 1 { UPDATE guards SET on-duty = n WHERE guard = x}

COMMIT

Goal: ensure at least one guard always on-duty

guard on-duty?on-duty?

Alice y

Bob y

Tuesday, August 28, 2012

Page 12: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

Tuesday, August 28, 2012

Page 13: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

Tuesday, August 28, 2012

Page 14: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardWHERE on-duty = y [result = 2]

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

Tuesday, August 28, 2012

Page 15: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardWHERE on-duty = y [result = 2]

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’}COMMIT

Tuesday, August 28, 2012

Page 16: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardWHERE on-duty = y [result = 2]

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’}COMMIT

n

Tuesday, August 28, 2012

Page 17: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guards = ‘Bob’}COMMIT

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’}COMMIT

n

Tuesday, August 28, 2012

Page 18: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guards = ‘Bob’}COMMIT

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’}COMMIT

n

n

Tuesday, August 28, 2012

Page 19: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guards = ‘Bob’}COMMIT

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’}COMMIT

n

n

Tuesday, August 28, 2012

Page 20: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guards = ‘Bob’}COMMIT

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’}COMMIT

n

n

rw-conflict:T1 didn’t seeT2’s UPDATE

Tuesday, August 28, 2012

Page 21: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guards = ‘Bob’}COMMIT

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’}COMMIT

n

n

rw-conflict:T1 didn’t seeT2’s UPDATE

rw-conflict:T2 didn’t seeT1’s UPDATE

Tuesday, August 28, 2012

Page 22: Serializable Snapshot Isolation in PostgreSQL

SSI Approach

Detect these rw-conflicts and maintain a conflict graphSerializability theory: each anomaly involves two adjacent rw-conflict edges

• if found, abort some involved transaction

• note: can have false positives

Tuesday, August 28, 2012

Page 23: Serializable Snapshot Isolation in PostgreSQL

T2

rw-conflict:T1 didn’t seeT2’s UPDATE

T1

rw-conflict:T2 didn’t seeT1’s UPDATE

two adjacent edges: T1 -> T2 and T2 -> T1

Tuesday, August 28, 2012

Page 24: Serializable Snapshot Isolation in PostgreSQL

T2

rw-conflict:T1 didn’t seeT2’s UPDATE

T1

rw-conflict:T2 didn’t seeT1’s UPDATE

two adjacent edges: T1 -> T2 and T2 -> T1

XTuesday, August 28, 2012

Page 25: Serializable Snapshot Isolation in PostgreSQL

T2

rw-conflict:T1 didn’t seeT2’s UPDATE

T1

rw-conflict:T2 didn’t seeT1’s UPDATE

two adjacent edges: T1 -> T2 and T2 -> T1

XERROR: could not serialize access due to

read/write dependencies among transactionsHINT: The transaction might succeed if retried.

Tuesday, August 28, 2012

Page 26: Serializable Snapshot Isolation in PostgreSQL

Outline• Motivation

• Review of snapshot isolation and SSI

• Implementation challenges & optimizations

• Performance

• Conclusions

Tuesday, August 28, 2012

Page 27: Serializable Snapshot Isolation in PostgreSQL

SSI in PostgreSQL

Implementation challenges:

• Detecting conflicts in a purely-snapshot DB

• requires new lock manager

• Reining in potentially-unbounded memory usage

Tuesday, August 28, 2012

Page 28: Serializable Snapshot Isolation in PostgreSQL

Detecting ConflictsHow to detect when an update conflicts with a previous read?Previous SSI implementations:reuse read locks from existing lock mgrBut...

• PostgreSQL didn’t have read locks!

• ...let alone predicate locks

Tuesday, August 28, 2012

Page 29: Serializable Snapshot Isolation in PostgreSQL

SSI Lock Manager

Needed to build a new lock managerto track read dependencies

• Uses multigranularity locks, index-range locks

• Doesn’t block, just flags conflicts => no deadlocks

• Locks need to persist past transaction commit

Tuesday, August 28, 2012

Page 30: Serializable Snapshot Isolation in PostgreSQL

Memory UsageNeed to keep track of transaction readsets + conflict graph

• not just active transactions; alsocommitted ones that ran concurrently

• one long-running transaction can cause memory usage to grow without bound

Could exhaust shared memory space (esp. in PostgreSQL)

Tuesday, August 28, 2012

Page 31: Serializable Snapshot Isolation in PostgreSQL

Read-Only TransactionsMany long-running transactions are read-only; optimize for these

Safe snapshots: cases where r/o transactions can never be a part of an anomaly

• can then run using regular SI w/o SSI overhead

• but: can only detect once all concurrent r/w transactions complete

Deferrable transactions: delay execution to ensure safe snapshot

Tuesday, August 28, 2012

Page 32: Serializable Snapshot Isolation in PostgreSQL

Graceful DegradationWhat if we still run out of memory?

Don’t want to refuse to accept new transactions

Instead: keep less information (tradeoff: more false positives)

• keep less state about committed transactions

• deduplicate readsets: “read by some committed transaction”

Tuesday, August 28, 2012

Page 33: Serializable Snapshot Isolation in PostgreSQL

Outline

• Motivation

• Review of snapshot isolation and SSI

• Implementation challenges & optimizations

• Performance

• Conclusions

Tuesday, August 28, 2012

Page 34: Serializable Snapshot Isolation in PostgreSQL

Performance

TPC-C-derived benchmark;modified to have SI anomaliesVaried fraction of r/o and r/w transactionsCompared PostgreSQL 9.1’s SSI against SI, and an implementation of S2PL

Tuesday, August 28, 2012

Page 35: Serializable Snapshot Isolation in PostgreSQL

Performance (in-memory)

0.4x

0.5x

0.6x

0.7x

0.8x

0.9x

1.0x

0% 20% 40% 60% 80% 100%

SISSISSI (no r/o opt.)S2PL

25 warehouses (3 GB), tmpfs

trans. rate(normalized)

Fraction of read-only transactions

Tuesday, August 28, 2012

Page 36: Serializable Snapshot Isolation in PostgreSQL

Performance (disk)150 warehouses (19 GB)

trans. rate(normalized)

Fraction of read-only transactions

0.4x

0.5x

0.6x

0.7x

0.8x

0.9x

1.0x

0% 20% 40% 60% 80% 100%

SISSIS2PL

Tuesday, August 28, 2012

Page 37: Serializable Snapshot Isolation in PostgreSQL

ConclusionSSI available now in PostgreSQL 9.1

• true serializability without blocking

• new lock manager to track read dependencies

• optimizations for read-only transactions

Performance close to that of SI

• outperforms S2PL on read-heavy workloads

• makes serializable mode a more practical option for some users

Tuesday, August 28, 2012