Serializable Snapshot Isolation in PostgreSQL

37
Serializable Snapshot Isolation in PostgreSQL Dan Ports University of Washington MIT Kevin Grittner Wisconsin Supreme Court Tuesday, August 28, 2012

Transcript of Serializable Snapshot Isolation in PostgreSQL

Page 1: Serializable Snapshot Isolation in PostgreSQL

Serializable Snapshot Isolation in PostgreSQL

Dan PortsUniversity of Washington

MIT

Kevin GrittnerWisconsin Supreme Court

Tuesday, August 28, 2012

Page 2: Serializable Snapshot Isolation in PostgreSQL

For years, PostgreSQL’s “SERIALIZABLE” mode did not provide true serializability

• instead: snapshot isolation – allows anomalies

PostgreSQL 9.1: Serializable Snapshot Isolation

• based on recent research [Cahill, SIGMOD ’08]

• first implementation in a production DB release & first in a purely-snapshot DB

Tuesday, August 28, 2012

Page 3: Serializable Snapshot Isolation in PostgreSQL

This talk....

• Motivation: Why serializability?   Why did we choose SSI?

• Review of snapshot isolation and SSI

• Implementation challenges & optimizations

• Performance

Tuesday, August 28, 2012

Page 4: Serializable Snapshot Isolation in PostgreSQL

Serializability vs. Performance

Two perspectives:

• Serializability is important for correctness

• simplifies development; don’t need to worry about race conditions

• Serializability is too expensive to use

• locking restricts concurrency; use weaker isolation levels instead

Tuesday, August 28, 2012

Page 5: Serializable Snapshot Isolation in PostgreSQL

Serializability vs. Performance(in PostgreSQL)

PostgreSQL offered snapshot isolation instead

• better performance than 2-phase locking“readers don’t block writers, writers don’t block readers”

• but doesn’t guarantee serializability!

Snapshot isolation isn’t enough for some users

• complex databases with strict integrity requirements,e.g. Wisconsin Court System

Tuesday, August 28, 2012

Page 6: Serializable Snapshot Isolation in PostgreSQL

Serializability vs. Performance(in PostgreSQL)

PostgreSQL offered snapshot isolation instead

• better performance than 2-phase locking“readers don’t block writers, writers don’t block readers”

• but doesn’t guarantee serializability!

Snapshot isolation isn’t enough for some users

• complex databases with strict integrity requirements,e.g. Wisconsin Court System

Serializable Snapshot Isolationoffered true serializability with

performance benefits of snapshot isolation!Tuesday, August 28, 2012

Page 7: Serializable Snapshot Isolation in PostgreSQL

Serializable Snapshot Isolation

SSI approach:

• run transactions using snapshot isolation

• detect conflicts between transactions at runtime;abort transactions to prevent anomalies

Appealing for performance reasons

• aborts less common than blocking under 2PL

• readers still don’t block writers!

[Cahill et al. Serializable Isolation for Snapshot Databases, SIGMOD ’08]Tuesday, August 28, 2012

Page 8: Serializable Snapshot Isolation in PostgreSQL

SSI in PostgreSQL

Available in PostgreSQL 9.1; first production implementation

Contributions: new implementation techniques

• Detecting conflicts in a purely-snapshot DB

• Limiting memory usage

• Read-only transaction optimizations

• Integration with other PostgreSQL features

Tuesday, August 28, 2012

Page 9: Serializable Snapshot Isolation in PostgreSQL

Outline

• Motivation

• Review of snapshot isolation and SSI

• Implementation challenges & optimizations

• Performance

• Conclusions

Tuesday, August 28, 2012

Page 10: Serializable Snapshot Isolation in PostgreSQL

Goal: ensure at least one guard always on-duty

guard on-duty?on-duty?

Alice y

Bob y

Tuesday, August 28, 2012

Page 11: Serializable Snapshot Isolation in PostgreSQL

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y

if > 1 { UPDATE guards SET on-duty = n WHERE guard = x}

COMMIT

Goal: ensure at least one guard always on-duty

guard on-duty?on-duty?

Alice y

Bob y

Tuesday, August 28, 2012

Page 12: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

Tuesday, August 28, 2012

Page 13: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

Tuesday, August 28, 2012

Page 14: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardWHERE on-duty = y [result = 2]

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

Tuesday, August 28, 2012

Page 15: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardWHERE on-duty = y [result = 2]

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’}COMMIT

Tuesday, August 28, 2012

Page 16: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardWHERE on-duty = y [result = 2]

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’}COMMIT

n

Tuesday, August 28, 2012

Page 17: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guards = ‘Bob’}COMMIT

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’}COMMIT

n

Tuesday, August 28, 2012

Page 18: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guards = ‘Bob’}COMMIT

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’}COMMIT

n

n

Tuesday, August 28, 2012

Page 19: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guards = ‘Bob’}COMMIT

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’}COMMIT

n

n

Tuesday, August 28, 2012

Page 20: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guards = ‘Bob’}COMMIT

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’}COMMIT

n

n

rw-conflict:T1 didn’t seeT2’s UPDATE

Tuesday, August 28, 2012

Page 21: Serializable Snapshot Isolation in PostgreSQL

guard on-duty?on-duty?

Alice y

Bob y

BEGIN

SELECT count(*) FROM guardWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guards = ‘Bob’}COMMIT

BEGIN

SELECT count(*) FROM guardsWHERE on-duty = y [result = 2]

if > 1 { UPDATE guards SET on-duty = n WHERE guard = ‘Alice’}COMMIT

n

n

rw-conflict:T1 didn’t seeT2’s UPDATE

rw-conflict:T2 didn’t seeT1’s UPDATE

Tuesday, August 28, 2012

Page 22: Serializable Snapshot Isolation in PostgreSQL

SSI Approach

Detect these rw-conflicts and maintain a conflict graphSerializability theory: each anomaly involves two adjacent rw-conflict edges

• if found, abort some involved transaction

• note: can have false positives

Tuesday, August 28, 2012

Page 23: Serializable Snapshot Isolation in PostgreSQL

T2

rw-conflict:T1 didn’t seeT2’s UPDATE

T1

rw-conflict:T2 didn’t seeT1’s UPDATE

two adjacent edges: T1 -> T2 and T2 -> T1

Tuesday, August 28, 2012

Page 24: Serializable Snapshot Isolation in PostgreSQL

T2

rw-conflict:T1 didn’t seeT2’s UPDATE

T1

rw-conflict:T2 didn’t seeT1’s UPDATE

two adjacent edges: T1 -> T2 and T2 -> T1

XTuesday, August 28, 2012

Page 25: Serializable Snapshot Isolation in PostgreSQL

T2

rw-conflict:T1 didn’t seeT2’s UPDATE

T1

rw-conflict:T2 didn’t seeT1’s UPDATE

two adjacent edges: T1 -> T2 and T2 -> T1

XERROR: could not serialize access due to

read/write dependencies among transactionsHINT: The transaction might succeed if retried.

Tuesday, August 28, 2012

Page 26: Serializable Snapshot Isolation in PostgreSQL

Outline• Motivation

• Review of snapshot isolation and SSI

• Implementation challenges & optimizations

• Performance

• Conclusions

Tuesday, August 28, 2012

Page 27: Serializable Snapshot Isolation in PostgreSQL

SSI in PostgreSQL

Implementation challenges:

• Detecting conflicts in a purely-snapshot DB

• requires new lock manager

• Reining in potentially-unbounded memory usage

Tuesday, August 28, 2012

Page 28: Serializable Snapshot Isolation in PostgreSQL

Detecting ConflictsHow to detect when an update conflicts with a previous read?Previous SSI implementations:reuse read locks from existing lock mgrBut...

• PostgreSQL didn’t have read locks!

• ...let alone predicate locks

Tuesday, August 28, 2012

Page 29: Serializable Snapshot Isolation in PostgreSQL

SSI Lock Manager

Needed to build a new lock managerto track read dependencies

• Uses multigranularity locks, index-range locks

• Doesn’t block, just flags conflicts => no deadlocks

• Locks need to persist past transaction commit

Tuesday, August 28, 2012

Page 30: Serializable Snapshot Isolation in PostgreSQL

Memory UsageNeed to keep track of transaction readsets + conflict graph

• not just active transactions; alsocommitted ones that ran concurrently

• one long-running transaction can cause memory usage to grow without bound

Could exhaust shared memory space (esp. in PostgreSQL)

Tuesday, August 28, 2012

Page 31: Serializable Snapshot Isolation in PostgreSQL

Read-Only TransactionsMany long-running transactions are read-only; optimize for these

Safe snapshots: cases where r/o transactions can never be a part of an anomaly

• can then run using regular SI w/o SSI overhead

• but: can only detect once all concurrent r/w transactions complete

Deferrable transactions: delay execution to ensure safe snapshot

Tuesday, August 28, 2012

Page 32: Serializable Snapshot Isolation in PostgreSQL

Graceful DegradationWhat if we still run out of memory?

Don’t want to refuse to accept new transactions

Instead: keep less information (tradeoff: more false positives)

• keep less state about committed transactions

• deduplicate readsets: “read by some committed transaction”

Tuesday, August 28, 2012

Page 33: Serializable Snapshot Isolation in PostgreSQL

Outline

• Motivation

• Review of snapshot isolation and SSI

• Implementation challenges & optimizations

• Performance

• Conclusions

Tuesday, August 28, 2012

Page 34: Serializable Snapshot Isolation in PostgreSQL

Performance

TPC-C-derived benchmark;modified to have SI anomaliesVaried fraction of r/o and r/w transactionsCompared PostgreSQL 9.1’s SSI against SI, and an implementation of S2PL

Tuesday, August 28, 2012

Page 35: Serializable Snapshot Isolation in PostgreSQL

Performance (in-memory)

0.4x

0.5x

0.6x

0.7x

0.8x

0.9x

1.0x

0% 20% 40% 60% 80% 100%

SISSISSI (no r/o opt.)S2PL

25 warehouses (3 GB), tmpfs

trans. rate(normalized)

Fraction of read-only transactions

Tuesday, August 28, 2012

Page 36: Serializable Snapshot Isolation in PostgreSQL

Performance (disk)150 warehouses (19 GB)

trans. rate(normalized)

Fraction of read-only transactions

0.4x

0.5x

0.6x

0.7x

0.8x

0.9x

1.0x

0% 20% 40% 60% 80% 100%

SISSIS2PL

Tuesday, August 28, 2012

Page 37: Serializable Snapshot Isolation in PostgreSQL

ConclusionSSI available now in PostgreSQL 9.1

• true serializability without blocking

• new lock manager to track read dependencies

• optimizations for read-only transactions

Performance close to that of SI

• outperforms S2PL on read-heavy workloads

• makes serializable mode a more practical option for some users

Tuesday, August 28, 2012