Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN...

20
Lampson and Lomet’s Lampson and Lomet’s Paper: Paper: A New Presumed Commit A New Presumed Commit Optimization for Two Optimization for Two Phase Commit Phase Commit Doug Cha Doug Cha COEN 317 – SCU Spring 05 COEN 317 – SCU Spring 05

Transcript of Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN...

Page 1: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

Lampson and Lomet’s Paper: Lampson and Lomet’s Paper: A New Presumed Commit A New Presumed Commit

Optimization for Two Phase Optimization for Two Phase CommitCommit

Doug ChaDoug Cha

COEN 317 – SCU Spring 05COEN 317 – SCU Spring 05

Page 2: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

About the authorsAbout the authors

Butler LampsonButler Lampson Currently at MSFTCurrently at MSFT Formerly at Xerox PARC, DEC research, and a Formerly at Xerox PARC, DEC research, and a

professor at MIT and Berkeleyprofessor at MIT and Berkeley ACM Turing Award in 1992ACM Turing Award in 1992

David LometDavid Lomet Also at MSFT, formerly at DEC researchAlso at MSFT, formerly at DEC research Key work on database systemsKey work on database systems One of the inventors of the transaction concpetOne of the inventors of the transaction concpet ACM FellowACM Fellow

Page 3: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

OutlineOutline

Review of 2PCReview of 2PC

More on 2PC / OptimizationsMore on 2PC / Optimizations Presumed NothingPresumed Nothing Presumed AbortPresumed Abort Presumed CommitPresumed Commit

Recovery requirementsRecovery requirements

The new PrC protocolThe new PrC protocol

SummarySummary

Page 4: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

Review of 2PCReview of 2PC

Distributed Atomic Commit problem (DC9 p2)Distributed Atomic Commit problem (DC9 p2) How to get all members of a group to commit/abort together?How to get all members of a group to commit/abort together?

Two Phase Commit, Gray 1987 (DC9 p3):Two Phase Commit, Gray 1987 (DC9 p3): First phase is the voting phaseFirst phase is the voting phase

Coordinator sends all participants (cohorts) a vote request Coordinator sends all participants (cohorts) a vote request (PREPARE)(PREPARE)All participants (cohorts) respond COMMIT-VOTE or ABORT-VOTEAll participants (cohorts) respond COMMIT-VOTE or ABORT-VOTE

Second phase, coordinator decides commit or abort: if any Second phase, coordinator decides commit or abort: if any participant voted ABORT, then decision must be abort. participant voted ABORT, then decision must be abort. Otherwise, commit.Otherwise, commit.

Coordinator sends all participants decision (COMMIT or ABORT)Coordinator sends all participants decision (COMMIT or ABORT)Participants (who have been waiting for decision) commit or abort Participants (who have been waiting for decision) commit or abort as instructed and ACK.as instructed and ACK.

Page 5: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

2 Phase Commit2 Phase Commit

PREPARE

COMMIT-VOTE

COMMIT

<<collect all votes>>

Coordinator Cohort

make vote

execute commit

ACK

Additional Detail – A protocol database at the coordinator stores transaction states and cohort votes. This is used for error recovery.

Page 6: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

2PC Variations2PC Variations

Presumed Nothing (PrN)Presumed Nothing (PrN)

Presumed Abort (PrA)Presumed Abort (PrA)

Presumed Commit (PrC)Presumed Commit (PrC)

Variations deal with how to handle Variations deal with how to handle recovery and vary on how recovery data is recovery and vary on how recovery data is logged.logged.

Page 7: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

Presumed Nothing (PrN)Presumed Nothing (PrN)

PREPARE

COMMIT or ABORT-VOTE

COMMIT or ABORT

<<collect all votes>>

Coordinator Cohort

make vote

execute commit

ACK

Record ACK

Forced record

1 forced write, 1 lazy write, 2 messages to cohort

<<collect all acks>>

Remove record

Page 8: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

PrN Failure RecoveryPrN Failure Recovery

PREPARE

COMMIT-VOTE

Coordinator Cohort

make vote

In PrN nothing is recorded until a COMMIT is sent, so coordinator crash results in ABORT.

timeoutSTATUS?

crash

no recordABORT

Page 9: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

PrA OptimizationPrA Optimization

PREPARE

ABORT-VOTE

ABORT

Coordinator Cohort

make vote

No record

On an ABORT, there are no log records and no ACK. This works because we “presume an abort” if no record exists!

crash

recoverySTATUS?

no recordABORT

Page 10: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

Presumed Commit (PrC) - COMMITPresumed Commit (PrC) - COMMIT

PREPARE

COMMIT-VOTE

COMMIT<<collect all

votes>>

Coordinator Cohort

make vote

Forced remove record

2 forced write, 2 messages to cohort

Cohort doesn’t need to send ACK

Forced record

crash

recoverySTATUS?

no recordCOMMIT

Page 11: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

Presumed Commit (PrC) - ABORTPresumed Commit (PrC) - ABORTCoordinator Cohort

PREPARE

ABORT-VOTE

ABORT

<<collect all acks>>

make vote

execute abort

remove record

ACK

Forced record

ACK only needed on ABORTs

Page 12: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

Comparison For NowComparison For Now

2PC 2PC VariantVariant

CoordinatorCoordinator CohortCohort

PrNPrN 2 log records2 log records

1 forced log1 forced log

2 messages to Cohort2 messages to Cohort

2 log records2 log records

2 forced log2 forced log

2 messages to Coordinator2 messages to Coordinator

PrAPrA 2 log records2 log records

1 forced log1 forced log

2 messages to Cohort2 messages to Cohort

2 log records2 log records

2 forced log2 forced log

2 messages to Coordinator2 messages to Coordinator

PrCPrC 2 log records2 log records

2 forced log2 forced log

2 messages to Cohort2 messages to Cohort

2 log records2 log records

1 forced log1 forced log

1 messages to Coordinator1 messages to Coordinator

Page 13: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

Improving PrCImproving PrC

Messaging is low already, try to reduce forced Messaging is low already, try to reduce forced log writes.log writes. In PrC a forced write happens at PREPAREIn PrC a forced write happens at PREPARE

Any transactions with a PREPARE, but no transaction end Any transactions with a PREPARE, but no transaction end are abortedare aborted

Non existence of a transaction record assumes commitNon existence of a transaction record assumes commit To remove the forced PREPARE write, we need to:To remove the forced PREPARE write, we need to:

Find another way to identify transactions that may have Find another way to identify transactions that may have started before the crash but did not finishstarted before the crash but did not finish

Keep these transaction records around so we know to abort Keep these transaction records around so we know to abort them (since we are still presuming commits)them (since we are still presuming commits)

Page 14: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

Improving PrCImproving PrC

Instead of recording trans init, record timestamps:Instead of recording trans init, record timestamps: tidtidll –lowest possible time of an undocumented transaction –lowest possible time of an undocumented transaction tidtidhh –most recent undocumented transaction –most recent undocumented transaction tidtidstasta – most recent record of a transaction – most recent record of a transaction

So we have:So we have: REC = { tid | tidREC = { tid | tidll < tid < tid < tid < tidhh} = recent transactions} = recent transactions

COM = commited and stable transactionsCOM = commited and stable transactions IN = REC – COM = transactions maybe active during crashIN = REC – COM = transactions maybe active during crash

On recovery:On recovery: Cohorts asking status of a transaction assume commit unless the record Cohorts asking status of a transaction assume commit unless the record

exists in the IN setexists in the IN set The IN set must be stored forever! (But data size is small)The IN set must be stored forever! (But data size is small)

Transaction Log

tidtidll tidtidhhtidtidstasta

Window of Active/Undocumented Transactions (REC)

Commited or Aborted Transactions

Not used space

time

Page 15: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

The New PrC Protocol ABORTThe New PrC Protocol ABORT

PREPARE

ABORT-VOTE

ABORT

Coordinator

make vote

increase tidl value past this trans, so IN set does not include this anymore

ACK<<Collect all acks>>

abort

IN range of tids contains this transaction

tidl < tid < tidh

Page 16: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

The New PrC Protocol COMMITThe New PrC Protocol COMMIT

PREPARE

COMMIT-VOTE

COMMIT

Coordinator

make vote

No trans record in IN so commit

ACKabort

recovery / crash STATUS?

COMMIT

<<Collect all acks>>

Move tidl past this

IN range of tids contains this transaction

tidl < tid < tidh

Page 17: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

The New PrC Protocol ABORT/CRASHThe New PrC Protocol ABORT/CRASH

PREPARE

ABORT-VOTE

ABORT

Coordinator

make vote

Trans is still in IN set, so we send abort

ACK

abortcrash

recovery STATUS?

ABORT

IN range of tids contains this transaction

Page 18: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

Analysis of New PrC ProtocolAnalysis of New PrC Protocol

We reduce the # of forced writes but require permanent We reduce the # of forced writes but require permanent storage of IN recordsstorage of IN records

2PC Variant2PC Variant CoordinatorCoordinator CohortCohort

PrCPrC 2 log records2 log records

2 forced log2 forced log

2 messages to Cohort2 messages to Cohort

2 log records2 log records

1 forced log1 forced log

1 messages to Coordinator1 messages to Coordinator

New PrCNew PrC 1 log records1 log records

1 forced log1 forced log

2 messages to Cohort2 messages to Cohort

2 log records2 log records

1 forced log1 forced log

1 messages to Coordinator1 messages to Coordinator

Page 19: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

SummarySummary

Two-Phase CommitTwo-Phase Commit Presumed NothingPresumed Nothing Presumed AbortPresumed Abort Presumed CommitPresumed Commit Requirements for logging/recoveryRequirements for logging/recovery New Presumed CommitNew Presumed Commit

Page 20: Lampson and Lomet’s Paper: A New Presumed Commit Optimization for Two Phase Commit Doug Cha COEN 317 – SCU Spring 05.

ReferencesReferences

A New Presumed Commit Optimization for A New Presumed Commit Optimization for Two Phase Commit – Lampson and Two Phase Commit – Lampson and Lomet, 1993.Lomet, 1993.

Distributed Systems Concepts and Design Distributed Systems Concepts and Design – Coulouris, Dollimore, Kindberg– Coulouris, Dollimore, Kindberg

Santa Clara Univ, COEN 317 class notes Santa Clara Univ, COEN 317 class notes – Holliday– Holliday