Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash...

17
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1 Crash Recovery Chapter 18 Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 2 Review: The ACID properties A tomicity: All actions in the Xact happen, or none happen. C onsistency: If each Xact is consistent, and the DB starts consistent, it ends up consistent. I solation: Execution of one Xact is isolated from that of other Xacts. D urability: If a Xact commits, its effects persist. The Recovery Manager guarantees Atomicity & Durability.

Transcript of Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash...

Page 1: Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash Recovery Chapter 18 ... a crash, restart proceeds in three phrases: Analysis: ... Minimal

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1

Crash Recovery

Chapter 18

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 2

Review: The ACID properties

A tomicity: All actions in the Xact happen, or none happen.

C onsistency: If each Xact is consistent, and the DB startsconsistent, it ends up consistent.

I solation: Execution of one Xact is isolated from that ofother Xacts.

D urability: If a Xact commits, its effects persist.

The Recovery Manager guarantees Atomicity & Durability.

Page 2: Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash Recovery Chapter 18 ... a crash, restart proceeds in three phrases: Analysis: ... Minimal

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 3

Motivation

Atomicity: Transactions may abort (“Rollback”).

Durability: What if DBMS stops running? (System crashes or

media failures)

crash! Desired Behavior aftersystem restarts:– T1, T2 & T3 should be

durable.– T4 & T5 should be

aborted (effects not seen).

T1T2T3T4T5

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 4

Assumptions

Concurrency control is in effect. Strict 2PL, in particular.

Updates are happening “in place”. i.e. data is overwritten on (deleted from) the disk.

A simple scheme to guarantee Atomicity &Durability?

Page 3: Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash Recovery Chapter 18 ... a crash, restart proceeds in three phrases: Analysis: ... Minimal

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 5

ARIES

When the recovery manager is invoked aftera crash, restart proceeds in three phrases: Analysis: identifies dirty pages and active

transactions Redo: Repeats some actions necessary Undo: Undoes actions which did not commit.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 6

Principles

Write-Ahead Logging Changes are written ahead in a stable storage.

Repeating History During Redo When restart, bring system back to the state when

crash happened. Logging Changes During Undo Changes during undo is also logged.

Page 4: Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash Recovery Chapter 18 ... a crash, restart proceeds in three phrases: Analysis: ... Minimal

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 7

Basic Idea: Logging

The log is a history of actions executed by theDBMS.

Physically, the log is stored in stable storage tosurvive crashes.

The most recent portion of the log, the log tail,is kept in memory and is periodically forced tostable storage.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 8

Basic Idea: Logging

Record REDO and UNDO information,for every update, in a log. Sequential writes to log (put it on a separate

disk). Minimal info (diff) written to log, so multiple

updates fit in a single log page.

Page 5: Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash Recovery Chapter 18 ... a crash, restart proceeds in three phrases: Analysis: ... Minimal

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 9

Log Record

Every log record has a unique id – LogSequence Number (LSN) and LSN alwaysincreases. It could be the memory address of the log record.

For recovery, every page in the database hasthe LSN of the most recent log record thatdescribes a change to this page – pageLSN.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 10

Log Records

Possible log record types: Update Commit Abort End Compensation Log

Records (CLRs) for UNDO actions

prevLSNXIDtype

LogRecord fields:

A linked list of one transaction

AdditionalFields

Page 6: Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash Recovery Chapter 18 ... a crash, restart proceeds in three phrases: Analysis: ... Minimal

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 11

Additional Fields of Log Record

Commit (Commit record) Transaction id is included and it force writes the log tail to a

stable storage.

Abort (Abort record) Transaction id is included. Undo is initiated for the

transaction.

End (End record) After additional actions after Abort/Commit. The transaction

id is included.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 12

Update Log Records

pageID: modified page id. length: length of the change in bytes offset: offset of the change before-image: the value of the changed bytes

before the change (for undo) after-image: the value after the change (for

redo) Redo-only update vs. undo-only update

LSN:<prevLSN, transID, type, pageID, length, offset, before-image, after-image>

Page 7: Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash Recovery Chapter 18 ... a crash, restart proceeds in three phrases: Analysis: ... Minimal

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 13

Transaction Table

One entry for each active transaction. Each entry includes transaction id, the status (in progress, committed, or aborted), lastLSN (the LSN of the most recent log record for

this transaction).transID Status lastLSNT100 In progress 4T200 In progress 3

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 14

Dirty Page Table

Dirty page: pages with changes not reflectedon disk.

One entry for each dirty page in the system’sbuffer pool.

Each entry includes recLSN (the LSN of the first log record that caused

the page dirty, which is also the earliest log recordto be redone)

pageID recLSNP500 1P600 2P505 4

Page 8: Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash Recovery Chapter 18 ... a crash, restart proceeds in three phrases: Analysis: ... Minimal

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 15

LSN PrevLSN

transID Type pageID Length Offset Before-image

After-image

1 null T100 Update P500 3 21 ABC DEF

2 null T200 Update P600 3 41 HIJ KLM

3 2 T200 Update P500 3 20 GDE QRS

4 1 T100 Update P505 3 21 TUV WXY

pageID recLSNP500 1P600 2P505 4

Dirty Page Table

Transaction Table

LOGtransID Status lastLSNT100 In progress 4T200 In progress 3

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 16

Write-Ahead Logging (WAL)Protocol The Write-Ahead Logging Protocol:

1. Must write the log record for an update before thecorresponding data page gets to disk. (GuaranteeAtomicity)

2. Must write all log records to stable storage for aXact before commit. (Guarantee Durability)

WAL is the fundamental rule to ensure arecord of every change to the database isavailable while recovering a crash.

Page 9: Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash Recovery Chapter 18 ... a crash, restart proceeds in three phrases: Analysis: ... Minimal

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 17

The Big Picture:What’s Stored Where

DB

Data pageseachwith apageLSN

Master Record

Xact TablelastLSNstatus

Dirty Page TablerecLSN

flushedLSN System keeps track offlushedLSN, The maxLSN flushed so far.

RAM

prevLSNXIDtype

lengthpageID

offsetbefore-imageafter-image

LogRecords

LOG

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 18

TransactionCommits

System keeps track of flushedLSN. WAL: Before a page is written, pageLSN flushedLSN

Write commit record to log. All log records up to Xact’s lastLSN are

flushed. Guarantees that flushedLSN lastLSN. Note that log flushes are sequential, synchronous

writes to disk. Many log records per log page.

Write end record to log.

LSNs

DB

pageLSNs

RAM

flushedLSN

pageLSN

Log recordsflushed to disk

“Log tail”in RAM

Page 10: Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash Recovery Chapter 18 ... a crash, restart proceeds in three phrases: Analysis: ... Minimal

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 19

Simple Transaction Abort

For now, consider an explicit abort of a Xact. No crash involved.

We want to “play back” the log in reverseorder, UNDOing updates. Get lastLSN of Xact from Xact table. Can follow chain of log records backward via the

prevLSN field. Before starting UNDO, write an Abort log record.

• For recovering from crash during UNDO!

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 20

Abort, cont.

To perform UNDO, must have a lock on data! Before restoring old value of a page, write a CLR: You continue logging while you UNDO!! CLR has one extra field: undoNextLSN

• Points to the next LSN to undo (i.e. the prevLSN of the recordwe’re currently undoing).

CLRs never Undone (but they might be Redone whenrepeating history: guarantees Atomicity!)

At end of UNDO, write an “end” log record.

Page 11: Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash Recovery Chapter 18 ... a crash, restart proceeds in three phrases: Analysis: ... Minimal

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 21

Example of CLR

To undo the last Update, a CLR is written as

LSN PrevLSN

transID Type pageID

Length Offset Before-image

After-image

1 Null T100 Update P500 3 21 ABC DEF

2 Null T200 Update P600 3 41 HIJ KLM

3 2 T200 Update P500 3 20 GDE QRS

4 1 T100 Update P505 3 21 TUV WXY

LSN: <prevLSN, transID, type, pageID, length,5 : < 4 , T100 , CLR , P505 , 3 ,

offset, before-image, after-image, undoNextLSN>21 , WXY , TUV , 1 >

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 22

Checkpointing

Periodically, the DBMS creates a checkpoint, inorder to minimize the time taken to recover in theevent of a system crash. Write to log: begin_checkpoint record: Indicates when chkpt began. end_checkpoint record: Contains current Xact table and

dirty page table. This is a `fuzzy checkpoint’:• Other Xacts continue to run; so these tables accurate only as of

the time of the begin_checkpoint record.• No attempt to force dirty pages to disk; effectiveness of

checkpoint limited by oldest unwritten change to a dirty page.(So it’s a good idea to periodically flush dirty pages to disk!)

Store LSN of chkpt record in a safe place (master record).

Page 12: Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash Recovery Chapter 18 ... a crash, restart proceeds in three phrases: Analysis: ... Minimal

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 23

Crash Recovery: Big Picture

Start from a checkpoint (foundvia master record).

Three phases. Need to:– Figure out which Xacts

committed since checkpoint,which failed (Analysis).

– REDO all actions.• (repeat history)

– UNDO effects of failed Xacts.

Oldest logrec. of Xactactive at crash

SmallestrecLSN indirty pagetable afterAnalysis

Last chkpt

CRASH

A R U

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 24

Recovery: The Analysis Phase

Analysis phase identifies Where to start the Redo pass (the point in the log) What are dirty (pages in the buffer pool at the

time of crash). Who are active ( transactions should be undone).

Page 13: Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash Recovery Chapter 18 ... a crash, restart proceeds in three phrases: Analysis: ... Minimal

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 25

Recovery: The Analysis Phase

Reconstruct state at checkpoint. Initializing D.P.T. and Xact Table via

end_checkpoint record. Scan log forward from checkpoint. End record: Remove T from Xact table. Other records: Add T to Xact table, set

lastLSN=LSN, change T status tocommit/To be Undone. Update record: If P not in Dirty Page

Table, add P to D.P.T., set itsrecLSN=LSN.

Lastchkpt

CRASH

A R U

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 26

Recovery: The Analysis Phase

At the end, transaction table has a list of alltransactions with status U (they were activewhile crash).

At the end, the dirty page table has all dirtypages that were dirty during crash, and alsosome pages were written to disk.

Page 14: Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash Recovery Chapter 18 ... a crash, restart proceeds in three phrases: Analysis: ... Minimal

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 27

Example of Analysis

The dirty page table and transaction table islost in the crash. Two tables are empty nowand are recreated as:

LSN PrevLSN

transID Type pageID Length Offset Before-image

After-image

1 Null T100 Update P500 3 21 ABC DEF

2 Null T200 Update P600 3 41 HIJ KLM

3 2 T200 Update P500 3 20 GDE QRS

4 1 T100 Update P505 3 21 TUV WXY

5 3 T200 Commit

6 5 T200 End

pageID recLSNP500 1P600 2P505 4

transID Status lastLSNT100 To be Undone 4

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 28

Recovery: The REDO Phase Repeating History to reconstruct state at

crash by checking Dirty Page Table: Reapply all updates (even of aborted Xacts!),

redo CLRs. Scan forward from log rec containing smallest

recLSN in D.P.T. For each CLR or update log rec LSN, REDO the

action unless:• Affected page is not in the Dirty Page Table, or• Affected page is in D.P.T., but has recLSN > LSN, or• pageLSN (in DB) LSN.

To REDO an action: Reapply logged action. Set pageLSN to LSN. No additional logging!

SmallestrecLSNin dirtypagetableafterAnalysis

Lastchkpt

CRASH

A R U

Page 15: Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash Recovery Chapter 18 ... a crash, restart proceeds in three phrases: Analysis: ... Minimal

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 29

Example of RedoLSN Prev

LSNtransID Type pageID

1 Null T100 Update P500

2 Null T200 Update P600

3 2 T200 Update P500

4 1 T100 Update P505

5 3 T200 Commit

6 5 T200 End

pageID recLSNP500 1P600 2P505 4

1. Start from LSN No.1, check Log Rec which action should be redone.2. LSN No.1 should be redone. The PageLSN of P500 is set to 1.3. LSN No.2 should not be redone since the PageLSN of the affected page

in the database P600 is equal to LSN No.24. LSN No. 3 should be redone. The PageLSN of P500 is set to 3.5. LSN No.4 should be redone. The PageLSN of P505 is set to 4.

pageID

pageLSN

P600 2P500 3P505 4

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 30

Recovery: The UNDO Phase

Scan backward from the end of thelog rec to abort transactions inTransaction table (losertransactions).

Undo actions of loser transactionsin the reverse order.

Oldest logrec. of Xactactive atcrash

SmallestrecLSN indirty pagetable afterAnalysis

Last chkpt

CRASH

A R U

Page 16: Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash Recovery Chapter 18 ... a crash, restart proceeds in three phrases: Analysis: ... Minimal

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 31

Recovery: The UNDO PhaseToUndo={ l | l a lastLSN of a “loser” Xact}Repeat: Scan backward. Choose largest LSN among ToUndo. If this LSN is a CLR and undonextLSN==NULL

• Write an End record for this Xact. If this LSN is a CLR, and undonextLSN != NULL

• Add undonextLSN to ToUndo Else this LSN is an update. Undo the update, write a

CLR, add prevLSN to ToUndo.Until ToUndo is empty.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 32

Example of Undo

1. Start from LSN No.4, the Update is undone, a CLRis written with undoNextLSN equal to LSN No.1.

2. LSN No.1, the Update is undone, a CLR is writtenwith undoNextLSN equal to Null.

3. The end log record is written for T100.

LSN PrevLSN

transID Type pageID

1 Null T100 Update P500

2 Null T200 Update P600

3 2 T200 Update P500

4 1 T100 Update P505

5 3 T200 Commit

6 5 T200 End

transID Status lastLSNT100 To be Undone 4

Page 17: Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash Recovery Chapter 18 ... a crash, restart proceeds in three phrases: Analysis: ... Minimal

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 33

Summary of Logging/Recovery

Recovery Manager guarantees Atomicity &Durability.

Use WAL to allow STEAL/NO-FORCE w/osacrificing correctness.

LSNs identify log records; linked intobackwards chains per transaction (viaprevLSN).

pageLSN allows comparison of data page andlog records.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 34

Summary, Cont.

Checkpointing: A quick way to limit theamount of log to scan on recovery.

Recovery works in 3 phases: Analysis: Forward from checkpoint. Redo: Forward from oldest recLSN. Undo: Backward from end to first LSN of oldest

Xact alive at crash. Upon Undo, write CLRs. Redo “repeats history”: Simplifies the logic!