Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash...
-
Upload
truongtuong -
Category
Documents
-
view
218 -
download
3
Transcript of Crash Recovery - Fordhamstorm.cis.fordham.edu/~yli/documents/CISC6345Spring17/PDF6_Ch18...Crash...
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1
Crash Recovery
Chapter 18
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 2
Review: The ACID properties
A tomicity: All actions in the Xact happen, or none happen.
C onsistency: If each Xact is consistent, and the DB startsconsistent, it ends up consistent.
I solation: Execution of one Xact is isolated from that ofother Xacts.
D urability: If a Xact commits, its effects persist.
The Recovery Manager guarantees Atomicity & Durability.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 3
Motivation
Atomicity: Transactions may abort (“Rollback”).
Durability: What if DBMS stops running? (System crashes or
media failures)
crash! Desired Behavior aftersystem restarts:– T1, T2 & T3 should be
durable.– T4 & T5 should be
aborted (effects not seen).
T1T2T3T4T5
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 4
Assumptions
Concurrency control is in effect. Strict 2PL, in particular.
Updates are happening “in place”. i.e. data is overwritten on (deleted from) the disk.
A simple scheme to guarantee Atomicity &Durability?
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 5
ARIES
When the recovery manager is invoked aftera crash, restart proceeds in three phrases: Analysis: identifies dirty pages and active
transactions Redo: Repeats some actions necessary Undo: Undoes actions which did not commit.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 6
Principles
Write-Ahead Logging Changes are written ahead in a stable storage.
Repeating History During Redo When restart, bring system back to the state when
crash happened. Logging Changes During Undo Changes during undo is also logged.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 7
Basic Idea: Logging
The log is a history of actions executed by theDBMS.
Physically, the log is stored in stable storage tosurvive crashes.
The most recent portion of the log, the log tail,is kept in memory and is periodically forced tostable storage.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 8
Basic Idea: Logging
Record REDO and UNDO information,for every update, in a log. Sequential writes to log (put it on a separate
disk). Minimal info (diff) written to log, so multiple
updates fit in a single log page.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 9
Log Record
Every log record has a unique id – LogSequence Number (LSN) and LSN alwaysincreases. It could be the memory address of the log record.
For recovery, every page in the database hasthe LSN of the most recent log record thatdescribes a change to this page – pageLSN.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 10
Log Records
Possible log record types: Update Commit Abort End Compensation Log
Records (CLRs) for UNDO actions
prevLSNXIDtype
LogRecord fields:
A linked list of one transaction
AdditionalFields
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 11
Additional Fields of Log Record
Commit (Commit record) Transaction id is included and it force writes the log tail to a
stable storage.
Abort (Abort record) Transaction id is included. Undo is initiated for the
transaction.
End (End record) After additional actions after Abort/Commit. The transaction
id is included.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 12
Update Log Records
pageID: modified page id. length: length of the change in bytes offset: offset of the change before-image: the value of the changed bytes
before the change (for undo) after-image: the value after the change (for
redo) Redo-only update vs. undo-only update
LSN:<prevLSN, transID, type, pageID, length, offset, before-image, after-image>
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 13
Transaction Table
One entry for each active transaction. Each entry includes transaction id, the status (in progress, committed, or aborted), lastLSN (the LSN of the most recent log record for
this transaction).transID Status lastLSNT100 In progress 4T200 In progress 3
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 14
Dirty Page Table
Dirty page: pages with changes not reflectedon disk.
One entry for each dirty page in the system’sbuffer pool.
Each entry includes recLSN (the LSN of the first log record that caused
the page dirty, which is also the earliest log recordto be redone)
pageID recLSNP500 1P600 2P505 4
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 15
LSN PrevLSN
transID Type pageID Length Offset Before-image
After-image
1 null T100 Update P500 3 21 ABC DEF
2 null T200 Update P600 3 41 HIJ KLM
3 2 T200 Update P500 3 20 GDE QRS
4 1 T100 Update P505 3 21 TUV WXY
pageID recLSNP500 1P600 2P505 4
Dirty Page Table
Transaction Table
LOGtransID Status lastLSNT100 In progress 4T200 In progress 3
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 16
Write-Ahead Logging (WAL)Protocol The Write-Ahead Logging Protocol:
1. Must write the log record for an update before thecorresponding data page gets to disk. (GuaranteeAtomicity)
2. Must write all log records to stable storage for aXact before commit. (Guarantee Durability)
WAL is the fundamental rule to ensure arecord of every change to the database isavailable while recovering a crash.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 17
The Big Picture:What’s Stored Where
DB
Data pageseachwith apageLSN
Master Record
Xact TablelastLSNstatus
Dirty Page TablerecLSN
flushedLSN System keeps track offlushedLSN, The maxLSN flushed so far.
RAM
prevLSNXIDtype
lengthpageID
offsetbefore-imageafter-image
LogRecords
LOG
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 18
TransactionCommits
System keeps track of flushedLSN. WAL: Before a page is written, pageLSN flushedLSN
Write commit record to log. All log records up to Xact’s lastLSN are
flushed. Guarantees that flushedLSN lastLSN. Note that log flushes are sequential, synchronous
writes to disk. Many log records per log page.
Write end record to log.
LSNs
DB
pageLSNs
RAM
flushedLSN
pageLSN
Log recordsflushed to disk
“Log tail”in RAM
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 19
Simple Transaction Abort
For now, consider an explicit abort of a Xact. No crash involved.
We want to “play back” the log in reverseorder, UNDOing updates. Get lastLSN of Xact from Xact table. Can follow chain of log records backward via the
prevLSN field. Before starting UNDO, write an Abort log record.
• For recovering from crash during UNDO!
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 20
Abort, cont.
To perform UNDO, must have a lock on data! Before restoring old value of a page, write a CLR: You continue logging while you UNDO!! CLR has one extra field: undoNextLSN
• Points to the next LSN to undo (i.e. the prevLSN of the recordwe’re currently undoing).
CLRs never Undone (but they might be Redone whenrepeating history: guarantees Atomicity!)
At end of UNDO, write an “end” log record.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 21
Example of CLR
To undo the last Update, a CLR is written as
LSN PrevLSN
transID Type pageID
Length Offset Before-image
After-image
1 Null T100 Update P500 3 21 ABC DEF
2 Null T200 Update P600 3 41 HIJ KLM
3 2 T200 Update P500 3 20 GDE QRS
4 1 T100 Update P505 3 21 TUV WXY
LSN: <prevLSN, transID, type, pageID, length,5 : < 4 , T100 , CLR , P505 , 3 ,
offset, before-image, after-image, undoNextLSN>21 , WXY , TUV , 1 >
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 22
Checkpointing
Periodically, the DBMS creates a checkpoint, inorder to minimize the time taken to recover in theevent of a system crash. Write to log: begin_checkpoint record: Indicates when chkpt began. end_checkpoint record: Contains current Xact table and
dirty page table. This is a `fuzzy checkpoint’:• Other Xacts continue to run; so these tables accurate only as of
the time of the begin_checkpoint record.• No attempt to force dirty pages to disk; effectiveness of
checkpoint limited by oldest unwritten change to a dirty page.(So it’s a good idea to periodically flush dirty pages to disk!)
Store LSN of chkpt record in a safe place (master record).
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 23
Crash Recovery: Big Picture
Start from a checkpoint (foundvia master record).
Three phases. Need to:– Figure out which Xacts
committed since checkpoint,which failed (Analysis).
– REDO all actions.• (repeat history)
– UNDO effects of failed Xacts.
Oldest logrec. of Xactactive at crash
SmallestrecLSN indirty pagetable afterAnalysis
Last chkpt
CRASH
A R U
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 24
Recovery: The Analysis Phase
Analysis phase identifies Where to start the Redo pass (the point in the log) What are dirty (pages in the buffer pool at the
time of crash). Who are active ( transactions should be undone).
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 25
Recovery: The Analysis Phase
Reconstruct state at checkpoint. Initializing D.P.T. and Xact Table via
end_checkpoint record. Scan log forward from checkpoint. End record: Remove T from Xact table. Other records: Add T to Xact table, set
lastLSN=LSN, change T status tocommit/To be Undone. Update record: If P not in Dirty Page
Table, add P to D.P.T., set itsrecLSN=LSN.
Lastchkpt
CRASH
A R U
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 26
Recovery: The Analysis Phase
At the end, transaction table has a list of alltransactions with status U (they were activewhile crash).
At the end, the dirty page table has all dirtypages that were dirty during crash, and alsosome pages were written to disk.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 27
Example of Analysis
The dirty page table and transaction table islost in the crash. Two tables are empty nowand are recreated as:
LSN PrevLSN
transID Type pageID Length Offset Before-image
After-image
1 Null T100 Update P500 3 21 ABC DEF
2 Null T200 Update P600 3 41 HIJ KLM
3 2 T200 Update P500 3 20 GDE QRS
4 1 T100 Update P505 3 21 TUV WXY
5 3 T200 Commit
6 5 T200 End
pageID recLSNP500 1P600 2P505 4
transID Status lastLSNT100 To be Undone 4
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 28
Recovery: The REDO Phase Repeating History to reconstruct state at
crash by checking Dirty Page Table: Reapply all updates (even of aborted Xacts!),
redo CLRs. Scan forward from log rec containing smallest
recLSN in D.P.T. For each CLR or update log rec LSN, REDO the
action unless:• Affected page is not in the Dirty Page Table, or• Affected page is in D.P.T., but has recLSN > LSN, or• pageLSN (in DB) LSN.
To REDO an action: Reapply logged action. Set pageLSN to LSN. No additional logging!
SmallestrecLSNin dirtypagetableafterAnalysis
Lastchkpt
CRASH
A R U
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 29
Example of RedoLSN Prev
LSNtransID Type pageID
1 Null T100 Update P500
2 Null T200 Update P600
3 2 T200 Update P500
4 1 T100 Update P505
5 3 T200 Commit
6 5 T200 End
pageID recLSNP500 1P600 2P505 4
1. Start from LSN No.1, check Log Rec which action should be redone.2. LSN No.1 should be redone. The PageLSN of P500 is set to 1.3. LSN No.2 should not be redone since the PageLSN of the affected page
in the database P600 is equal to LSN No.24. LSN No. 3 should be redone. The PageLSN of P500 is set to 3.5. LSN No.4 should be redone. The PageLSN of P505 is set to 4.
pageID
pageLSN
P600 2P500 3P505 4
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 30
Recovery: The UNDO Phase
Scan backward from the end of thelog rec to abort transactions inTransaction table (losertransactions).
Undo actions of loser transactionsin the reverse order.
Oldest logrec. of Xactactive atcrash
SmallestrecLSN indirty pagetable afterAnalysis
Last chkpt
CRASH
A R U
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 31
Recovery: The UNDO PhaseToUndo={ l | l a lastLSN of a “loser” Xact}Repeat: Scan backward. Choose largest LSN among ToUndo. If this LSN is a CLR and undonextLSN==NULL
• Write an End record for this Xact. If this LSN is a CLR, and undonextLSN != NULL
• Add undonextLSN to ToUndo Else this LSN is an update. Undo the update, write a
CLR, add prevLSN to ToUndo.Until ToUndo is empty.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 32
Example of Undo
1. Start from LSN No.4, the Update is undone, a CLRis written with undoNextLSN equal to LSN No.1.
2. LSN No.1, the Update is undone, a CLR is writtenwith undoNextLSN equal to Null.
3. The end log record is written for T100.
LSN PrevLSN
transID Type pageID
1 Null T100 Update P500
2 Null T200 Update P600
3 2 T200 Update P500
4 1 T100 Update P505
5 3 T200 Commit
6 5 T200 End
transID Status lastLSNT100 To be Undone 4
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 33
Summary of Logging/Recovery
Recovery Manager guarantees Atomicity &Durability.
Use WAL to allow STEAL/NO-FORCE w/osacrificing correctness.
LSNs identify log records; linked intobackwards chains per transaction (viaprevLSN).
pageLSN allows comparison of data page andlog records.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 34
Summary, Cont.
Checkpointing: A quick way to limit theamount of log to scan on recovery.
Recovery works in 3 phases: Analysis: Forward from checkpoint. Redo: Forward from oldest recLSN. Undo: Backward from end to first LSN of oldest
Xact alive at crash. Upon Undo, write CLRs. Redo “repeats history”: Simplifies the logic!