Real-Time Software Design Yonsei University 2 nd Semester, 2014 Sanghyun Park.
Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park.
Transcript of Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park.
Chapter 16Recovery
Yonsei University1st Semester, 2015
Sanghyun Park
Outline Failure Classification Recovery and Atomicity Log-Based Recovery Shadow Paging Recovery with Concurrent Transactions Buffer Management
Failure Classification Transaction failure
Logical errors: transaction cannot complete due to some logical error condition
System errors: the database system must terminate an active transaction due to an error condition (e.g., deadlock)
System crash: a power failure or other hardware or software failure causes the system to crash
Disk failure: a head crash or similar disk failure destroys all or part of disk storage
Recovery Algorithms Recovery algorithms are techniques to ensure
DB consistency, transaction atomicity, and durabilitydespite failures
Recovery algorithms have two parts Actions taken during normal transaction processing
to ensure enough information exists to recover from failures Actions taken after a failure to recover the database contents
to a state that ensures atomicity, consistency, and durability
Data Access (1/2) Physical blocks are those blocks residing on the disk
Buffer blocks are those blocks residing temporarily inmain memory
Block movements between disk and main memory are initiated through input(B) and output(B) operations
Each transaction Ti has its private work-area in which local copies of all data items accessed and updated by Ti are kept
We assume, for simplicity, that each data item fits in a single block
Data Access (2/2) Transaction transfers data items between buffer blocks
and its private work-area using read(X) and write(X) operations
Transactions Perform read(X) when accessing X for the first time All subsequent accesses are to the local copy After last access, transaction executes write(X)
Output(BX) does not need to follow write(X) immediately
Example of Data Access
X
Y A
B
x1
y1
bufferBuffer Block A
Buffer Block B
input(A)
output(B)
read(X) write(Y)disk
work areaof T1
work areaof T2
memory
x2
Recovery and Atomicity (1/2) Modifying the database without ensuring that the
transaction will commit may leave the databasein an inconsistent state
Consider transaction Ti that transfers $50 from account A to account B; goal is either to perform all database modifications made by Ti or none at all
Several output operations may be required for Ti.A failure may occur after one of these modifications have been made but before all of them are made
Recovery and Atomicity (2/2) To ensure atomicity despite failures, we first output
information describing the modifications to stable storage without modifying the database itself
We study two approaches Log-based recovery Shadow-paging
We assume (initially) that transactions run serially,that is, one after the other
Log-Based Recovery (1/2) The log is a sequence of log records, recording all the
update activities in the database
When transaction Ti starts, it registers itself by writing a <Ti start> log record
Before Ti executes write(X), a log record <Ti, X, V1, V2> is written, where V1 is an old value and V2 is a new value
Log-Based Recovery (2/2)
When Ti finishes its last statement, the log record<Ti commit> is written
We assume for now that log records are written directly to stable storage (that is, they are not buffered)
Two approaches using logs:deferred or immediate database modification
Deferred Database Modification (1/4) The deferred database modification scheme records all
database modifications in the log, but defers all the writes until the transaction partially commits
Transaction starts by writing <Ti start> record to log
A write(X) operation by Ti results in the writing of a new record to the log
The write is not performed on X at this time, but deferred
Deferred Database Modification (2/4)
When Ti partially commits, a record <Ti commit> is written to the log
Finally, the log records are read and used to actually execute the previously deferred writes
During recovery after a crash, a transaction needs to be redone if both <Ti start> and <Ti commit> are in the log
Redoing a transaction Ti (redo Ti) sets the value of all data items updated by the transaction to the new values
Deferred Database Modification (3/4) Crashes can occur while
The transaction is executing the original updates While recovery action is being taken
Example transactions T0 and T1 (T0 executes before T1)
T0: read(A) A := A – 50 write(A) read(B) B := B + 50 write(B)
T1: read(C) C := C – 100 write(C)
Deferred Database Modification (4/4) Below is the log as it appears at three time instances
If log on stable storage at time of crash is as in case:(a) No redo actions need to be taken
(b) Redo(T0) must be performed since <T0 commit> is present
(c) Redo(T0) must be performed followed by redo(T1) since both <T0 commit> and <T1 commit> are present
Immediate Database Modification (1/4)
The immediate modification scheme allows database modifications to be output to the database while the transaction is still in the active stage
Update log record must be written before database item is written
Output of updated buffer blocks can take place at any time before or after transaction commit
Order in which blocks are output can be different from the order in which they are written
Immediate Database Modification (2/4)
Example (BX denotes block containing X)
Log Write Output
<T0 start><T0, A, 1000, 950><T0, B, 2000, 2050>
<T0 commit><T1 start><T1, C, 700, 600>
<T1 commit>
A = 950B = 2050
C = 600
BB, BC
BA
Immediate Database Modification (3/4)
Recovery procedure has two operations instead of one: Undo(Ti) restores the value of all data items updated by Ti to
their old values, going backward from the last log record for Ti
Redo(Ti) sets the value of all data items updated by Ti to the new values, going forward from the first log record for Ti
Both operations must be idempotent That is, even if the operation is executed multiple times,
the effect is the same as if it is executed once
Immediate Database Modification (4/4)
When recovering after failure: Transaction Ti needs to be undone if the log contains the record
<Ti start>, but does not contain the record <Ti commit>
Transaction Ti needs to be redone if the log contains both the record <Ti start> and the record <Ti commit>
Undo operations are performed first, then redo operations
Immediate Database ModificationRecovery Example
Below is the log as it appears at three time instances
Recovery actions in each case above are:(a) Undo(T0): B is restored to 2000 and A to 1000
(b) Undo(T1) and redo(T0): C is restored to 700, and then A and B are set to 950 and 2050 respectively
(c) Redo(T0) and redo(T1): A and B are set to 950 and 2050 respectively, then C is set to 600
Checkpoints (1/2) Problems in recovery procedure discussed earlier:
Searching the entire log is time-consuming Most of the transactions that need to be redone have already
written their updates into the database
To reduce these types of overhead, the system periodically performs checkpoints, which require the following sequence of actions to take place Output onto stable storage all log records currently residing in
main memory Output to the disk all modified buffer blocks Output onto stable storage a log record <checkpoint>
Checkpoints (2/2) During recovery we need to consider only the most
recent transaction Ti that started before the checkpoint, and transactions that started after Ti
Scan backward from end of log to find the most recent <checkpoint> record
Continue scanning backwards till a record <Ti start> is found
Need only consider the part of log following above start record For all transactions starting from Ti or later with no <Ti
commit>, execute undo(Ti)
Scanning forward in the log, for all transactions starting from Ti or later with <Ti commit>, execute redo(Ti)
Example of Checkpoints
T1 can be ignored(updates already output to disk due to checkpoint)
T2 and T3 redone
T4 undone
TcTf
T1
T2
T3
T4
checkpoint system failure
Page And Page Table The database is partitioned into some number of fixed-
length blocks, which are referred to as pages
The pages do not need to be stored in any particular order on disk
We use a page table to find the ith page of the database for a given i
Sample Page Table
Shadow Paging (1/6) Shadow paging is an alternative to log-based recovery;
this scheme is useful if transactions execute serially The key idea is to maintain two page tables during the
lifetime of a transaction – the current page table andthe shadow page table
Both page tables are identical when the transaction starts The shadow table is never changed over the duration of
the transaction; the current page table may be changed when a transaction performs a write operation
All input and output operations use the current page table to locate database pages on disk
Shadow Paging (2/6)
Suppose that the transaction Tj performs a write(X),and that X resides on the ith page
The system executes the write operation as follows:
If the ith page is not already in main memory, then the system issues input(X)
Shadow Paging (3/6) If this is the write first performed on the ith page by this
transaction, then the system modifies the current page tableas follows:
It finds an unused page on disk
It copies the contents of the ith page to the page found in above step
It modifies the current page table so that the ith entry points to the page found in above step
It assigns the value of xj to X in the buffer page
Shadow Paging (4/6) Example: shadow and current page tables after write to
page 4
Shadow Paging (5/6) To commit a transaction:
Flush all modified pages in main memory to disk Output current page table to disk Make the current page table the new shadow page table
Once pointer to shadow page table has been written, transaction is committed
No recovery is needed after a crash – new transactions can start right away, using the shadow page table
Pages not pointed to from current/shadow page table should be freed (garbage collected)
Shadow Paging (6/6) Advantages of shadow-paging over log-based schemes
No overhead of writing log records Recovery is trivial
Disadvantages Copying the entire page table is very expensive Commit overhead is high Data gets fragmented After every transaction completion, the database pages
containing old versions of modified data need to be garbage collected
Hard to extend algorithm to allow transactions to run concurrently
Recovery WithConcurrent Transactions (1/3)
We modify the log-based recovery schemes to allow multiple transactions to execute concurrently;they share a single buffer and a single log
Logging is done as described as earlier
The checkpoint technique and actions taken on recovery have to be changed
Recovery WithConcurrent Transactions (2/3)
Checkpoints are performed as before, except that checkpoint log record is now of the form: <checkpoint L>(L: list of transactions active at the time of checkpoint)
When the system recovers, it first does the following: Initialize undo-list and redo-list to empty Scan the log backward from the end, stopping when the first
<checkpoint L> record is found For each record found during the backward scan:
if the record is <Ti commit>, add Ti to redo-list;If the record is <Ti start> and Ti is not in redo-list, add Ti to undo-list
For every Ti in L, if Ti is not in redo-list, add Ti to undo-list
Recovery WithConcurrent Transactions (3/3)
At this point, undo-list consists of incomplete transactions that must be undone, and redo-list consists of finished transactions that must be redone
Recovery now continues as follows: Scan log backward from the end, and perform undo for each log
record that belongs to transaction Ti on undo-list, stopping when <Ti start> records have been found for every Ti in undo-list
Locate the most recent <checkpoint L> record Scan log forward from the <checkpoint L> record, and perform
redo for each log record that belongs to transaction T i on redo-list
It is important to undo the transaction in undo-list before redoing transactions in redo-list (example in textbook)
Log Record Buffering (1/2) Log record buffering: log records are buffered in main
memory, instead of being output directly to stable storage Log records are output to stable storage when a block of log
records in the buffer is full, or a log force operation is executed
Log force is performed to commit a transaction by forcing all its log records to stable storage
Several log records can thus be output using a single output operation, reducing the I/O cost
Log Record Buffering (2/2) The rules below must be followed if log records are
buffered
Log records are output to stable storage in the order in which they are created
Transaction Ti enters the commit state only when the log record <Ti commit> has been output to stable storage
Before a block of data in main memory is output to the database, all log records pertaining to data in that block must have been output to stable storage This rule is called the write-ahead logging or WAL rule