8/2/2019 Part14 Crash
1/28
Part 14 -crash 1
Crash RecoveryCrash Recovery
in case of system crash (failure) we require a recoveryscheme to: detect failures restore the database to a consistent state
Failure Types
Volatile Storage main memory and cache
normally does not survive a crash
8/2/2019 Part14 Crash
2/28
Part 14 -crash 2
Failure Types
Nonvolatile Storage usually survives a crash example: disk and magnetic tape except head crash, etc.
Stable Storage "never" lost (??)
can replicate on several nonvolatile media withindependent failure modes
8/2/2019 Part14 Crash
3/28
Part 14 -crash 3
Failure Types
Logical Error program related error - divide by zero, overflow, access
to non-existent memory, etc. can often be restarted after a software fix made
System Error example: deadlock or some undesirable system state
entered re-execution often possible
8/2/2019 Part14 Crash
4/28
Part 14 -crash 4
Failure Types
System Crash some hardware problem, volatile memory lost,...
Disk Failure head crash, etc. error during data transfer - sometimes recoverable
8/2/2019 Part14 Crash
5/28
Part 14 -crash 5
Basic Terminology
input(X)
transfer physical block where data item X resides intomain memory
output(X)
transfer buffer block on which X resides onto physical
block (disk)read(X, xi )
assign value of X to local variable xi :
ifthe block in which X resides is not in mainmemory then issue an input(X).
assign xi the value of data item X from the buffer
block.
8/2/2019 Part14 Crash
6/28
Part 14 -crash 6
Basic Terminology
write(X, xi )
assign value of the local variable xi to data item X in
the buffer block: ifthe block in which X resides is not in main
memory then issue an input(X) first.
assign xi to X in buffer memory.
8/2/2019 Part14 Crash
7/28
Part 14 -crash 7
EXAMPLE consider the following example from a banking system where
$50 is withdrawn from account A and deposited into account B:
read( A, a1 )
a1 = a1 - 50
write( A, a1 )
read( B, b1 )
b1 = b1 + 50
write( B, b1 )
8/2/2019 Part14 Crash
8/28
Part 14 -crash 8
Failure Modes
can leave the database in an inconsistent state, e.g.:
failure after output(A) but before output(B)
before output(A) and output(B) executed, thephysical database blocks and memory blocks differ,problem if crash!
8/2/2019 Part14 Crash
9/28
Part 14 -crash 9
Transaction
a basic program unit
its execution preserves the database consistency
the database is consistent both before and after itsexecution.
transaction may not always complete may become aborted for various reasons database must be restored (rolled back) to the state
before the transaction started the transaction must be atomic
either all the instructions are completed or none areperformed
8/2/2019 Part14 Crash
10/28
Part 14 -crash 10
Crash Recovery Methods
Incremental Log with Deferred Updates during the transaction execution, all writes are deferred until partial
commit stage
all updates are recorded on log and written to stable storage
for example: let A = 1000, B = 2000 at the start
T1 Log
read(A, a)
a := a - 50
write( A, a)
read(B, b)
b := b + 50
write(B, b)
. . .
other
transactions
the log is used to update the
database after thetransaction commits.
8/2/2019 Part14 Crash
11/28
Part 14 -crash 11
Recovery Procedure
redo(T
i
)
set of all data values updated by Ti to new values
Ti needs a redo if both and
found in the log. redo is idempotent: can execute more than once,
same final result. For example, the system crashes while
performing a recovery.
8/2/2019 Part14 Crash
12/28
Part 14 -crash 12
Crash Recovery Methods
Incremental Log with Immediate Update all updates are applied to the database; we keep an
incremental log of all changes.
written to stable storage when Ti begins.
for each write:
is written to
stable storage before any output(X) is performed.
e.g. write(X, 950)
when Ti partially commits, is written to
log.
8/2/2019 Part14 Crash
13/28
Part 14 -crash 13
Recovery Procedure
[Incremental Log with Immediate Update]
redo(Ti)
the same as before set updated items to new values
undo(Ti) if log contains an but no < Ti ,commit>
found.
restore value of items updated by T
i
to their old
values.
8/2/2019 Part14 Crash
14/28
Part 14 -crash 14
Checkpoints
recovery with logs requires the entire log to be
scanned. the search time grows with log size. many redone transactions unnecessary since their
updates have already been written to disk.
we can maintain periodic checkpoints save all logs currently residing in main memory (if
any) onto stable storage. output all modified buffer blocks to disk.
output a to log on stable storage.
8/2/2019 Part14 Crash
15/28
Part 14 -crash 15
Checkpoints - recovery
Recovery:
find the last Ti executing before the last checkpoint, Ti .
all the redo and undo operations apply only to Ti and
subsequent Tjs.
much less time consuming.
8/2/2019 Part14 Crash
16/28
Part 14 -crash 16
Buffer Management
OSs with virtual memory have paging schemes to evict
resident pages as required.
may work against us:
OS may evict a modified block before Ti commits,
as well logs often stored in main memory until abuffer block is full before sending to stable storage.
if now, Ti crashes, an inconsistency may result.
most OSs rarely support database requirements
8/2/2019 Part14 Crash
17/28
Part 14 -crash 17
Buffer Management
it may be possible for the db manager to allocate an
area of memory and manage it independent of the OS(i.e. memory reserved for database use only).
thus < Ti , data_item, old_value, new_value> must be
written to stable storage before output of the block onwhich the item resides. (all entries)
before output on a block in main memory, all logspertaining to the block must be written to stable storage
first.
8/2/2019 Part14 Crash
18/28
Part 14 -crash 18
Shadow Paging
the database is partitioned into a number of fixed length
blocks (pages). we can use a page table to translate each logical block
into its physical block:
1
2
3
n
Logical pag
table
Physical Pages
ondisk
we maintain two page tables:- current page table - used by Ti .
- shadow page table a copy of
the table before Ti executes, never
changed during execution of Ti
,
and stored in stable storage. Logical pagetable
Physical pages
on disk
8/2/2019 Part14 Crash
19/28
Part 14 -crash 19
Shadow Paging
example:
a write(X, xi ) is issued and X resides on the k-th page:
if the k-th page is not in memory, then issue aninput(X).
if this is the first write to the k-th page: find a free page on disk. modify the current page table so the k-th entry
points to the new page.
assign xito X in the buffer page.
8/2/2019 Part14 Crash
20/28
Part 14 -crash 20
Shadow Paging
the shadow page is stored in non-volatile memory just
prior to the execution of Ti . We can recover the
shadow page on a crash.
when Ti commits, the current page table becomes the
new shadow page table.
if the current page table is lost in a crash, it is simple toroll the system back to the last consistent state.
the overhead of log-records are eliminated.
8/2/2019 Part14 Crash
21/28
Part 14 -crash 21
Shadow Paging
recovery is fast since no redo or undos to perform.
In order to commit a transaction:
all modified buffer pages in main memory are outputto disk.
output the current page table to disk (do notoverwrite the shadow page -may need to recover ifcrash occurs now).
send the disk address of current page table to stablestorage - over writes the previous shadow page.
8/2/2019 Part14 Crash
22/28
Part 14 -crash 22
Shadow Paging
Disadvantages: data fragmentation:
the database becomes scattered over the disk (slowsequential access) - may need to repack to maintainfast sequential access.
garbage collection: after a commit, the old version of data is not
reachable (unreferenced) and is not part of free
space. We must perform periodic garbagecollections to recover the lost disk space.
8/2/2019 Part14 Crash
23/28
Part 14 -crash23
Loss of Non-volatile Storage
typically does not occur frequently
do periodic dump from disk to magnetic tape (?)
recovery to point of last dump, then follow log torestore database.
8/2/2019 Part14 Crash
24/28
Part 14 -crash24
Recovery with Concurrent Transactions
the scheme depends on the concurrency-control scheme
used. Basically, to roll back a transaction, we must undo its
updates. situation:
T0 is rolled back: a data item, B, that it updated must berestored to old value - can use undo information in its logfor log based recovery systems.
But if T1 did another update to B before T0 is rolledback, then T1s update is lost if T0 is rolled back.
thus we require that if T updates data item B, then no othertransaction may update B until T either commits or is rolledback.
This can be ensured with strict two-phase locking scheme(exclusive locks held until the end of a transaction).
8/2/2019 Part14 Crash
25/28
Part 14 -crash25
Recovery with Concurrent Transactions
Transaction Rollbacks
transaction Ti is rolled back by scanning the logbackwards.
for every entry found, the data item, Xj,
is restored to its old value V1. (possible that Tiperformed several updates to Xj)
continue scan until found.
8/2/2019 Part14 Crash
26/28
Part 14 -crash26
Recovery with Concurrent Transactions
Checkpoints recovery scheme more complex with concurrent
transaction execution than previous form. Severaltransactions may have been active at the lastcheckpoint.
we require that the checkpoint log entry be, where L is a list of the transactions
active at the time of the checkpoint. as before it is assumed that the transactions do not
perform updates to either the log or to buffer blocksduring the checkpoint duration
8/2/2019 Part14 Crash
27/28
Part 14 -crash27
Recovery with Concurrent Transactions
Restart Recovery initially, create two empty lists: undo-listand redo-list for
transactions requiring these operations. next, scan log backwards until the first
record is found, then:
for each found, add Ti to the redo-list. for each found, if Ti is not in redo-list,
then add it to undo-list. next, check the list L in the checkpoint record:
for each Ti in L, if Ti is not in the redo-list then addTi to the undo-list.
8/2/2019 Part14 Crash
28/28
Part 14 -crash28
Recovery with Concurrent Transactions
once the two lists have been constructed: Rescan log from most recent record backwards
performing an undo for each log record that belongsto a transaction on the undo-list (the log records forredo-list transactions are ignored). Stop scan when have been found for every transaction inundo-list.
Relocate the most recent again. scan log forward and perform redo for each record
that belongs to a transaction on the redo-list. Ignore
log records of transactions in the undo-list.
Top Related