CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

61
CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014

Transcript of CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Page 1: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

CSE 3330 Database Concepts

Dr. Andy NagarSpring 2014

Page 2: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

A transaction is a sequence of operations that must be executed as a whole.

Every DB action takes place inside a transaction.

SAVINGS

User

$1000

CHECKING

User

$0

Either both (1) and (2) happen or neither!

Transfer $500

1. Debit savings

2. Credit checking

Page 3: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009

Transaction supportTransaction

Action, or series of actions, carried out by user or application, which reads or updates contents of database.

Logical unit of work on the database. Application program is series of

transactions with non-database processing in between.

Transforms database from one consistent state to another, although consistency may be violated during transaction.

3

Page 4: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

We abstract away most of the application code when thinking about transactions.

Read Balance1Write Balance1Read Balance2Write Balance2

Transfer $500

1. Debit savings

2. Credit checking

User’s

point of

view

Code

writer’s

point of

viewConcurrency

control ‘s &

recovery’s point

of view

Page 5: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Schedule: The order of execution of operations of two or more transactions.

R(A) R(C)

W(A)R(B)

W(C) R(B) W(B)

W(B)

Transaction1 Transaction2

Schedule S1

Tim

e

Read data item A

(typically a tuple)

Page 6: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009

Example transaction

6

Page 7: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Why do we need transactions?

R(A)W(A)

Transaction 1: Add $100 to account A

R(A)W(A)

•Tim

eTransaction 2:

Add $200 to account A

Page 8: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

R(A)

W(A)

R(A)W(A)

•Tim

eWhat will be the final account balance?

Transaction 1: Add $100 to account A

Transaction 2: Add $200 to account A

The Lost Update Problem

Page 9: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009

Lost update problemSuccessfully completed update is

overridden by another user.T1 withdrawing $10 from an account with

balx, initially $100.

T2 depositing $100 into same account. Serially, final balance would be $190.

9

Page 10: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009

Lost update problem

Loss of T2’s update avoided by preventing T1 from reading balx until after update.

10

Page 11: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

R(A)W(A)

F A I L

R(A)W(A)

•Tim

eWhat will be the final account balance?

Transaction 1: Add $100 to account A

Transaction 2: Add $200 to account A

Dirty reads cause problems.

Page 12: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Abort or roll back are the official words for “fail”.

Commit

All your writes will definitely absolutely be recorded and will not be undone, and all the values you read are committed too.

Abort/rollback

Undo all of your writes!

We’ll give a more technical definition later on.

Page 13: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

The concurrent execution of transactions must be such

that each transaction appears to execute in

isolation.

Page 14: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

The ACID properties:

Atomic– A transaction is performed entirely or not at all

Consistency Preservation– A transaction’s execution takes the database from one correct

state to another

Isolation– The updates of a transaction must not be made visible to other

transactions until it is committed

Durability– If a transaction changes the database and is committed, the

changes must never be lost because of subsequent failure

Page 15: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Serial Schedule: the transactions are performed one after the other.

Read(A)Write(A)Read(B)Write(B) Read(C) Write(C) Read(B) Write(B)

Read(A) Write(A) Read(B) Write(B) Read(C) Write(C) Read(B) Write(B)

Page 16: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.
Page 17: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.
Page 18: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.
Page 19: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.
Page 20: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.
Page 21: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.
Page 22: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

A schedule is serializable if it is guaranteed to give the same final result as some serial schedule.

Which of these are serializable?

Read(A) Read(A) Write(A)Write(A)Read(B)Write(B) Read(B) Write(B)

Read(A) Read(A)Write(A) Write(A)

Read(B)Write(B) Read(B) Write(B)

Read(A) Write(A) Read(A) Write(A)Read(B)Write(B) Read(B) Write(B)

Page 23: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Why is that last schedule ok?

Read(A) Write(A)Read(B)Write(B) Read(A) Write(A) Read(B) Write(B)

Will always give the same

result as

And this will never cause an

interleaving problem

Read(A) Write(A) Read(A) Write(A)Read(B)Write(B) Read(B) Write(B)

Page 24: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 24

Concurrency control protocolsTwo basic concurrency control techniques:

Locking, Timestamping.

Both are conservative approaches: delay transactions in case they conflict with other transactions.

Optimistic methods assume conflict is rare and only check for conflicts at commit.

Page 25: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 25

Locking methods Transaction uses locks to deny access to other transactions and so prevent incorrect updates.

Most widely used approach to ensure serializability.

Generally, a transaction must claim a shared (read) or exclusive (write) lock on a data item before read or write.

Lock prevents another transaction from modifying item or even reading it, in the case of a write lock.

Page 26: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 26

Locking – basic rulesIf transaction has shared lock on item, can

read but not update item.If transaction has exclusive lock on item,

can both read and update item.Reads cannot conflict, so more than one

transaction can hold shared locks simultaneously on same item.

Exclusive lock gives transaction exclusive access to that item.

Page 27: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 27

Incorrect locking schedule

Page 28: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 28

Incorrect locking scheduleIf at start, balx = 100, baly = 400, result

should be:balx = 220, baly = 330, if T7 executes before T8, or balx = 210, baly = 340, if T8 executes before T7.

However, result gives balx = 220 and baly = 340.

S is not a serializable schedule.Problem is that transactions release locks too

soon, resulting in loss of total isolation and atomicity.

To guarantee serializability, need an additional protocol concerning the positioning of lock and unlock operations in every transaction.

Page 29: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 29

Two- phase locking Transaction follows 2PL protocol if all locking operations precede first unlock operation in the transaction.

Two phases for transaction:Growing phase - acquires all locks but

cannot release any locks.Shrinking phase - releases locks but cannot

acquire any new locks.

Page 30: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 30

Incorrect Schedule

T1 releases its lock too soon. For 2PL, it needs to continue to keep its lock until all operations are over.

Page 31: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 31

Conflicting Operations

Page 32: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Certain pairs of operations conflict with each other.

Read(A) Read(A) Write(A)Write(A)Read(B)Write(B) Read(B) Write(B)

Read(A) Read(A)Write(A) Write(A)

Read(B)Write(B) Read(B) Write(B)

Read(A) Write(A) Read(A) Write(A)Read(B)Write(B) Read(B) Write(B)

Read(A) Write(A)

Write(A) Write(A)

Write(A) Write(A)

Write(A) Read(A)

Page 33: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Let’s clean up by collapsing all the nodes for a transaction into one big node.

Read(A) Read(A) Write(A)Write(A)Read(B)Write(B) Read(B) Write(B)

Read(A) Read(A)Write(A) Write(A)

Read(B)Write(B) Read(B) Write(B)

Read(A) Write(A) Read(A) Write(A)Read(B)Write(B) Read(B) Write(B)

Blue Black Blue Black Blue Black

Page 34: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Let’s clean up by collapsing all the nodes for a transaction into one big node.

Read(A) Read(A) Write(A)Write(A)Read(B)Write(B) Read(B) Write(B)

Read(A) Read(A)Write(A) Write(A)

Read(B)Write(B) Read(B) Write(B)

Read(A) Write(A) Read(A) Write(A)Read(B)Write(B) Read(B) Write(B)

Blue Black Blue Black Blue Black

A schedule is conflict serializable iff its dependency graph is acyclic.

Page 35: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Is this schedule conflict serializable?

R1(A) R2(C) W2(C)W1(A)R1(C)W1(C) R3(C) W3(C) R2(B) W2(B)

Schedule S1

T1 T2 T3 R2(C) W2(C) R2(B) W2(B) R1(A)W1(A)R1(C)W1(C) R3(C) W3(C)

Schedule S2

T1 T2 T3

Page 36: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

What serial schedule is S1 equivalent to, if any?

W(B) R(A) W(A)R(A)R(B)

R(C)W(B)

W(C) R)C(

S1 S2 R(A)

W(A)

R(C)

W(C)

R)C(W(B) R(A)R(B)W(B)

Page 37: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

R(A) W(A)R(A)R(C)

R(C)W(A)

W(C) R)A(

S1

What serial schedule is S1 equivalent to, if any?

Nothing

Page 38: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Locks are the basis of most protocols to guarantee serializability.Before you can read (write) item A, you must get

a read (write) lock on A from the lock manager, which you eventually give up.

Shared lock = read lock: many transactions can hold a shared lock on the same item at the same time.

Exclusive lock = write lock: only one transaction can hold an exclusive lock on an item at any given time.

Page 39: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

The lock manager makes decisions based on its compatibility matrix

X

X S

S

N

N

N

Y

Pro

cess

T2

Process T1

Page 40: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Lock management keeps a hash table of current locks

Locking and unlocking must be atomic: ensured by semaphores

# trans who have it locked

Lock type (S, X)

Queue of requests

Data item ID

Trans 109 has lock

Trans 448 wants X lock

Trans 599 wants S lock

Trans 47 has lock

Entry in the lock table

Queue to lock a particular data item

Page 41: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

2-Phase Locking (2PL): no new locks once you’ve given one up.Two phases:

getting locks (growing),

then releasing locks (shrinking).

T3

T2T1

R/W(A) R/W(A)

R/W(B) R/W(B) R/W(C)R/W(C)

At least one end of each arrow is a

Write

Page 42: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Locks can also lead to deadlocksDeadlock = circular wait among

transactions waiting for each other to release a lock. – Prevent: every transaction must

lock all items it needs in advance (ha!)

– Or detect

Livelock: a transaction cannot proceed for an indefinite period of time while other transactions in the system continue normally. – Fair waiting schemes (i.e., first-

come-first-served) for locks

T1 T2

Page 43: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Detect deadlocks by timeouts, or occasionally looking for cycles in a waits-for graph.

Edge from Ti to Tj iff Ti waits for Tj to release a lock

T1: S(A) S(D)…………..S(B)

T2: …………………X(B) ……………………………………X(C)

T3: …………………………………. S(D) S(C) ………………... X(A)

T4: …………………………………………………… X(B)

T1 T2

T4 T3

Page 44: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

Even S2PL does not solve the phantom problem.

Non-repeatable read! Would never happen in serial execution.

Undergrad Major Age

Alice CS 26

Bob ECE 36

Carla Math 45

David Psych 52

Elaine CS 67

B-tree on

Major

Find the oldest CS undergrad

Find the oldest CS undergrad

Add Frank, a 71-year-old CS major; commit.

Schedule

Elaine

Frank

Frank CS 71

Page 45: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 59

Deadlock An impasse that may result when two (or more) transactions are each waiting for locks held by the other to be released.

Page 46: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 60

DeadlockOnly one way to break deadlock: abort one

or more of the transactions.Deadlock should be transparent to user, so

DBMS should restart transaction(s).Three general techniques for handling

deadlock: Timeouts.Deadlock prevention.Deadlock detection and recovery.

Page 47: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 61

Timestamp methodsTransactions ordered globally so that older transactions, transactions with smaller timestamps, get priority in the event of conflict.

Conflict is resolved by rolling back and restarting transaction.

No locks so no deadlock.

Page 48: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 65

Recovery

Process of restoring database to a correct state in the event of a failure.

Need for Recovery ControlTwo types of storage: volatile (main memory)

and nonvolatile.Volatile storage does not survive system

crashes. Stable storage represents information that

has been replicated in several nonvolatile storage media with independent failure modes.

Page 49: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 66

Types of FailureSystem crashes, resulting in loss of main

memory.Media failures, resulting in loss of parts of

secondary storage.Application software errors.Natural physical disasters.Carelessness or unintentional destruction of

data or facilities.Sabotage.

Page 50: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 67

Transactions and recoveryTransactions represent basic unit of recovery.Recovery manager responsible for atomicity

and durability.If failure occurs between commit and database

buffers being flushed to secondary storage then, to ensure durability, recovery manager has to redo (rollforward) transaction’s updates.

If transaction had not committed at failure time, recovery manager has to undo (rollback) any effects of that transaction for atomicity.

Partial undo - only one transaction has to be undone.

Global undo - all transactions have to be undone.

Page 51: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 68

Transactions and recovery

DBMS starts at time t0, but fails at time tf. Assume data for transactions T2 and T3 have been written to secondary storage.

T1 and T6 have to be undone. In absence of any other information, recovery manager has to redo T2, T3, T4, and T5.

Page 52: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 69

Recovery facilitiesDBMS should provide following facilities to

assist with recovery:

Backup mechanism, which makes periodic backup copies of database.

Logging facilities, which keep track of current state of transactions and database changes.

Checkpoint facility, which enables updates to database in progress to be made permanent.

Recovery manager, which allows DBMS to restore database to consistent state following a failure.

Page 53: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 70

Log fileContains information about all updates to

database:Transaction records.Checkpoint records.

Often used for other purposes (for example, auditing).

Page 54: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 71

Log fileTransaction records contain:

Transaction identifier.Type of log record, (transaction start, insert,

update, delete, abort, commit).Identifier of data item affected by database

action (insert, delete, and update operations).Before-image of data item.

After-image of data item.Log management information.

Page 55: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 72

Log file

Page 56: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 73

Log fileLog file may be duplexed or triplexed.Log file sometimes split into two separate

random-access files.Potential bottleneck; critical in determining

overall performance.

Page 57: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 74

CheckpointingCheckpoint

Point of synchronization between database and log file. All buffers are force-written to secondary storage.

Checkpoint record is created containing identifiers of all active transactions.

When failure occurs, redo all transactions that committed since the checkpoint and undo all transactions active at time of crash.

Page 58: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 75

CheckpointingIn previous example, with checkpoint at time

tc, changes made by T2 and T3 have been written to secondary storage.

Thus:only redo T4 and T5,undo transactions T1 and T6.

Page 59: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 76

Recovery techniquesIf database has been damaged:

Need to restore last backup copy of database and reapply updates of committed transactions using log file.

If database is only inconsistent:Need to undo changes that caused

inconsistency. May also need to redo some transactions to ensure updates reach secondary storage.

Do not need backup, but can restore database using before- and after-images in the log file.

Page 60: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 77

Immediate updateUpdates are applied to database as they

occur.Need to redo updates of committed

transactions following a failure.May need to undo effects of transactions that

had not committed at time of failure.Essential that log records are written before

write to database. Write-ahead log protocol.

Page 61: CSE 3330 Database Concepts Dr. Andy Nagar Spring 2014.

©Pearson Education 2009 78

Immediate updateIf no “transaction commit” record in log,

then that transaction was active at failure and must be undone.

Undo operations are performed in reverse order in which they were written to log.