Transactions CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by...

31
Transactions CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Transcript of Transactions CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by...

Transactions

CPSC 356 Database

Ellen Walker

Hiram College

(Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Transaction

• A logical unit of work for a database– Example: “move $5 from savings to checking”– { withdraw 5 from savings; add 5 to checking }

• Must be prevented from interfering with each other

• Transactions must be “all or nothing”– If the transaction doesn’t complete, no changes

should be made at all!

States of a Transaction

• Successful transactions are committed• Failed partial transactions are rolled back• Committed transactions cannot be undone. We must

create a new (compensating) transaction to fix the database.

ACID Properties of a Transaction(Härder and Reuter, 1983)

• Atomicity — a transaction is either performed in its entirety or not at all

• Consistency — a transaction must take the database from one consistent state to another

• Isolation (Serializable) — if two transactions run at the same time, the result must look as if they ran sequentially in some arbitrary order; a transaction’s updates must not be visible to other transactions until it commits

• Durability — once a transaction commits, its result is permanent (must never be lost)

Concurrency Control

• Two or more transactions proceed concurrently, while preserving serializability (isolation)

• Transactions cannot interfere with each other– Lost update problem– Dirty read problem– Inconsistent analysis problem

Lost Update Problem

– Account A = $100, B = $200, C = $300• Transaction T transfers $4 from A to B• Transaction U transfers $3 from C to B• Should end A = $96, B = $207, C = $297

– U’s update of B is lost:Transaction T Transaction Ubal=read(A) $100write(A,bal–4) $96

bal=read(C)$300

write(C,bal–3) $297bal=read(B) $200

bal=read(B) $200write(B,bal+3) $203

write(B,bal+4) $204

Dirty Read Problem

• Account A = $200, B = $200– Transaction T transfers $100 from A to B but fails!– Transaction U deposits $25 to A– Should end A = $225, B = $200

• Problem: – Transaction U read “dirty value” of A after $100 was taken…

Transaction T Transaction U bal=read(A) $200write(A,bal–100) $100

bal=read(A) $100write(A, bal+25) $125

bal=read(B) $200…ROLLBACK!

Nonrepeatable Read Problem

• Similar to dirty read, but the same transaction reads the same value twice

Transaction T Transaction U read(A) $1000

sal=read(A) $1000 (unrelated actions)write(A,sal*1.1) $1100

sal=read(A) $1100

Inconsistent Analysis Problem

– Situation:• Transaction T gives everyone a 10% raise• Transaction U computes the average salary

– Problem: • Some salaries have been raised, some not when

average is computed (avg should be 1500 or 1650)Transaction T Transaction U sal=read(A) $1000write(A,sal*1.1) $1100

bal=read(A) $1100bal+=read(B) $3100

sal=read(B) $2000write(B,sal*1.1) $2200

avg = bal/2 $1550

Interleaving Causes Problems

• We need concurrency control mechanism– Allow as much concurrency among transactions

as possible (throughput)– Prevent other transactions from viewing

intermediate values (not yet committed)

Definitions for Scheduling

• Schedule– A sequence of operations by a set of concurrent

transactions that preserves order of operations within each transaction

• Serial Schedule– A schedule without any interleaving

• Nonserial Schedule– A schedule where operations from different

transactions are interleaved

Conflict Serializability

• A serializable schedule has the same result as a serial schedule

• Recognize conflicts between transactions– Both transactions access the same variable– At least one of those accesses is a write

• When all conflicts happen in the same order (T before U or U before T), then the schedule is serializable; otherwise not.

Serializability Testing

• Draw a downward (forward in time) arrow for each conflict (when one transaction is writing). If all arrows point the same way, then the schedule is serializable

Transaction T Transaction Ubal=read(A)write(A,bal–4)

bal=read(C)write(C,bal–3)

bal=read(B)write(B,bal+4)

bal=read(B)write(B,bal+3)

Serializability Testing (cont.)

• If at least one arrow is pointing leftward and another arrow is pointing rightward, the schedule is not serializableTransaction T Transaction U

bal=read(A)write(A,bal–4)

bal=read(C)write(C,bal–3)bal=read(B)

bal=read(B)write(B,bal+4)

write(B,bal+3)

Generalizing Serializability

• With more than two transactions, build a conflict serializable graph– Each transaction is a node of the graph– For each conflict, draw an arc from the earlier

transaction to the later transaction.

• If this graph has a cycle, then the schedule is not serializable

Serializability Testing vs. Enforcement

• To test serializability, you have to create the graph and check for cycles– This cannot be done efficiently (result from study

of algorithms)

• Instead, let’s create extra constraints (locking) to enforce serializability

Locking Algorithms

• Locking is a method of controlling concurrency using a lock (variable) to deny transactions access to certain objects

• Types of locking– Static locking– 2 Phase Locking

• Other algorithms (we won’t cover)– Optimistic concurrency control– Timestamp ordering

Using Locks

• Transaction must lock the data object before accessing it

• Transaction should unlock the data object when done

• If an item is locked, the transaction must wait until it is unlocked

• Example transaction:– Lock B; read B; … write B; unlock B; commit.

Types of Locks

• Shared lock– Transaction can read item only (read lock)

• Exclusive lock– Transaction can read and update item (write lock)

• Shared lock can be upgraded to exclusive lock.

• Exclusive lock can be downgraded to shared lock.

Locking Protocols

• Even locking doesn’t guarantee serializability– Object is unlocked and locked again within a

transaction; another transaction “jumps in”

• Locking protocols prevent this– Static locking– 2 Phase locking

Static Locking

• Transaction locks all the data items before using any of them.– Usually the first operation in the transaction

• Transaction releases all locks at once when it’s done with the data– Usually at the end of the transaction

• This method limits concurrency but guarantees serializability

• Transaction must know in advance which objects it will use

2 Phase Locking

• Constraint: A transaction cannot request a lock on one data item after it has unlocked any data items.

• To maintain the constraint, use 2 phases:– Growing phase — transaction requests locks, but

doesn’t release any locks (upgrades allowed)• The stage of a transaction when it holds locks on all the

needed data objects is called the lock point

– Shrinking phase — transaction releases locks, but doesn’t request any more locks (downgrades allowed)

2-Phase Locking can cause Cascading Rollback

• With 2PL, after the transaction has released some of its locks, yet before it has committed the transaction, those intermediate results become visible

• When a transaction is rolled back, all modified data objects are restored

• What if another transaction reads those intermediate results, and this transaction later aborts?– All transactions that have read these data objects must also

be rolled back (even if they’ve already completed!) — this is called cascaded roll-back

Rigorous & Strict 2 Phase Locking

• Rigorous 2PL– A transaction holds all its locks until it completes,

when it commits (or aborts) and releases all of its locks in a single atomic action

• Strict 2PL– A transaction holds all its exclusive locks until it

completes, when it commits (or aborts) and releases all of its locks in a single atomic action

Deadlock

• When 2 or more transactions are each waiting for locks on items held by other waiting transactions. (Circular wait)

• Example: Dining Philosophers– 5 philosophers, 5 forks– To eat, you need both left and right forks– If each philosopher picks up a left fork and waits

for a right fork to become available, deadlock!

2 Phase Locking can lead to Deadlock

• A transaction can request a lock on a data object while holding locks on other data object, so a circular wait can result

• Resolved (after detecting deadlock) by:– Abort deadlocked transaction, restore all modified

data objects, release all its locks, and withdraw all pending lock requests

Deadlock Detection

• Deadlock detection– Wait-for Graph

• If transaction T is waiting for a lock that transaction U holds, there is an arrow from T to U in WFG

– Lock manager is responsible for detection• It looks for cycles in its Wait For Graph• If it finds a cycle, it must select and abort a transaction

(the deadlock victim)• Choose victim based on age, number of changes already

made, number of changes still to be made

Deadlock Prevention (Lock methods)

• Lock all items when transaction starts (static locking)• Request locks in predefined order

– May cause premature locking, which reduces concurrency

• Lock timeouts (enables preemption)– Each lock is invulnerable for a limited period, and vulnerable

afterwards– If a transaction wants to access a data object protected by a

vulnerable lock, the lock is broken and the transaction holding it is aborted

Deadlock Prevention (Timestamp)

– Transaction timestamps• Each transaction is assigned a unique timestamp when it

starts • If a transaction needs to access a data object that is

locked by another transaction, the timestamps of the two transactions are compared

– Older transaction (smaller timestamp) generally have priority

– Wait-for edges are only allowed from older to younger, which prevents cycles

Eliminating Deadlock with Timestamps

• Wait-die: (aborts one)– If older transaction wants something held by

younger transaction, it waits– If younger transaction wants something held by

older transaction, it must die

• Wound-wait: (preempts resource)– If older transaction wants something held by

younger transaction, it preempts it– If younger transaction wants something held by

older transaction, it waits

Locking in a Real DBMS

• Granularity– Lock by tuple -- possible “phantom”– Lock by table -- limits concurrency

• Isolation levels: (increasing order) – READ UNCOMMITTED (dirty reads)– READ COMMITTED (no dirty reads)– REPEATABLE READ (no nonrepeatable reads)– SERIALIZABLE (no phantoms)