1 CSE 480: Database Systems Lecture 24: Concurrency Control.

33
1 CSE 480: Database Systems Lecture 24: Concurrency Control

Transcript of 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

Page 1: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

1

CSE 480: Database Systems

Lecture 24: Concurrency Control

Page 2: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

2

Concurrency Control

Reasons for concurrency control– Although serial execution of a set of transactions may be correct,

concurrent (interleaved) transactions may be incorrect Lost Update Problem Temporary Update (or Dirty Read) Problem Incorrect Summary Problem Nonrepeatable read

– Database recovery from transaction failure or system crash becomes more complicated if you don’t control the read/write operations performed by the concurrent transactions

Page 3: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

3

Lost update problem

Lost Update Problem– When two transactions that update the same database items

have their operations interleaved in a way that makes the value of some database item incorrect

Page 4: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

4

Temporary Update (Dirty Read) problem

Temporary Update (or Dirty Read) Problem – When one transaction updates a database item and then the

transaction fails for some reason

– The updated item could be accessed by another transaction

Page 5: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

5

Incorrect Summary Problem

Incorrect Summary Problem– If a transaction is calculating an aggregate function while others

are updating some of these records, the aggregate function may calculate some values before they are updated and others after they are updated

Page 6: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

6

Nonrepeatable Read Problem

Nonrepeatable Read Problem– If a transaction reads the same data item twice and the item is

changed by another transaction between the two reads

Value of X has changed

Page 7: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

7

Transaction Schedule

To analyze the problems with concurrent transactions, we need to examine their transaction schedule

– A transaction schedule is an ordering of database operations from various concurrently executing transactions

Sa: r1(X) r2(X) w1(X) r1(Y) w2(X) w1(Y) Sb: r1(X) w1(X) r2(X) w2(X) r1(Y)

Page 8: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

8

Why Study Transaction Schedule?

Characteristics of a transaction schedule will determine– whether it is “easy” to recover from transaction failures

(see slide 10)

– whether concurrent execution of transactions is “correct” (see slide 11)

Page 9: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

9

Characterizing Schedules based on Recoverability

time

T1

T2

T3

T4

T5

T6

c1

c2

c3

c4

a6

What to do when transaction T6 aborts?- Do we need to rollback transactions that have already committed (e.g., T1..T4)- Do we need to rollback other uncommitted transactions beside T6?

x x

x

x

x

x

x

x

xx

x x

x

x x

xx

x

x

x

x

x

xx

x

x

x

x

x

x

x

x

x xxx

x x

Page 10: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

10

Characterizing Schedules based on Serializability

Is the effect of executing transactions in the order shown in schedule D equivalent to executing transactions in the order shown in schedule A or B?

Page 11: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

11

Example

Consider the following transaction schedule

r1(X) w1(X) r2(X) r1(Y) r2(Y) c2 w1(Y) a1

– If T1 aborts, do we need to rollback the committed transaction T2?

– Answer: yes

– Schedule is “non-recoverable” This type of schedule makes the recovery process more cumbersome

because we have to rollback transactions that have committed

Page 12: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

12

Recoverable Schedule

A schedule where no committed transactions need to be rolled back

A transaction T must not commit until all transactions T’ that have written an item that T reads have committed

Examples:– r1(X) w1(X) r2(X) r1(Y) w2(X) c2 a1

Nonrecoverable (T2 must be rolled back when T1 aborts)

– r1(X) r2(X) w1(X) r1(Y) w2(X) c2 w1(Y) a1

Recoverable (T2 does not have to be rolled back when T1 aborts)

– r2(X) w2(X) r1(X) r1(Y) w1(X) c2 w1(Y) a1

Recoverable (T2 does not have to be rolled back when T1 aborts)

Page 13: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

13

Is Recoverable Schedule Sufficient?

Example:

r1(X) w1(X) r2(X) w2(X) a1

– Recoverable because T2 has not committed before T1

– But the uncommitted transaction T2 must still be aborted when T1 aborts (cascading rollback)

Cascadeless schedule– A schedule with no cascading rollback, i.e., if a transaction T is

aborted, we only need to rollback T and no other transactions

– How do we ensure this?

Page 14: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

14

Cascadeless Schedules

Every transaction in the schedule reads only items that were written by committed transactions

Examples:– r1(X) w1(X) r2(X) r3(X) w2(X) c2 a1

Must rollback T2 and T3 (Not recoverable, not cascadeless)

– r1(X) r2(X) w1(X) r3(X) w2(X) c2 a1

Must rollback T3 only (Recoverable, not cascadeless)

– r1(X) r2(X) r3(X) w1(X) c2 a1

No need to rollback T2 nor T3 (Recoverable, cascadeless)

Cascadeless schedules are recoverable and avoid cascading rollback

Page 15: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

15

Recovery using System Log

From previous lecture, if the database system crashes, we can recover to a consistent database state by examining the log

Example of entries in a log record (T: transaction ID)1. [start_transaction,T4]

2. [read_item,T4.X]

3. [write_item,T4.X,4,11] (before image = 4, after image = 11)

4. [abort,T4]

– During recovery, we may undo the change in T4.X by using its “before image” (i.e., replace new value 11 with old value 4)

Page 16: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

16

Strict Schedules

But with concurrency, it is not always possible to restore the database to its original state after abort using the before image of data item

Example:

r2(X) r1(X) w1(X) w2(X) a1

– Is the transaction recoverable?

– Is the transaction cascadeless?

– If the original value for X is 5, T1 modifies X to 10 and T2 modifies it to 8. After we undo the changes of T1, will X returned to a correct value?

Page 17: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

17

Strict Schedules

A schedule in which we can restore the database to a consistent state after abort using the before image of data item

A schedule in which a transaction can neither read nor write an item X until the last transaction that wrote X has committed or aborted

Example:

r2(X) r1(X) w1(X) w2(X) a1

– Schedule is cascadeless but not strict

Page 18: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

18

Characterizing Schedules based on Recoverability

Summary– Recoverable schedules: no need to rollback committed

transactions

– Cascadeless schedules: no cascading rollback (rollback only the aborted transaction)

– Strict schedules: undo changes by aborted transaction by applying the before image of affected data items

Cascadeless schedules are recoverable Strict schedules are cascadeless and recoverable

More stringent condition means easier to do recovery from failure but less concurrency

Page 19: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

19

Serial Schedules

A schedule S is serial if all operations in transactions are executed consecutively in the schedule

– Otherwise, it is called nonserial schedule

Serial: r1(X) w1(X) r1(Y) w1(Y) r2(X) w2(X) Nonserial: r1(X) w1(X) r2(X) w2(X) r1(Y) w1(Y)

Page 20: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

20

Serial Schedules

Every serial schedule is correct, i.e., leads to a consistent database state

Schedule A Schedule Bread_item(X) read_item(X)

X = X + 3 X = X – 2Y

write_item(X) write_item(X)

read_item(X) read_item(X)

X = X – 2Y X = X + 3

write_item(X) write_item(X)

– But executions of serial schedules are highly inefficient (because there is no concurrency)

Page 21: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

21

Serializable Schedules

A schedule S is serializable if its execution is equivalent to some serial schedule of the same transactions

– Otherwise, S is non-serializable

We consider a special type of serializable schedule called conflict serializable schedule

Page 22: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

22

Conflict Serializable

A schedule S is said to be conflict serializable if it is conflict equivalent to some serial schedule of the same transactions

– Two schedules are said to be conflict equivalent if the order of any two conflicting operations is the same in both schedules

– Two operations in a transaction schedule are in conflict if They belong to different transactions They access the same data item At least one of them is a write operation

If a schedule is conflict serializable, we can reorder the nonconflicting operations until we form an equivalent serial schedule

Page 23: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

23

Testing for Conflict Serializability

Construct a precedence graph (serialization graph) where – Nodes are the transactions

– A directed edge is created from Ti to Tj if one of the operations in Ti appears before a conflicting operation in Tj

Create edge Ti Tj if schedule contains wi(X) rj(X)

Create edge Ti Tj if schedule contains ri(X) wj(X)

Create edge Ti Tj if schedule contains wi(X) wj(X)

A schedule is conflict serializable if and only if the precedence graph has no cycles.

Page 24: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

24

Example

Page 25: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

25

Example

This schedule is non-serializable because precedence graph has a cycle

Page 26: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

26

Example

This schedule is conflict serializable because precedence graph has no cycle

Page 27: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

27

Equivalent Serial Schedule

If a schedule S is conflict serializable, we can create an equivalent serial schedule S’ as follows:– Whenever an edge exists in the precedence graph from Ti to Tj,

Ti must appear before Tj in the equivalent serial schedule

– Schedule A is the equivalent serial schedule for schedule D

Precedence graph for schedule D

Page 28: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

28

Example

Is it conflict serializable?

What is the equivalent serial schedule?

Page 29: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

29

Example

Is it conflict serializable?

What is the equivalent serial schedule?

Page 30: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

30

Characterizing Schedules based on Serializability

Summary– Serial schedule is inefficient (no parallelism)

– Serializable schedule gives benefits of concurrent executions without giving up correctness

Concurrency control subsystem of DBMS must use certain protocol to ensure serializability of all schedules in which the transactions participate

– 2-Phase locking protocol (Chapter 22)

– May cause deadlocks

– DBMS will automatically abort one of the transactions, releasing the locks for other transactions to continue

Page 31: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

31

MySQL Example

Client 1: Client 2:

Consider two concurrent transactions

Page 32: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

32

MySQL Example (Deadlock)

Client 1: Client 2:

Page 33: 1 CSE 480: Database Systems Lecture 24: Concurrency Control.

33

MySQL Example (Deadlock)

Client 1: Client 2:

Client 1 will be kept busy waiting

Deadlock detected by concurrency control module of DBMS;Transaction for client 2 is aborted, allowing transaction for client 1 to continue