CS4432: Database Systems II Transaction Management Motivation 1.

57
CS4432: Database Systems II Transaction Management Motivation 1

Transcript of CS4432: Database Systems II Transaction Management Motivation 1.

Page 1: CS4432: Database Systems II Transaction Management Motivation 1.

CS4432: Database Systems II

Transaction ManagementMotivation

1

Page 2: CS4432: Database Systems II Transaction Management Motivation 1.

2

DBMS Backend Components

Our next focus

Page 3: CS4432: Database Systems II Transaction Management Motivation 1.

3

Transactions• A transaction = sequence of operations that either all

succeed, or all fail

• Basic unit of processing in DBMS

• Transactions have the ACID properties:A = atomicityC = consistencyI = independence (Isolation)D = durability

Page 4: CS4432: Database Systems II Transaction Management Motivation 1.

Goal: The ACID properties• AA tomicity: All actions in the transaction happen, or none happen.

• CC onsistency: If each transaction is consistent, and the DB starts consistent, it ends up consistent.

• II solation: Execution of one transaction is isolated from that of all others.

• D D urability: If a transaction commits, its effects persist.

4

Page 5: CS4432: Database Systems II Transaction Management Motivation 1.

5

• Data in the DB should be always correct and consistent

Name

WhiteGreenGray

Age

523421

1

Integrity & Consistency of Data

How DBMS decides if data is consistent?

How DBMS decides if data is consistent?

Is this data correct (consistent)?

Page 6: CS4432: Database Systems II Transaction Management Motivation 1.

Schema-levelAdd Constraint command

Business-constraintUse of Triggers

6

• Define predicates and constraints that the data must satisfy

• Examples:- x is key of relation R- x y holds in R- Domain(x) = {Red, Blue, Green}- No employee should make more than twice the average salary

Integrity & Consistency Constraints

Defining constraints (CS3431)Defining constraints (CS3431)

Page 7: CS4432: Database Systems II Transaction Management Motivation 1.

7

.

.

50

.

.

1000

.

.

150

.

.

1000

.

.

150

.

.

1100

Example: a1 + a2 +…. an = TOT (constraint)

Deposit $100 in a2: a2 a2 + 100

TOT TOT + 100

FACT: DBMS is Not Consistent All the Time

a2

TOT

Initial state Final stateIntermediate state

Not

A transaction hides intermediate states (Even under failure)

A transaction hides intermediate states (Even under failure)

Page 8: CS4432: Database Systems II Transaction Management Motivation 1.

8

Transaction: a collection of actions that preserve consistency

Consistent DB Consistent DB’T

If T starts with consistent state ANDT executes in isolation THEN T leaves consistent state

Main Assumption

Concept of Transactions

Page 9: CS4432: Database Systems II Transaction Management Motivation 1.

9

How Can Constraints Be Violated?

• Transaction Bug– The semantics of the transaction is wrong– E.g., update a2 and not ToT

• DBMS Bug– DBMS fails to detect inconsistent states

• Hardware Failure– Disk crash, memory failure, …

• Concurrent Access– Many transactions accessing the data at the same time– E.g., T1: give 10% raise to programmers

T2: change programmers systems analysts

DBMS can easily detect and prevent that (if constraints are

defined)

DBMS can easily detect and prevent that (if constraints are

defined)

Should not use this DBMSShould not use this DBMS

Our focus & Major components in

DBMS

Our focus & Major components in

DBMS

Page 10: CS4432: Database Systems II Transaction Management Motivation 1.

10

How Can We Prevent/Fix Violations?

• Chapter 17: Due to failures only• Chapter 18: Due to concurrent access only• Chapter 19: Due to failures and concurrent access

Page 11: CS4432: Database Systems II Transaction Management Motivation 1.

Plan of Attack (ACID properties)

• First we will deal with “I”, by focusing on concurrency control.

• Then we will address “A” and “D” by looking at recovery.

• What about “C”?– Well, if you have the other three working, and you set up your integrity

constraints correctly, then you get “C” for free

11

Page 12: CS4432: Database Systems II Transaction Management Motivation 1.

CS4432: Database Systems II

Transaction ManagementConcurrency Control (Ch. 18)

12

Page 13: CS4432: Database Systems II Transaction Management Motivation 1.

13

T1

DB(consistencyconstraints)

Concurrent TransactionsT2 T3

Tn

• Many transactions access the data at the same time• Some are reading, others are writing • May conflict

Page 14: CS4432: Database Systems II Transaction Management Motivation 1.

14

Transactions: Example

T1: Read(A) T2:Read(A)

A A + 100 A A 2Write(A) Write(A)Read(B)

Read(B)B B+100 B B

2Write(B) Write(B)

Constraint: A=B• How to execute these two transactions?• How to schedule the read/write operations?

Page 15: CS4432: Database Systems II Transaction Management Motivation 1.

15

A Schedule

An ordering of operations (reads/writes) inside one or more transactions over time

What is correct outcome ?

What is good schedule ?

Leads To

Page 16: CS4432: Database Systems II Transaction Management Motivation 1.

16

Schedule A

T1 T2Read(A); A A+100Write(A);Read(B); B B+100;Write(B);

Read(A);A A2;

Write(A); Read(B);B

B2;

Write(B);

A B25 25

125

125

250

250

250 250

Serial Schedule: T1, T2Serial Schedule: T1, T2

Page 17: CS4432: Database Systems II Transaction Management Motivation 1.

17

Schedule B

Serial Schedule: T2, T1Serial Schedule: T2, T1

Page 18: CS4432: Database Systems II Transaction Management Motivation 1.

18

Serial Schedules !

• Definition: A schedule in which transactions are performed in a serial order (no interleaving)

• The Good: Consistency is guaranteed• Any serial schedule is “good”.

• The Bad: Throughput is low, need to execute in parallel

Solution Interleave Transactions in A Schedule…

Page 19: CS4432: Database Systems II Transaction Management Motivation 1.

19

Schedule C

Schedule C is NOT serial but its GoodSchedule C is NOT serial but its Good

Page 20: CS4432: Database Systems II Transaction Management Motivation 1.

20

Schedule D

Schedule C is NOT serial but its BadSchedule C is NOT serial but its Bad Not Consistent

Page 21: CS4432: Database Systems II Transaction Management Motivation 1.

21

Schedule ESame as Schedule D

but with new T2’

Same schedule as D, but this one is GoodSame schedule as D, but this one is Good Consistent

Page 22: CS4432: Database Systems II Transaction Management Motivation 1.

22

What Is A ‘Good’ Schedule? • Does not depend only on the sequence of operations

– Schedules D and E have the same sequence– D produced inconsistent data– E produced consistent data

• We want schedules that are guaranteed “good” regardless of:– The initial state and– The transaction semantics

• Hence we consider only:– The order of read/write operations– Any other computations are ignored (transaction semantics)

Transaction semantics played a role

Transaction semantics played a role

Example: Schedule S =r1(A) w1(A) r2(A) w2(A) r1(B) w1(B) r2(B) w2(B)

Page 23: CS4432: Database Systems II Transaction Management Motivation 1.

Example: Considering Only R/W Operations

23

Schedule S =r1(A) w1(A) r2(A) w2(A) r2(B) w2(B) r1(B) w1(B)

Page 24: CS4432: Database Systems II Transaction Management Motivation 1.

24

Concept: Conflicting Actions

Conflicting actions: Two actions from two different transactions on the same object are conflicting iff one of them is write

r1(A) W2(A)

w1(A) r2(A)

w1(A) w2(A)

r1(A) r2(A)

Transaction 1 reads A, Transaction 2 write A

Transaction 1 writes A, Transaction 2 reads A

Transaction 1 writes A, Transaction 2 write A

Transaction 1 reads A, Transaction 2 reads ANo Conflict

Conflicting actions can cause anomalies…Which is BadConflicting actions can cause anomalies…Which is Bad

Page 25: CS4432: Database Systems II Transaction Management Motivation 1.

Anomalies with Interleaving Reading Uncommitted Data (WR Conflicts, “dirty reads”):

e.g. T1: A+100, B+100, T2: A*1.06, B*1.06

Unrepeatable Reads (RW Conflicts): E.g., T1: R(A), …..R(A), decrement, T2: R(A), decrement

Overwriting Uncommitted Data (WW Conflicts):

25

We need schedule that is

anomaly-free

We need schedule that is

anomaly-free

Page 26: CS4432: Database Systems II Transaction Management Motivation 1.

Our Goal

• We need schedule that is equivalent to any serial schedule

26

It should allow interleaving Any serial

order is goodProduces

consistent result & anomaly-free

Given schedule S: If we can shuffle the non-conflicting actions to reach a serial schedule L S is equivalent to L S is good

Given schedule S: If we can shuffle the non-conflicting actions to reach a serial schedule L S is equivalent to L S is good

Page 27: CS4432: Database Systems II Transaction Management Motivation 1.

27

Example: Schedule C

Page 28: CS4432: Database Systems II Transaction Management Motivation 1.

28

Example: Schedule C

Sc= r1(A) w1(A) r2(A) w2(A) r1(B) w1(B) r2(B) w2(B)

Sc”= r1(A) w1(A) r1(B) w1(B) r2(A) w2(A) r2(B) w2(B)

Can be switched because they are not conflicting

T1 T2

Schedule C is equivalent to a serial schedule So it is “Good”

Page 29: CS4432: Database Systems II Transaction Management Motivation 1.

29

Why Schedule C turned out to be Good ? (Some Formalization)

Sc= r1(A) w1(A) r2(A) w2(A) r1(B) w1(B) r2(B) w2(B)

T1 T2 T1 T2(T1 precedes T2) (T1 precedes T2)

No cycles Sc is “equivalent” to a

serial schedule where T1 precedes T2.

Page 30: CS4432: Database Systems II Transaction Management Motivation 1.

30

Example: Schedule D

SD= r1(A) w1(A) r2(A) w2(A) r2(B) w2(B) r1(B) w1(B)

• Can we shuffle non-conflicting actions to make T1 T2 or T2 T1 ??

Page 31: CS4432: Database Systems II Transaction Management Motivation 1.

31

Example: Schedule D

SD= r1(A) w1(A) r2(A) w2(A) r2(B) w2(B) r1(B) w1(B)

• Can we make T1 first [T1 T2]?– No…Cannot move r1(B) w1(B) forward– Why: because r1(B) conflict with w2(B) so it cannot move….Same for w1(B)

Page 32: CS4432: Database Systems II Transaction Management Motivation 1.

32

Example: Schedule D

SD= r1(A) w1(A) r2(A) w2(A) r2(B) w2(B) r1(B) w1(B)

• Can we make T2 first [T2 T1]?– No…Cannot move r2(A) w2(A) forward– Why: because r2(A) conflict with w1(A) so it cannot move….Same for w2(A)

Schedule D is NOT equivalent to a serial schedule So it is “Bad”

Page 33: CS4432: Database Systems II Transaction Management Motivation 1.

33

Why Schedule D turned out to be Bad? (Some Formalization)

T1 T2 T2 T1(T1 precedes T2) (T2 precedes T1)

Cycle Exist SD is “Not equivalent” to any serial schedule.

SD= r1(A) w1(A) r2(A) w2(A) r2(B) w2(B) r1(B) w1(B)

T1 T2

Page 34: CS4432: Database Systems II Transaction Management Motivation 1.

Recap• Serial Schedules are always “Good” (Consistency + no anomaly)

– But they limit the throughput

• Goal: Find interleaving schedule that is “equivalent to” a serial schedule

• Identify “Conflicting Actions”, and try to arrange the non-conflicting ones to reach a serial schedule

• When formalized Maps to Dependency Graphs and Cycle Testing

34

Next…

Page 35: CS4432: Database Systems II Transaction Management Motivation 1.

CS4432: Database Systems II

Transaction ManagementConcurrency Control: Theory

35

Page 36: CS4432: Database Systems II Transaction Management Motivation 1.

Definitions

• Conflict Equivalent– S1, S2 are conflict equivalent schedules if S1 can be transformed into S2

by a series of swaps of non-conflicting actions.

• Conflict Serializable (Serializable for short)– A schedule S1 is conflict serializable if it is conflict

equivalent to some serial schedule.

36

Schedule C is conflict serializable Schedule D is not conflict serializable Schedule C is conflict serializable Schedule D is not conflict serializable

Page 37: CS4432: Database Systems II Transaction Management Motivation 1.

37

Answer: A Precedence Graph !

How to Determine This ?

If no cycles If cycles

Schedule is conflict serializable (Good)

Schedule is NOT conflict serializable (Bad)

Page 38: CS4432: Database Systems II Transaction Management Motivation 1.

38

Nodes Transactions in S

Edges Ti Tj whenever the 3 conditions are met

- pi(A), qj(A) are actions in S- pi(A) <S qj(A)

- at least one of pi, qj is a write

Precedence Graph P(S) (S is

schedule)

Two actions, one from Ti and one from Tj

Ti’s action before Tj’s action

They are conflicting actions

Page 39: CS4432: Database Systems II Transaction Management Motivation 1.

39

Precedence Graph

• Precedence graph for schedule S:– Nodes: Transactions in S– Edges: Ti → Tj whenever

• S: … ri (X) … wj (X) …• S: … wi (X) … rj (X) …• S: … wi(X) … wj (X) …

Note: not necessarily consecutive

Page 40: CS4432: Database Systems II Transaction Management Motivation 1.

40

Graph Theory 101

Directed Graph:

Cycle

Not Cycle

Nodes

Directed edges

Page 41: CS4432: Database Systems II Transaction Management Motivation 1.

41

Theorem

P(S1) acyclic S1 conflict serializable

Page 42: CS4432: Database Systems II Transaction Management Motivation 1.

42

r2(x) r1(y) r1(z) r5(v) r5(w) w5(w)….

Time dim

Page 43: CS4432: Database Systems II Transaction Management Motivation 1.

Build P(A)

43

No cycles Schedule A is Conflict Serializable

Page 44: CS4432: Database Systems II Transaction Management Motivation 1.

44

Exercise 1:

• What is P(S) forS = w3(A) w2(C) r1(A) w1(B) r1(C) w2(A) r4(A) w4(D)

• Is S conflict-serializable?

Page 45: CS4432: Database Systems II Transaction Management Motivation 1.

45

Exercise 2:

• What is P(S) forS = w1(A) r2(A) r3(A) w4(A) ?

• Is S conflict-serializable?

Page 46: CS4432: Database Systems II Transaction Management Motivation 1.

• Build P(F)….Is F Conflict Serializable ?

46

Exercise 3:

Page 47: CS4432: Database Systems II Transaction Management Motivation 1.

How to Find the Equivalent Serial Order

47

No cycles Schedule A is Conflict Serializable So What is the serial order equivalent to A???

Page 48: CS4432: Database Systems II Transaction Management Motivation 1.

How to Find the Equivalent Serial Order

48

• The serializability order can be obtained by a topological sorting of the graph. This is a linear order consistent with the partial order of the graph.

Take the transaction (T) with no incoming edges and put it in the serial order (left–to-right)

Delete T and its edges from the graph Repeat until all transactions are taken

There can be many orders … It is not unqiue

Page 49: CS4432: Database Systems II Transaction Management Motivation 1.

How to Find the Equivalent Serial Order

49

One order T5 T1 T2 T3 T4Another order T1 T3 T5 T2 T4….

Page 50: CS4432: Database Systems II Transaction Management Motivation 1.

CS4432: Database Systems II

Concurrency ControlEnforcing Serializability: Locking

50

Page 51: CS4432: Database Systems II Transaction Management Motivation 1.

Enforcing Serializable Schedules

• DBMSs use a “Scheduler” that schedules the actions of transactions

• Transactions send their requests (R or W) to Scheduler • The scheduler prevents the formation of cycles

– It grants permission to R or W only if no cycle will be formed

51

Page 52: CS4432: Database Systems II Transaction Management Motivation 1.

Locking Protocol

• “Scheduler” uses a locking protocol to enforce serializability

• Two New actions– Lock (exclusive): li(A) Transaction Ti locks item A

– Unlock: Ui(A) Transaction Ti unlocks (releases) item A

52

locktable

Page 53: CS4432: Database Systems II Transaction Management Motivation 1.

53

Rule #1: Well-Formed Transactions

Ti: … li(A) … pi(A) … ui(A) ...

Any action (R/W) must be after the lock (l) and before the unlock (u)

Rule 1 is at the level of each transaction independent of the others

Rule 1 is at the level of each transaction independent of the others

Page 54: CS4432: Database Systems II Transaction Management Motivation 1.

54

Rule #2 Legal Scheduler

S = …….. li(A) ………... ui(A) ……...

no lj(A)

No transaction Tj can lock item A that is already locked by another transaction Ti(Transaction Tj must wait until Ti releases its lock)

Rule 2 is at the level of the complete schedule (Set of interleaving transactions)

Rule 2 is at the level of the complete schedule (Set of interleaving transactions)

Page 55: CS4432: Database Systems II Transaction Management Motivation 1.

55

• What schedules are legal?What transactions are well-formed?S1 = l1(A)l1(B)r1(A)w1(B)l2(B)u1(A)u1(B)

r2(B)w2(B)u2(B)l3(B)r3(B)u3(B)

S2 = l1(A)r1(A)w1(B)u1(A)u1(B) l2(B)r2(B)w2(B)l3(B)r3(B)u3(B)

S3 = l1(A)r1(A)u1(A)l1(B)w1(B)u1(B) l2(B)r2(B)w2(B)u2(B)l3(B)r3(B)u3(B)

Exercise:

Page 56: CS4432: Database Systems II Transaction Management Motivation 1.

56

Schedule F: Let’s Add Some Locking!

Does the locking mechanism working? Does it guarantee serializable schedule??

Does the locking mechanism working? Does it guarantee serializable schedule??

Page 57: CS4432: Database Systems II Transaction Management Motivation 1.

Still Something is Missing…

57

Still by applying the locks….results is not consistent !!!

Next: Rule #3 (Two-Phase Locking)