Synchronization Chapter 6 Part III Transactions. –Most of the lecture notes are based on slides by...

44
Synchronization Chapter 6 Part III Transactions

Transcript of Synchronization Chapter 6 Part III Transactions. –Most of the lecture notes are based on slides by...

Synchronization

Chapter 6

Part III

Transactions

– Most of the lecture notes are based on slides by Prof. Jalal Y. Kawash at Univ. of Calgary

– Some slides are from Prof. Steve Goddard at University Nebraska, Lincoln, Prof. Harandi, Prof. Hou, Prof. Gupta and Prof. Vaidya from University of Illinois, Prof. Kulkarni from Princeton University

– I have modified them and added new slides

Giving credit where credit is due:

CSCE455/855 Distributed Operating Systems

Transactions

• Example Transaction:

Transfer amount X from A to Bdebit(Account A; Amount X): A = A – X;

credit(Account B; Amount X): B = B + X;

• Either do the whole thing or nothing

ACID Properties – AC

• Atomicity: No intermediate results are observable

– To an external observer, T jumps from the initial state to the final state, or never leaves the initial state

• Consistency: T produces consistent results only, otherwise it aborts

– T fulfills the consistency constraints of the application, no violation of system invariants

ACID Properties – ID

• Isolation: T looks like it is the only program running

– Notice the looks like!

• Durability: T produces unforgettable by the system results

– T’s results become part of reality

Some Transaction Primitives

Examples of primitives for transactions.

Primitive Description

BEGIN_TRANSACTION Make the start of a transaction

END_TRANSACTION Terminate the transaction and try to commit

ABORT_TRANSACTION Kill the transaction and restore the old values

READ Read data from a file, a table, etc.

WRITE Write data to a file, a table, etc.

Transaction Types

• Flat Transactions

• Flat Transactions with Savepoints

• Chained Transactions

• Nested Transactions

• Distributed Transactions

• Multi-level Transactions

• Open Nested Transactions

• Long-lived Transactions

Flat Transactions – All or Nothing

• Simplest Type• Basic building block for organizing an

application into atomic actions• Encloses a program with brackets:

– BEGIN_TRANSACTION– END_TRANSACTIONAll that is between the brackets is performed or

nothing at all

• Strictly satisfies ACID

Flat Transactions – Limitations

• All-or-nothing is not always appropriate

• It is beneficial to commit partial results (violate atomicity)

• Trip Planning: commit part of a transaction

• Bulk Updates: can be expensive to undo all updates

Trip Planning

a) Transaction to reserve three flights commitsb) Transaction aborts when third flight is unavailable

BEGIN_TRANSACTION reserve LNK -> MSP; reserve MSP -> SEA; reserve SEA -> YVR;END_TRANSACTION

(a)

BEGIN_TRANSACTION reserve LNK -> MSP; reserve MSP -> SEA; reserve SEA -> YVR full =>ABORT_TRANSACTION (b)

Flat Transactions with SavePointsT:

BEGIN_TRANSACTION

S1

Sm

SAVEPOINT

Sm+1

END_TRANSACTION

Failure here

ROLLBACK

Flat Tw/o

SavePoint

W SavePoint

Nested Transactions – No D • They organize T actions into a hierarchy1. A nested T is a tree of T’s; sub-trees are flat or

nested2. Leaf T’s are flat3. Root T = top-level T; all other are subTs4. SubT can commit or roll back. Its commit will

not take effect unless the root commits5. When a subT rolls back, all its children are

rolled back• Satisfy ACI, not the D (except for top level)

Nested Transactions – Trip Planning

BEGIN_TRANSACTION

BEGIN_SUB

reserve LNK -> MSP;END_SUB

BEGIN_SUB

reserve MSP -> SEA;

END_SUB

BEGIN_SUB

reserve SEA -> YVR;

END_SUB

END_TRANSACTION

Failure here

Can Commit

Can Commit

Distributed Transactions

• A flat transaction that runs in a distributed environment:

– Visit several nodes in the system

• Distributed T: structure is determined by the distribution of data in the DS

• Nested T: structure is determined by the functional decomposition of the application

Nested versus Distributed Ts

a) A nested transactionb) A distributed transaction

How to Implement Transactions?

Achieving Atomicity (I)

• Private Workspace: – Change a copy of data, keeping original intact– COMMIT: copy changed data to original – ROLLBACK: discard copy

Private Workspace

a) The file index and disk blocks for a three-block fileb) The situation after a transaction has modified block 0 and appended block 3c) After committing

Achieving Atomicity (II)

• Writeahead Log:– Change original data, logging every change before

making it– COMMIT: leave changes– ROLLBACK: restore from log

Writeahead Log

a) A transactionb) – d) The log before each statement is executed

x = 0;

y = 0;

BEGIN_TRANSACTION;

x = x + 1;

y = y + 2

x = y * y;

END_TRANSACTION;

(a)

Log

[x = 0 / 1]

(b)

Log

[x = 0 / 1]

[y = 0 / 2]

(c)

Log

[x = 0 / 1]

[y = 0 / 2]

[x = 1 / 4]

(d)

Achieving Consistency & Isolation

• Concurrency Control: controlling the execution of concurrent Ts operating on shared data

• Consistency & Isolation: All Ts must appear as if they executed in some sequential order, one after another (Serializability)

Serializability

d) Possible schedules

BEGIN_TRANSACTION x = 0; x = x + 1;END_TRANSACTION

(a)

BEGIN_TRANSACTION x = 0; x = x + 2;END_TRANSACTION

(b)

BEGIN_TRANSACTION x = 0; x = x + 3;END_TRANSACTION

(c)

Schedule 1 x = 0; x = x + 1; x = 0; x = x + 2; x = 0; x = x + 3 Legal

Schedule 2 x = 0; x = 0; x = x + 1; x = x + 2; x = 0; x = x + 3; Legal

Schedule 3 x = 0; x = 0; x = x + 1; x = 0; x = x + 2; x = x + 3; Illegal

(d)

a) – c) Three transactions

Ts as Sequences of Reads and Writes

• Transaction = sequence of read and write operations

– Read of x returning 1 by T: rT(x)1

– Write to x a value 2 by T: wT(x)2

Ta

BEGIN_TRANSACTION x = 0;

x = x + 1;END_TRANSACTION

BT wa(x)0 ra(x) 0 wa(x)1ET

Conflicting Operations

• Two operations o1 and o2 conflict if:

– Both o1 and o2 are on the same data item, and

– At least one of o1 and o2 is a write

wa(x)0ra(x) 0wa(x)1

wb(x)0rb(x) 0wb(x)2

wc(x)0rc(x) 0wc(x)3

X

Concurrency Control Algorithms (CCA)

• CCA: order read and write operations– By using locks (critical sections)– By using timestamps

• Pessimistic CCA: Act conservatively so that nothing can go wrong

• Optimistic CCA: Act aggressively, if something goes wrong, abort

Using Locks

• For T to access (read or write) a data item x:– T requests a lock on x from the scheduler

• When T finishes accessing x:– T releases the lock

• Scheduler grants acquisitions & releases on locks so that serializability is guaranteed

• Locks:– Shared: can only read x– Exclusive: can read and write x

ExampleTa

x = 0x = x + 1

Tb

x = 0x = x + 2

Serializabilityx = 1 or 2 onlyAcquire x.lock

w(x)0Release x.lock

Acquire x.lockw(x)0Release x.lock

Acquire x.lockr(x)0Release x.lock

Acquire x.lockw(x)1Release x.lock Acquire x.lock

r(x)1Release x.lock

Acquire x.lockw(x)3Release x.lock

x = 3

Two-Phase Locking – Pessimistic

A transaction executes in two phases

2PL – Scheduler Rules

1. When sched receives operation oT(x)v:

a. If ((exists o’ holding a lock on x) & (o’ and o are conflicting))

• Delay o

1. Else grant a lock to o and pass o to the data manager

a. Sched never releases a lock for x granted for operation o, until data manager acks the completion of o

b. Once sched releases a lock for T, T cannot be granted another lock. If T tries to, abort it

2PL – Analysis • 2PL guarantees serializability

• BUT not every serializable schedule can be generated by 2PL

• 2PL can cause a deadlock, concurrency is reduced

BEGIN_TRANSACTION x = 0; x = x + 1;

x = 6END_TRANSACTION

(a)

BEGIN_TRANSACTION x = 0; x = x + 2;

x = 8END_TRANSACTION

(b)

BEGIN_TRANSACTION x = 0; x = x + 3;

x = 10;END_TRANSACTION

(c)

Strict 2PL

Do not release locks until a T is finished

2PL – Discussion

• Strict 2PL can also lead to a deadlock– Use timeouts to preempt locks

• Strict 2PL advantages:– Acquisitions and releases can be done more easily, without

T’s knowledge– T always reads committed data avoid cascaded aborts

2PL – Discussion

• Strict 2PL can also lead to a deadlock– Use timeouts to preempt locks

• Strict 2PL advantages:– Acquisitions and releases can be done more easily, without

T’s knowledge– T always reads committed data avoid cascaded aborts

• Every 2PL (both versions) schedule is serializable• BUT not every serializable schedule can be generated

by 2PL (both versions)

Pessimistic Timestamp Ordering (PTO)• Each T has a timestamp, denoted T.ts• Each operation in T has same timestamp (T.ts)• Using Lamport’s or vector timestamps, all Ts have

unique timestamp values• If T.ts < T’.ts then T must appear before T’ in the

schedule• With each data item x, associate:

– x.wts : largest ts of any T that executed wT(x)v– x.rts : largest ts of any T that executed rT(x)v

• Update x.wts (resp. x.rts) whenever wT(x)v (resp. rT(x)v) occurs

PTO General Concept

• If T.ts < T’.ts then T must appear before T’ in the schedule

• Process transactions in a serial order

• Can use the same file, but must do it in order

• Therefore isolation is preserved

PTO – read operation

• When Sched receives rT(x)v operation:

– if T.ts < x.wts

\\T tries to read the past

reject rT(x)v

roll T back (assign T a new ts and restart)– if T.ts x.wts – (Do we need to compare T.ts with x.rts?)

execute rT(x)v

x.rts = max(x.rts , T.ts)

PTO – write operation

• When Sched receives wT(x)v operation:

– if (T.ts < x.rts) or (T.ts < x.wts)\\T tries to write in the past

reject wT(x)v roll T back (assign T a new ts and restart)

– else

execute wT(x)v

x.wts = T.ts

PTO – Analysis • Will PTO cause deadlock?

– PTO is deadlock-free

• Each PTO schedule is serializable• Not every serializable schedule is possible in

PTO• 2PL can produce schedules not possible under

PTO, and vice versaAll serializable

SchedulesAll PTO

Schedules

All 2PLSchedules

Optimistic Timestamp Ordering (OTO)• Each T does its changes to data (in its private

workspace)

• When done either commit or restart

• Each data item x has x.wts and x.rts, updated as before

When T needs to commit,Check x’s ts

If there is a conflict then restart

else commit

OTO (cont.)• Parallelism is maximized

– No waiting on locks– Inefficient when an abort is needed

• Pessimistic vs. Optimistic CCA– For instance, PTO vs. OTO ??

TO: Cascaded Aborts Problem

• For example the following run with transactions T1 and T2:– W1(x) R2(x) W2(y) R1(z), when can we commit T1 and T2? – W1(x) R2(x) W2(y) C2 R1(z) C1

– This could be produced by a TO scheduler

• T2 commits even though having read (i.e., R2(x)) from an uncomitted transaction– Answer: a scheduler can keep a list of other transactions each transaction

has read from, and not let a transaction commit before this list consists of only committed transactions

• Cascaded aborts still possible! – Answer: To avoid cascaded aborts, the scheduler can tag data written by

uncommitted transactions as dirty, and never let a read operation start on such a data item before it was untagged

Appendix

Concurrency Control

General organization of managers for handling transactions.

ConcurrencyControl

Concurrency Control for Distributed Ts

General organization of managers for handling distributed transactions.