PMIT-6102 Advanced Database Systems

30
Distributed DBMS Slide 1 Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University

description

PMIT-6102 Advanced Database Systems. By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University. Lecture -11 Distributed Concurrency Control . Outline. Transaction Role of the distributed execution monitor Schedule Detailed Model of the Distributed Execution Monitor , - PowerPoint PPT Presentation

Transcript of PMIT-6102 Advanced Database Systems

Page 1: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 1Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

PMIT-6102Advanced Database Systems

By-Jesmin Akhter

Assistant Professor, IIT, Jahangirnagar University

Page 2: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 2Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Lecture -11Distributed Concurrency

Control

Page 3: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 3Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Outline Transaction

Role of the distributed execution monitor Schedule Detailed Model of the Distributed Execution Monitor,

Distributed Concurrency Control Serializability in Distributed DBMS Concurrency Control Algorithms time-stamping, Deadlock

Centralized Deadlock Detection Distributed Deadlock Detection

Page 4: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 4Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

The distributed execution monitor consists of two modules:

a transaction manager (TM) and a scheduler (SC). The transaction manager is responsible for

coordinating the execution of the database operations on behalf of an application.

The scheduler, on the other hand, is responsible for the implementation of a specific concurrency control algorithm for synchronizing access to the database.

A third component that participates in the management of distributed transactions is the local recovery managers (LRM) that exist at each site. Their function is to implement the local procedures by

which the local database can be recovered to a consistent state following a failure.

Role of the distributed execution monitor

Page 5: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 5Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Each transaction originates at one site, which we will call its originating site.

The execution of the database operations of a transaction is coordinated by the TM at that transaction’s originating site.

The transaction managers implement an interface for the application programs which consists of five commands: begin transaction, read, write, commit, and abort.

Role of the distributed execution monitor

Page 6: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 6Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Begin transaction: This is an indicator to the TM that a new transaction is starting. The TM does some bookkeeping, such as recording the transaction’s name, the originating application, and so on, in coordination with the data processor.

Read: If the data item to be read is stored locally, its value is read and returned to the transaction. Otherwise, the TM finds where the data item is stored and requests its value to be returned (after appropriate concurrency control measures are taken).

Write: If the data item is stored locally, its value is updated (in coordination with the data processor). Otherwise, the TM finds where the data item is located and requests the update to be carried out at that site after appropriate concurrency control measures are taken).

Commit: The TM coordinates the sites involved in updating data items on behalf of this transaction so that the updates are made permanent at every site.

Abort The TM makes sure that no effects of the transaction are reflected in any of the databases at the sites where it updated data items.

Role of the distributed execution monitor

Page 7: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 7Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Detailed Model of the Distributed

Execution Monitor

Scheduling/DeschedulingRequests

Transaction Manager(TM)

Distributed Execution Monitor

With other SCs

With other TMs

Begin_transaction,Read, Write, Commit, Abort

To data processor

Results

Scheduler(SC)

Page 8: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 8Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Serializability in Distributed DBMS

Somewhat more involved. Two histories have to be considered:

local histories global history

For global transactions (i.e., global history) to be serializable, two conditions are necessary:

Each local history should be serializable. Two conflicting operations should be in the same relative

order in all of the local histories where they appear together.

Page 9: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 9Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Global Non-serializability

The following two local histories are individually serializable (in fact serial), but the two transactions are not globally serializable.

T1: Read(x) T2: Read(x)x x5 x x15Write(x) Write(x)Commit Commit

LH1={R1(x),W1(x),C1,R2(x),W2(x),C2}LH2={R2(x),W2(x),C2,R1(x),W1(x),C1}

Page 10: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 10Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Concurrency Control Algorithms Pessimistic

Two-Phase Locking-based (2PL) Centralized (primary site) 2PL Primary copy 2PL Distributed 2PL

Timestamp Ordering (TO) Basic TO Multiversion TO Conservative TO

Hybrid Optimistic

Locking-based Timestamp ordering-based

Page 11: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 11Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Locking-Based Algorithms Transactions indicate their intentions by

requesting locks from the scheduler (called lock manager).

Locks are either read lock (rl) [also called shared lock] or write lock (wl) [also called exclusive lock]

Read locks and write locks conflict (because Read and Write operations are incompatible

rl wlrl yes nowl no no

Locking works nicely to allow concurrent processing of transactions.

Page 12: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 12Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Two-Phase Locking (2PL) A Transaction locks an object before using it. When an object is locked by another

transaction, the requesting transaction must wait.

When a transaction releases a lock, it may not request another lock.

Obtain lock

Release lock

Lock point

Phase 1 Phase 2

BEGIN END

No.

of l

ocks

Page 13: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 13Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Strict 2PLHold locks until the end.

Obtain lock

Release lock

BEGIN ENDTransactiondurationperiod of

data itemuse

Page 14: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 14Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Testing for SerializabilityConsider transactions T1, T2, …, TkCreate a directed graph (called a conflict graph),

whose nodes are transactions. Consider a history of transactions.

If T1 unlocks an item and T2 locks it afterwards, draw an edge from T1 to T2 implying T1 must precede T2 in any serial history

T1→T2Repeat this for all unlock and lock actions for

different transactions.If graph has a cycle, the history is not

serializable.If graph is a cyclic, a topological sorting will give

the serial history.

Page 15: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 15Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

ExampleT1: Lock XT1: Unlock

XT2: Lock XT2: Lock YT2: Unlock

XT2: Unlock

YT3: Lock YT3: Unlock

Y

T1→T2

T2→T3

T2T1

T3

Page 16: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 16Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

TheoremTwo phase locking is a sufficient condition to

ensure serializablility.Proof: By contradiction.If history is not serializable, a cycle must exist in

the conflict graph. This means the existence of a path such as

T1→T2→T3 … Tk → T1.This implies T1 unlocked before T2 and after Tk.T1 requested a lock again. This violates the

condition of two phase locking.

Page 17: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 17Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Only one of the sites has a lock manager; The transaction managers at the other sites communicate

with it rather than with their own lock managers. This approach is also known as the primary site 2PL algorithm

The communication between the cooperating sites in executing a transaction

A centralized 2PL (C2PL) algorithm is depicted in Figure in next slide.

This communication is between the transaction manager at the site where the

transaction is initiated (called the coordinating TM), the lock manager at the central site, and the data processors (DP) at the other participating sites.

The participating sites are those that store the data item and at which the operation is to be carried out.

The order of messages is denoted in the figure.

Centralized 2PL

Page 18: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 18Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Centralized 2PL There is only one 2PL scheduler in the distributed

system. Lock requests are issued to the central scheduler.

Data Processors at participating sites Coordinating TM Central Site LM

Lock Request

Lock Granted

Operation

End of Operation

Release Locks

Page 19: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 19Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Distributed 2PL

2PL schedulers are placed at each site. Each scheduler handles lock requests for data at that site.

A transaction may read any of the replicated copies of item x, by obtaining a read lock on one of the copies of x. Writing into x requires obtaining write locks for all copies of x.

Page 20: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 20Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Distributed 2PL ExecutionCoordinating TM Participating LMs Participating DPs

Lock Request

Operation

End of Operation

Release Locks

Page 21: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 21Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Timestamp Ordering Transaction (Ti) is assigned a globally unique timestamp ts(Ti). Transaction manager attaches the timestamp to all operations

issued by the transaction. Each data item is assigned a write timestamp (wts) and a read

timestamp (rts): rts(x) = largest timestamp of any read on x wts(x) = largest timestamp of any read on x

Conflicting operations are resolved by timestamp order.Basic T/O:for Ri(x) for Wi(x)if ts(Ti) < wts(x) if ts(Ti) < rts(x) and ts(Ti) < wts(x) then reject Ri(x) then reject Wi(x)else accept Ri(x) else accept Wi(x)rts(x) ts(Ti) wts(x) ts(Ti)

Page 22: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 22Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

A transaction is deadlocked if it is blocked and will remain blocked until there is intervention.

Locking-based CC algorithms may cause deadlocks.

TO-based algorithms that involve waiting may cause deadlocks.

Wait-for graph If transaction Ti waits for another transaction Tj to

release a lock on an entity, then Ti Tj in WFG.

Deadlock

Ti Tj

Page 23: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 23Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Assume T1 and T2 run at site 1, T3 and T4 run at site 2. Also assume T3 waits for a lock held by T4 which waits for a lock held by T1 which waits for a lock held by T2 which, in turn, waits for a lock held by T3.Local WFG

Global WFG

Local versus Global WFG

T1

Site 1 Site 2

T2

T4

T3

T1

T2

T4

T3

Page 24: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 24Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Ignore Let the application programmer deal with it,

or restart the system Prevention

Guaranteeing that deadlocks can never occur in the first place. Check transaction when it is initiated. Requires no run time support.

Avoidance Detecting potential deadlocks in advance

and taking action to insure that deadlock will not occur. Requires run time support.

Detection and Recovery Allowing deadlocks to form and then finding

and breaking them. As in the avoidance scheme, this requires run time support.

Deadlock Management

Page 25: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 25Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Transactions are not required to request resources a priori.

Transactions are allowed to proceed unless a requested resource is unavailable.

In case of conflict, transactions may be allowed to wait for a fixed time interval.

Order either the data items or the sites and always request locks in that order.

More attractive than prevention in a database environment.

Deadlock Avoidance

Page 26: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 26Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

WAIT-DIE Rule: If Ti requests a lock on a data item which is already locked by Tj, then Ti is permitted to wait iff ts(Ti)<ts(Tj). If ts(Ti)>ts(Tj), then Ti is aborted and restarted with the same timestamp.

if ts(Ti)<ts(Tj) then Ti waits else Ti dies non-preemptive: Ti never preempts Tj

prefers younger transactionsWOUND-WAIT Rule: If Ti requests a lock on a data item which is already locked by Tj , then Ti is permitted to wait iff ts(Ti)>ts(Tj). If ts(Ti)<ts(Tj), then Tj is aborted and the lock is granted to Ti.

if ts(Ti)<ts(Tj) then Tj is wounded else Ti waits preemptive: Ti preempts Tj if it is younger prefers older transactions

Deadlock Avoidance –Wait-Die & Wound-Wait Algorithms

Page 27: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 27Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Transactions are allowed to wait freely. Wait-for graphs and cycles. Topologies for deadlock detection

algorithms Centralized Distributed Hierarchical

Deadlock Detection

Page 28: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 28Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

One site is designated as the deadlock detector for the system. Each scheduler periodically sends its local WFG to the central site which merges them to a global WFG to determine cycles.

How often to transmit? Too often higher communication cost but lower

delays due to undetected deadlocks Too late higher delays due to deadlocks, but lower

communication cost Would be a reasonable choice if the

concurrency control algorithm is also centralized.

Proposed for Distributed INGRES

Centralized Deadlock Detection

Page 29: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 29Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Sites cooperate in detection of deadlocks. One example:

The local WFGs are formed at each site and passed on to other sites. Each local WFG is modified as follows:

Since each site receives the potential deadlock cycles from other sites, these edges are added to the local WFGs

The edges in the local WFG which show that local transactions are waiting for transactions at other sites are joined with edges in the local WFGs which show that remote transactions are waiting for local ones.

Each local deadlock detector: looks for a cycle that does not involve the external edge.

If it exists, there is a local deadlock which can be handled locally.

looks for a cycle involving the external edge. If it exists, it indicates a potential global deadlock. Pass on the information to the next site.

Distributed Deadlock Detection

Page 30: PMIT-6102 Advanced Database Systems

Distributed DBMS Slide 30Lectured by, Jesmin Akhter, Assistant Professor, IIT, JU

Thank you