Chapter 22

20
1 Chapter 22 Chapter 22 Distributed DBMS Distributed DBMS Concepts and Concepts and Design Design CS 157B CS 157B Edward Chen Edward Chen

description

Chapter 22. Distributed DBMS Concepts and Design CS 157B Edward Chen. Introduction. Distributed Database changes the way of data sharing, conceptually from centralization into decentralization. Development of computer networks promotes a decentralized mode of work. - PowerPoint PPT Presentation

Transcript of Chapter 22

Page 1: Chapter 22

11

Chapter 22Chapter 22

Distributed DBMSDistributed DBMSConcepts and DesignConcepts and Design

CS 157BCS 157B

Edward ChenEdward Chen

Page 2: Chapter 22

22

IntroductionIntroduction

Distributed Database changes the way of data Distributed Database changes the way of data sharing, conceptually from centralization into sharing, conceptually from centralization into decentralization. decentralization.

Development of computer networks promotes a Development of computer networks promotes a decentralized mode of work.decentralized mode of work.

Development of distributed systems should improve Development of distributed systems should improve the sharing ability of the data and the efficiency of the sharing ability of the data and the efficiency of data access data access

Distributed systems should help resolve the "islands Distributed systems should help resolve the "islands of information" problemof information" problem

Page 3: Chapter 22

33

ConceptsConcepts

Distributed database Distributed database

A logically interrelated collection of shared data A logically interrelated collection of shared data and description of this data, physically distribute and description of this data, physically distribute over a computer network.over a computer network.

Distributed DBMS Distributed DBMS

The software system that permits the management The software system that permits the management of the distributed databases and makes the of the distributed databases and makes the distribution transparent to users. distribution transparent to users.

Page 4: Chapter 22

44

Concepts (cont’d)Concepts (cont’d)

In a distributed DBMS , single logical database is In a distributed DBMS , single logical database is split into a number of split into a number of fragmentsfragments..

Each fragment is stored on one or more computers Each fragment is stored on one or more computers under the control of a separate DBMSunder the control of a separate DBMS

with the computer connected to a network. Each site with the computer connected to a network. Each site is capable of independently processing user requests is capable of independently processing user requests that require access to local data and is also capable of that require access to local data and is also capable of processing data stored on other computers in the processing data stored on other computers in the network. network.

Page 5: Chapter 22

55

Concepts (cont’d)Concepts (cont’d) There are two applicationsThere are two applications

1) local application: do not require data from 1) local application: do not require data from other sites other sites

2) global application: do require data from 2) global application: do require data from other other sites sites

Distributed DBMS need to have at least one global Distributed DBMS need to have at least one global

application.application.

Page 6: Chapter 22

66

Concepts (cont’d)Concepts (cont’d)

DBMS have following characteristics: DBMS have following characteristics: A collection of logically related shared data A collection of logically related shared data The data is split into number of fragments The data is split into number of fragments Fragments may be replicated Fragments may be replicated Fragments/replicas are allocated to sites. Fragments/replicas are allocated to sites. The sites are linked with computer network. The sites are linked with computer network. The data at each site is under the control of a DBMS The data at each site is under the control of a DBMS The DBMS at each site can handle local applications The DBMS at each site can handle local applications

autonomously. autonomously. Each DBMS participates in at least one global application.Each DBMS participates in at least one global application.

Page 7: Chapter 22

77

Distributed Database Management SystemDistributed Database Management System

It is not necessary for every site in It is not necessary for every site in the system to have its own local the system to have its own local database as showndatabase as shown

The system is expected to make The system is expected to make the distribution the distribution transparenttransparent to the to the user user

Distributed database is split into Distributed database is split into fragments that can be stored on fragments that can be stored on different computers and perhaps different computers and perhaps replicated replicated

The objective of the transparency The objective of the transparency is to make the distributed system is to make the distributed system to appear like a centralized systemto appear like a centralized system

Page 8: Chapter 22

88

Distributed ProcessingDistributed Processing

TThe system consists of data that is physically distributed he system consists of data that is physically distributed across the network. If the data is centralized, even though across the network. If the data is centralized, even though the users may be accessing the data over the network, it is the users may be accessing the data over the network, it is not considered as distributed DBMS, simply not considered as distributed DBMS, simply distributeddistributed processingprocessing

Page 9: Chapter 22

99

AdvantagesAdvantages Reflects organizational structureReflects organizational structure Improved shareablility and local autonomyImproved shareablility and local autonomy Improved availabilityImproved availability Improved reliabilityImproved reliability Improved performanceImproved performance Modular growthModular growth Less danger on single-point failureLess danger on single-point failure

Page 10: Chapter 22

1010

DisadvantagesDisadvantages

ComplexityComplexity CostCost SecuritySecurity Integrity control more difficultIntegrity control more difficult Lack of standardsLack of standards Lack of experienceLack of experience Database design more complexDatabase design more complex Possible slow responsePossible slow response

Page 11: Chapter 22

1111

Homogeneous and Heterogeneous DDBMSsHomogeneous and Heterogeneous DDBMSs

Homogeneous DDBMS

In homogeneous DDBMS, all sites use the same DBMS product.

Much easier to design and manage.

This design provides incremental growth by making additional new sites to DDBMS easy

Allows increased performance by exploiting the parallel processing capability of multiple sites.

Page 12: Chapter 22

1212

Homogeneous and Heterogeneous DDBMSs Homogeneous and Heterogeneous DDBMSs (cont’d)(cont’d)

Heterogeneous DDBMSsHeterogeneous DDBMSs In heterogeneous DDBMS, all sites may run different DBMS products, which need not to be based on the same underlying data model and so the system may be composed of RDBMS, ORDBMS and OODBMS products.

In heterogeneous system, communication between different DBMS are required for translations.

In order to provide DBMS transparency, users must be able to make requests in the language of the DBMS at their local site.

Data from the other sites may have different hardware, different DBMS products and combination of different hardware and DBMS products.

The task for locating those data and performing any necessary translation are the abilities of heterogeneous DDBMS.

Page 13: Chapter 22

1313

Components Architecture of Components Architecture of DDBMSDDBMS

Component Architecture for a DDBMS Local DBMS (LDBMS) component - It has its own local system catalog that stores information about the data held at that site. Data communications (DC) component – is the software that enables all sites to communicate with each other. Global System Catalog (GSC) - The GSC holds information specific to the distributed nature of the system, such as the fragmentation and allocation schemas. Distributed DBMS component - is the controlling unit of the entire system.

Page 14: Chapter 22

1414

Components Architecture of DBMS Components Architecture of DBMS (cont’d)(cont’d)

Fig. components of DDBMS

Page 15: Chapter 22

1515

FRAGMENTATIONFRAGMENTATIONWhy fragmentation?

· Usage: Applications work with views rather than entire relations

· Efficiency: Data is stored close to where it is mostly frequently used· Parallelism: With fragments are the unit of distribution, a transaction can be divided into several subqueries that operate on fragments.· Security : Data not required by local applications is not

restored, and consequently not available to unauthorized users.

.Performance: The performance of global applications that require data from several fragments located at different sites may be slower.

Page 16: Chapter 22

1616

FRAGMENTATION (cont’d)FRAGMENTATION (cont’d)

Types of fragmentation· Horizontal fragmentation : a subset of the tuples of a

relation, defined as sp(R), where p is a predicate based on one or more attributes of there relation.

· Vertical fragmentation : a subset of the attributes of a relation, denoted as Pa1, a2, .., an (R), where a1, a2, .., an are attributes of the relation R.

· Mixed fragmentation : A horizontal fragment that is subsequently vertically fragmented, or a vertical fragment that is then horizontally fragmented.

Page 17: Chapter 22

1717

FRAGMENTATION (cont’d)FRAGMENTATION (cont’d)Selection S p(R) - defines a relation that contains only those tuples of R that satisfy the specified condition (predicate p). The same as horizontal fragmentation.

Projection P a1, a2, .., an (R) - defines a relation that contains a vertical subset of R, extracting the values of specified attributes and eliminating duplicates. The same as vertical fragmentation.

Page 18: Chapter 22

1818

SummarySummary

A distributed database is a collection of multiple, logically interrelated collection of shared data which is physically distributed over a computer network.

Basically, a DDBMS is different from a client-server system, even though the client-server architecture can be used to provide distributed DBMSs.

Both Top-down and Bottom-up design approaches can be used to design DDBMS.

Page 19: Chapter 22

1919

Summary (cont’d)Summary (cont’d) A relation may be divided into a number of sub

relations called fragments, which may be horizontal, vertical, mixed.

The three correctness rules of fragmentation are: completeness, reconstruction, and disjoitness.

There are four allocation strategies regarding the

placement of data: centralized, partitioned, complete replication and selected replication.

Page 20: Chapter 22

2020

The EndThe End