13-Unit13

10
Database Management Systems Unit 13 Sikkim Manipal University Page No.: 214 Unit 13 Distributed Database Structure 13.1 Introduction to Distributed DBMS Concepts Objectives Self Assessment Question(s) (SAQs) 13.2 Client-Server Model Self Assessment Question(s) (SAQs) 13.3 Data Fragmentation, Replication, and Allocation Techniques for Distributed Database Design Self Assessment Question(s) (SAQs) 13.4 Summary 13.5 Terminal Questions (TQs) 13.6 Multiple Choice Questions (MCQs) 13.7 Answers to SAQs, TQs, and MCQs 13.7.1 Answers to Self Assessment Questions (SAQs) 13.7.2 Answers to Terminal Questions (TQs) 13.7.3 Answers to Multiple Choice Questions (MCQs) 13.1 Introduction to Distributed DBMS Concepts In a centralized database system, all system components such as data, DBMS software, storage devices reside at a single computer or site, where as in distributed database system data is spread over one or more computer connected by a network. Distributed database is thus a set of databases stored on multiple computers but it appears to a user as a single database. The data on several computers can be simultaneously accessed and modified (data from local and remote databases) using a network. Each database server in the DDB is controlled by its local DBMS, and each cooperates to maintain the consistency of the global database.

description

DDD

Transcript of 13-Unit13

Database Management Systems Unit 13

Sikkim Manipal University Page No.: 214

Unit 13 Distributed Database

Structure

13.1 Introduction to Distributed DBMS Concepts

Objectives

Self Assessment Question(s) (SAQs)

13.2 Client-Server Model

Self Assessment Question(s) (SAQs)

13.3 Data Fragmentation, Replication, and Allocation Techniques for

Distributed Database Design

Self Assessment Question(s) (SAQs)

13.4 Summary

13.5 Terminal Questions (TQs)

13.6 Multiple Choice Questions (MCQs)

13.7 Answers to SAQs, TQs, and MCQs

13.7.1 Answers to Self Assessment Questions (SAQs)

13.7.2 Answers to Terminal Questions (TQs)

13.7.3 Answers to Multiple Choice Questions (MCQs)

13.1 Introduction to Distributed DBMS Concepts

In a centralized database system, all system components such as data,

DBMS software, storage devices reside at a single computer or site, where

as in distributed database system data is spread over one or more computer

connected by a network.

Distributed database is thus a set of databases stored on multiple

computers but it appears to a user as a single database. The data on

several computers can be simultaneously accessed and modified (data from

local and remote databases) using a network. Each database server in the

DDB is controlled by its local DBMS, and each cooperates to maintain the

consistency of the global database.

Database Management Systems Unit 13

Sikkim Manipal University Page No.: 215

As a general goal, distributed computing systems divide a big,

unmanageable problem into smaller pieces and solve it efficiently in a

coordinated manner.

Fig. 13.1: Data distribution and replication among distributed database

Objectives

To know about

o Client-Server Model

o Data fragmentation

o Replication

o Allocation Techniques for Distributed Database Design

Advantages of Distributed Databases

1. Increased reliability and availability: Reliability is broadly defined as the

probability that a system is running at a certain time point, whereas

reliability is defined as the system that is continuously available during a

time interval. When the data and DBMS software are distributed over

Database Management Systems Unit 13

Sikkim Manipal University Page No.: 216

several sites, one site may fail while other sites continue to operate.

Only the data and software that exist at the failed site cannot be

accessed. In a centralized system, failure at a single site makes the

whole system unavailable to all users.

2. Improved performance: Large database is divided into smaller

databases by keeping the necessary data where it is needed most.

Data localization reduces the contention for CPU and I/O services, and

simultaneously reduces access delays involved in wide area network.

When a large database is distributed over multiple sites, smaller

databases exist at each site. As a result, local queries and transactions

accessing data at a single site have better performance because of the

smaller local databases. To improve parallel query processing a single

large transaction is divided into a number of smaller transactions and

executes multiple transactions at different sites.

3. Data sharing: Data can be accessed by users at other remote sites

through the distributed database management system (DDBMS)

Software.

4. Transparency: Ideally, a distributed database should be distribution

transparent in the sense of hiding the details of where each file is

physically stored within the system. It provides network transparency,

that is the command used to perform a task is independent of the

location of data, and the location of the system where the command was

issued.

5. Easier expansion: In a distributed environment, expansion of the

system in terms of adding more data, increasing database size, or

adding more processors is much easier.

Additional Functions of Distributed Databases:

Basic functions performed by DDBMS in addition to those of centralized

DBMS.

Database Management Systems Unit 13

Sikkim Manipal University Page No.: 217

1. Distributed query processing: Distributed query processing means the

ability to access remote sites and transmit queries and data among the

various sites via the communication network.

2. Data tracing: DDBMS should have the ability to keep track of the data

distribution, fragmentation and replication by maintaining DDBMS

catalog.

3. Distributed transaction management In DDBMS transactions that

accesses data from more than one site, and it synchronizes the access

to distributed data and maintains integrity of the overall database.

4. Distributed database recovery: The ability to recover from individual site

crashes and from new types of failures.

5. Security: It must be executed with the proper management of the

security of the data and the authorization/access privileges of the users.

6. Distributed directory (catalog) management: A directory contains

information (meta data) about data in the database. The directory may

be global for the entire DDB, or local for each site. The placement and

distribution of the directory are design and policy issues.

These functions increase the complexity of a DDBMS over a centralized

DBMS.

Self Assessment Question(s) (SAQs) (For Section 13.1)

1. Define distributed database system

2. What are the advantages of Distributed database systems?

13.2 Client-Server Model

The Client-Server model is basic to distributed systems, it allows clients to

make requests that are routed to the appropriate server in the form of

transactions. The client_server model consists of three parts.

Database Management Systems Unit 13

Sikkim Manipal University Page No.: 218

1. Client - The client is the machine (workstation or pc) running the front

and applications. It interacts with a user through the keyboard, display

and mouse. The client has no direct data access responsibilities. The

client machine provides front_end application software for accessing the

data on the server. The clients initiates transactions, the server

processes the transactions.

Interaction between client and server might be processed as follows during

processing of an SQL query.

1. The client passes a user query and decomposes it into a number of

independent site queries. Each site query is sent to the appropriate

server site.

2. Each server processes the local query and sends the resulting relation

to the client site.

3. The client site combines the results of the queries to produce the result

of the originally submitted query.

So the server is called database processor or back end machine, where as

the client is called application processor or front end machine.

Another function controlled by the client is that of ensuring consistency of

replicated copies of a data item by using distributed concurrency control

techniques. The client must also ensure the atomicity of global transactions

by performing global recovery when certain sites fail. It provides distribution

transparency, that is the client hides the details of data distribution from the

user.

1. Server – The server is the machine that runs the DMS software. It is

referred to as back end. The server processes SQL and other query

statements received from client applications. It can have large disk

capacity and fast processors.

Database Management Systems Unit 13

Sikkim Manipal University Page No.: 219

2. Network – The network enables remote data access through client

server and server-to-server communication.

Each computer in a network is a node, acts as a client, a server, or both,

depending on the situation.

Advantages:

Client applications are not dependent on physical location of the data. If

the data is moved or distributed to other database servers, the

application continues to function with little or no modification.

It provides multi-tasking and shared memory facilities; as a result they

can deliver the highest possible degree of concurrency and data

integrity.

In networked environment, shared data is stored on the servers, rather

than on all computers in the system. This makes it easier and more

efficient to manage concurrent access. Inexpensive, low-end client work

stations can access the remote data of the server effectively.

Self Assessment Question(s) (SAQs) (For Section 13.2)

1. Explain the concept of Client server model.

13.3 Data fragmentation, Replication, and Allocation Techniques

for Distributed Database Design

Data fragmentation: Techniques that are used to break up the database

into logical units called fragments that may be assigned for storage at the

various sites. In a DDBMS, decisions must be made regarding which site

should be used to store which portions of the database. There are three

types of fragmentation:

1. Horizontal fragmentation: A horizontal fragmentation divides a relation

"horizontally" by grouping rows to create subsets of tuples, where each

subset has a certain logical meaning. These fragments can then be

Database Management Systems Unit 13

Sikkim Manipal University Page No.: 220

assigned to different sites in the distributed system. For example, we

may divide employee relation into three horizontal fragments with the

following conditions: (DNO=10), (DNO=20) AND (DNO=30) – each

fragment contains the Employee tuples working for a particular

department.

2. Vertical fragmentations: It is a collection of only certain attributes of

the relation. It divides a relation "vertically" by columns. For ex: we may

want to fragment the employee relation into two vertical fragments. The

first fragment includes personal information –Name, B date, Address

and the Second includes work related information-SSN, Salary, Mgr no

etc.

3. Mixed fragmentation: Mixing of horizontal and vertical fragmentation is

called mixed fragmentation.

Data Replication and Allocation: Replication is useful in improving the

availability of data. This replication of the whole database at every site in

the distributed system is called fully replicated database. This can improve

availability because the system can continue to operate as long as at least

one site is up. It improves performance of retrieval for global queries,

because the result of such a query can be obtained locally from any one

site. The disadvantage is that it can slow down update operations, since

update must be performed on every copy of the database to keep the copies

consistent. Full replication makes the concurrency control and recovery

techniques more expensive.

The other extreme from full replication is no replicating – that is, each

fragment is stored at only one location, whereas in partial replication some

fragments of the database may be replicated and others may not. Some

people carry partially replicated databases with them on laptops.

Database Management Systems Unit 13

Sikkim Manipal University Page No.: 221

Allocation: Each copy of a fragment must be assigned to a particular site in

the distributed system. This process is called data distribution or allocation.

Type of Distributed DB Systems:

In DDB software is distributed over multiple sites connected by network. It

is categorized as:

The first factor is the degree of homogeneity of the DDBMS software. If all

servers (or individual local DDMSs) use identical software and all users use

identical software, the DDBMS is called homogeneous; otherwise, it is

called heterogeneous. At the other extreme is the federated DDBMS or

multidatabase system. In such a system each server has an independent

DBMS, own local users, local programmers and DBA. In heterogeneous

FDBS one server may be RDBMS, another may be network DBMS, and the

third one may be hierarchical DBMS etc. In such a way, it is necessary to

have a canonical system language and language translators to translate

canonical language to the language of each server.

Self Assessment Question(s) (SAQs) (For Section 13.3)

1. What do you mean by data fragmentation? Explain different types.

2. Explain the concept of data replication and allocation.

13.4 Summary

In this unit we have learnt concepts such as

o Client-Server Model

o Data fragmentation

o Replication

o Allocation Techniques for Distributed Database Design

13.5 Terminal Questions (TQs)

1. Discuss briefly the advantages of distributed databases.

2. Discuss Data fragmentation, Replication, and Allocation Techniques for

Distributed Database Design.

Database Management Systems Unit 13

Sikkim Manipal University Page No.: 222

13.6 Multiple Choice Question (MCQs)

1. In ………………….all system components such as data, DBMS

software, storage device reside at a single computer or site.

a) a centralized database system

b) Distributed database System

c) client and server architecuture

d) None of the above

2. In…………data is spread over one or more computer connected by a

network

a) a centralized database system

b) Distributed database System

c) client and server architecuture

d) None of the above

3. … is the machine (workstation or pc) running the front end applications.

a) Server

b) Client

c) Client and server

d) None of the above

4. ………… enables remote data access through client server and server-

to-server communication

a) The network

b) client

c) Server

d) None of the above

13.7 Answers to SAQs, TQs, and MCQs

13.7.1 Answers to Self Assessment Questions (SAQs)

For Section 13.1

1. In a distributed database system, data is spread over one or more

computer connected by a network. (Refer section 13.1)

Database Management Systems Unit 13

Sikkim Manipal University Page No.: 223

2. Increased reliability and availability, Improved performance, Data

sharing, Transparency, Easier expansion (Refer section 13.1)

For Section 13.2

1. The Client-Server model is basic to distributed systems, it allows clients

to make requests that are routed to the appropriate server in the form of

transactions. (Refer section13.2)

For Section 13.3

1. Data fragmentation: Techniques that are used to break up the database

into logical units called fragments, that may be assigned for storage at

the various sites. (Refer section13.3)

2. Data Replication and Allocation: Replication is useful in improving the

availability of data. Each copy of a fragment must be assigned to a

particular site in the distributed system. This process is called data

distribution or allocation. (Refer section 13.3)

13.7.2 Answers to Terminal Questions (TQs)

1. Increased reliability and availability: Reliability is broadly defined as the

probability that a system is running at a certain time point, whereas

reliability is defined as the system is continuously available during a time

interval. (Refer section 13.1)

2. Data fragmentation: Techniques that are used to break up the database

into logical units called fragments, that may be assigned for storage at

the various sites. (Refer section 13.3)

13.7.3 Answers to Multiple Choice Questions (MCQs)

1. A

2. B

3. B

4. A