DISTRIBUTED DATABASES JORGE POMBAR. Overview Most businesses need to support databases at multiple...

10
DISTRIBUTED DISTRIBUTED DATABASES DATABASES JORGE POMBAR JORGE POMBAR

Transcript of DISTRIBUTED DATABASES JORGE POMBAR. Overview Most businesses need to support databases at multiple...

Page 1: DISTRIBUTED DATABASES JORGE POMBAR. Overview Most businesses need to support databases at multiple sites. Most businesses need to support databases at.

DISTRIBUTED DISTRIBUTED DATABASESDATABASES

JORGE POMBARJORGE POMBAR

Page 2: DISTRIBUTED DATABASES JORGE POMBAR. Overview Most businesses need to support databases at multiple sites. Most businesses need to support databases at.

OverviewOverview Most businesses need to support Most businesses need to support

databases at multiple sites.databases at multiple sites. We need a single application that can We need a single application that can

access multiple databases.access multiple databases. The goal is for the client to connect The goal is for the client to connect

to a single server that can issue to a single server that can issue queries that affect all databases.queries that affect all databases.

Page 3: DISTRIBUTED DATABASES JORGE POMBAR. Overview Most businesses need to support databases at multiple sites. Most businesses need to support databases at.

ProblemsProblems

Some problems come up:Some problems come up: Each application must know and Each application must know and

exploit the distribution of data in all exploit the distribution of data in all of the multiple databases.of the multiple databases.

Also the DBMS is responsible for Also the DBMS is responsible for maintaining consistency among all maintaining consistency among all databases.databases.

Page 4: DISTRIBUTED DATABASES JORGE POMBAR. Overview Most businesses need to support databases at multiple sites. Most businesses need to support databases at.

SolutionSolution

We consider this We consider this these individual these individual databases to be part databases to be part of a larger of a larger distributed database.distributed database.

Distributed Distributed databases are databases are created by allowing created by allowing servers to interact.servers to interact.

We create a server-We create a server-to-server system.to-server system.

Page 5: DISTRIBUTED DATABASES JORGE POMBAR. Overview Most businesses need to support databases at multiple sites. Most businesses need to support databases at.

Principles of Distributed DatabasesPrinciples of Distributed Databases

A distributed database is a collection of A distributed database is a collection of databases that are related logically but databases that are related logically but are separated physically.are separated physically.

The DBMS needs to be able to use a The DBMS needs to be able to use a single database connection to access single database connection to access and modify all of the distributed data.and modify all of the distributed data.

It has a single schema whose tables can It has a single schema whose tables can be distributed over many different be distributed over many different database servers.database servers.

Page 6: DISTRIBUTED DATABASES JORGE POMBAR. Overview Most businesses need to support databases at multiple sites. Most businesses need to support databases at.

Distribution of tablesDistribution of tables UnfragmentedUnfragmented: A table exists on one database : A table exists on one database

and different tables are in different databases.and different tables are in different databases. Horizontally fragmentedHorizontally fragmented: The rows of the table : The rows of the table

appear in multiple databases. Each row only appear in multiple databases. Each row only appears in one database.appears in one database.

Vertically fragmentedVertically fragmented: The columns of a table : The columns of a table appear in multiple databases. Only the key appear in multiple databases. Only the key columns are duplicated.columns are duplicated.

ReplicatedReplicated: Some or all of the rows and columns : Some or all of the rows and columns are stored in more than one database.are stored in more than one database.

No matter how the tables are distributed, the user must be presented with a schema that makes the distributed database look like a single database.

Page 7: DISTRIBUTED DATABASES JORGE POMBAR. Overview Most businesses need to support databases at multiple sites. Most businesses need to support databases at.

How it worksHow it works We have a central database We have a central database

server and a local database server and a local database on each branch.on each branch.

The client application The client application connects to the local connects to the local database server, which in database server, which in turn connects to the other turn connects to the other database servers in order to database servers in order to access data.access data.

Any modifications of data Any modifications of data requires modification to all requires modification to all connected databases.connected databases.

The client feels that is only The client feels that is only interacting with a single, interacting with a single, local database.local database.

Page 8: DISTRIBUTED DATABASES JORGE POMBAR. Overview Most businesses need to support databases at multiple sites. Most businesses need to support databases at.

Advantages of Distributed Advantages of Distributed DatabasesDatabases

Autonomy and availability of data: Each site can Autonomy and availability of data: Each site can always access its local data even if the network always access its local data even if the network connection is down.connection is down.

Independence of physical and logical layout: Independence of physical and logical layout: Changes are not the responsibility of client Changes are not the responsibility of client programs. They are done by the database programs. They are done by the database servers.servers.

Physical locality: A database application that only Physical locality: A database application that only needs part of the whole database can access its needs part of the whole database can access its data locally. So you would get a faster response data locally. So you would get a faster response time.time.

Improved performance: Data is distributed over Improved performance: Data is distributed over many computers so query executes faster since many computers so query executes faster since we use the processing power of multiple we use the processing power of multiple computers. computers.

Page 9: DISTRIBUTED DATABASES JORGE POMBAR. Overview Most businesses need to support databases at multiple sites. Most businesses need to support databases at.

Real world applicationsReal world applications

Large retail chains: Best Buy, Macys, Large retail chains: Best Buy, Macys, Circuit City, etc.Circuit City, etc.

Video rental stores.Video rental stores. Sharing academic research data.Sharing academic research data.

Page 10: DISTRIBUTED DATABASES JORGE POMBAR. Overview Most businesses need to support databases at multiple sites. Most businesses need to support databases at.

ReferencesReferences

Riccardi, G (2001). Riccardi, G (2001). Principles of Principles of Database Systems with internet and Database Systems with internet and Java applicationsJava applications. Boston, MA: . Boston, MA: Addison Wesley. Addison Wesley.

Dye, Charles (1999, April). Oracle Dye, Charles (1999, April). Oracle Distributed Systems. Retrieved May Distributed Systems. Retrieved May 8, 2008, from O Web site: 8, 2008, from O Web site: http://www.oreilly.com/catalog/ordisthttp://www.oreilly.com/catalog/ordistsys/chapter1 sys/chapter1