1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths...
-
Upload
john-montgomery -
Category
Documents
-
view
213 -
download
0
Transcript of 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths...
![Page 1: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/1.jpg)
1
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Distributed databases
3
![Page 2: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/2.jpg)
2
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Outline
generalities objectives problems
![Page 3: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/3.jpg)
3
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
1
![Page 4: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/4.jpg)
4
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Introduction
communication network
server
applicationapplication
application
applicationapplication
applicationapplication
server
serverDBMS in its own right
![Page 5: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/5.jpg)
5
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Introduction
distributed database = collection of connected sites each site is a DB in its own right (1)
• has its own DBMS and its own users
• operations can be performed locally as if the DB was not distributed
the sites collaborate (transparently from the user’s point of view) the union of all DBs = the DB of the whole organisation (institution)
• (oppose to (1))
physical or logical distribution strict homogeneity (assumption)
![Page 6: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/6.jpg)
6
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Motivation
advantages matches the structure of the organisation
• example
efficiency of processing• stored closely to where it is being used
increased accessibility• remote DBs can be accessed
disadvantage complexity
![Page 7: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/7.jpg)
7
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Implementations (systems)
commercial ORACLE (Oracle Corporation) INGRES/STAR (Ask Group Inc. Ingres Division) DB2 (IBM)
they all provide some sort of features for distributed databases
![Page 8: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/8.jpg)
8
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Fundamental principle
a distributed DB system should look to the user exactly as a non-distributed DB system
![Page 9: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/9.jpg)
9
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
2
![Page 10: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/10.jpg)
10
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Objectives
local autonomy
no reliance on central site
location independence
fragmentation independence
replication independence
distributed query processing
distributed transaction management
![Page 11: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/11.jpg)
11
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Objectives are:
not independent from each other not exhaustive sometimes contradicting different degree of importance (for the user)
![Page 12: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/12.jpg)
12
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Local autonomy
all operations at a certain site are fully controlled by that site
not achievable (why?) therefore, autonomy should be achieved to the
maximum extent possible
local data is locally owned and managed local data belongs to the local server even if it is
accessible from other servers security, integrity, ..., are in the responsibility of the local
server
![Page 13: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/13.jpg)
13
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
No reliance on a central site
reasons bottle-neck vulnerability
conclusion all sites must be equal
![Page 14: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/14.jpg)
14
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Location independence
users should not have to know where data is physically stored
why do you think this is needed?• think of application programs
what does this objective look like?
![Page 15: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/15.jpg)
15
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Data fragmentation
data fragmentation if a relation can be divided into “fragments” for storing
purposes motivation: performance - data is stored where it is
mostly used
definition fragment = any subrelation derivable via restriction or
projection
![Page 16: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/16.jpg)
16
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
FRAGMENT Emp INTOLo_Emp AT SITE ‘London’
WHERE Dept_id = ‘Sales’Le_Emp AT SITE ‘Leeds’
WHERE Dept_id = ‘Dev’ ;
Data fragmentation - example
![Page 17: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/17.jpg)
17
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Fragmentation independence / transparency
users should perceive data as if it were not fragmented
why?
it is the optimiser’s responsibility to determine which fragments need to be physically accessed
similar to views retrieving updating (JOIN and UNION views)
![Page 18: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/18.jpg)
18
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Data replication
copies of the same fragment can exist at different sites
reasons better availability better performance
disadvantage update propagation
![Page 19: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/19.jpg)
19
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Replication independence / transparency
users should not have to be aware of data replication
it is the optimiser’s responsibility to choose which replica to use
commercial systems not full support for replication independence (update
problems) - primary copy
![Page 20: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/20.jpg)
20
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Distributed query processing
the system must have set level operators one record at a time - too many messages (traffic) relational - indicated
optimisation particularly relevant! find best way to move data across the network
![Page 21: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/21.jpg)
21
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
3
![Page 22: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/22.jpg)
22
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Problems
occur due to network utilisation
aim minimise network utilisation
query processing
catalogue management
update propagation
recovery control
concurrency control
![Page 23: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/23.jpg)
23
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Query processing
in a distributed environment query execution is distributed query optimisation is distributed
• global optimisation
• local optimisation
example• query on relation R issued at site X
• part of R, say Ry, stored at Y
• part of R, say Rz, stored at Z
• where is the query going to be executed?
![Page 24: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/24.jpg)
24
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Catalogue management
what ‘other’ data does the catalog include? fragmentation, replication ...
where should the catalogue be stored centralised fully replicated
• loss of autonomy - update propagation!
partitioned • non local operations - very expensive!
combination of first and third
![Page 25: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/25.jpg)
25
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Central Catalogue
all updates, including local updates, have to be recorded in the central catalogue disadvantages:
bottleneck conflicts with the “no reliance on a central site” objective
![Page 26: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/26.jpg)
26
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Fully Replicated Catalogue
the entire database catalogue (not only the local one) is stored at each site
every time an update is made, it has to be recorded at each site disadvantages
loss of local autonomy time and network traffic consuming updates
![Page 27: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/27.jpg)
27
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Update propagation
problems because of replication data might become less available
primary copy scheme one copy is designated primary copy (unique) primary copies exist at different sites (distributed) an update is logically complete if the primary copy has been
updated• the site holding the primary copy would have to propagate the
updates
violation of local autonomy
![Page 28: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/28.jpg)
28
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Concurrency control
locking overhead - increased number of messages
primary copy strategy locking only the primary copy the primary copy’s site will propagate the update loss of autonomy (severely)
global deadlock two interlocked (waiting for each other) sites cannot be detected using the wait-for graph - therefore,
communication overhead
![Page 29: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/29.jpg)
29
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
![Page 30: 1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.](https://reader035.fdocuments.us/reader035/viewer/2022062619/5515c805550346c6278b4686/html5/thumbnails/30.jpg)
30
Term 2, 2004, Lecture 9, Distributed Databases Marian Ursu, Department of Computing, Goldsmiths College
Conclusion
generalities objectives – in brief problems – in brief