Fiqh Hanafi Per Aetrazaat Kay Jawabat By Peer G Syed Mushtaq Ali Shah
Data Management for Peer-to-Peer Computing: A Vision Ali Rahbari.
-
Upload
oswald-goodwin -
Category
Documents
-
view
212 -
download
0
Transcript of Data Management for Peer-to-Peer Computing: A Vision Ali Rahbari.
Data Management for Peer-to-Peer Computing: A Vision
Ali Rahbari
Outline
• P2P Data Networks
• Why P2P Databases are Different
• A P2P Database Scenario
• A logic for P2P Databases
• Propagation Strategy
• Architecture and Implementation Issues
P2P Data Networks: Basic Notions
• Node
– Database, File System, etc
• P2P network
– Indexed nodes with equal participant rights
• Services
– Query answering
– Query, results and update propagation
• Locality
– No global schema, no centralized control
– Nodes have only a partial vision of the world
• Autonomy
– Nodes are largely independent of their language and content, etc
Roles for P2P DBs?
• Peers come and go, but must still be able to interoperate.
• To us, the big question is how to cope with DBs that
– are incomplete, overlapping, and mutually inconsistent
– dynamically appear and disappear
– have limited connectivity.
• Scenario
– Databases of medical patients
– Complete integration is likely to be infeasible
– But dynamic integration of DBs relevant to one patient could have high value.
A Model for P2P Databases
• Each peer is a node with a database. It exchanges data
and services with acquaintances (i.e. other peers).
• The set of acquaintances changes often, due to
– site availability
– changing usage patterns
• Peers are fully autonomous.
– No global control or central server.
H: HospitalP: Pharmacist
D: Doctor
A Motivating Scenario
A patient may be described in several DBs, which use
different patient id formats, disease descriptions,
etc.
But the databases can use different patient id
formats, disease descriptions, etc
1. When a patient is admitted
to the hospital, H becomes acquainted with D
2. The acquaintance is dropped when treatment is
over
3. When the doctor prescribes a drug, D becomes
acquainted with P
4. A patient is injured skiing, so more DBs get
involved
Ski Clinic
Proposal: Local Relational Model (LRM)
• A logic for P2P data integration
• Instead of a global schema, each peer has
– coordination formulas – each specifies semantic interdependencies
between two acquaintances
– binary domain relations – each specifies how symbols in one database
translate to symbols in an acquaintance’s database.
• Each expression in a coordination formula is relative to just one
participating database
• Use coordination formulas and domain relations for query and
update processing.
A Coordination Formula• p: pharmacist DB
medication(PrescriptionID, PatientID, Prod)
• d: doctor DB
treatment(TreatmentID, PatientID, Description, Type)
where type {“hospital”, “home”}
• (i:x).A(x) means for all x in the domain of database i, A(x) is true.
• A coordination formula:
(p:y).(p:z).(p: (x).medication(x, y, z)
d: (w).treatment(w, y, z, “home”) )
“There’s a row in treatment in the doctor DB
for each row in medication in the pharmacist DB”
Domain Relation
• A row <d1,d2> in domain relation rik specifies that value d1 in DBi corresponds to
value d2 in DBk
• rik may be partial
• rik,rki need not be symmetric
• Example - DBi contains lengths in meters and
DBk in kilometers (total but not symmetric)
– rik(x) = roundToClosestK(x)
rik(653)=1, rik(453)=0
– rki(x) = x*1000
rki(1)=1000
Queries• A query is a coordination formula of the form A(x) i: q(x), where
– A(x) is a coordination formula
– x has n variables
– i is the database against which the query is posed
– q is a new n-ary predicate symbol
• A relational space is a pair <db,r> where db is a set of DBs and r associates an
rik with each pair of DBs
• <db,r> ⊨ f A relational space <db,r> satisfies a coordination formula f
• The answer to a query:
{ddomi | <db,r> ⊨ ((i:x).A(x) i:x=d)}
Interpreting a Query
• A query:
((i:P(x) j:R(y)) k:S(x,y) ) h:q(x,y)
• Evaluate P,R,S in i,j,k (respectively)
• Map these results via rih,rjh,rkh to sets si,sj,sk
• And then compute ((si sj) sk)
P2P Databases: Proposed Solution
Coordinate query and update exchange between autonomous DBs using:
• Coordination Formulas– Specify semantic interdependencies between data from two nodes
table to table: Cust Customer
column to column: name(Cust) nm(Customer)
• Binary Domain Relations– Specify how the symbols used in one database translate to symbols used in another database
‘one’ ‘uno’
CAN$1.00 US$0.65
• Keep AUTONOMY and COORDINATION, as much as possible
What’s New in the Solution?
• No global schema, no central registry, no form of
control
• No need of system restructuring when new nodes
come and old ones go away
• We do not integrate, we COORDINATE.
– Integration is built at design time
– coordination happens at runtime
Propagation Strategy: Basic notions
• Acquaintance– Pair of nodes which have coordination formulas and binary domain relations with respect to each
other
– Acquaintances can exchange data and services
• Interest Group– Set of nodes with inter-acquaintances between them which have related content
• Group Manager– Node of an Interest Group, which is dedicated for group and query propagation management
– GM has higher requirements for stability, must be permanently active
• Query Scope
– Set of nodes which are supposed to answer a given query. Query Scope is defined by Group Manager
15
Query Propagation Strategy
1. User submits query Q ()
2. Node defines query topic
3. Node sends to Group Manager (GM) request to
define Query Scope (QS)
4. GM computes and sends back QS
5. Node 1 sends query to acquaintances in QS, and
reports this fact to GM
6. Nodes 2 and 4 send answer to node 1
7. Nodes propagate the query to theirs acquaintances
from QS and report this fact to GM
8. And so on…
9. Nodes which do not propagate any further, report
this fact to GM
10. Propagation stops when “no more propagation”
received from all boundary nodes
1
2
3
4
6
5
10
8
7
9
11
1. Q ()2. Q (, topic)
3. QS (, topic) = ? GM
4. QS (, topic)= (2, 4, 6, 8, 9, 11)
5. “nodes 2 and 4 are reached”
←R
es2
←Res4
“node 6 is reached”
“node 8 is reached”
“no more propagation from 8”
“no more propagation from 9”
Implementation Architecture
• A classic multi-database system, with
– A protocol for adding/dropping acquaintances
– LRM query processing (domain mapping logic) that can cope with chains of
acquaintances
– Dynamic approach to materialized view creation
• Tools to help a user establish an acquaintance
Architecture
• P2P Layer– P2P functionality’s add-on
• Local Data Source– Database
– File system
• User Interface– User queries
– Results
• Query Manager and Update Manager– Responsible for query and update propagation
– Manage coordination and correspondence rules, acquaintances,
and interest groups
• Wrapper– Provides a translation layer between QM and UM, and LDS
Summary
• Why P2P databases are different
• A P2P database scenario
• A logic for P2P databases (LRM)
– Coordination formulas and domain relations
– Query semantics
• Architecture and implementation issues
P2P Databases 19
منابع
• 1. M.J. Carey, L.M. Haas, P.M. Schwarz, Manish Arya, W.F. Cody, R. Fagin, M. Flickner, A. Luniewski, W.
Niblack, D. Petkovic, J. Thomas II, J.H. Williams, E.L. Wimmers: Towards heterogeneous multimedia
information systems: The Garlic approach. RIDE-DOM 1995: 124-131.
• 2. T. Catarci and M. Lenzerini. Representing and using interschema knowledge in cooperative information
systems. International J. of Intelligent and Cooperative Info. Sys., 2(4), 375-398, 1993.
• 3. S. Ceri and J. Widom. Managing semantic heterogeneity with production rules and persistent queues.
In Proceedings 19th VLDB (1993), 108-119.
• 4. S. Chawathe, H. Garcia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J.D. Ullman, J. Widom. The
TSIMMIS Project: Integration of heterogeneous data sources. 16th Meeting of Information Processing
Society of Japan, 1994, 7–18.
• 5. A. Gupta and J. Widom. Local verification of global integrity constraints in distributed databases. In
Proc. ACM SIGMOD Conference, 49-58, 1993.