Peer-to-Peer Infrastructure and Applications Andrew Herbert Microsoft Research, Cambridge +44 1223...
-
Upload
molly-coffey -
Category
Documents
-
view
215 -
download
0
Transcript of Peer-to-Peer Infrastructure and Applications Andrew Herbert Microsoft Research, Cambridge +44 1223...
Peer-to-Peer Infrastructure and Applications
Andrew HerbertMicrosoft Research, Cambridge+44 1223 [email protected]
Microsoft and the Grid Shared vision of the “virtual organization”
But focused on e-Business rather than e-Science Grid investments
Windows clusters for large scale computations TerraServer projects for large data sets Globus port to Windows Globus implementation of OGSA
Wrap Grid services as Web services Leverage web services as Grid infrastructure
E.g., Hailstorm user authentication
Microsoft Research
Web services enable wide area integration How to extend this to enable efficient wide scale
information sharing and collaboration? Move to a model of peer-to-peer service
implementation in contrast to today’s server-based model
Necessarily scalable and self-organizing Necessarily simple developer framework No conflict with WSDL, SOAP etc
Peer-to-Peer today
Music / video download Napster, Morpheus, Gnutella
Distributed computing SETI@Home
Research community looking for general purpose frameworks discovering useful applications
Peer-to-Peer applications Publish/Subscribe Event Notification (SCRIBE)
Share load of supporting topics and disseminating messages
from publishers to subscribers
Distributed document archive (PAST) Share load of storing documents reliably
Web caching (SQUIRREL) Share load of caching web pages
Dynamic directory (OVERLOOK) Share load of storing directory entries for dynamic data
A P2P framework requires: Content-based addressing
Hash content to key Route message to computer hosting that key
Dynamic caching and proxying Local computers stand in for remote ones Faster access, reduced load on key holder
Replication and automatic failover Store at K computers adjacent to key holder
Multicast cascade for group communication Each computer needs a spanning tree of
routes for reaching every other computer
Overlay Networks Peer-to-peer requires richer routing semantics
than IP IP routes to destination computer, not content URLs route to destination computer, not content IP multicast isn’t widely deployed
Solution: Overlay networks allow applications to participate in hop-by-hop
routing decisions Ideal overlay is efficient, self-organizing,
scalable, and fault-tolerant
Pastry outline Computers (Nodes) have unique Id
Typically 128 bits long
Primitive: Route (msg, key) Deliver msg to currently alive node with Id closest
numerically to key Scalable, efficient
Per node routing table O(log(N)) entries Route in O(log(N)) steps
Fault tolerant Self-fixes routing tables when nodes added,
deleted or fail
0XXX
1XXX
2XXX
3XXX
2321
2032
2001
0112
START
0112 routes a message
to key 2000.
First hop fixes first digit (2)
Second hop fixes second
digit (20)END
2001 closest live node to
2000.
Pastry routing
Routing table:For each level, nearest peer for other domains
Namespace leaf set: nearest Ids to “left” and “right” in name space
Each entry gives IP address for host
associated with Id
Pastry routing table
Pastry Routing Demo
Pastry node addition Want to add new node to the system Invent a new random nodeId X Go to a nearby or well-known node A
1. Route to “key” X via A (finds node Z with Id closest to X)
2. Obtain leaf set from Z and rebuild3. Obtain routing table entries from each node along the
route from A to Z and rebuild4. Register with each member of A’s namespace leaf
set so they adjust their leaf sets and rebuild
5. Find nearest leaf set node and use itsrouting table to improve locality
Scribe: A Pastry ApplicationPublisher Publisher
Subscriber Subscriber
Topic of interest
Publish / subscribe is a popular model for “event driven” systems with volatile membership
Decouple event publishers from event subscribers Publishers don’t know in advance who subscribers are Subscribers don’t know in advance who publishers are
Challenge is how to multicast notifications from topics efficiently
Scribe: architecture Topic hashed to a key Construct a multicast tree based on the Pastry
network Have the (Pastry) node with the closest Id to the
topic key be the root This node replicates knowledge of the topic to its k
nearest neighbours for resilience Pass event notification down through the tree
Each parent forwards event to it’s children Avoids over stressing network links close to the
topic node
Scribe: Topic creation
Each topic is assigned a topicId
Root of the multicast tree= node with nodeId numerically closest
Create(topic): route through Pastry to the topicId
T
Create(T)
Root
Scribe: subscribing
1100
1101
1001
0100 0111
1011
1111
1100
0111
0100
10001111
1000
1101
1001
1011
Scribe: event dissemination
Publish(topic, event)
Route through the Pastry network using the topicId as the destination
Dissemination along the multicast tree starting from the root
1100
1101
1011
0100 0111
1011
E
Scribe demo
Summary
Peer-to-peer techniques are good for wide area information sharing and collaborative computation
Overlay networks enable peer-to-peer distributed computing
Pastry is an efficient, scalable, self-organizing peer-to-peer framework
Pastry makes it easy to build powerful peer-to-peer applications
For more see: http://research.microsoft.com/~antr/Pastry/