Peer-to-Peer Systems

71
Peer-to-Peer Systems Presented By: Nazanin Dehghani Supervisor: Dr. Naser Yazdani

description

Peer-to-Peer Systems. Presented By: Nazanin Dehghani Supervisor: Dr. Naser Yazdani. Peer-to-Peer Architecture. more dynamic structure while having a distributed system . every client should bound statically to a specific server. Peer-to-Peer Definition. - PowerPoint PPT Presentation

Transcript of Peer-to-Peer Systems

Peer-to-Peer Social Networking

Peer-to-Peer SystemsPresented By: Nazanin DehghaniSupervisor: Dr. Naser Yazdani

Peer-to-Peer Architecture

every client should bound statically to a specific servermore dynamic structure while having a distributed system.Peer-to-Peer Definitiona computer network in which each computer in the network can act as a client or server for the other computers in the network, allowing shared access to files andperipherals without the need for a central server.

Peer-to-Peer Applications4Peer-to-Peer SystemsPropertiesNodes have to share their resources such as memory, band-width and processing power directly

P2P networks should be robust to node churn5PrimitivesCommon PrimitivesJoin: how to I begin participating?Publish: how do I advertise my file?Search: how to I find a file?Fetch: how to I retrieve a file?

Architecture of P2P SystemsOverlay Network

Graph StructureStructuredAware of topology of overlay networkUnstructured

How Did it Start?A killer application: NaptserFree music over the InternetKey idea: share the content, storage and bandwidth of individual (home) users

InternetModelEach user stores a subset of filesEach user has access (can download) files from all users in the systemMain ChallengeFind where a particular file is storedABCDEFE?Other ChallengesScale: up to hundred of thousands or millions of machines

Dynamicity: machines can come and go any timeNapsterAssume a centralized index system that maps files (songs) to machines that are aliveHow to find a file (song)Query the index system return a machine that stores the required fileIdeally this is the closest/least-loaded machineftp the fileAdvantages: Simplicity, easy to implement sophisticated search engines on top of the index systemDisadvantages:Robustness, scalability (?)Napster: ExampleABCDEFm1m2m3m4m5m6m1 Am2 Bm3 Cm4 Dm5 Em6 FE?m5E?EGnutellaDistribute file locationIdea: flood the requestHot to find a file:Send request to all neighborsNeighbors recursively multicast the requestEventually a machine that has the file receives the request, and it sends back the answer

Advantages:Totally decentralized, highly robustDisadvantages:Not scalable; the entire network can be swamped with request (to alleviate this problem, each request has a TTL) GnutellaQueries are flooded for bounded number of hopsNo guarantees on recall

Query: xyzxyzxyzDistributed Hash Tables (DHTs)Abstraction: a distributed hash-table data structure insert(id, item);item = query(id); (or lookup(id);)Note: item can be anything: a data object, document, file, pointer to a fileProposalsCAN, Chord, Kademlia, Pastry, Tapestry, etc

DHT Design GoalsMake sure that an item (file) identified is always foundScales to hundreds of thousands of nodesHandles rapid arrival and failure of nodesDistributed Hash Tables (DHTs)Hash table interface: put(key,item), get(key)O(log n) hops Guarantees on recall

Structured Networks

K IK IK IK IK IK IK IK IK Iput(K1,I1)(K1,I1)get (K1)I1Chord20ChordIn short: a peer-to-peer lookup serviceSolves problem of locating a data item in a collection of distributed nodes, considering frequent node arrivals and departuresCore operation in most p2p systems is efficient location of data itemsSupports just one operation: given a key, it maps the key onto a node2021Chord CharacteristicsSimplicity, provable correctness, and provable performanceEach Chord node needs routing information about only a few other nodesResolves lookups via messages to other nodes (iteratively or recursively)Maintains routing information as nodes join and leave the system2122Napster, Gnutella etc. vs. ChordCompared to Napster and its centralized servers, Chord avoids single points of control or failure by a decentralized technologyCompared to Gnutella and its widespread use of broadcasts, Chord avoids the lack of scalability through a small number of important information for rounting2223Addressed Difficult Problems (1)Load balance: distributed hash function, spreading keys evenly over nodesDecentralization: chord is fully distributed, no node more important than other, improves robustnessScalability: logarithmic growth of lookup costs with number of nodes in network, even very large systems are feasible2324Addressed Difficult Problems (2)Availability: chord automatically adjusts its internal tables to ensure that the node responsible for a key can always be found

2425Consistent HashingHash function assigns each node and key an m-bit identifier using a base hash function such as SHA-1ID(node) = hash(IP, Port)ID(key) = hash(key)Properties of consistent hashing:Function balances load: all nodes receive roughly the same number of keysWhen an Nth node joins (or leaves) the network, only an O(1/N) fraction of the keys are moved to a different location25266126042651372identifiercircleidentifiernodeXkeySuccessor Nodessuccessor(1) = 1successor(2) = 3successor(6) = 02627Node Joins and Departures61204265137successor(6) = 761successor(1) = 32728Scalable Key LocationA very small amount of routing information suffices to implement consistent hashing in a distributed environmentEach node need only be aware of its successor node on the circleQueries for a given identifier can be passed around the circle via these successor pointersResolution scheme correct, BUT inefficient: it may require traversing all N nodes!2829Acceleration of LookupsLookups are accelerated by maintaining additional routing informationEach node maintains a routing table with (at most) m entries (where N=2m) called the finger tableith entry in the table at node n contains the identity of the first node, s, that succeeds n by at least 2i-1 on the identifier circle (clarification on next slide)s = successor(n + 2i-1) (all arithmetic mod 2)s is called the ith finger of node n, denoted by n.finger(i).node2930Finger Tables (1)04265137124[1,2)[2,4)[4,0)130finger tablestartint.succ.keys1235[2,3)[3,5)[5,1)330finger tablestartint.succ.keys2457[4,5)[5,7)[7,3)000finger tablestartint.succ.keys63031Finger Tables (2) - characteristicsEach node stores information about only a small number of other nodes, and knows more about nodes closely following it than about nodes farther awayA nodes finger table generally does not contain enough information to determine the successor of an arbitrary key kRepetitive queries to nodes that immediately precede the given key will lead to the keys successor eventually

3132Node Joins with Finger Tables04265137124[1,2)[2,4)[4,0)130finger tablestartint.succ.keys1235[2,3)[3,5)[5,1)330finger tablestartint.succ.keys2457[4,5)[5,7)[7,3)000finger tablestartint.succ.keysfinger tablestartint.succ.keys702[7,0)[0,2)[2,6)003666663233Node Departures with Finger Tables04265137124[1,2)[2,4)[4,0)130finger tablestartint.succ.keys1235[2,3)[3,5)[5,1)330finger tablestartint.succ.keys2457[4,5)[5,7)[7,3)660finger tablestartint.succ.keysfinger tablestartint.succ.keys702[7,0)[0,2)[2,6)003666033334Source of Inconsistencies:Concurrent Operations and FailuresBasic stabilization protocol is used to keep nodes successor pointers up to date, which is sufficient to guarantee correctness of lookupsThose successor pointers can then be used to verify the finger table entriesEvery node runs stabilize periodically to find newly joined nodes3435Stabilization after Joinnpsucc(np) = nsnsnpred(ns) = npn joinspredecessor = niln acquires ns as successor via some nn notifies ns being the new predecessorns acquires n as its predecessornp runs stabilizenp asks ns for its predecessor (now n)np acquires n as its successornp notifies nn will acquire np as its predecessorall predecessor and successor pointers are now correctfingers still need to be fixed, but old fingers will still worknilpred(ns) = nsucc(np) = n3536Failure RecoveryKey step in failure recovery is maintaining correct successor pointersTo help achieve this, each node maintains a successor-list of its r nearest successors on the ringIf node n notices that its successor has failed, it replaces it with the first live entry in the liststabilize will correct finger table entries and successor-list entries pointing to failed nodePerformance is sensitive to the frequency of node joins and leaves versus the frequency at which the stabilization protocol is invoked3637Chord The MathEvery node is responsible for about K/N keys (N nodes, K keys)When a node joins or leaves an N-node network, only O(K/N) keys change hands (and only to and from joining or leaving node)Lookups need O(log N) messagesTo reestablish routing invariants and finger tables after node joining or leaving, only O(log2N) messages are required3738

Chord Finger TableN801/21/41/81/161/321/641/128

Entry i in the finger table of node n is the first node that succeeds or equals n + 2iIn other words, the ith finger points 1/2n-i way around the ringSmall tables, but multi-hop lookup. Table entries: IP address and Chord ID.Navigate in ID space, route queries closer to successor. Log(n) tables, log(n) hops.Route to a document between and 39

Chord RoutingUpon receiving a query for item id, a node:Checks whether stores the item locallyIf not, forwards the query to the largest node in its successor table that does not exceed id01234567i id+2i succ0 2 21 3 62 5 6 Succ. Tablei id+2i succ0 3 61 4 62 6 6 Succ. Tablei id+2i succ0 1 11 2 22 4 0 Succ. Table7Items 1Items i id+2i succ0 7 01 0 02 2 2 Succ. Tablequery(7)

Node JoinCompute IDUse an existing node to route to that ID in the ring.Finds s = successor(id)ask s for its predecessor, pSplice self into ring just like a linked listp->successor = meme->successor = sme->predecessor = p

s->predecessor = me4041

Chord SummaryRouting table size?Log N fingers

Routing time?Each hop expects to 1/2 the distance to the desired id => expect O(log N) hops.Bittorrent

FairnessHow about somebody only download not upload.

What is the policyIncentive mechanismUniv. of Tehran43Distributed Operating Systems

Fetching DataOnce we know which node(s) have the data we want...Option 1: Fetch from a single peerProblem: Have to fetch from peer who has whole file.Peers not useful sources until d/l whole fileAt which point they probably log off. :)How can we fix this?4445

Chunk FetchingMore than one node may have the file.How to tell?Must be able to distinguish identical filesNot necessarily same filenameSame filename not necessarily same file...Use hash of fileCommon: MD5, SHA-1, etc.How to fetch?Get bytes [0..8000] from A, [8001...16000] from BAlternative: Erasure CodesUniv. of TehranDistributed Operating SystemsBitTorrentWritten by Bram Cohen (in Python) in 2001Pull-based swarming approachEach file split into smaller piecesNodes request desired pieces from neighborsAs opposed to parents pushing data that they receivePieces not downloaded in sequential order

Encourages contribution by all nodesBitTorrentPiece SelectionRarest firstRandom first selection

Peer SelectionTit-for-tatOptimistic un-choking

47

BitTorrent SwarmSwarmSet of peers all downloading the same fileOrganized as a random meshEach node knows list of pieces downloaded by neighborsNode requests pieces it does not own from neighbors

How a node enters a swarm for file popeye.mp4File popeye.mp4.torrent hosted at a (well-known) webserverThe .torrent has address of tracker for fileThe tracker, which runs on a webserver as well, keeps track of all peers downloading filewww.bittorrent.com

Peer1popeye.mp4.torrentFile popeye.mp4.torrent hosted at a (well-known) webserverThe .torrent has address of tracker for fileThe tracker, which runs on a webserver as well, keeps track of all peers downloading fileHow a node enters a swarm for file popeye.mp4PeerTrackerAddresses of peers2www.bittorrent.com

File popeye.mp4.torrent hosted at a (well-known) webserverThe .torrent has address of tracker for fileThe tracker, which runs on a webserver as well, keeps track of all peers downloading fileHow a node enters a swarm for file popeye.mp4PeerTracker3www.bittorrent.com

SwarmFile popeye.mp4.torrent hosted at a (well-known) webserverThe .torrent has address of tracker for fileThe tracker, which runs on a webserver as well, keeps track of all peers downloading fileHow a node enters a swarm for file popeye.mp4Contents of .torrent fileURL of trackerPiece length Usually 256 KBSHA-1 hashes of each piece in fileFor reliabilityfiles allows download of multiple filesTerminologySeed: peer with the entire fileOriginal Seed: The first seedLeech: peer thats downloading the fileFairer term might have been downloader

Peer-peer transactions:Choosing pieces to request

Rarest-first: Look at all pieces at all peers, and request piece thats owned by fewest peersIncreases diversity in the pieces downloadedavoids case where a node and each of its peers have exactly the same pieces; increases throughputIncreases likelihood all pieces still available even if original seed leaves before any one node has downloaded entire fileChoosing pieces to request

Random First Piece:When peer starts to download, request random piece.So as to assemble first complete piece quicklyThen participate in uploads

When first complete piece assembled, switch to rarest-firstTit-for-tat as incentive to uploadWant to encourage all peers to contributePeer A said to choke peer B if it (A) decides not to upload to BEach peer (say A) unchokes at most 4 interested peers at any timeThe three with the largest upload rates to AWhere the tit-for-tat comes inAnother randomly chosen (Optimistic Unchoke)To periodically look for better choicesWhy BitTorrent took offBetter performance through pull-based transferSlow nodes dont bog down other nodesAllows uploading from hosts that have downloaded parts of a fileIn common with other end-host based multicast schemesPros and cons of BitTorrentProsProficient in utilizing partially downloaded filesDiscourages freeloadingBy rewarding fastest uploadersEncourages diversity through rarest-firstExtends lifetime of swarm

Pros and cons of BitTorrentConsAssumes all interested peers active at same time; performance deteriorates if swarm cools offEven worse: no trackers for obscure content

Pros and cons of BitTorrentDependence on centralized tracker: pro/con? Single point of failure: New nodes cant enter swarm if tracker goes downLack of a search feature Prevents pollution attacks Users need to resort to out-of-band search: well known torrent-hosting sites / plain old web-search

Trackerless BitTorrentTo be more precise, BitTorrent without a centralized-trackerE.g.: AzureusUses a Distributed Hash Table (Kademlia DHT)Tracker run by a normal end-host (not a web-server anymore)The original seeder could itself be the tracker Or have a node in the DHT randomly picked to act as the tracker

CoolStreaming/DONet: A Data-driven Overlay Network for Peer-to-Peer Live Media StreamingP2P Live Video StreamingAutonomous and selfish peersChurn

Time-sensitive and deadline-prone data

64Success of P2P-Based File DistributionDistribute content quicklyUtilizing the capacity of all peersIncentive mechanismPreventing peers from free-riding

Incentive mechanism == formation of clusters of similar bandwidth peers65Inefficiency of Local Incentive66Live Video StreamingFile DistributionQuality as incentiveFast download as incentive66newCoolstreamingProvide peer-to-peer live streaming

Data-driven designDont use any tree, mesh, or any other structuresData flows are guided by the availability of dataCore operations of DONet / CoolStreamingDONet: Data-driven Overlay NetworkCoolStream: Cooperative Overlay StreamingA practical DONet implementationEvery node periodically exchanges data availability information with a set of partnersRetrieve unavailable data from one or more partners, or supply available data to partnersThe more people watching the streaming data, the better the watching quality will beThe idea is similar to BitTorrent (BT)A generic system diagram for a DONet nodePartnership managerRandom select

Transmission schedulerSchedules transmission of video data

Buffer MapRecord availabilityCoolstreamingTwo types of connections between peers:Partnership relationshipParent-child relationship

Multiple sub-streamsBuffer partitioningPush-Pull content deliveringParent re-selection7070An Example of Stream Decomposition7112345611121312345678910111213Single stream of blocks with Sequence number {1, 2, 3, , 13}For sub-streams {S1, S2, S3, S4}Combine & DecomposeS1S2S3S471Structure of Buffer in a Node72d+1d+3kd+2kd+kdd+1321 . . . . . . . . .Blocks ReceivedBlocks UnavailableS1S2SkSynchronization BufferWith K numbers of sub-streamsCache Buffer72Parent Re-selection73

7374

P2P: SummaryMany different styles; remember pros and cons of eachcentralized, flooding, swarming, unstructured and structured routingLessons learned:Single points of failure are badFlooding messages to everyone is badUnderlying network topology is importantNeed incentives to discourage freeloadingPrivacy and security are importantStructure can provide theoretical bounds and guarantees