Post on 21-Mar-2018
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Introduction on Peer to Peer systems
Georges Da Costa
Yerevan, Armenian National Academy of Sciences
dacosta@irit.fr 1/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Goal of this Lecture
What can P2P do, not only as a buzzword
What it can't do
Shows some examples & algorithms
A Survey and Comparison of Peer-to-Peer Overlay Network Schemes, by Eng Keong Luaand al.
in IEEE Communications survey and tutorial March 2004
Harnessing the Power of Disruptive Technologies
published by O'Reilly, 2001
dacosta@irit.fr 2/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
1 What is P2P
2 First generation systems
3 Self-organized systems
4 Structured systems
5 Distributed Hash Table
6 Conclusion
dacosta@irit.fr 3/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Plan
1 What is P2P
2 First generation systems
3 Self-organized systems
4 Structured systems
5 Distributed Hash Table
6 Conclusion
dacosta@irit.fr 4/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Universal
What have in common
Net Meeting, Skype, Ekiga
Irc, Msn, Icq, Jabber
Kazza, Freenet, Napster, Gnutella
Seti@Home, Folding@Home
Ebay, Flickr, Facebook
dacosta@irit.fr 5/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
De�nition
Philosophical one
Participants gathering their resources in order to achieve a common goal
dacosta@irit.fr 6/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Why ?
Available resources
Large Hard Drives
Powerful CPUs
Correct connexion to Internet
Users want
More freedom
No link to commercial companies
No infrastructure cost
dacosta@irit.fr 7/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
A new (?) solution : Peer to Peer sys-tems
De�nition
Participant gathering their resources in order to achieve a common goal
Computers are running the same code
There is no global view of the system
View is limited to neighboors
Everyone has the same rights and duties
dacosta@irit.fr 8/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Peer-to-Peer: New name, old concept
An architecture already there
Internet connects most of existing computers
Most computers are not fully used
Idle time > 75% on personal computersStorage systems are mostly empty
Already used between servers
Usenet
DNS
IP Routing
dacosta@irit.fr 9/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Comparison with Client/Server
In client/Server each node is either a Client or a Server. Usually there are a fewServers and lots of Clients.
Client/Server systems su�er from single point of failure.
Client/Server are mostly static, at least the Servers. Peer to Peer systems aredynamics.
Client/Server systems need human administrators
Client/Server does not scale
dacosta@irit.fr 10/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Comparison with Client/Server II
Client
Client
Client
Client
Client
Client
Node
Node
Node
NodeNode
Server
dacosta@irit.fr 11/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Comparison with Client/Server II
When a new participant joins a service, the service increase the resource consumption
Client/Server : increases the server power/connectivity
Peer to Peer : uses the resources given by the participant
dacosta@irit.fr 12/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Not so easy
Wanted
Scalability (1K,100K,1M nodes)
Dynamicity
Security (user, task)
Transparent
For the user (CPU,memory,disk)For the network
Heterogeneity
Self-organization
Participation (66% of Free riders)
Go through NAT/Firewall
dacosta@irit.fr 13/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Self-organization
Participants
High volatility & voluntary
No central administration
Resource discovery
Heterogeneity
HardwareUsers (15% of users have 94% of �les)
Distribution of the resources
Trust
dacosta@irit.fr 14/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
What's not new
Partial solutions
Scalability : Farm of web servers
Dynamism : Cell phones
Fault tolerance : Redundant servers
dacosta@irit.fr 15/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Current Peer to Peer systems
Available applications
File sharing
Distributed storage
Content delivery
Distributed computing
Telephony/Chat
Games
dacosta@irit.fr 16/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Current Peer to Peer systems (cont)
Widely used
2004 : According to British Web analysis �rm CacheLogic, BitTorrent accounts for anastounding 35 percent of all the tra�c on the Internet � more than all otherpeer-to-peer programs combined � and dwarfs mainstream tra�c like Web pages
Start-ups
Skype (ok, no more a small start-up)
BitTorrent
UbiStorage
dacosta@irit.fr 17/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Two worlds
Internet Users
Problem of security
Large scale
No control
Motivation needed
Private Area (Corp., Univ.)
Other mean of security
Medium to large scale
Total control
dacosta@irit.fr 18/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Plan
1 What is P2P
2 First generation systems
3 Self-organized systems
4 Structured systems
5 Distributed Hash Table
6 Conclusion
dacosta@irit.fr 19/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Index Method
Server
Client
Client
Client Client
Client
Static connexion
Sending the files list
File transfert
Sending a request for a file
Users send the list of their �les to a server
To �nd a �le, you send a request to the server
It answers with the list of clients owning the �le
You directly contact the owners for the transfer
dacosta@irit.fr 20/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Index Method II
Systems
Napster, Mojonation, Yaga, Filetopia, Seti@Home
Problems
Scaling
Price
HotSpot
Attack
Single point of failure
dacosta@irit.fr 21/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Useful when...
Small number of client
Need a total control of transfers (video game industry)
Performance is more important than cost
dacosta@irit.fr 22/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
BitTorrent
Same approach as Napster, but :
Downloads are done in parallel
One server per �le
Server manages all the details of transfers
Server enforces the rule The more you share, the more you get
Di�erences
Specialized for large �les
Distributed due to the One server per �le rule
dacosta@irit.fr 23/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Privacy
No privacy
Napster : The server knows all transfers
BitTorrent : For each �le, a server knows all transferts
dacosta@irit.fr 24/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Flooding
ClientClient
ClientClient Client
Client
Client
You send your request to your neighbors
They forward it to their neighbors, and so on until reaching the Time To Live depth
Users with �les corresponding to the request answer
dacosta@irit.fr 25/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Flooding II
Systems
Gnutella, Direct Connect
Characteristics
Distributed structure
No single point of failureDenial of service di�cult (but possible)
Not scalable
Resource consumption (network)Not complete answers
dacosta@irit.fr 26/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Privacy
Average to good privacy
Onion routing (good privacy)
No global view of the system
Usually easy to obtain the shared list of a node
Di�cult to have a global impact
dacosta@irit.fr 27/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Super Peers
SuperPeerSuperPeer SuperPeer
SuperPeerSuperPeer
SuperPeer
PeerPeer Peer
Super Peers act as local servers
Some reliable nodes act as super peers
Super peers are connected with a gnutella protocol
Each super peer acts as a local server for several peers
dacosta@irit.fr 28/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Super Peers II
Systems
Gnutella2, Kazaa
Characteristics
Less distributed structure
Some nodes are more loadedSome nodes are more important
Scalable
Less resource consumption due to limits of number of answers
dacosta@irit.fr 29/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Plan
1 What is P2P
2 First generation systems
3 Self-organized systems
4 Structured systems
5 Distributed Hash Table
6 Conclusion
dacosta@irit.fr 30/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
A case study : Freenet
Ian Clarke, University of Edinbourgh, (1999)
Keywords
A peer-to-peer �le sharing system
Provide anonymity for authors and readers
A web of Freedom
Principle
Files are referenced by key
The key is obtained by SHA-1 on the �le
The key is routed to localize the �le
dacosta@irit.fr 31/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Content Driven routing algorithm
Routing table contains a set of key/node pairs
Take the nearest key in the routing table to obtain the next node to consult.
Nearest key = by lexical comparison
Request
Data
3
2 5 6
4
1
78a b
c d
e
abc
node cacdabb
node bdacosta@irit.fr 32/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
On the path of the answer
File is replicated on the path in the cacheCache : variant of Last Recently UsedRouting tables are updated
→ the graph evolves (new links = new entries)
a b
c d
e
Old links New links (entries)node c
acdabb
node babc node d
dacosta@irit.fr 33/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Anonymity
Reader
Impossible to know if a user is forwarding or initiating the request
Impossible to know if a user is the last to receive a �le
Writer
Once in the system, the writer can disconnect
Impossible to know if someone insert some �le or forward it
dacosta@irit.fr 34/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Some properties
Self-organization of the graph
Nodes specialize in �les with close keys (learning process)
Good properties (Small World)
File are automatically replicated in function of their popularity
Hot-spots are limited
Tolerant against attacks
dacosta@irit.fr 35/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Drawbacks
Counterpart
Files might disappear (LRU cache)
The network is heavily loaded
Di�cult to update a value
Impossible to know what is hosted locally
dacosta@irit.fr 36/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Plan
1 What is P2P
2 First generation systems
3 Self-organized systems
4 Structured systems
5 Distributed Hash Table
6 Conclusion
dacosta@irit.fr 37/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Pastry
Principle
Each �le has a key
Each node has an identi�er
Node with identi�er Id manages keys whose values are near Id
Queries
Content driven queries
Su�x forwarding
dacosta@irit.fr 38/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Pastry II
05981598259835984598559865987598
x098x198x298x398x498x598x698x798
xx08xx18xx28xx38xx48xx58xx68xx78
xxx0xxx1xxx2xxx3xxx4xxx5xxx6xxx7
Links to the neighbor
Table of the node 4598
87CA
D598
1598
2118
09 98
8 F4B
0325
4598
3E98
00982BB8
598 8
Neighbors of Id are chosen as to have the su�x of their identi�er in common withId
dacosta@irit.fr 39/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Pastry III
Pros
ln(n) messages guarantee
Good path redundancy
Cons
Di�cult to keep a synchronized neighbor table
Problem of data redundancy
No adaptation to data dynamicity
dacosta@irit.fr 40/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Plan
1 What is P2P
2 First generation systems
3 Self-organized systems
4 Structured systems
5 Distributed Hash Table
6 Conclusion
dacosta@irit.fr 41/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Current state of Peer to Peer systems
A lot of redundant systems
Typically File Sharing
Common basic component
Distributed index (Key, Value)
Key is typically the �lename
Value is typically the �le content or where to obtain it
Each Key is associated with a node
dacosta@irit.fr 42/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Generic Interface
Node Id : k-bit identi�er (unique)
Key : k-bit identi�er (unique)
Value : bytes (can be a �le, an IP, ...)
Generic DHT (Distributed Hash Table)
put(key, value)
Stores (key, value) on the node responsible of key
value = get(key)
Retrieves the data associated with key
dacosta@irit.fr 43/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Current implementations
Software
Kadmelia
Chord
CAN
Usage
File sharing
Naming
Chat service
Databases
dacosta@irit.fr 44/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Still limited
Fundamental Problems
Complex request
Data coherence
Request with several answer
Implementation di�culties
Distribute workload evenly
KeysRequests
Only local information
Dynamic information
dacosta@irit.fr 45/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Chord structure
Nodes are distributedon a circle
Keys are assigned tothe node with Id justbefore their value
0
128
64192
75
61Key in store of 61
62, 66 74
dacosta@irit.fr 46/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Neighbors
Log(N) neighbors
Neighbors are nodesId + 1, Id + 2, Id +4, ..., Id + 2i , ..., Id +2k−1 (modulo 2k).
0
128
64192
dacosta@irit.fr 47/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Routing algorithm
Forward to theneighbor which isprior to the key
Query needs at mostLog(N) messages
0
128
192 64
Query for 40
Node responsible of
Id between 35 and 50
dacosta@irit.fr 48/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Chord characteristics
E�cient
If a (key, value) exists, the query will �nd it
Fast : Log2(1.000.000) = 23
Small neighbors table Log2(N)
dacosta@irit.fr 49/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Chord characteristics
Some problems
Security and privacy
Attack
How to test and evaluate such system ?
Real performance (instead of number of messages)
dacosta@irit.fr 50/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Physical overlay
Logical topology mapped in the physical network :
N2
N1
Query
Answer
dacosta@irit.fr 51/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Plan
1 What is P2P
2 First generation systems
3 Self-organized systems
4 Structured systems
5 Distributed Hash Table
6 Conclusion
dacosta@irit.fr 52/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Conclusion
Peer to Peer systems are e�cient for several uses (using border resources)
Recent systems are scalable
Low cost alternative to Client/Server
Field old enough to be used in real cases
Still not perfect
Trust & certi�cationAnonymitySecurityPerformanceLayers fees
dacosta@irit.fr 53/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
When to use Peer to Peer systems
Limited budget
Large audience
Trusted users
Dynamic system, but not too much
Do not need guarantee
Do not need control
dacosta@irit.fr 54/55
What is P2P First generation systems Self-organized systems Structured systems Distributed Hash Table Conclusion
Vision of the future
User centered
No more serversAll content provided and served by users
Only cooperation of peers
WikipediaSocial networksYoutubeGood Ol' Time web-pages
dacosta@irit.fr 55/55