The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture...

22
The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006

Transcript of The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture...

Page 1: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

The BitTorrentcontent distribution system

CS217 Advanced Topics in

Internet Research

Guest Lecture

Nikitas Liogkas, 5/11/2006

Page 2: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Motivation

flash crowd (aka slashdot) effect many clients, few servers

Problem: servers cannot handle load

Solution: swarming clients download pieces of the file

from each other has been proven to have good scaling

and performance properties

Page 3: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Presentation outline

Joining the system Encoding / metadata file Tracker protocol Peer wire protocol Piece selection Peer selection Client implementations Resources

Page 4: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

new leecher

Joining a torrent

Peers divided into: seeds: have the entire file leechers: still downloading

datarequest

peer list

metadata file

join

1

2 3

4seed/leecher

website

tracker

1. obtain the metadata file (out of band)2. contact the tracker 3. obtain a peer list (contains seeds & leechers)4. contact peers from that list for data

Page 5: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

!

Exchanging data

I have leecher A

● verify pieces using hashes

● download sub-pieces (blocks) in parallel

● advertise received pieces to the entire peer list

● interested: need pieces that a given peer has

seed

leecher B

leecher C

Page 6: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Bencoding

encoding format of all exchanged messages four types

byte strings integers lists dictionaries (mapping keys to values)

examples 4:spam represents the string “spam” i10e represents the integer 10

Page 7: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Metadata file structure

contains information necessary to contact the tracker and describes the files in the torrent announce URL of tracker file name file length piece length (typically 256KB) SHA-1 hashes of pieces for verification also creation date, comment, creator, …

Page 8: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Tracker protocol

communicates with clients via HTTP/HTTPS client GET request

info_hash: uniquely identifies the file peer_id: chosen by and uniquely identifies the client client IP and port numwant: how many peers to return (defaults to 50) stats: bytes uploaded, downloaded, left

tracker GET response interval: how often to contact the tracker list of peers, containing peer id, IP and port stats: complete, incomplete

tracker-less mode; based on the Kademlia DHT

Page 9: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Presentation outline

Joining the system Encoding / metadata file Tracker protocol Peer wire protocol Piece selection Peer selection Client implementations Resources

Page 10: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Peer wire protocol

implemented directly on top of TCP messages

handshake (maybe with bitfield) keep-alive choke / unchoke interested / not interested have (advertisement of a newly acquired piece) request / piece cancel (only used in “endgame mode”) port (used in tracker-less mode)

Page 11: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Piece selection

when downloading starts: choose at random get complete pieces as quickly as possible obtain something to offer to others

after we have 4 pieces: pick (local) rarest first achieves the fastest replication of rare pieces obtain something of value only get unique pieces from the seed

endgame mode defense against the “last-block problem” send requests for missing sub-pieces to all

peers in our peer list send cancel messages upon receipt of a sub-piece

Page 12: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Last-block problem

at the end of the download, a peer may have trouble finding the few missing pieces

based on anecdotal evidence other proposals

network coding [Gkantsidis et al., Infocom’05] prefer to upload to peers with similar file

completeness; unfair for the peers having most of the pieces [Tian et al., Infocom’06]

Page 13: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Last-block problem – a myth?

is it a problem after all? figure from [Legout et al., INRIA-TR-2006], with permission

Page 14: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Peer selection - unchoking

leecher A

seed

leecher B

leecher C

• periodically (typically every 10 seconds) calculate data-receiving rates

• upload to (unchoke) the fastest

• constant number of unchoking slots

• based on the “tit-for-tat” strategy

Page 15: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Optimistic unchoking

periodically select a peer at random and upload to it typically every 3 unchoking rounds (30 seconds)

multi-purpose mechanism allow bootstrapping of new clients continuously look for the fastest partners robustness: every peer has a non-zero chance

of interacting with any other peer

Page 16: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Seed unchoking

old algorithm unchoke the fastest leechers problem: fastest peers may monopolize seeds

new algorithm periodically sort all leechers according to their last unchoke time prefer the most recently unchoked leechers; on a tie, prefer the fastest (presumably) achieves equal spread of seed bandwidth

Page 17: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

new listrequest

peer list

Downloading only from seeds

leecher A

seed

leecher B

leecher C

tracker

● repeatedly query the tracker for peer lists

● distinguish the seeds, and receive data from them

● violates fairness model; may be harmful to honest peers

Page 18: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Rate- vs. volume-based selection

Proponents of rate-based decisions: [Cohen, P2PECON’03], and[INRIA TR’2006]

Proponents of volume-based decisions:[Bharambe et al., MSR-TR-2005],[Gkantsidis et al., Infocom’05], [Jun et al., P2PECON’05], andeDonkey file-sharing system

No clear winner yet!

Page 19: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Client implementations

mainline: written in Python; right now, the only one employing the new seed unchoking algorithm

Azureus: the most popular, written in Java; implements a special protocol between clients(e.g. peers can exchange peer lists)

other popular clients: ABC, BitComet, BitLord, BitTornado, μTorrent, Opera browser

various non-standard extensions retaliation mode: detect compromised/malicious peers anti-snubbing: ignore a peer who ignores us super seeding: seed masquerading as a leecher

Page 20: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Resources #1

Basic BitTorrent mechanisms [Cohen, P2PECON’03]

BitTorrent specification Wikihttp://wiki.theory.org/BitTorrentSpecification

Measurement studies [Izal et al., PAM’04], [Pouwelse et al., Delft TR 2004 and IPTPS’05], [Guo et al., IMC’05], and[Legout et al., INRIA-TR-2006]

Page 21: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Resources #2

Theoretical analysis and modeling [Qiu et al., SIGCOMM’04], and[Tian et al., Infocom’06]

Simulations [Bharambe et al., MSR-TR-2005]

Sharing incentives and exploiting them [Shneidman et al., PINS’04],[Jun et al., P2PECON’05], and[Liogkas et al., IPTPS’06]

Page 22: The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Conclusion and food for thought

BitTorrent is fast and robust

Yet, many parameters are arbitrarily set number of unchoking slots unchoking round duration size of pieces / sub-pieces

What can we learn from BitTorrent for the design of future P2P content distribution protocols?