2014-11-26 | Creating a BitTorrent Client with Scala and Akka, Part 1 (Vienna Scala User Group)
-
Upload
dominik-gruber -
Category
Technology
-
view
332 -
download
2
Transcript of 2014-11-26 | Creating a BitTorrent Client with Scala and Akka, Part 1 (Vienna Scala User Group)
Part I
Dominik Gruber, @the_dom Scala Vienna User Group – Nov. 26, 2014
Creating a BitTorrent Client with Scala and Akka
Dominik Gruber • @the_domscala-torrent
Agenda• Akka: Brief Introduction
• BitTorrent: Basic Procedure
• Bencoding
• Tracker
• Peer Wire Protocol
• Software Architecture
Dominik Gruber • @the_domscala-torrent
• Actor Model for the JVM
• Scala & Java API
• Part of the Typesafe Platform
• http://akka.io
Dominik Gruber • @the_domscala-torrent
• Simple Concurrency & Distribution Asynchronous and Distributed by design. High-level abstractions like Actors, Futures and STM.
• Elastic & DecentralizedAdaptive load balancing, routing, partitioning and configuration-driven remoting.
• Resilient by Design Write systems that self-heal. Remote and/or local supervisor hierarchies.
• High Performance 50 million msg/sec on a single machine. Small memory footprint; ~2.5 million actors per GB of heap.
Dominik Gruber • @the_domscala-torrent
Dominik Gruber • @the_domscala-torrent
BitTorrent• Most popular protocol for peer-to-peer file sharing
• Designed in 2001
• Est. over 250 million users/month
• Currently responsible for 3.35% of all worldwide bandwidth
Dominik Gruber • @the_domscala-torrent
http://en.wikipedia.org/wiki/BitTorrent#mediaviewer/File:BitTorrent_network.svg
Dominik Gruber • @the_domscala-torrent
Basic Procedure1. Load information from a metainfo (.torrent) file
2. Request peers from tracker
3. Connect to peers via handshake
4. Exchange data via predefined messages (Peer Wire Protocol)
Dominik Gruber • @the_domscala-torrent
Metainfo File
• In Bencode format
• Includes URL of the tracker, list of files / pieces,…
• Info hash has to be calculated from the file to identify the torrent for communication with tracker and peers
Dominik Gruber • @the_domscala-torrent
Bencoding
• Integers i<integer encoded in base ten ASCII>e e.g. i42e
• Strings <string length>:<string data> e.g. 4:spam; 12:scala vienna
Dominik Gruber • @the_domscala-torrent
Bencoding
• Lists l<bencoded values>e e.g. l5:scala7:torrente
• Dictionaries d<bencoded string><bencoded element>e e.g. d5:scalal4:akka4:playe4:java15:enterprise helle
Dominik Gruber • @the_domscala-torrent
Example: Ubuntu 14.04.1d8:announce39:http://torrent.ubuntu.com:6969/announce13:announce-listll39:http://torrent.ubuntu.com:6969/announceel44:http://ipv6.torrent.ubuntu.com:6969/announceee7:comment29:Ubuntu CD releases.ubuntu.com13:creation datei1406245742e4:infod6:lengthi599785472e4:name31:ubuntu-14.04.1-server-amd64.iso12:piece lengthi524288e6:pieces22880:(…)
Dominik Gruber • @the_domscala-torrent
Parsing Bencode with Scala
• Scala Parser Combinators
• Since Scala 2.11 a separate library
• Any structured format can be defined and parsed through its custom DSL
Dominik Gruber • @the_domscala-torrent
Parsing Bencode with Scalaimport scala.util.parsing.combinator._
object BencodeParser extends RegexParsers {
def integer: Parser[Int] = "i" ~> """(0|\-?[1-9]\d*)""".r <~ "e" ^^ (_.toInt)
// (…)
}
Dominik Gruber • @the_domscala-torrent
Parsing Bencode with Scalaobject BencodeParser extends RegexParsers { // (…)
def list: Parser[List[Any]] = "l" ~> rep1(bencodeElem) <~ "e"
def dictionary: Parser[Map[String,Any]] = "d" ~> rep1(string ~ bencodeElem) <~ "e" ^^ (_.map(x => (x._1, x._2)).toMap)
// (…)
}
Dominik Gruber • @the_domscala-torrent
Parsing Bencode with Scalaobject BencodeParser extends RegexParsers {
// (…)
def string: Parser[String] = new Parser[String] { def apply(in: Input) = { val source = in.source val offset = in.offset val start = handleWhiteSpace(source, offset) """(\d+):([\s\S]+)""".r findPrefixMatchOf source.subSequence(start, source.length) match { case Some(matched) => val length = matched.group(1).toInt if (length <= matched.group(2).length) Success( matched.group(2).substring(0, length), in.drop(start + length.toString.length + 1 + length - offset) ) else Failure("Provided length is longer than the remaining input", in.drop(start - offset)) case None => Failure("Input is not a string", in.drop(start - offset)) } }} // (…)
}
Dominik Gruber • @the_domscala-torrent
Parsing Bencode with Scalaobject BencodeParser extends RegexParsers { // (…) def bencodeElem = string | integer | list | dictionary
def apply(input: String) = parseAll(bencodeElem, input)
}
Dominik Gruber • @the_domscala-torrent
Tracker• HTTP service which holds information about a torrent
• Communication via GET-Request if a transfer is started, stopped, or completed
• Responds with a list of peers as a bencoded dictionary
• Tracker is a single point of failure -> DHT extension
Dominik Gruber • @the_domscala-torrent
Tracker: Example Requesthttp://torrent.ubuntu.com:6969/announce ?event=started &info_hash=-%06l%94H%0A%DC%F5%2B%FD%11%85%A7%5E%B4%DD%C1wvs &peer_id=-SC0001-546306326124 &port=6881 &numwant=50 &downloaded=0 &left=599785472 &uploaded=0 &compact=1
Dominik Gruber • @the_domscala-torrent
Tracker: Example Response
d8:completei748e10:incompletei8e8:intervali1800e5:peers300:(…)e
Dominik Gruber • @the_domscala-torrent
Peer Wire Protocol
• Specifies communication with peers
• TCP
• Basic format of every messages besides handshake: <length prefix><message ID><payload>
Dominik Gruber • @the_domscala-torrent
PWP: Handshake
• <pstrlen><pstr><reserved><info_hash><peer_id>
• pstr = “BitTorrent protocol”
• reserved: Eight bytes, used to indicate supported extensions
Dominik Gruber • @the_domscala-torrent
PWP: Bitfield• Sent immediately after handshake (optional)
• <len=0001+X><id=5><bitfield>
• X: length of bitfield
• Bitfield represents the pieces that have successfully been downloaded
Dominik Gruber • @the_domscala-torrent
PWP: Request• <len=0013><id=6><index><begin><length>
• Request a block of a piece
• index: Piece number
• begin / length: Specifies block with the piece
• Requests can be canceled via the CANCEL message
Dominik Gruber • @the_domscala-torrent
PWP: Piece
• <len=0009+X><id=7><index><begin><block>
• X: Length of block
• If a piece has been fully received, it is acknowledged via the HAVE message
Dominik Gruber • @the_domscala-torrent
PWP: Other Messages
• keep-alive
• choke / unchoke
• interested / uninterested
Dominik Gruber • @the_domscala-torrent
scala-torrent• Akka/the actor model is a good fit for this project
• The components can be clearly separated
• Supervision is needed
• TCP communication via Akka I/O
• HTTP communication via spray
Dominik Gruber • @the_domscala-torrent
Coordinator
Connection Handler
Torrent
Peer ConnectionPeer
ConnectionPeer Connection Tracker
Torrent Tracker
Actors
Incom
ing
Conne
ction
s
Dominik Gruber • @the_domscala-torrent
Connection Handlerobject ConnectionHandler { case class CreatePeerConnection(peer: PeerInformation) case class PeerConnectionCreated(connection: ActorRef, peer: PeerInformation)} class ConnectionHandler(endpoint: InetSocketAddress, internalPeerId: String) extends Actor { import Tcp._ import context.system import ConnectionHandler._ // Torrent coordinator actor val coordinator = context.parent // Start listening to incoming connections IO(Tcp) ! Tcp.Bind(self, endpoint)
// (…) }
Dominik Gruber • @the_domscala-torrent
Connection Handlerclass ConnectionHandler(endpoint: InetSocketAddress, internalPeerId: String) extends Actor { // (…) override def receive = { case CommandFailed(_: Bind) => // TODO: Handle failure case c @ Connected(remoteAddress, _) => val handler = createPeerConnectionActor(remoteAddress) sender ! Register(handler) case CreatePeerConnection(peer) => val peerConnection = createPeerConnectionActor(peer.inetSocketAddress) sender ! PeerConnectionCreated(peerConnection, peer) } private def createPeerConnectionActor(remoteAddress: InetSocketAddress) = context.actorOf(Props(classOf[PeerConnection], remoteAddress, internalPeerId, coordinator), "peer-connection-" + remoteAddress.toString.replace("/", ""))}
Dominik Gruber • @the_domscala-torrent
Peer Connectionclass PeerConnection(remoteAddress: InetSocketAddress, internalPeerId: String, coordinator: ActorRef) extends Actor { // (…) override def receive: Receive = initialStage def initialStage: Receive = { case AttachToTorrent(t, m) => torrent = Some(t) metainfo = Some(m) case SendHandshake if torrent.isDefined => if (connection.isDefined) sendHandshake() else { IO(Tcp) ! Connect(remoteAddress) context become awaitConnectionForHandshake } case Received(data) => handleHandshakeIn(data) case PeerClosed => handlePeerClosed() }
// (…)}
Dominik Gruber • @the_domscala-torrent
Peer Connectionclass PeerConnection(remoteAddress: InetSocketAddress, internalPeerId: String, coordinator: ActorRef) extends Actor { // (…) def handleHandshakeIn(data: ByteString) = { Handshake.unmarshal(data.toVector) match { case Some(handshake: Handshake) => connection = Some(context.sender()) if (torrent.isDefined) context become connected else coordinator ! IncomingPeerConnection(self, handshake) case None => // TODO: Handle failure } } def sendHandshake() = { val handshake = Handshake(metainfo.get.info.infoHash, internalPeerId) connection.get ! Write(ByteString(handshake.marshal.toArray)) }}
Dominik Gruber • @the_domscala-torrent
Current Status• Bencode parsing (and encoding)
• Communication with tracker
• Modelling of the PWP messages
• Handshake with clients
• TODO: File exchange (= the core)
Dominik Gruber • @the_domscala-torrent
Q & A
Dominik Gruber • @the_domscala-torrent
Source
• https://github.com/TheDom/scala-torrent
• http://www.bittorrent.org/beps/bep_0003.html
• http://jonas.nitro.dk/bittorrent/bittorrent-rfc.html
• https://wiki.theory.org/BitTorrentSpecification