P2P file sharing and traffic management - TUT · P2P le sharing and tra c management Dmitri...

47
P2P file sharing and traffic management Dmitri Moltchanov Department of Communications Engineering Tampere University of Technology [email protected] October 8, 2014 Based on slides provided by R. Dunaytsev http://utopia.duth.gr/rdunayts/

Transcript of P2P file sharing and traffic management - TUT · P2P le sharing and tra c management Dmitri...

P2P file sharing and traffic management

Dmitri Moltchanov

Department of Communications EngineeringTampere University of Technology

[email protected]

October 8, 2014

Based on slides provided by R. Dunaytsev http://utopia.duth.gr/rdunayts/

Outline

1 Problem statement

2 Overprovisioning

3 Blocking traffic

4 Bandwidth caps

5 Traffic shaping

6 Network caching

7 Traffic localization

8 Learning outcomes

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 2 / 47

Outline

1 Problem statement

2 Overprovisioning

3 Blocking traffic

4 Bandwidth caps

5 Traffic shaping

6 Network caching

7 Traffic localization

8 Learning outcomes

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 3 / 47

Problem Statement

What do ISPs have to cope with :

Tremendous growth in Internet traffic

According to Cisco VNI, global traffic will quadruple from 2009 to 2014

Mix of applications with very different requirements

Bandwidth-hungry file sharing vs. delay-sensitive multimedia streamingPoor application performance during congestion: dissatisfaction

P2P content distribution

Lawful and illegal content sharing should be differentiated somehowUse of encryption and obfuscation in P2P file sharing systems

Highly competitive market place

Customer dissatisfaction aggravates subscriber churn

Complete mismatch of the underlying topology by P2P!

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 4 / 47

Problem Statement (cont’d)

What would ISPs like to achieve :

Maximize revenue by:

Assuring high QoS to meet the requirements of applications and usersProviding competitive services and increasing subscriber baseDecreasing subscriber churn and increasing customer loyaltyCharging content owners and CDNs based on the QoS guaranteesOffering own content and tiered services based on the QoS guarantees

Minimize costs by:

Deferring investments in infrastructure upgradesReducing inter-ISP transit expensesDeploying flexible and scalable solutions

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 5 / 47

Problem Statement (cont’d)

Proposed solutions for managing P2P (and not only) traffic :

1 Acquire more bandwidth (aka overprovisioning)

2 Block P2P traffic

3 Implement bandwidth caps (aka quotas)

4 Shape P2P traffic

5 Utilize network caching (aka in-network storage)

6 Localize P2P traffic

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 6 / 47

Outline

1 Problem statement

2 Overprovisioning

3 Blocking traffic

4 Bandwidth caps

5 Traffic shaping

6 Network caching

7 Traffic localization

8 Learning outcomes

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 7 / 47

Overprovisioning

Overprovisioning refers to the practice where links are dimensionedso that the available bandwidth exceeds the expected peak or theaverage traffic load by a certain margin

G. Finnie, ”ISP Traffic Management Technologies”, 2009

The most widely established technique!

The overprovisioning ratio varies widely and depends on:

Underlying network topology and technologyVolume of traffic and anticipated variation in traffic loadsNumber of usersMix of applications

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 8 / 47

Overprovisioning (cont’d)

Benefits of overprovisioning:

Way easier to control an overprovisioned networkWith an overprovisioned network, an ISP is prepared for the future

Shortcomings of overprovisioning:

Very limited life span (recall operation of TCP-like protocols!)It also involves costly capital expenditures

Conclusion:

Adding extra bandwidth cannot solve the P2P problem aloneAcquiring more bandwidth as part of a growth strategy is necessaryAcquiring bandwidth to manage P2P traffic alone is untenable

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 9 / 47

Outline

1 Problem statement

2 Overprovisioning

3 Blocking traffic

4 Bandwidth caps

5 Traffic shaping

6 Network caching

7 Traffic localization

8 Learning outcomes

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 10 / 47

Blocking Traffic

P2P blocking refers to the practice of blocking ports at the networkaccess point that are commonly used by popular P2P applications

Benefits of blocking:

Reduction in bandwidth usage by preventing all P2P traffic fromentering the network

Shortcomings of blocking:

It is not easy to block P2P traffic since it is able to masquerade asnon-P2P trafficBlocking all P2P traffic leads to customer dissatisfaction andaggravates subscriber churn

Conclusion:

Costs are reduced when P2P traffic is blocked, but so are revenues andbrand equityNetwork performance is increased, but so is subscriber churn

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 11 / 47

Blocking Traffic (cont’d)

Many P2P applications enable users to select a desired port or assignports dynamically

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 12 / 47

Outline

1 Problem statement

2 Overprovisioning

3 Blocking traffic

4 Bandwidth caps

5 Traffic shaping

6 Network caching

7 Traffic localization

8 Learning outcomes

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 13 / 47

Bandwidth Caps

Bandwidth caps refers to the practice of limiting transfers of aspecified amount of data over a period of time

If a user exceeds the cap, the ISP reduce connection speedThe ISP may also offer to purchase some additional bandwidth

E.g., Shaw Communications, Inc. – a Canadian telco

$35/month = 13 GB/month$107/month = 250 GB/month

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 14 / 47

Bandwidth Caps (cont’d)

Benefits of bandwidth caps:

Discourage users from transferring more than the allocated capBy charging different fees for different traffic volumes, ISPs recoupsome of the additional costs that are incurred by the heavy-traffic users

Shortcomings of bandwidth caps:

Although providing bandwidth savings in a long run (bits per month),bandwidth quotas offer limited reductions on a short time-scale(bits per second)Subscriber frustration with this approach is likely to aggravate customerdissatisfaction, especially if other ISPs do not implement similar caps

Conclusion:

In an ”all-you-can-eat” broadband world, bandwidth caps leavesubscribers frustrated and unsatisfied

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 15 / 47

Outline

1 Problem statement

2 Overprovisioning

3 Blocking traffic

4 Bandwidth caps

5 Traffic shaping

6 Network caching

7 Traffic localization

8 Learning outcomes

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 16 / 47

Traffic Shaping

Traffic shaping refers to the practice of providing differenttreatments to different classes of traffic

Each individual packet that arrives is examined and classified (mostlybased on DPI)

According to the priority of each class of traffic, packets are put intoqueues and then transmitted

This allows an ISP to give priority to certain classes of traffic, leavingwhatever bandwidth is left over for others

Thus, traffic shaping includes:1 Deep Packet Inspection (DPI)2 Application priority management3 Traffic shaping

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 17 / 47

Traffic Shaping (cont’d)

E.g., TVTEL, a Portuguese cable operator, deployed several ipoqueTraffic Managers and implemented a priority management schemethat gives P2P traffic a lower priority and, thus, improves QoS for allother applications such as Web browsing and VoIP

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 18 / 47

Traffic Shaping (cont’d)

There is still no common agreement on what exactly”net neutrality” is

E.g., http://en.wikipedia.org/wiki/Network neutrality

Some definitions:1 In a neutral network, all packets should be treated equally on a

best-effort basis2 A neutral network is one that is free of restrictions on the kinds of

equipment that may be attached, on the modes of communicationallowed, without restriction of content, sites or platforms, and wherecommunication is not unreasonably degraded

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 19 / 47

Traffic Shaping (cont’d)

Net neutrality vs. fairness:

Less than 20% of users generate over 80% of the traffic, causing slowdownload speeds and connection problems for all usersThey pay the same fee but use more resources than the rest of the usersShould the bandwidth be distributed evenly?ISPs call that ”traffic shaping”Net neutrality supporters call that ”discrimination”

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 20 / 47

Traffic Shaping (cont’d)

Benefits of traffic shaping:

ISPs get control on how their networks are being used, and canprioritize certain classes of traffic, while throttling othersAssociated P2P costs can be reduced in a way that avoids thedrawbacks of completely blocking P2P traffic

Shortcomings of traffic shaping:

It introduces processing delays for all classes of trafficTraffic shaping is sensitive to the accuracy of DPIIssues concerning net neutrality arise

Conclusion:

Recommended

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 21 / 47

DPI

Deep Packet Inspection (DPI) is a method of packet analysis

that examines the whole packet, not only the header

Inspecting headers only does not yield a reliable protocol or applicationclassification anymore, as many modern applications use dynamic portsor even ports that have traditionally been used by other applications

DPI in traffic shaping systems does not read all packets

Instead, it only scans for patterns in the first few packets of each flow

About 1-3 packets for unencrypted and 3-20 packets for encryptedcommunication protocols

DPI application fields:

E-mail spam and anti-virus filteringIntrusion detection and prevention systemsTraffic shaping systemsContent caching systems

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 22 / 47

DPI (cont’d)

Since ISPs actively interfere with P2P activities in order to reducetheir bandwidth requirements, many P2P software clients haveintroduced an encryption protocol to prevent ISPs from identifyingP2P traffic

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 23 / 47

DPI (cont’d)

DPI relies on:

Pattern matching – the scanning for strings or generic bit and bytepatterns anywhere in the packetBehavioral analysis – the scanning for patterns in the communicationbehavior of an application, including packet sizes, data rates, numberof flows, etc.Statistical analysis – the calculation of statistical indicators that canbe used to identify application types, including the mean, the median,and the variation of values collected as part of the behavioral analysis

The P2P development community has shown itself to be veryresistant to shaping techniques in the past, and has developed severaltactics for hiding the true identity of packets:

EncryptionVariable offsets of the packet identifierIdentifier artificially split by fragmenting packetsEtc.

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 24 / 47

DPI (cont’d)

DPI vs. P2P traffic encryption and obfuscation as the armor/weaponrace

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 25 / 47

Outline

1 Problem statement

2 Overprovisioning

3 Blocking traffic

4 Bandwidth caps

5 Traffic shaping

6 Network caching

7 Traffic localization

8 Learning outcomes

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 26 / 47

Network Caching

Principle of caching:

Instead of content being delivered repeatedly through the network,highly demanded content is passed through the backbone only onceand stored in the cacheThen, the content is delivered from the cache, without clogging up thenetwork

Caching became popular in the mid-1990’s to address a huge increasein Internet usage associated with Web surfing

However, as the Internet infrastructure grew through a massiveinvestment in fiber, the value of caching small Web objects decreased

A decade later, the Internet finds itself in a similar predicament,except this time the problem is caused by a massive adoption ofInternet video and P2P file sharing

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 27 / 47

Network Caching (cont’d)

Network caching refers to the practice of maintaining a repositoryof the most frequently downloaded files in a local network

When a user performs a file search, this centralized store of popularP2P files is accessed first

If the file is present in the store, the file can be retrieved directly fromit, thus reducing incoming traffic

If the file is not already present, it is retrieved from the source of thecontent, simultaneously copying it into the centralized store

Since an efficient caching solution needs DPI support, networkcaching and DPI are often used in combination

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 28 / 47

Network Caching (cont’d)

Network caching allows to:

Relieve network congestion by delivering extra bandwidth from thenearest cacheImprove QoS by accelerating content deliveryIncrease ARPU (Average Revenue Per User) by linking the servicepackage to the performance, improving customer satisfaction, andreducing subscriber churnMonetize Internet traffic by charging content providers based onQoS guarantees and offering own content

P2P and video content responds well to caching because they havehigh reuse patterns

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 29 / 47

Network Caching (cont’d)

Network caching may cause legal headache for the ISPs

Currently, ISPs bear no legal responsibility for the legality of thecontent that goes through their networks if they meet certainrequirements

E.g., Online Copyright Infringement Liability Limitation Act (OCILLA)http://en.wikipedia.org/wiki/Online Copyright Infringement Liability Limitation Act

So if a user spends all day illegally uploading and downloadingcopyrighted content, the fault lies with the individual pirate and nothis/her ISP

Once an ISP gets in the business of content manipulation, then itruns a serious risk of losing its legal immunity from piracy charges

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 30 / 47

Network Caching (cont’d)

Up to now, vendors focus their sales efforts in markets whereinternational bandwidth is expensive, and the ROI for the solution isvery quick (mainly Asia, Latin America, Eastern Europe, and Africa)

E.g., Oversi customers worldwide:

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 31 / 47

Network Caching (cont’d)

”Bringing popular content close to consumers” – some P2P protocolshave inherent support for this

E.g., Freenet:1 Each Freenet participant runs a node that provides the network with

some storage space2 A file is inserted into the network with an associated key3 After insertion is finished, the publisher is free to shut down his node,

since the file is stored in the network4 To retrieve a file, a user sends out a request message containing the key5 When the request finds a node containing a copy of the file, the file is

returned through the search path6 During relaying of data at both insert and request steps, nodes copy

the data into their caches7 As a result, a file can be replicated on and migrate to other nodes,

getting closer to consumers

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 32 / 47

Network Caching (cont’d)

DECADE: DECoupled Application Data Enroute

DECADE is an architecture that provides both P2P and non-P2Papplications with access to in-network storage

Unlike proprietary solutions, this IETF project aims to provide astandard protocol for:

Authorization and accessing stored contentManagement of stored contentData search within in-network storageDiscovery mechanism to find location of in-network storage

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 33 / 47

Network Caching (cont’d)

Benefits of network caching:

Caching delivers more bandwidth, more services, and more revenuesover existing infrastructures, without requiring costly infrastructureupgradesCaching minimizes ISP’s transit expenses

Shortcomings of network caching:

This approach does nothing to alleviate upstream congestionIssues concerning piracy arise

Conclusion:

Recommended

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 34 / 47

Outline

1 Problem statement

2 Overprovisioning

3 Blocking traffic

4 Bandwidth caps

5 Traffic shaping

6 Network caching

7 Traffic localization

8 Learning outcomes

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 35 / 47

Traffic Localization

Client/server and P2P systems are implemented as virtual (overlay)networks of nodes and logical links built on top of an existing network

The overlay is a logical view that usually does not directly mirror thephysical network topology

Overlay network

Physical network

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 36 / 47

Traffic Localization (cont’d)

Nowadays, data are often available in several equivalent replicas ondifferent hosts

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 37 / 47

Traffic Localization (cont’d)

Random peer selection (what about local seeders?)

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 38 / 47

Traffic Localization (cont’d)

P2P traffic localization could save ISP’s transit expenses throughselecting local peers to connect to

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 39 / 47

Traffic Localization (cont’d)

Traffic localization refers to the practice of providing someguidance to applications, which have to select one or more hosts froma set of candidates

The goal is to provide ISPs with the ability to optimize utilization ofnetwork resources while improving QoS for both P2P and non-P2Papplications

The most critical part is how to enable locality awareness ofparticipants

Most of traffic localization solutions proposed so far assumecollaboration between ISPs and P2P systems (aka P4P)

However, this assumption does not necessarily hold

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 40 / 47

Traffic Localization (cont’d)

Locality awareness: IP-to-location mapping in ads

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 41 / 47

Traffic Localization (cont’d)

www.ip2location.com: IP-to-ISP mapping

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 42 / 47

Traffic Localization (cont’d)

ALTO: Application-Layer Traffic OptimizationALTO is an architecture that provides guidance to applications, whichhave to select one or several hosts from a set of candidates, that areable to provide a desired resourceThis guidance should be based on parameters that affect performanceand efficiency of the data transmission between the hostsThe ultimate goal is to improve QoS while reducing resourceconsumption in the underlying network infrastructure

The DECADE architecture can complement the ALTO effort inreducing of cross-domain and backbone traffic

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 43 / 47

Traffic Localization (cont’d)

The biased choice of peers is one of the most ”easy-to-implement”solutions for traffic localization in P2P systems

For instance, according to the biased selection of peers, a BitTorrenttracker is required to reply with those peers that are geographicallyclose to the location of a given peer

The list should also contain some peers that are not in localneighborhood of this peer; otherwise it is possible to fragment theentire network into many isolated pieces

Another way to enable choosing nearby nodes is to to allow hosts tocarry out measurements themselves

However, it may significantly increase prefetching delays in streamingmedia systems as these measurements are often time consuming

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 44 / 47

Traffic Localization (cont’d)

Benefits of traffic localization:

It allows finding optimal routes for packets traveling between specificsource and destination addressesThis approach alleviates both downstream and upstream congestion

Shortcomings of traffic localization:

ISPs should provide topology and/or bandwidth information to P2PapplicationsThe P2P development community should want to collaborate with ISPs

Conclusion:

Recommended

All the above-mentioned approaches (overprovisioning, DPI,traffic shaping, network caching, and traffic localization) arebest used in combination

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 45 / 47

Outline

1 Problem statement

2 Overprovisioning

3 Blocking traffic

4 Bandwidth caps

5 Traffic shaping

6 Network caching

7 Traffic localization

8 Learning outcomes

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 46 / 47

Learning Outcomes

Things to know:

Traffic management fundamentals (objectives, obstacles, etc.)Main strategies for managing P2P traffic (overprovisioning, blocking,bandwidth caps, traffic shaping, network caching, and trafficlocalization)DPI basics

Be ready to explain/compare/give examples

Dmitri Moltchanov (TUT) ELT-53206, Lecture 6 October 8, 2014 47 / 47