Content Distribution March 8, 2012 2: Application Layer1.

26
Content Distribution March 8, 2012 2: Application Layer 1

Transcript of Content Distribution March 8, 2012 2: Application Layer1.

2: Application Layer 1

Content Distribution

March 8, 2012

2: Application Layer 2

Review: P2P architecture

no always-on server arbitrary end systems

directly communicate peers are

intermittently connected and change IP addresses

peer-peer

2: Application Layer 3

File Distribution: Server-Client vs P2PQuestion : How much time to distribute file

from one server to N peers?

us

u2d1 d2u1

uN

dN

Server

Network (with abundant bandwidth)

File, size F

us: server upload bandwidth

ui: peer i upload bandwidth

di: peer i download bandwidth

P2P content distribution issues Issues

Group management and data search Reliable and efficient file exchange Security/privacy/anonymity/trust

Approaches for group management and data search (i.e., who has what?) Centralized (e.g., BitTorrent tracker) Unstructured (e.g., Gnutella) Structured (Distributed Hash Tables [DHT])

2: Application Layer 4

2: Application Layer 5

Contents

P2P architecture and benefits P2P content distribution Content distribution network (CDN)

Why Content Networks?

More hops between client and Web server more congestion!

Same data flowing repeatedly over links between clients and Web server

S

C1

C4

C2

C3

- IP router

Slides from http://www.cis.udel.edu/~iyengar/courses/Overlays.ppt 2: Application Layer 6

Why Content Networks?

Origin server is bottleneck as number of users grows

Flash Crowds (for instance, Sept. 11)

The Content Distribution Problem: Arrange a rendezvous between a content source at the origin server (www.cnn.com) and a content sink (us, as users)

Slides from http://www.cis.udel.edu/~iyengar/courses/Overlays.ppt 2: Application Layer 7

Example: Web Server Farm

Simple solution to the content distribution problem: deploy a large group of servers

Arbitrate client requests to servers using an “intelligent” L4-L7 switch

Pretty widely used today

L4-L7 Switch

Request fromgrad.umd.edu

Request from ren.cis.udel.edu

Request fromren.cis.udel.edu

Request fromgrad.umd.edu

www.cnn.com(Copy 1)

www.cnn.com(Copy 3)

www.cnn.com(Copy 2)

2: Application Layer 8

Example: Caching Proxy

Majorly motivated by ISP business interests – reduction in bandwidth consumption of ISP from the Internet

Reduced network traffic Reduced user perceived latency

Clientren.cis.udel.edu

Clientmerlot.cis.u

del.edu

Intercepters

Proxy

www.cnn.comInternetTCP port 80 traffic

Othertraffic

ISP

2: Application Layer 9

2: Application Layer 10

But on Sept. 11, 2001

Web Serverwww.cnn.com

Usermslab.kaist.ac.kr

1000,000other hosts

1000,000other hosts

New ContentWTC News!

oldcontent request

request

- Caching Proxy

ISP

- Congestion / Bottleneck

2: Application Layer 11

Problems with discussed approaches: Server farms and Caching proxies Server farms do nothing about problems due to

network congestion

Caching proxies serve only their clients, not all users on the Internet

Content providers (say, Web servers) cannot rely on existence and correct implementation of caching proxies

Accounting issues with caching proxies. For instance, www.cnn.com needs to know the number of

hits to the webpage for advertisements displayed on the webpage

2: Application Layer 12

Again on Sept. 11, 2001 with CDN

Web Serverwww.cnn.com

Usermslab.kaist.ac.kr

New ContentWTC News!

requestnew

content

1000,000other users

1000,000other users

- Surrogate

- Distribution Infrastructure

FL

IL

DE

NY

MA

MICA

WA

2: Application Layer 13

Web replication - CDNs

Overlay network to distribute content from origin servers to users

Avoids large amount of same data repeatedly traversing potentially congested links on the Internet

Reduces Web server load

Reduces user perceived latency

Tries to route around congested networks

2: Application Layer 14

CDN vs. Caching Proxies

Caches are used by ISPs to reduce bandwidth consumption, CDNs are used by content providers to improve quality of service to end users

Caches are reactive, CDNs are proactive

Caching proxies cater to their users (web clients) and not to content providers (web servers), CDNs cater to the content providers (web servers) and clients

CDNs give control over the content to the content providers, caching proxies do not

CDN Architecture

Surrogate

Surrogate

Request Routing

Infrastructure

Distribution& Accounting Infrastructure

CDN

Origin Server

Client Client

2: Application Layer 15

CDN Organization

Limelight/Google: placing CDN servers near a small # of ISP core nets

Akamai: placing CDN servers deep into a large # of ISP networks’ sites

Nano Data Center (NaDa): home gateways (STBs/modems) as CDN servers (peer-to-peer delivery among NaDa servers)

P2P software (BitTorrent, PPLive, etc.)

EdgeRouter

CoreRouter

ONTOLT

DSLAM Modem

AccessMetro/Edge NetworkCore Network

NaDaDigital MediaDelivery Platform

CDN Components

Distribution Infrastructure: Moving or replicating content from content source

(origin server, content provider) to surrogates

Request Routing Infrastructure: Steering or directing content request from a client to

a suitable surrogate

Content Delivery Infrastructure: Delivering content to clients from surrogates

Accounting Infrastructure: Logging and reporting of distribution and delivery activities

2: Application Layer 17

Server Interaction with CDN

DistributionInfrastructure

1

1. Origin server pushes new content to CDN OR CDN pulls content from origin server

Accounting Infrastructure

2

2. Origin server requests logs and other accounting info from CDN OR CDN provides logs and other accounting info to origin server

CDN

Origin Server

www.cnn.com

2: Application Layer 18

Request Routing

Infrastructure

Client Interaction with CDN

1

1. Hi! I need www.cnn.com/sept11

2

2. Go to surrogate newyork.cnn.akamai.com

3

3. Hi! I need content /sept11

Q:How did the CDN choose the New York surrogate over the California surrogate ?

Client

Surrogate(NY)

Surrogate(CA)

CDNcalifornia.cnn.akamai.com

newyorkcnn.akamai.com

2: Application Layer 19

Request Routing Techniques

Request routing techniques use a set of metrics to direct users to “best” surrogate

Proprietary, but underlying techniques known: DNS based request routing Content modification (URL rewriting) Anycast based (how common is anycast?) URL based request routing Transport layer request routing Combination of multiple mechanisms

2: Application Layer 20

DNS based Request-Routing

Common due to the ubiquity of DNS as a directory service

Specialized DNS server inserted in a DNS resolution process

DNS server is capable of returning a different set of A, NS or CNAME records based on policies/metrics

2: Application Layer 21

DNS based Request-Routing

Akamai DNS

DN

S q

uery

:w

ww

.cnn.c

om

DN

S r

esp

onse

:A

1

45

.15

5.1

0.1

5

Sess

ion

local DNS server (dns.nyu.edu)128.4.4.12

1) DNS query:www.cnn.com

DNS response:A 145.155.10.15

www.cnn.com

Surrogate145.155.10.15

Surrogate58.15.100.152

AkamaiCDN

test.nyu.edu

128.4.30.15

newyork.cnn.akamai.com

california.cnn.akamai.com

newyork.cnn.akamai.com

Q: How does the Akamai DNS know which surrogate is

closest ?

2: Application Layer 22

DNS based Request-Routing

DN

S q

uery

Akamai DNS

www.cnn.com

Surrogate

Surrogate

AkamaiCDN

test.nyu.edu128.4.30.15

local DNS server (dns.nyu.edu)

128.4.4.12

DNS query

Measure

to

Client D

NS

Measure to Client DNS

Measurement results

Measure

ment resu

lts

Mea

sure

men

tsMeasurem

ents

2: Application Layer 23

DNS based Request-Routingwww.cnn.com

Client DNS76.43.32.4

Surrogate145.155.10.15

Surrogate58.15.100.152

Akamai DNS

AkamaiCDN

Client76.43.35.53

Requesting DNS - 76.43.32.4

Surrogate - 145.155.10.15

www.cnn.comA 145.155.10.15TTL = 10s

Requesting DNS - 76.43.32.4Available Bandwidth = 10 kbpsRTT = 10 ms

Requesting DNS - 76.43.32.4Available Bandwidth = 5 kbpsRTT = 100 ms

2: Application Layer 24

25

DNS based Request Routing: Discussion

Originator Problem: Client may be far removed from client DNS

Client DNS Masking Problem: Virtually all DNS servers, except for root DNS servers honor requests for recursion Q: Which DNS server resolves a request for test.nyu.edu?Q: Which DNS server performs the last recursion of the DNS

request?

Hidden Load Factor: A DNS resolution may result in drastically different load on the selected surrogate – issue in load balancing requests, and predicting load on surrogates

2: Application Layer

2: Application Layer 26

Summary

P2P architecture and its benefits P2P content distribution

BitTorrent, Skype Content distribution network (CDN)

DNS-based request routing