Content Distribution March 8, 2012 2: Application Layer1.
-
Upload
bruce-knight -
Category
Documents
-
view
223 -
download
2
Transcript of Content Distribution March 8, 2012 2: Application Layer1.
2: Application Layer 2
Review: P2P architecture
no always-on server arbitrary end systems
directly communicate peers are
intermittently connected and change IP addresses
peer-peer
2: Application Layer 3
File Distribution: Server-Client vs P2PQuestion : How much time to distribute file
from one server to N peers?
us
u2d1 d2u1
uN
dN
Server
Network (with abundant bandwidth)
File, size F
us: server upload bandwidth
ui: peer i upload bandwidth
di: peer i download bandwidth
P2P content distribution issues Issues
Group management and data search Reliable and efficient file exchange Security/privacy/anonymity/trust
Approaches for group management and data search (i.e., who has what?) Centralized (e.g., BitTorrent tracker) Unstructured (e.g., Gnutella) Structured (Distributed Hash Tables [DHT])
2: Application Layer 4
2: Application Layer 5
Contents
P2P architecture and benefits P2P content distribution Content distribution network (CDN)
Why Content Networks?
More hops between client and Web server more congestion!
Same data flowing repeatedly over links between clients and Web server
S
C1
C4
C2
C3
- IP router
Slides from http://www.cis.udel.edu/~iyengar/courses/Overlays.ppt 2: Application Layer 6
Why Content Networks?
Origin server is bottleneck as number of users grows
Flash Crowds (for instance, Sept. 11)
The Content Distribution Problem: Arrange a rendezvous between a content source at the origin server (www.cnn.com) and a content sink (us, as users)
Slides from http://www.cis.udel.edu/~iyengar/courses/Overlays.ppt 2: Application Layer 7
Example: Web Server Farm
Simple solution to the content distribution problem: deploy a large group of servers
Arbitrate client requests to servers using an “intelligent” L4-L7 switch
Pretty widely used today
L4-L7 Switch
Request fromgrad.umd.edu
Request from ren.cis.udel.edu
Request fromren.cis.udel.edu
Request fromgrad.umd.edu
www.cnn.com(Copy 1)
www.cnn.com(Copy 3)
www.cnn.com(Copy 2)
2: Application Layer 8
Example: Caching Proxy
Majorly motivated by ISP business interests – reduction in bandwidth consumption of ISP from the Internet
Reduced network traffic Reduced user perceived latency
Clientren.cis.udel.edu
Clientmerlot.cis.u
del.edu
Intercepters
Proxy
www.cnn.comInternetTCP port 80 traffic
Othertraffic
ISP
2: Application Layer 9
2: Application Layer 10
But on Sept. 11, 2001
Web Serverwww.cnn.com
Usermslab.kaist.ac.kr
1000,000other hosts
1000,000other hosts
New ContentWTC News!
oldcontent request
request
- Caching Proxy
ISP
- Congestion / Bottleneck
2: Application Layer 11
Problems with discussed approaches: Server farms and Caching proxies Server farms do nothing about problems due to
network congestion
Caching proxies serve only their clients, not all users on the Internet
Content providers (say, Web servers) cannot rely on existence and correct implementation of caching proxies
Accounting issues with caching proxies. For instance, www.cnn.com needs to know the number of
hits to the webpage for advertisements displayed on the webpage
2: Application Layer 12
Again on Sept. 11, 2001 with CDN
Web Serverwww.cnn.com
Usermslab.kaist.ac.kr
New ContentWTC News!
requestnew
content
1000,000other users
1000,000other users
- Surrogate
- Distribution Infrastructure
FL
IL
DE
NY
MA
MICA
WA
2: Application Layer 13
Web replication - CDNs
Overlay network to distribute content from origin servers to users
Avoids large amount of same data repeatedly traversing potentially congested links on the Internet
Reduces Web server load
Reduces user perceived latency
Tries to route around congested networks
2: Application Layer 14
CDN vs. Caching Proxies
Caches are used by ISPs to reduce bandwidth consumption, CDNs are used by content providers to improve quality of service to end users
Caches are reactive, CDNs are proactive
Caching proxies cater to their users (web clients) and not to content providers (web servers), CDNs cater to the content providers (web servers) and clients
CDNs give control over the content to the content providers, caching proxies do not
CDN Architecture
Surrogate
Surrogate
Request Routing
Infrastructure
Distribution& Accounting Infrastructure
CDN
Origin Server
Client Client
2: Application Layer 15
CDN Organization
Limelight/Google: placing CDN servers near a small # of ISP core nets
Akamai: placing CDN servers deep into a large # of ISP networks’ sites
Nano Data Center (NaDa): home gateways (STBs/modems) as CDN servers (peer-to-peer delivery among NaDa servers)
P2P software (BitTorrent, PPLive, etc.)
EdgeRouter
CoreRouter
ONTOLT
DSLAM Modem
AccessMetro/Edge NetworkCore Network
NaDaDigital MediaDelivery Platform
CDN Components
Distribution Infrastructure: Moving or replicating content from content source
(origin server, content provider) to surrogates
Request Routing Infrastructure: Steering or directing content request from a client to
a suitable surrogate
Content Delivery Infrastructure: Delivering content to clients from surrogates
Accounting Infrastructure: Logging and reporting of distribution and delivery activities
2: Application Layer 17
Server Interaction with CDN
DistributionInfrastructure
1
1. Origin server pushes new content to CDN OR CDN pulls content from origin server
Accounting Infrastructure
2
2. Origin server requests logs and other accounting info from CDN OR CDN provides logs and other accounting info to origin server
CDN
Origin Server
www.cnn.com
2: Application Layer 18
Request Routing
Infrastructure
Client Interaction with CDN
1
1. Hi! I need www.cnn.com/sept11
2
2. Go to surrogate newyork.cnn.akamai.com
3
3. Hi! I need content /sept11
Q:How did the CDN choose the New York surrogate over the California surrogate ?
Client
Surrogate(NY)
Surrogate(CA)
CDNcalifornia.cnn.akamai.com
newyorkcnn.akamai.com
2: Application Layer 19
Request Routing Techniques
Request routing techniques use a set of metrics to direct users to “best” surrogate
Proprietary, but underlying techniques known: DNS based request routing Content modification (URL rewriting) Anycast based (how common is anycast?) URL based request routing Transport layer request routing Combination of multiple mechanisms
2: Application Layer 20
DNS based Request-Routing
Common due to the ubiquity of DNS as a directory service
Specialized DNS server inserted in a DNS resolution process
DNS server is capable of returning a different set of A, NS or CNAME records based on policies/metrics
2: Application Layer 21
DNS based Request-Routing
Akamai DNS
DN
S q
uery
:w
ww
.cnn.c
om
DN
S r
esp
onse
:A
1
45
.15
5.1
0.1
5
Sess
ion
local DNS server (dns.nyu.edu)128.4.4.12
1) DNS query:www.cnn.com
DNS response:A 145.155.10.15
www.cnn.com
Surrogate145.155.10.15
Surrogate58.15.100.152
AkamaiCDN
test.nyu.edu
128.4.30.15
newyork.cnn.akamai.com
california.cnn.akamai.com
newyork.cnn.akamai.com
Q: How does the Akamai DNS know which surrogate is
closest ?
2: Application Layer 22
DNS based Request-Routing
DN
S q
uery
Akamai DNS
www.cnn.com
Surrogate
Surrogate
AkamaiCDN
test.nyu.edu128.4.30.15
local DNS server (dns.nyu.edu)
128.4.4.12
DNS query
Measure
to
Client D
NS
Measure to Client DNS
Measurement results
Measure
ment resu
lts
Mea
sure
men
tsMeasurem
ents
2: Application Layer 23
DNS based Request-Routingwww.cnn.com
Client DNS76.43.32.4
Surrogate145.155.10.15
Surrogate58.15.100.152
Akamai DNS
AkamaiCDN
Client76.43.35.53
Requesting DNS - 76.43.32.4
Surrogate - 145.155.10.15
www.cnn.comA 145.155.10.15TTL = 10s
Requesting DNS - 76.43.32.4Available Bandwidth = 10 kbpsRTT = 10 ms
Requesting DNS - 76.43.32.4Available Bandwidth = 5 kbpsRTT = 100 ms
2: Application Layer 24
25
DNS based Request Routing: Discussion
Originator Problem: Client may be far removed from client DNS
Client DNS Masking Problem: Virtually all DNS servers, except for root DNS servers honor requests for recursion Q: Which DNS server resolves a request for test.nyu.edu?Q: Which DNS server performs the last recursion of the DNS
request?
Hidden Load Factor: A DNS resolution may result in drastically different load on the selected surrogate – issue in load balancing requests, and predicting load on surrogates
2: Application Layer