Evaluation of the Proximity between Web Clients and their Local DNS Servers
description
Transcript of Evaluation of the Proximity between Web Clients and their Local DNS Servers
Evaluation of the Proximity between Web Clients and their Local DNS Servers
Z. Morley MaoUC Berkeley ([email protected])
C. Cranor, M. Rabinovich, O. Spatscheck, and J. Wang
AT&T Labs-ResearchF. Douglis
IBM Research
Motivation Content Distribution Networks (CDNs)
Attempt to deliver content from servers close to users
Internet
Clients Clients
Cache server
Origin servers
Cache server
Cache server
DNS based server selection Originator problem
Assumes that clients are close to their local DNS servers
Verify the assumption that clients are close to their local DNS servers
Client.myisp.net Local DNS Serverns.myisp.net
Authoritative DNS serverns.service.com
A.GTLD-SERVERS.NET
www.service.com? www.service.com?ns.service.com
www.service.co
m?Server IP address
Server IP address
Measurement setup Three components
1x1 pixel embedded transparent GIF image
<img src=http://xxx.rd.example.com/tr.gif height=1 width=1>
A specialized authoritative DNS server Allows hostnames to be wild-carded
An HTTP redirector Always responds with “302 Moved
Temporarily” Redirect to a URL with client IP address
embedded
www.att.com
1x1 transparent GIF
Embedded image request sequence
Client[10.0.0.1]
Redirector forxxx.rd.example.com
Local DNS server
Content server for the image
Name server for*.cs.example.com
1. HTTP GET request for the image2. HTTP redirect toIP10-0-0-1.cs.example.com
3. R
eque
s t to
r eso
lve
IP10
- 0- 0
-1.c
s .exa
mpl
e.c o
m
4. Request to resolve IP10-0-0-1.cs.example.com
5. Reply: IP address of content server
6. R
eply
: con
tent
serv
erIP
add
ress
7. HTTP GET request for the image8. HTTP response
Measurement DataSite Participant Image hit
countDuration
1 att.com 20,816,927 2 months2,3 Personal pages
(commercial domain)
1,743 3 months
4 AT&T research 212,814 3 months5-7 University sites 4,367,076 3 months8-19 Personal pages
(university domain)
26,563 3 months
Measurement statistics
Data type CountUnique client-LDNS associations 4,253,157HTTP requests 25,425,12
3Unique client IPs 3,234,449Unique LDNS IPs 157,633Client-LDNS associations whereClient and LDNS have the same IP address
56,086
Proximity metrics: AS clustering Network clustering Traceroute divergence Roundtrip time correlation
AS clustering Autonomous System (AS)
A single administrative entity with unified routing policy
Observes if client and LDNS belong to the same AS
Network clustering [Krishnamurthy,Wang sigcomm00] Based on BGP routing information
using the longest prefix match Each prefix identifies a network
cluster Observes if client and LDNS belong
to the same network cluster
Traceroute divergenceProbe machine
client Local DNS server
•[Shaikh et al. infocom00]
•Use the last point of divergence
•Traceroute divergence:Max(3,4)=4
1
23
4
1
2
3
a
b
Roundtrip time correlation Correlation between message
roundtrip times from a probe site to the client and its LDNS server
The probe site represents a potential cache server location
A crude metric, highly dependent on the probe site
Aggregate statistics of AS/network clustering
More than 13,000 ASes Close to 75% total ASes
440,000 unique prefixes Close to 25% of all possible network clusters
We have a representative data set
Metrics # client clusters
# LDNS clusters
Total # cluster
sAS clustering 9,215 8,590 9,570Network clustering
98,001 53,321 104,950
Proximity analysis:AS, network clustering
Metrics Client IPs HTTP requests
AS cluster 64% 69%Network cluster 16% 24% AS clustering: coarse-grained Network clustering: fine-grained Most clients not in the same routing
entity as their LDNS Clients with LDNS in the same cluster
slightly more active
Proximity analysis:Traceroute divergence Probe sites:
NJ(UUNET), NJ(AT&T), Berkeley(Calren), Columbus(Calren)
Sampled from top half of busy network clusters Median divergence: 4 Mean divergence: 5.8-6.2 Ratio of common to disjoint path length
72%-80% pairs traced have common path at least as long as disjoint path
Improved local DNS configuration For client-LDNS associations not in
the same cluster, do we know a LDNS in the client’s cluster?
Metrics Original Improved
Original Improved
AS cluster 64% 88% 69% 92%Network cluster
16% 66% 24% 70%
Client IPs HTTP requests
Impact on commercial CDNs
Data set Client-LDNS associations LDNS-CDN associations Available CDN servers
Client w/ CDN server in cluster
Verifiable clients:w/ responsive
LDNS
Misdirected clients:directed to a cache
not in client’s cluster
Clients with LDNSnot in same cluster
Impact on commercial CDNsAS clusteringCDN CDN X CDN Y CDN ZClients with CDN server in cluster
1,679,515
1,215,372
618,897
Verifiable clients 1,324,022
961,382 516,969
Misdirected clients(% of verifiable clients)
809,683(60%)
752,822(77%)
434,905(82%)
Clients with LDNS not in client’s cluster(% of misdirected clients)
443,394
(55%)
354,928
(47%)
262,713
(60%)
Impact on commercial CDNsNetwork clustering
CDN CDN X CDN Y CDN ZClients with cache server in cluster
264,743 156,507 103,448
Verifiable clients 221,440 132,567 90,264Misdirected clients(% of verifiable clients)
154,198(68%)
125,449(94%)
87,486(96%)
Clients with LDNS not in client’s cluster(% of misdirected clients)
145,276
(94%)
116,073
(93%)
84,737
(97%)
Less than 10% of all clients
Conclusion Novel technique for finding client and
local DNS associations Fast, non-intrusive, and accurate
DNS based server selection works well for coarse-grained load-balancing 64% associations in the same AS 16% associations in the same network
cluster Server selection can be inaccurate if
server density is high
Related work Measurement methodology
1. IBM (Shaikh et al.) Time correlation of DNS and HTTP requests from DNS
and Web server logs2. Univ of Boston (Bestavros et al.)
Assigning multiple IP addresses to a Web server Differences from our work:
Our methodology: efficient, accurate, nonintrusive3. Web bugs
Proximity metrics Cisco’s Boomerang protocol: uses latency from
cache servers to the LDNS