Ningning HuCarnegie Mellon University1 Optimizing Network Performance In Replicated Hosting Peter...
-
Upload
quinn-earnshaw -
Category
Documents
-
view
228 -
download
3
Transcript of Ningning HuCarnegie Mellon University1 Optimizing Network Performance In Replicated Hosting Peter...
Ningning Hu Carnegie Mellon University 1
Optimizing Network Performance In Replicated Hosting
Peter Steenkiste (CMU)
with Ningning Hu (CMU),
Oliver Spatscheck (AT&T),
Jia Wang (AT&T)
Ningning Hu Carnegie Mellon University 2
Motivation
The question of how to use latency to select a replicated web server has been well studied
How about using available bandwidth?
?
Ningning Hu Carnegie Mellon University 3
Outline
Pathneck
Internet end user RTT distribution and access bandwidth distribution
Optimization results For RTT For bandwidth For data transmission time
Ningning Hu Carnegie Mellon University 4
Pathneck: Recursive Packet Train (RPT)
Two measurement packets are dropped at each router
ICMP packets allow source to estimate train length at each hop
Changes in train length provide bounds on the available bandwidth of each link
Load packetsmeasurement
packetsmeasurement packets
1 2 20 20 2 1
20 pkts, 60 B 20 pkts, 60 B
100 100 100 100 100
60 pkts, 500 B
TTL
Ningning Hu Carnegie Mellon University 5
Pathneck Operation
1001 2 3 4 4 3 2 1100 100 100 100
991 2 3 3 2 199 99 99 99
981 2 2 198 98 98 98
R1
S
R2
R3
0 0
0 0
0 0
g1
g2
g3
982 298 98 98 981 1
971 197 97 97 97
g1
g2
g2
Ningning Hu Carnegie Mellon University 6
Pathneck Properties
Pathneck is an active probing tool designed for locating Internet bottlenecks It is efficient and effective Also provide route, delay, and bandwidth
information For technical detail please see
www.cs.cmu.edu/~hnn/pathneck
We improve Pathneck to cover the last hop This allows us to measure the RTT and the
access bandwidth of many end users.
Ningning Hu Carnegie Mellon University 7
Methodology
Measurement sources: 18 nodes from a large tier-1 ISP 14 in the US, 3 in Europe, and 1 in East-Asia Large fraction of paths cover other ISPs Play the role of possible replica sites
Measurement destinations: 164,130 IP addresses from different prefixes 67,271 IPs correspond to real online hosts Firewalls etc sometime require us to use
intermediate node as “virtual” destination Play the role of clients accessing the web
Ningning Hu Carnegie Mellon University 8
Results
Internet end user RTT distribution and access bandwidth distribution
Optimization results For RTT For bandwidth For data transmission time
Ningning Hu Carnegie Mellon University 9
RTT Distribution
The RTT “views” of Internet clients from different geographical locations are significantly different
US-NE
Europe
East-Asia
Ningning Hu Carnegie Mellon University 10
Bandwidth Distribution
US-NEEuropeEast-Asia
The bandwidth “views” are much more alike
Ningning Hu Carnegie Mellon University 11
End Access Bandwidth Distribution
Low access bandwidth still dominates among end users
40% < 2.2Mbps
50% < 4.2Mbps
62.5% < 10Mbps
Limited by downstream bandwidth of measurement source
Ningning Hu Carnegie Mellon University 12
Bottleneck Location Distribution
75% of bottleneck links are at the last two hop Little chance to avoid these bottlenecks using
replication
However, when access bandwidth is higher than 40Mbps, content replication can help to improve performance
Ningning Hu Carnegie Mellon University 13
Results
Internet end user RTT distribution and access bandwidth distribution
Optimization results For RTT For bandwidth For data transmission time
Ningning Hu Carnegie Mellon University 14
Optimization Algorithm
We use simple greedy algorithm to optimize the performance of our replication infrastructure In each step, select the replication node that
has the largest marginal utility
Greedy algorithm has been shown to be able to obtain results very close to the optimal results For our study, it is only 0.1% worse than the
optimal results from brute-force search
Ningning Hu Carnegie Mellon University 15
RTT Optimization
RTT optimization results have a clear geographical pattern
The first 5 replicas provide most of the benefit
US-EastEurope
East-AsiaUS-West
US-Central
Ningning Hu Carnegie Mellon University 16
Marginal Utility of RTT Optimization
The first 5 nodes have significant improvement (i.e., larger than 5%)
[ Marginal utility: the relative performance improvement from a specific node ]
Ningning Hu Carnegie Mellon University 17
Bandwidth Optimization
The first 2 replicas provide most of the benefit
Ningning Hu Carnegie Mellon University 18
Marginal Utility for B.W. Optimization
Only the first 2 (3) nodes have significant improvement
Ningning Hu Carnegie Mellon University 19
For Well-provisioned Access Links
Replication can indeed improve bandwidth performance for end users with access bandwidth larger than 40Mbps
74%
35%
54Mbps
Ningning Hu Carnegie Mellon University 20
Data Transmission Time
End-users’ data transmission time depends on delay, bandwidth, and data size
We estimate data transmission time using a simplified TCP model: a slow start and congestion avoidance phase Assumes no packet loss Slow start: transfer time is delay sensitive Congestion avoidance: bandwidth sensitive
Data size determines whether replication should optimize delay or bandwidth Use “slow-start size” as cross over point
Results: 70% of paths have slow-start size larger than 10KB Larger than the average web page
Ningning Hu Carnegie Mellon University 21
Data Transmission Time (2)
The transmission times for 10KB, 100KB, 1MB and 10MB are 0.4s, 1.1s, 6.4s, and 59.2s, respectively
Ningning Hu Carnegie Mellon University 22
Related Work
Content replication with different optimization metrics Geographic location, network hops and
latency, Retrieval costs, update cost, storage cost, QoS guarantee, …
Greedy algorithm used in replica selection
Ningning Hu Carnegie Mellon University 23
Conclusion
Quantify Internet end-node access-bandwidth distribution and bottleneck location distribution
Two differences distinguish the optimization on bandwidth and on RTT Geographic location is not important for
bandwidth optimization For throughput, only well-provisioned end
users can benefit from content replication