Measuring and Understanding Internet Performance: A ...csrchang/Sun Yat-sen U-July-2012.pdf · 11...
Transcript of Measuring and Understanding Internet Performance: A ...csrchang/Sun Yat-sen U-July-2012.pdf · 11...
Rocky K. C. Chang
Internet Infrastructure and Security Group
The Hong Kong Polytechnic University
10 July 2012
Measuring and Understanding Internet
Performance: A Personal View
2 Graduate Forum, Sun Yat-sen University July 2012
Measuring end-to-end performance
Source: Akamai’s network performance comparison 3 Graduate Forum, Sun Yat-sen University July 2012
Why measuring network path? P
erf
orm
ance
me
tric
s
Latency
Delay variation (jitter)
Connectivity
Packet loss/reordering
Link/path capacity
Available Bandwidth
TCP throughput
Router hop (count)
Packet duplication
…
Ap
pli
cat
ion
s
Traffic engineering
• Network tomography
• Path fingerprinting
• Routing optimization
• QoS routing, admission control, channel assignment in WLAN
User profiling
• Network resource planning
• SLA verification
Application performance tuning
• Rate adaption for VoIP/video streaming apps
• Distance/location prediction for overlay networks, P2Ps, CDNs
…
4 Graduate Forum, Sun Yat-sen University July 2012
Graduate Forum, Sun Yat-sen University July 2012 5
An unfinished business
Much had been done in late 1990 and early 2000.
Very few measurement tools have made their way into wide
deployment.
The Internet is no longer friendly to measurement probes.
Many unfriendly and intelligent middleboxes
Measurement Lab from Google, PlanetLab, …
Measurement results may not reflect the experience of data
packets.
Continuous monitoring for inter-domain paths is hard
without receiving complaints.
Active Path-Quality Measurement
Challenges to active measurement Measurement scalability
Measure many network paths
Measurement reliability
Measurement will not be interfered or interrupted
Measurement representativeness
Measurement traffic representing the traffic of interest
Measurement accuracy
Measurement results are accurately statistically.
Bi-directional measurement
Measure both directions
Measuring multiple metrics
7 Graduate Forum, Sun Yat-sen University July 2012
Challenges to active measurement Measurement scalability
Cooperative measurement paradigm (e.g., OWAMP) not scalable
Measurement reliability
Interference from various middleboxes and firewalls
Measurement representativeness
Using control channel to measure data channel
Measurement accuracy
Sampling rate and patterns
Bi-directional measurement
Measure from both directions
Measuring multiple metrics
Need multiple tools
8 Graduate Forum, Sun Yat-sen University July 2012
A sampling of measurement tools
Graduate Forum, Sun Yat-sen University July 2012 9
Our approach to active measurement Measurement scalability
Non-cooperative measurement paradigm
Measurement reliability
Use standard protocol and legitimate application data
Measurement representativeness
Using data channel to measure data channel
Measurement accuracy
Supporting different sampling rate and patterns
Bi-directional measurement
Measure from only one direction
Measuring multiple metrics
Obtain multiple metrics from one side
10 Graduate Forum, Sun Yat-sen University July 2012
HTTP/OneProbe
Graduate Forum, Sun Yat-sen University July 2012 11
Use normal TCP data packet to measure data-path quality.
Use normal and basic TCP data transmission mechanisms
specified in RFC 793.
Integrated into normal HTTP application sessions.
OneProbe (TCP)
HT
TP
BitTorrent
RT
MP
… Data
clocking
Path
measure-
ment
What does HTTP/OneProbe offer?
Graduate Forum, Sun Yat-sen University July 2012 12
Continuous path monitoring in an HTTP session (stateful measurement)
All in one:
Round-trip time
Loss rate (uni-directional)
Reordering rate (uni-directional)
Capacity (uni-directional)
Loss-pair analysis
…
"Design and Implementation of TCP Data Probes for Reliable and Metric-Rich Network Path Monitoring,“ Proc. USENIX Annual Tech. Conf., June 2009.
OneProbe
RTT
Forward
Loss
Reverse
Loss
Forward
Reordering
Reverse
Reordering
Forward
Capacity Reverse
Capacity
Graduate Forum, Sun Yat-sen University July 2012 13
Graduate Forum, Sun Yat-sen University July 2012 14
Graduate Forum, Sun Yat-sen University July 2012 15
The probe design Send two back-to-back probe data packets.
Capacity measurement based on packet-pair dispersion
At least two packets for packet reordering
Determine which packet is lost.
16 Graduate Forum, Sun Yat-sen University July 2012
The probe design (cont’d)
Similarly for the response packets
Each probe packet elicits a response packet.
Adv. Window = 2 and acknowledge only 1 packet.
17 Graduate Forum, Sun Yat-sen University July 2012
Bootstrapping and continuous monitoring
18 Graduate Forum, Sun Yat-sen University July 2012
Loss and reordering measurement via
response diversity
19 Graduate Forum, Sun Yat-sen University July 2012
18 possible path events
20 Graduate Forum, Sun Yat-sen University July 2012
Based on their response packets
21 Graduate Forum, Sun Yat-sen University July 2012
Path event distinguishability
All 18 cases can be distinguished except for
A1. F1×R2 and F1×R3
A2. F1×RR and F1×R1
A3. F0×R3 and FR×R3
Resolving the ambiguities
A1 and A2: use RTT.
A3: use TCP timestamping.
22 Graduate Forum, Sun Yat-sen University July 2012
Our measurement methods Round-trip delay, asymmetric packet loss and packet reordering
measurement "Design and Implementation of TCP Data Probes for Reliable and Metric-Rich
Network Path Monitoring", Proc. USENIX Annual Tech. Conf., June 2009.
Capacity measurement "TRIO: Measuring Asymmetric Capacity with Three Minimum Round-Trip
Times", Proc. ACM CoNEXT Conf., Dec. 2011.
"A Minimum-Delay-Difference Method for Mitigating Cross-Traffic Impact on Capacity Measurement", Proc. ACM CoNEXT, December 2009.
Loss-pair measurement "Measurement of Loss Pairs in Network Paths", Proc. ACM/USENIX IMC,
November 2010.
Available bandwidth measurement "QDASH: A QoE-Aware DASH System", Proc. ACM Multimedia Systems Conf., Feb.
2012.
23 Graduate Forum, Sun Yat-sen University July 2012
Source
Non-cooperative
destination
The capacity measurement and loss-pair
measurement
Design and analyze three packet-pair methods for sound network measurements
24
Incorporate all the methods into a non-cooperative measurement
tool – HTTP/OneProbe [USENIX 08]
• MDDIF [CoNEXT 09], TRIO [CoNEXT 11], Loss pair [IMC 10]
• Fundamentals: decompose + recompose + recycle
Mitigate cross-traffic interference on path capacity measurement Eliminate measurement traffic interference on
asymmetric capacity measurement Recycle bad packet pairs to infer additional path properties
Graduate Forum, Sun Yat-sen University July 2012
Network capacity
2 8 Mbits/s
6 5
3
15 Mbits/s
4 1 Source Destination
Forward path
Reverse path
25
Link capacity One-way (forward-path) capacity Reverse-path capacity Asymmetric capacity Sub-path capacity
Graduate Forum, Sun Yat-sen University July 2012
Existing techniques: Identify the unaffected packet pair/train
Cross-traffic impact on packet pairs
p1
p1 p2
p1
p2
p2
Cro
ss
tra
ffic
Compressed PPD
p3
p3 p4
p3
p4
p4
Cro
ss
tra
ffic
Expanded PPD
p5
p5 p6
p5
p6
p6
Correct PPD
= S/Cb
Source
Destination
20 Mbits/s
8 Mbits/s
50 Mbits/s
Time
26
Round-trip
capacity
Correct PPD Correct PPD
Graduate Forum, Sun Yat-sen University July 2012
• Third PPD = p6’s delay – p5’s delay.
Delay difference = PPD
p1
p1 p2
p1
p2
p2
Cro
ss
tra
ffic
Compressed PPD
p3
p3 p4
p3
p4
p4
Cro
ss
tra
ffic
Expanded PPD
p5
p5 p6
p5
p6
p6
Correct PPD
= S/Cb
• The MDDIF method: Difference between first and
second packets’ minimum delays (minDelays)
Source
Destination
20 Mbits/s
8 Mbits/s
50 Mbits/s
Time
27
Round-trip
capacity
p3
p3 p2
p3
p2
p2
Graduate Forum, Sun Yat-sen University July 2012
rj-1 pj
dj-1 T
rj-1 pj
dj-1 T
pj rj
dj T
Source
Destination
Cr Cf pj pj-1 pj-1
pj rj
1-RTP (1,1)-TWP
rj-1 pj
TRIO: measuring asymmetric capacity
with three minRTTs
Exploit 1-RTP and (1,1)-TWP with Sf = Sr = S
28
• dj-1-dj-1 = S/Cf.
• dj-dj-1 = S/Cr.
• Avoid response interference!
pj-1
dj-1 R
T R
T T
• Reuse dj-1
• Avoid probe interference!
T
S/Cf
S/Cr
For self-diagnosis
dj R
AsymProbe, CapProbe,
PingPair
Taxonomy of capacity measurement
techniques
29
Clink, Pathchar, Pchar ACCSIG Available tools: Nettimer (tailgating) Packet quartet BBScope Envelope, MultiQ Bprobe, Pathrate,
Paśztor’s method,
PBM
MDDIF, TRIO DSLprobe,
SProbe
Loss-pair measurement
Packet pair with exactly one lost packet (defined by Liu & Crovella [liu01imw])
Path queueing delay Θ
LP01: Θj-1 = dj-1 – minRTT.
LP10: Θj = dj – minRTT. Buffer size of congested hop h’ [liu01imw]:
B = Θj x C(h’).
30
Source
Destination
dj
pj pj-1
LP10
dj-1
pj pj-1
LP01
Three questions:
1. Θj-1 = Θj?
2. Is B accurate? 3. Any additional info from Θj-1 and Θj?
Loss pairs
Graduate Forum, Sun Yat-sen University July 2012 31
Forward Path
Reverse Path
Collaborative path-quality
measurement
HARNET measurement (since 1 Jan
2009)
Graduate Forum, Sun Yat-sen University July 2012 33
Running OneProbe at the 8 Us
Graduate Forum, Sun Yat-sen University July 2012 34
24x365 probing of the paths to 40+ websites
Graduate Forum, Sun Yat-sen University July 2012 35
One
Pro
be@
HK
U
One
Pro
be@
CU
HK
One
Pro
be@
Cit
yU
One
Pro
be@
Pol
yU
One
Pro
be@
BU
One
Pro
be@
HK
UST
One
Pro
be@
HK
IED
One
Pro
be@
LU
40+ web servers selected by the JUCC
Planetopus,
database, etc
HKU CUHK PolyU CityU BU HKUST LU HKIED
Mea
sure
men
t si
de
Use
r si
de
Graduate Forum, Sun Yat-sen University July 2012 36
Application: Impact analysis of
submarine cable faults
Eyjafjallajöekull volcano eruption
Graduate Forum, Sun Yat-sen University July 2012 38
Path-quality degradation for NOK
(Finland) and ENG (in UK)
Graduate Forum, Sun Yat-sen University July 2012 39
Graduate Forum, Sun Yat-sen University July 2012 40
Network congestion caused by the
volcano ashes?
Graduate Forum, Sun Yat-sen University July 2012 41
The surges on packet loss and RTT occurred on 14 April
2009.
But
The onsets of the path congestion and air traffic disruption do
not entirely match.
Some of the peak loss rate and RTT occurred on weekends.
Path congestion can still be observed at the end of the
measurement period.
A SEA-ME-WE 4 cable fault
Graduate Forum, Sun Yat-sen University July 2012 42
The SEA-ME-WE 4 cable encountered a shunt fault on the
segment between Alexandria and Marseille on 14 April 2010.
The repair was started on 25 April 2010, and it took four
days to complete.
During the repair, the service for the westbound traffic to
Europe was not available.
"Non-cooperative Diagnosis of Submarine Cable Faults,” Proc.
PAM 2011, March 2011.
The SEA-ME-WE 4 cable
Graduate Forum, Sun Yat-sen University July 2012 43
A plausible explanation for the network
congestion
Graduate Forum, Sun Yat-sen University July 2012 44
The congestion in the FLAG network was caused by taking
on rerouted traffic from the faulty SEA-ME-WE 4 cable.
FLAG does not use the SEA-ME-WE 4 cable for Hong Kong
NOKIA, ENG3, and BBC.
FLAG uses FEA for Hong Kong NOKIA, ENG3, and BBC
TATA uses different cables between Mumbai and London.
The current phase
Graduate Forum, Sun Yat-sen University July 2012 45
• Server-side measurement methods
• Induce data from clients for measurement.
• Quality measurement without user intervension
• NetMagic implementation of measurement boxes
• Supporting client-side and server-side measurement
• http://www.netmagic.org/
• CERNET-2 measurement platform
• Deploy a measurement platform on CERNET-2
• IPv6 measurement
• Residential broadband measurement platform
• SLA measurement
• Facilitate a social network for network diagnosis and monitoring
More projects
Graduate Forum, Sun Yat-sen University July 2012 46
Network performance data analytics
What and when to induce for measurement?
What can we say from the measurement data with high confidence?
Automating diagnosis and patch-up of network performance problems
By a group of individual home users?
Network tomography
Creative performance data visualization
Spatial and temporal data visualization
Visualization for naked-eye diagnosis
Conclusions
Graduate Forum, Sun Yat-sen University July 2012 47
Develop a suite of atomic path-quality measurement methods. Atomic => application specific, e.g., video, cloud services Path quality => QoE Client side => server side
Network data research Mining network data Designing measurement “experiments” to facilitate network data
mining Towards a science of network research
Operational experience informs research; research underpins network operations Unearthing important problems and questions from operations Putting research output into practice.
Graduate Forum, Sun Yat-sen University July 2012 48
PAM 2013 @ HKPolyU
http://pam2013.comp.polyu.edu.hk/