OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35...

39
Rocky K. C. Chang Internet Infrastructure and Security Group The Hong Kong Polytechnic University 11 February 2011 ISMA 2011 AIMS-3 AIMS-III, 2011 1 OneProbe: Measuring network path quality with TCP data-packet pairs

Transcript of OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35...

Page 1: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

Rocky K. C. Chang

Internet Infrastructure and Security Group

The Hong Kong Polytechnic University

11 February 2011

ISMA 2011 AIMS-3

AIMS-III, 20111

OneProbe: Measuring network path

quality with TCP data-packet pairs

Page 2: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

Our group

AIMS-III, 20112

Active measurement

Non-cooperative path-quality measurement methodologies OneProbe (RTT, loss, reordering), capacity measurement, loss-pair

measurement, traceroute analysis

Applications Longitudinal analysis of network evolution, collaborative diagnosis of

routing and performance problems, impact analysis of submarine cable faults, …

Activities

Publications, research proposals, professional services

Work with HARNET, ISPs, data centers, ….

Plan to work with other groups, including CERNET in China

Page 3: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

Outline

AIMS-III, 20113

1. Path-quality measurement methodologies

2. Applications

• Cooperative network measurement (a demo)

• An impact analysis of a submarine cable fault

3. Conclusions and future works

Page 4: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

AIMS-III, 20114

1. Path-quality measurement

Page 5: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

Measuring e2e network paths

5 AIMS-III, 2011

Page 6: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

Controlling both endpoints

E.g., one-way delay, OWAMP, TWAMP

Controlling one endpoint (non-cooperative measurement)

Using/hacking existing protocols

E.g., ping, tulip, sting …

Controlling zero endpoint

E.g., King

Active measurement models

6 AIMS-III, 2011

Page 7: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

Controlling both endpoints

E.g., one-way delay, OWAMP, TWAMP

Controlling one endpoint (non-cooperative measurement)

Using/hacking existing protocols

E.g., ping, tulip, sting …

Controlling zero endpoint

E.g., King

Active measurement models

7 AIMS-III, 2011

Page 8: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

(Invalid) assumptions

Control-path quality = data-path quality

ICMP, TCP SYN, TCP RST

Middleboxes not an issue

Dropping, rate-limiting, additional latency

No changes in systems

Consecutive increment of IPID (e.g., tulip)

Sampling rate and pattern not an issue

8 AIMS-III, 2011

Page 9: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

(Invalid) assumptions

Control-path quality = data-path quality

ICMP, TCP SYN, TCP RST

Middleboxes not an issue

Dropping, rate-limiting, additional latency

No changes in systems

Consecutive increment of IPID (e.g., tulip)

Sampling rate and pattern not an issue

Invalid assumptions beget

unreliable measurement.

9 AIMS-III, 2011

Page 10: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

Other problems in practice

Support only one or two metrics

Round-trip measurement

No control over packet sizes

Not integrated with application protocols

10 AIMS-III, 2011

Page 11: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

Other problems in practice

Support only one or two metrics

Round-trip measurement

No control over packet sizes

Not integrated with application protocols

Practical issues stifle

deployment.

11 AIMS-III, 2011

Page 12: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

Our design principles

Use normal data packet to measure data-path quality.

Use normal and basic data transmission mechanisms

Integrated into normal application sessions.

12 AIMS-III, 2011

Page 13: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

Our design principles

Use normal data packet to measure data-path quality.

Use normal and basic data transmission mechanisms

Integrated into normal application sessions.

13 AIMS-III, 2011

Reliable measurement

Page 14: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

HTTP/OneProbe

Use normal TCP data packet to measure data-path quality.

Use normal and basic TCP data transmission mechanisms

specified in RFC 793.

Integrated into normal HTTP application sessions.

14 AIMS-III, 2011

OneProbe (TCP)

HT

TP

BitTorrent

RT

MP

… Data

clocking

Path

measure-

ment

Page 15: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

What does HTTP/OneProbe offer?

Continuous path monitoring in an HTTP session (stateful measurement)

All in one:

Round-trip time

Loss rate (uni-directional)

Reordering rate (uni-directional)

Capacity (uni-directional)

Loss-pair analysis

"Design and Implementation of TCP Data Probes for Reliable and Metric-Rich Network Path Monitoring,“ Proc. USENIX Annual Tech. Conf., June 2009.

OneProbe

RTT

Forward

Loss

Reverse

Loss

Forward

Reordering

Reverse

Reordering

Forward

CapacityReverse

Capacity

15 AIMS-III, 2011

Page 16: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

16 AIMS-III, 2011

Page 17: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

17 AIMS-III, 2011

Page 18: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

18 AIMS-III, 2011

Page 19: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

OneProbe: the probe design

Send two back-to-back probe

data packets.

Capacity measurement

Packet reordering

Determine which packet is lost.

Similarly for the response

packets

Each probe packet elicits a

response packet

19 AIMS-III, 2011

Page 20: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

OneProbe: Bootstrapping and continuous

monitoring

20 AIMS-III, 2011

Page 21: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

OneProbe: Loss and reordering

measurement via response diversity

21 AIMS-III, 2011

Page 22: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

AIMS-III, 201122

Discrepancy between ping RTT and OneProbe RTT

Page 23: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

AIMS-III, 201123

Highly asymmetric loss rates

Page 24: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

AIMS-III, 201124

Impact of configuration changes

Page 25: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

AIMS-III, 201125

2.1 Application: Collaborative path-

quality measurement

Page 26: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

HARNET measurement (since 1 Jan

2009)

AIMS-III, 201126

Page 27: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

Running OneProbe at the 8 Us

AIMS-III, 201127

24x365 probing of the paths to 40+ websites

Page 28: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

AIMS-III, 201128

One

Pro

be@

HK

U

One

Pro

be@

CU

HK

One

Pro

be@

Cit

yU

One

Pro

be@

Pol

yU

One

Pro

be@

BU

One

Pro

be@

HK

UST

One

Pro

be@

HK

IED

One

Pro

be@

LU

40+ web servers selected by the JUCC

Planetopus,

database, etc

HKU CUHK PolyU CityU BU HKUST LU HKIED

Mea

sure

men

t si

deU

ser

side

Page 29: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

AIMS-III, 201129

Page 30: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

AIMS-III, 201130

2.2 Application: Impact analysis of

submarine cable faults

Page 31: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

Eyjafjallajöekull volcano eruption

AIMS-III, 201131

Page 32: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

Path-quality degradation for NOK

(Finland) and ENG (in UK)

AIMS-III, 201132

Page 33: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

AIMS-III, 201133

Page 34: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

Network congestion caused by the

volcano ashes?

AIMS-III, 201134

The surges on packet loss and RTT occurred on 14 April

2009.

But

The onsets of the path congestion and air traffic disruption do

not entirely match.

Some of the peak loss rate and RTT occurred on weekends.

Path congestion can still be observed at the end of the

measurement period.

Page 35: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

A SEA-ME-WE 4 cable fault

AIMS-III, 201135

The SEA-ME-WE 4 cable encountered a shunt fault on the

segment between Alexandria and Marseille on 14 April 2010.

The repair was started on 25 April 2010, and it took four

days to complete.

During the repair, the service for the westbound traffic to

Europe was not available.

"Non-cooperative Diagnosis of Submarine Cable Faults,” Proc.

PAM 2011, March 2011.

Page 36: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

The SEA-ME-WE 4 cable

AIMS-III, 201136

Page 37: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

A plausible explanation for the network

congestion

AIMS-III, 201137

The congestion in the FLAG network was caused by taking

on rerouted traffic from the faulty SEA-ME-WE 4 cable.

FLAG does not use the SEA-ME-WE 4 cable for Hong Kong

NOKIA, ENG3, and BBC.

FLAG uses FEA for Hong Kong NOKIA, ENG3, and BBC

TATA uses different cables between Mumbai and London.

Page 38: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

Conclusions and current works

AIMS-III, 201138

Turning a network protocol into a measurement protocol.

Coming up a novel measurement method is just half a story.

Making it work in the non-cooperative Internet is hard.

Current works

Expanding OneProbe’s capability (e.g., asymmetric available

bandwidth)

Applications: fault localizations, SLA measurement, speed test,

net measurement neutrality, correlating with QoE, …

Page 39: OneProbe: Measuring network path quality with TCP data ... · A SEA-ME-WE 4 cable fault 35 AIMS-III, 2011 The SEA-ME-WE 4 cable encountered a shunt fault on the segment between Alexandria

AIMS-III, 201139

oneprobe.org