Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies...

35
1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 - Transport Layer Evolution 2 Content q TCP congestion control schemes q Multipath TCP q SCTP q SPDY and HTTP/2 q QUIC Advanced Networking (SS 17): 07 - Transport Layer Evolution

Transcript of Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies...

Page 1: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

1

Advanced NetworkingTechnologies

Chapter 7Transport Layer Evolution

Advanced Networking (SS 17): 07 - Transport Layer Evolution

2

Content

q TCP congestion control schemesq Multipath TCPq SCTPq SPDY and HTTP/2q QUIC

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Page 2: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

3

TCP congestion control: What is the problem?

q Relies on packets getting lost!q Root cause of all the buffer problemsq Degrades quality for other servicesq Takes some time to measure with large buffers

q Assumes packets get loss due to congestionq What about wireless?

q Direct dependency on bandwidth/delay productq Maximum window 64KB out of the boxq Problems utilizing intercontinental linksq Problems utilizing 10GB/s

Advanced Networking (SS 17): 07 - Transport Layer Evolution

4

TCP congestion control: Significance of the problem

q Wireless routers implemented TCP stitching

q Nowadays:q Router vendors implement AQM strategies

■ Discussed earlierq Link layer has repeated transmits & PAUSE framesq WAN optimizers (decrease number of ACKs etc)

➡ Everything is build around TCP

Advanced Networking (SS 17): 07 - Transport Layer Evolution

“Normal” TCP I-TCP, METP, SNOOP or similar

Page 3: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

5

TCP evolution

q Can we let TCP evolve?q Yes, but takes time

■ New algorithms may make use of existing protocol fields■ New fields via extensions

q RFCs take time to publicationq Need to be adopted by OSes & testers (chicken-or-egg problem)q Must not break existing TCP algorithmsq Must not mess with fairness

q Major improvements theses days: OS vendors “simply” implement new strategiesq CTCPq CUBICq BBR

Advanced Networking (SS 17): 07 - Transport Layer Evolution

6

TCP SACK option

q Introduces Selective Acknowledgmentsq RFC 2018 from 1996q Redefined in RFC 3517 from 2003 &

RFC 6675 from 2012q Supported by all main operating systems

q Negotiated during handshake

q Simple solution?!q Problem solved?!

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Page 4: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

7

TCP SACK option – Implementation errorsOS A1 A2 B C D E F GFreeBSD 5.3-5.4 ❌ ❌

FreeBSD 6.0-8.0Linux 2.2.20-2.6.18 ❌

Linux 2.6.31MacOS X 10.5-10.6OpenBSD 4.2-4.8 ❌ ❌

OpenSolaris 2008.05-2009.06 ❌ ❌

Solaris 10 ❌

Solaris 11 ❌

Windows 2000-2003 ❌ ❌ ❌ ❌ ❌

Windows Vista-7 ❌ ❌

Advanced Networking (SS 17): 07 - Transport Layer Evolution

� Degraded performance, but eventually consistent as timeouts delete SACK state (luckily)

Ekiz et al, Misbehaviors in TCP SACK Generation, ACM SIGCOMM Computer Communication Review, 2011

8

TCP SACK option – Gain?

aver

age

thro

ughp

ut [M

b/s]

burst error rate

NewReno

SACK

Reno

Tahoe

Westwood+

0.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00

1E-07 1E-06 1E-05 1E-04 1E-03

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Nguyen et al, An Implementation of the SACK-Based Conservative Loss Recovery Algorithm for TCP in ns-3 (extended version), Extended version of Wns3, 2015

Page 5: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

9

C-TCP

q Compound TCP introduced by Microsoft 2005q 3 RFC drafts posted until November 2008q Enabled in Windows server editions by default, requires enabling them

on clients

q Idea: Two congestion windowsq “Normal” loss-based oneq One for delay, also estimates bottleneck queueq Sum up, i.e., win = min(cwnd + dwnd, awnd)

q Delay based congestion window increases quickly when long delay observed

q Decreases to 0 to reach “normal” steady state behavior afterwards

Advanced Networking (SS 17): 07 - Transport Layer Evolution

10

C-TCP – Window behavior

TCP

CTCP

DWND

Advanced Networking (SS 17): 07 - Transport Layer Evolution

tTan et al, A Compound TCP Approach for High-speed and Long Distance Networks, TR Microsoft, 2005

C-TCP assumes a backlog of ! packets at

bottleneck

Page 6: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

11

C-TCP – Throughput in lossy networks

0

100

200

300

400

500

600

700

0.01 0.001 0.0001 0.00001 0.000001 0Packet loss rate

Thro

ughp

ut (M

bps)

Regular TCP HSTCP CTCP

Advanced Networking (SS 17): 07 - Transport Layer Evolution

12

C-TCP – Fairness

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.01 0.001 0.0001 0.00001 0.000001 0Packet loss rate

Band

widt

h st

olen

CTCP HSTCP

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Page 7: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

13

CUBIC

q Developed from an older more complex algorithm BICq Todays standard in Linux and MacOS

q Ideas: q Decrease queues in routers, send data at expected bandwidthq Aggressively increase bandwidth periodically to probe for moreq Scale window using a cubic function

Advanced Networking (SS 17): 07 - Transport Layer Evolution

A. BIC Window Growth Function Before delving into CUBIC, let us examine the features of

BIC. The main feature of BIC is its unique window growth

function.

Fig. 1 shows the growth function of BIC. When it gets a

packet loss event, BIC reduces its window by a multiplicative

factor β. The window size just before the reduction is set to the

maximum Wmax and the window size just after the reduction is

set to the minimum Wmin. Then, BIC performs a binary search

using these two parameters – by jumping to the “midpoint”

between Wmax and Wmin. Since packet losses have occurred at

Wmax, the window size that the network can currently handle

without loss must be somewhere between these two numbers.

However, jumping to the midpoint could be too much

increase within one RTT, so if the distance between the

midpoint and the current minimum is larger than a fixed

constant, called Smax, BIC increments the current window size

by Smax (linear increase). If BIC does not get packet losses at the

updated window size, that window size becomes the new

minimum. If it gets a packet loss, that window size becomes the

new maximum. This process continues until the window

increment is less than some small constant called Smin at which

point, the window is set to the current maximum. So the

growing function after a window reduction will be most likely

to be a linear one followed by a logarithmic one (marked as

“additive increase” and “binary search” respectively in Fig. 1).

If the window grows past the maximum, the equilibrium

window size must be larger than the current maximum and a

new maximum must be found. BIC enters a new phase called

“max probing.” Max probing uses a window growth function

exactly symmetric to those used in additive increase and binary

search – only in a different order: it uses the inverse of binary

search (which is logarithmic; its reciprocal will be exponential)

and then additive increase. Fig. 1 shows the growth function

during max probing. During max probing, the window grows

slowly initially to find the new maximum nearby, and after

some time of slow growth, if it does not find the new maximum

(i.e., packet losses), then it guesses the new maximum is further

away so it switches to a faster increase by switching to additive

increase where the window size is incremented by a large fixed

increment.

The good performance of BIC comes from the slow increase

around Wmax and linear increase during additive increase and

max probing.

B. CUBIC Window Growth Function Although BIC achieves pretty good scalability, fairness, and

stability during the current high speed environments, the BIC’s

growth function can still be too aggressive for TCP, especially

under short RTT or low speed networks. Furthermore, the

several different phases of window control add a lot of

complexity in analyzing the protocol. We have been searching

for a new window growth function that while retaining most of

strengths of BIC (especially, its stability and scalability),

simplifies the window control and enhances its TCP

friendliness.

In this paper, we introduce a new high-speed TCP variant:

CUBIC. As the name of the new protocol represents, the

window growth function of CUBIC is a cubic function, whose

shape is very similar to the growth function of BIC. CUBIC is

designed to simplify and enhance the window control of BIC.

More specifically, the congestion window of CUBIC is

determined by the following function:

max

3)( WKtCWcubic +−= (1)

where C is a scaling factor, t is the elapsed time from the last

window reduction, Wmax is the window size just before the last

window reduction, and 3max CWK β= , where β is a constant

multiplication decrease factor applied for window reduction at

the time of loss event (i.e., the window reduces to βWmax at the

time of the last reduction).

Fig. 2 shows the growth function of CUBIC with the origin

at Wmax. The window grows very fast upon a window reduction,

but as it gets closer to Wmax, it slows down its growth. Around

Wmax, the window increment becomes almost zero. Above that,

CUBIC starts probing for more bandwidth in which the

window grows slowly initially, accelerating its growth as it

moves away from Wmax. This slow growth around Wmax

enhances the stability of the protocol, and increases the

utilization of the network while the fast growth away from Wmax

ensures the scalability of the protocol.

The cubic function ensures the intra-protocol fairness among

the competing flows of the same protocol. To see this, suppose

that two flows are competing on the same end-to-end path. The

Wmax

Steady State Behavior

Max Probing

Fig. 2: The Window Growth Function of CUBIC

Wmax

Additive Increase

Max Probing

Fig. 1: The Window Growth Function of BIC

Binary Search

Rhee et al.:CUBIC: A New TCP-Friendly

High-Speed TCP Variant, ACM SIGOPS Operating Systems

Review 42.5 (2008)

Wcubic

= C(t� 3pW

max

�/C)3 +Wmax

14

CUBIC – Throughput over time

III. PERFORMANCE EVALUATION

In this section, we present some performance results regarding the TCP friendliness and stability of CUBIC and other high-speed TCP variants. For CUBIC, we set β to 0.8, C to 0.4, and Smax to 160. We use NS-2 for simulation. The network topology is dumbbell. For each simulation run, we run four flows of a high-speed protocol and four flows of regular long-term TCP SACK over the same end-to-end paths for the entire duration of the simulation; their starting times and RTTs are slightly varied to reduce the phase effect. About 10% of background traffic is added in both forward and backward directions of the dumbbell setup. For all the experiments unless notes explicitly, the buffer size of Drop Tail routers is set to 100% of BDP. Experiment 1: TCP Friendliness in Short-RTT Networks (Simulation script available in the BIC web site):

We test five high speed TCP variants: CUBIC, BIC, HSTCP,

Scalable TCP, and HTCP. We set RTT of the flows to be around 10 ms and vary the bottleneck bandwidth from 20 Mbps to 1 Gbps. Fig. 5 shows the throughput ratio of the long-term TCP flows over the high-speed flows (or TCP friendly ratio) measured from these runs.

The surprising result is that BIC and STCP even show worse TCP friendliness over 20Mbps than over 100Mbps. However, we are still not sure the exact reason for this result. Over 100 Mbps, all the high speed protocols show reasonable friendliness to TCP. As the bottleneck bandwidth increases from 100Mbps to 1Gbps, the ratios for BIC, HSTCP and STCP drop dramatically indicating unfair use of bandwidth with respect to TCP. Under all these environments, regular TCP can still use the full bandwidth. Scalable TCP shows the worst TCP friendliness in these tests followed by BIC and HSTCP. CUBIC and HTCP consistently give good TCP friendliness.

Experiment 2: TCP Friendliness in Long-RTT Networks (Simulation script available in the BIC web site)

Although the TCP mode improves the TCP friendliness of

the protocol, it does so mostly for short RTT situations. When the BDP is very large with long RTT, the aggressiveness of the window growth function (more specifically, the congestion epoch length) has more decisive effect on the TCP friendliness. As the epoch gets longer, it gives more time for TCP flows to grow their windows.

An important feature of BIC and CUBIC is that it keeps the epoch fairly long without losing scalability and network utilization. Generally, in AIMD, a longer congestion epoch means slower increase (or a smaller additive factor). However, this would reduce the scalability of the protocol, and also the network would be underutilized for a long time until the window becomes fully open (Note that it is true only if the multiplicative decrease factor is large; but we cannot keep the multiplicative factor too small since that implies much slower convergence to the equilibrium). Unlike AIMD, CUBIC increases the window to (or its vicinity of) Wmax very quickly and then holds the window there for a long time. This keeps the scalability of the protocol high, while keeping the epoch long and utilization high. This feature is unique both in BIC and CUBIC.

In this experiment, we vary the bottleneck bandwidth from 20Mbps to 1Gbps, and set RTT to 100ms. Fig. 6 shows the throughput ratio of long-term TCP over high-speed TCP variants. Over 20 Mbps, all the high speed protocols show reasonable friendliness to TCP. As the bandwidth gets larger than 20 Mbps, the ratio drops quite rapidly. Overall, CUBIC shows a better friendly ratio than the other protocols.

Experiment 3: Stability (Simulation script available in the

Fig. 5: TCP-Friendly Ratio in Short-RTT Networks

0

0.2

0.4

0.6

0.8

1

1.2

100050030010020

TCP/

Hig

h-Sp

eed

Thro

ughp

ut R

atio

(%)

Link Speed (Mbps)

CUBICBIC

HSTCPSTCPHTCP

Fig. 4: CUBIC window curves with competing flows (NS simulation in a network with 500Mbps and 100ms RTT), C = 0.4, β = 0.8.

Fig. 6: TCP-Friendly Ratio in Long-RTT Networks

0

0.2

0.4

0.6

0.8

1

1.2

100050030010020

TCP/

Hig

h-Sp

eed

Thro

ughp

ut R

atio

(%)

Link Speed (Mbps)

CUBICBIC

HSTCPSTCPHTCP

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Page 8: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

15

CUBIC - Maybe fair but takes forever

4

0

50

100

150

200

250

300

350

0 100 200 300 400 500 600

cwnd

(pac

kets

)

time (s)

Convergence 10Mbit/sec Bottleneck

Flow 1Flow 2

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

0 100 200 300 400 500 600

cwnd

(pac

kets

)

time (s)

Convergence 250Mbit/sec Bottleneck

Flow 1Flow 2

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

0 100 200 300 400 500 600 700 800

cwnd

(pac

kets

)

time (s)

Flow 1 Flow 2

Fig. 2. Cubic TCP cwnd time histories following startup of a secondflow. Bandwidth is 10Mbits/s (top), 250 Mbit/sec (middle) and 500 Mbit/sec(bottom). RTT is 200ms, queue size 100% BDP, no web traffic.

This effect is reinforced by changes to the AIMD backofffactor. In standard TCP flows backoff cwnd by 0.5 on detect-ing packet loss. Strategies such as BIC-TCP and Cubic-TCPinstead use a backoff factor of 0.8. As a result, flows releasebandwidth more slowly when informed of congestion, againhaving the effect of slowing convergence.

B. Slow convergence implies prolonged unfairness.One consequence of slow convergence is that periods of

extreme unfairness between flows may persist for long periods;even in situations where flows do eventually converge tofairness. Such situations are masked when fairness results arepresented purely in terms of long-term averages. However, thisbehaviour is immediately evident, for example, in the timehistories shown in Figure 2 and it seems clear that it has im-

0

0.2

0.4

0.6

0.8

1

10 100

Fairn

ess

Rat

io

RTT (ms)

Standard TCPCubici 10Mb/s

Cubic 250Mb/s

Fig. 3. Ratio of throughputs of two Cubic TCP flows with the same RTT(also sharing same bottleneck link and operating same congestion controlalgorithm) as path propagation delay is varied. Flow throughputs are averagedover the last 200s of each test run and so approximate asymptotic behaviour,neglecting initial transients. Results are shown for 10Mbit/sec and 250Mbit/secbottleneck bandwidths. The bottleneck queue size is 100% BDP, no webtraffic.

0

50

100

150

200

250

200 250 300 350 400 450 500 550 600

(Mbp

s)

time (s)

Flow 1Flow 2

Fig. 4. Impact of web traffic on convergence. Evolution of mean bandwidth,averaged over 20 test runs, following startup of a second flow. 200 backgroundweb flows (100 in each direction). Link bandwidth is 250 Mbit/sec, RTT is200ms, queue size 100% BDP.

portant practical implications. For example, two identical filetransfers may have very different completion times dependingon the order in which they are started. Also, long-lived flowscan gain a substantial throughput advantage at the expense ofshorter-lived flows. The latter seems particularly problematicas the majority of TCP flows are short to medium sized andso a single long-lived flow may potentially penalize a largenumber of users (akin to a form of denial of service).With regard to the last point, the impact of a long-lived flow

on a short-lived flow is illustrated, for example, in Figure 5.Here, we measure the completion time for a download versusthe size of the download. Measurements are shown (i) for thebaseline case where no other flow shares the bottleneck linkand (ii) for the case where a single long-lived flow sharesthe link and competes for bandwidth. It can be seen that inthe baseline situation, Cubic-TCP, standard TCP and H-TCPall exhibit similar completion times. It is perhaps initiallysurprising that standard TCP performs so well in this test,in view of concerns about performance in high-speed paths.However, we note that the link in this example is provisionedwith a BDP of buffering. A standard TCP flow slow-starts to

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Leigth et al.: Experimental evaluation of Cubic-TCP, 2008

16

Reinventing congestion control: BBR

q Estimates bottleneck bandwidth and round-trip propagation time (BBR)

q Developed at Google, available in recent Linux kernels

q Goal: Reduce buffer bloat by optimizing TCPq Idea: “congestion-based” observe how much data is in-flightq Idea: keep queues filled at the sender onlyq Another old idea: Use also delay information (but different from New

Vegas etc.)q No clocked by ACKs but paced

q Following slides based on: q Cardwell et al.: BBR Congestion Control, IETF Meeting 97, Seoul, 2016q Cardwell et al.: BBR Congestion-Based Congestion Control, ACM Queue,

2016

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Page 9: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

17

BBR: Working point with increasing bandwidth

4

Deliv

ery

rate

BDP BDP + BufSize

RTT

Amount in flight

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Optimal

Where loss-based CC starts controlling

Where loss-based CC works in lossy networks

18

Phases in BBR: 1. Exponential BW search

q Exponential BW searchq Increase then decrease exponentiallyq Probes for max bandwidth by monitoring in-flight data & ACKs

Advanced Networking (SS 17): 07 - Transport Layer Evolution

4

Deliv

ery

rate

BDP BDP + BufSize

RTT

Amount in flight

Optimal

Page 10: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

19

Phases in BBR: 2. Drain queues

q Exponentially decrease in-flight dataq Clears queues fast againq By monitoring in-flight data & ACKs

Advanced Networking (SS 17): 07 - Transport Layer Evolution

4

Deliv

ery

rate

BDP BDP + BufSize

RTT

Amount in flight

Optimal

20

Phases in BBR: 3. Refresh measurements

q Periodically increase send rate to probe for more bwq Periodically decrease it to probe for minimal RTTq Remember: BW * RTT = Max in-flight data being processed

Advanced Networking (SS 17): 07 - Transport Layer Evolution

4

Deliv

ery

rate

BDP BDP + BufSize

RTT

Amount in flight

Optimal

Page 11: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

21

BBR behavior compared to CUBIC

16

RTT

(ms)

Data

sent

or A

CKed

(MBy

tes)

STARTUP DRAIN PROBE_BW

CUBIC (red)BBR (green)ACKs (blue)

16

BBR and CUBIC: Start-up behavior

Advanced Networking (SS 17): 07 - Transport Layer Evolution

22

Fairness to RENO & CUBICSharing deep buffers with loss-based CC

At first CUBIC/Reno gains an advantage by filling deep buffers

But BBR does not collapse it adapts BBR's bw and RTT probing tends to drive system toward fairness

Deep buffer data point 8*BDP case bw = 10Mbps, RTT = 40ms, buffer = 8 * BDP

-> CUBIC 6.31 Mbps vs BBR 3.26 Mbps

23

q Cubic with small advantage here, but about fairq Depending on parameters

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Page 12: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

23

BBR - Throughput in lossy networks

BBR vs CUBIC synthetic bulk TCP test with 1 flow, bottleneck_bw 100Mbps, RTT 100ms

Fully use bandwidth, despite high loss

18

Advanced Networking (SS 17): 07 - Transport Layer Evolution

What does it mean to fairness?

24

All fixed? Maybe, maybe not yet

Advanced Networking (SS 17): 07 - Transport Layer Evolution

http://blog.cerowrt.org/post/birthday_problem/

Page 13: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

25

Excursion: Why is everybody worried about fairness?

q Back in the very old days people tried to optimize network power

q Delay average weighted by throughput!

q We want to maximize using locally observable informationq Capacity of all links on pathq # of users sharing each link on the pathq Message rate of all of these users

q Unfortunately this is impossible

Advanced Networking (SS 17): 07 - Transport Layer Evolution

J. Jaffe: Flow Control Power is Nondecentralizable, IEEE Transactions on Communications, 1981

Power P =

Total Throughput �̂

Average Delay D

26

Maximizing P [Kleinrock 78]

q Maximizing P for a single link yields in

q For a single link (capacity μ) modelled by an M/M/1 system (not being overburdened):

Advanced Networking (SS 17): 07 - Transport Layer Evolution

P 0 =�0D � �D0

D2

Pmax

) P 0 = 0

) �0D = �D0 =d�

d�D = �

dD

d�

) D

�=

dD

d�

D =1

µ� �

dD

d�=

1

(µ� �)2) µ/2 = �

Page 14: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

27

Network power: Example network

q Gives us:

Advanced Networking (SS 17): 07 - Transport Layer Evolution

µ1

µ2

�̂ = (µ1+µ2)/2

D = �̂�1(2�1/µ1 + 2�2/µ2) =4

µ1 + µ2

P = 1/8(µ1 + µ2)2

Optimal if µ1 = µ2

Autonomous servers

28

Network power: Counter example

q Assume µ1 >> µ2

q Let only server 1 send data

Advanced Networking (SS 17): 07 - Transport Layer Evolution

µ1

µ2

Autonomous servers

�̂ = (µ1)/2

D = �̂�1(2�1/µ1 + 0/µ2) =2

µ1

P = µ21/4Larger than 1/8(µ1 + µ2)

2

Page 15: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

29

Network power: Conclusion

q (Now) obvious: Optimizing overall network requires knowledge over the whole network � would not scale

q More fundamental: q “No performance criterion based on and is decentralizable.”q Details in [Jaf81]

q Focusing on global optimization metrics may simply be not the right thing to do…q Fairness between flows is a local criterionq Seems more suited

Advanced Networking (SS 17): 07 - Transport Layer Evolution

�̂ D

30

Content

q TCP congestion control schemesq Multipath TCPq SCTPq SPDY and HTTP/2q QUIC

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Page 16: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

31

Multipath TCP – Motivation

q TCP connections are bound to a hosts IP addressesq IP address determines routing between hosts (unless IP spoofing)

q In many scenarios insufficient (resilience, bandwidth, mobility)q Mobile users

■ Handoff between WiFi and LTEq Channel bundling

■ Why can’t I have two DSL connections and use both?q Data centers

■ Fat Tree topologies■ Aggregating links■ Advantages with resilience and blocking

Advanced Networking (SS 17): 07 - Transport Layer Evolution

C. Paasch: Decoupled from IP, TCP is at last able to support multihomed hosts, ACM queue, March 2014

32

Multipath – Blocking in Fat Trees

Advanced Networking (SS 17): 07 - Transport Layer Evolution

� Not full control over routing, but at least the lowest layer

Page 17: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

33

Multipath TCP – Objectives

q Scenario: q Work in any scenario with multiple IP addressesq No source routing or so (cross-product IP address many)

q Fully backward compatibleq No change to socket APIq No change to middleboxes

■ Firewalls■ NICs (think of TSO)■ WAN optimizers

q No unfairness to non-multipath aware TCPq All build in TCP option headers

Advanced Networking (SS 17): 07 - Transport Layer Evolution

34

Transport Layer

Multipath TCP – Architecture

Advanced Networking (SS 17): 07 - Transport Layer Evolution

TCP 1 TCP 2 TCP n

MPTCP

Application

Network Layer

Socket API

Transport Layer

TCP 1 TCP 2 TCP n

MPTCP

Application

Network Layer

Socket API

Page 18: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

35

Multipath TCP – Connection setup

q Subflows created after devices agree to use MPTCP (middle-box safe)

q Keys exchanged during setupq Used to bind other

sessions cryptographically to master session

q Kb is echoed back for stateless operation■ Why? State already

existing?q Token to identify MPTCP

session derived from the keys

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Initial connection

Establishing additional

connections

36

Fun with middleboxes

q In NAT & firewall scenarios: Server cannot initiate new connection

q � Devices announce available addresses by ADD_ADDR optionq Client may establish second connection in this scenario

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Page 19: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

37

(More) Fun with middleboxes

q Middleboxes mess with TCP streams (aggregate, rewrite etc.)q Rely on consistent sequence numbers per substreamq Requires additional mechanism to track sequence in overall stream

■ Absolutely independent from substream handling■ I.e. ACKs are sent for substreams, even if not expected in aggregated

stream

q Placement in overall stream carried in TCP option header?q Not feasible (aggregation, TSO)

q Sender specifies a fixed “mapping” of data to subflowsq Receiver informed in advanceq May be remapped for retransmits (if one flow dies)q Fall-back to single flow TCP possible by “infinite mapping”

Advanced Networking (SS 17): 07 - Transport Layer Evolution

38

MPTCP: Retransmits

q Obvious: Needs to deal with retransmits on subflow levelq Middleboxes may introduce thisq MPTCP instances may use it therefore too

q Subflows may fail (temporarily or permant)q Needs retransmits over different subflowq Implies changes in mappingq Underlying TCP connection in original flow still needs to retransmit

■ Would break connection otherwise■ � Performance penalty

q Scheduling?

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Page 20: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

39

MPTCP: Congestion control (I)

q Observation: MPTCP “smears” congestion over the network

q Naïve solution for CC: use congestion control of subflowsq Unfair advantage against regular TCPq Depends on number of used flowsq Also may not be optimal

■ Naïve: λ1 = λ2= λ3

■ Optimal: λ1 = λ3; λ3=0

q Also possible: measure & control subflows togetherq May lead to “flappiness”,

i.e. sudden load switchesq May smear congestion too

much

Advanced Networking (SS 17): 07 - Transport Layer Evolution

λ3

λ1

λ2

40

MPTCP: Congestion control (II)

q Loosely coupled subflows:q RFC 6356 suggests with each ACK:

q Losses still halve cwndi

q Throttle subflows to not exceed rate of “virtual” single TCP flowq Alpha controlling the allowed violation of that conditionq Still: no load-balancing between interfering flowsq Several scenarios with unfairness towards Reno TCP and between

MPTCP instances discovered (not Pareto optimal)

Advanced Networking (SS 17): 07 - Transport Layer Evolution

cwndi

+= #bytesAcked⇥MSSi

⇥min

✓↵

cwndtotal

,1

cwndi

↵ = cwndtotal

0

@max

⇣pcwndi

rtti

Pcwndirtti

1

A

2

C. Raiciu: Practical Congestion Control for Multipath Transport Protocols, Tech. Rep, 2009

Page 21: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

41

MPTCP: Congestion control (III)

q Opportunistic linked-increases algorithm (OLIA)q Also loosely coupled q Addresses problems with Pareto optimalityq With each ACK:

Advanced Networking (SS 17): 07 - Transport Layer Evolution

R. Khalili et al: MPTCP Is Not Pareto-Optimal: Performance Issues and a Possible Solution, IEEE/ACM Transactions on Networking, 2013

cwndi +=

pcwndi

rttiP cwndirtti

!2

+↵i

cwndi

Like before!Controls

aggressiveness for subflow

q "i is positive for subflows that have not reached the estimated bandwidth/delay ratio

42

Multipath TCP – Discussion

q Handshake:q Why is MP_CAPABLE sent three times?q Why is the second handshake based on a normal TCP handshake?q Why is it a four way handshake?

q Scenarios:q Can I use MPTCP with a single NIC?q Can I use MPTCP with a single IP address?q Can I use MPTCP to increase performance if I have two DSL lines (with

NAT)?q Does MPTCP help with delay problems?q Do applications need to be aware of MPTCP? What does MPTCP mean

for “legacy” application?q Security:

q IDS & Firewalls?q SYN cookies?

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Page 22: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

43

Content

q TCP congestion control schemesq Multipath TCPq SCTPq SPDY and HTTP/2q QUIC

Advanced Networking (SS 17): 07 - Transport Layer Evolution

44

Stream Control Transmission Protocol (SCTP)

q Protocol developed to transport SS7 messages over IPq Reliable and message-orientated

q Like a crossbreed of TCP and UDP

q First RFC 2960 in October 2000, current RFC 4960 (September 2007)q Many shortcomings of TCP have been known during design phaseq So from scratch SCTP supported

q Selective ACKsq Multistreamingq Multihomingq Heartbeatsq TLV coding of extension headersq SYN flood protectionq Better protocol state handling (no half-open connections)q …

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Page 23: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

45

SCPT – Multistreaming

IEEE Communications Magazine • April 200466

upper-layer applications. In other words, theHOL effect is limited within the scope of indi-vidual streams, but does not affect the entireassociation.

Multistreaming and HOL blocking are illus-trated in Fig. 4 where an SCTP associationconsisting of four streams is shown. Segmentsare identified by stream sequence numbers(SSNs) [1] that are unique within a stream, butdifferent streams can have the same SSN. Inthe figure, SSN 11 in stream 1 has been deliv-ered to the upper-layer application, and SSN 9of the second stream is lost in the network;SSNs 10, 11, 12 are therefore queued in thebuffer of the second stream, waiting forretransmitted SSN 9 to arrive. Arriving SSN 13at stream 2 will also be queued. Similarly, SSN4 of stream 3 is missing during the transmis-sion resulting in the blocking of SSNs 5, 6, and7. For stream 4, SSN 21 is being delivered tothe upper-layer application, while arriving SSN23 will be queued in the buffer because ofmissing SSN 22. Note that when SSN 12 arrivesat the buffer of stream 1, it can be deliveredimmediately even if the other streams areblocked. This illustrates that segments arrivingon stream 1 can still be delivered to the upper-layer application, although streams 2 and 3 are(and stream 4 will be) blocked because of lostsegments.

An example application of using SCTP multi-streaming in Web browsing is shown in Fig. 5.Here, an HTML page is split into five objects: aJava applet, an ActiveX control, two images, andplain text. Instead of creating a separate connec-tion for each object as in TCP, SCTP makes useof its multistreaming feature to speed up thetransfer of HTML pages. By transmitting eachobject in a separate steam, the HOL effectbetween different objects can be eliminated. Ifone object is lost during the transfer, the otherscan still be delivered to the Web browser at theupper layer while the lost object is being retrans-mitted from the Web server. This results in a

better response time to users while opening onlyone SCTP association for a particular HTMLpage.

CONGESTION CONTROLSCTP congestion control is based on the wellproven rate-adaptive window-based congestioncontrol scheme of TCP. This ensures that SCTPwill reduce its sending rate during network con-gestion and prevent congestion collapse in ashared network. SCTP provides reliable trans-mission and detects lost, reordered, duplicate, orcorrupt packets. It provides reliability by retrans-mitting lost or corrupt packets. However, thereare several major differences between TCP andSCTP:

•SCTP incorporates a fast retransmit algo-rithm based on SACK gap reports similar to thatof TCP SACK. This mechanism speeds up lossdetection and increases the bandwidth utiliza-tion. One of the major differences betweenSCTP and TCP is that SCTP does not have anexplicit fast recovery phase. SCTP achieves fastrecovery automatically with the use of SACK [1].

•Compared to TCP, the use of SACK ismandatory in SCTP, which allows more robustreaction in the case of multiple losses from a sin-gle window of data. This avoids a time-consum-ing slow start stage after multiple segment losses,thus saving bandwidth and increasing through-put.

•During slow start or congestion avoidanceof SCTP, the congestion window (cwnd) isincreased by the number of acknowledged bytes;in TCP it is increased by the number of ACKsegments received. Since the TCP sender

■ Figure 3. An SCTP association consisting of four streams carrying data fromone upper layer application.

Association Stream

SCTPStreambuffers

Stream1

Stream2

Stream3

Stream4

Application (source)

Stream4

Stream3

Stream2

Stream1

Application (destination)

IP

DLL

PHY

■ Figure 4. An illustration showing HOL blockingof individual streams at the receiver.

11

12 13

Association

7 23

21

10

11

5

6

12

Stream1

Stream2

Stream3

Stream4

Application

q Multistreaming at transport layer avoids head of line blocking

Advanced Networking (SS 17): 07 - Transport Layer Evolution

S. Fu: SCTP: State of the art in research, products, and

technical challenges, IEEE Communications Magazine 42(4):64 - 76 · May 2004

46

SCPT – Connection management

q Four way handshakeq Server allocates states AFTER

cookie echoq INIT + INIT ACK may contain TLV

coded optionsq What does this mean to

extensibility? Think of the cookie mechanism

q Connection identified by two tags (cmp. IPsec SA)

q Shutdown leads to immediate packet flush

q No half-open connectionsq Smaller protocol state machine

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Hand-shake

ConnectionClose

Page 24: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

47

SCTP – Chunks (I)

q SCTP common header only contains port numbers, a “verification tag” (i.e. connection id) & CRC-32 checksum

q Any payload & protocol data transported in “chunks”q Used even for internal purposes, e.g. address configuration

q Multiple chunks maybe aggregated in a packet

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Congestion and Flow Control 471

3.2 Counting Outstanding Bytes

As pointed out, cwnd has an influence on the network load and thus on thethroughput. Therefore, the way the outstanding bytes, that limit cwnd, arecounted, is important and should be examined.

Looking at an SCTP packet containing several data chunks, the amount ofuser data can vary significantly with the size of the individual chunks (i.e. mes-sages) assuming the same packet length.

IPHeader

SCTPCommonHeader

DataChunkHeader

12 16

User Data

143620

(a) One chunk with 1436 bytes of user data

IPHeader

SCTPCommonHeader

DataChunkHeader

12 16

UserData

DataChunkHeader

28 16

UserData

DataChunkHeader

UserData

28 281620

(b) 33 chunks, each containing 28 bytes of user data

length[bytes]

length[bytes]

Fig. 1. SCTP packet format

In Figure 1(b) the packet contains 33 DATA-chunks with 28 bytes of user dataeach, adding up to 924 bytes of user data compared to 1436 bytes in the packetin Figure 1(a). Both packets have a size of 1484 bytes. Whereas the overhead isjust 1 % in (a) the headers add up to 36 % in (b) and can be more than 60 %for even smaller user message sizes.

Therefore, we have to distinguish between the amount of data that is injectedinto the network and the user data that arrive at the application layer. Whereasthe first has a direct impact on the network load, the second results in thegoodput. Both depend on the number of packets (1), that are allowed by thecwnd.

NoOfPackets =!

cwndSizeP

"(1)

Calculating the size of a packet (SizeP ), the headers for IP (HIP ) and SCTP(HSCTP ) have to be considered as well as the size of the DATA-chunks (SizeChunk ).

SizeP = HIP + HSCTP + CPP · SizeChunk (2)

The number of the chunks per packet (CPP ) is calculated as

CPP =#

MTU − HIP − HSCTP

UMS + PUMS + HChunk

$(3)

The average user message size (UMS ) per packet and the corresponding paddingbytes (PUMS ) feature the variable parts of the packets.

I. Rüngeler et al.: Congestion and Flow Control in the Context of the Message-Oriented Protocol SCTP, Networking 2009

48

SCTP – Chunks (II)

q General chunk header format

q Well-known chunk types: 0 - Payload Data (DATA) 1 - Initiation (INIT) 2 - Initiation Acknowledgement (INIT ACK) 3 - Selective Acknowledgement (SACK) 4 - Heartbeat Request (HEARTBEAT) …

q If chunk type unknown highest 2 bit of chunk type code:q 00 - Stop processing the rest of SCTP packetq 01 - Stop and report an 'Unrecognized Chunk Type’q 10 - Skip this chunk and continue processingq 11 - Skip this chunk and continue processing, but error

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Chunk type Chunk flags Chunk length

Chunk specific value

8 bit

Padding (up to 3 bytes)

Page 25: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

49

SCTP – Data chunk

q Chunks may add own headerq Example: Payload data chunk

q Flags carry reorder requirement and fragmentation flagsq Payload Protocol Identifier passed to application transparentlyq Stream field are used to transport multiple data streams over an SCTP

connection

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Type = 0 Chunk flags Chunk length

User Data

8 bit

Transmission Sequence Number (TSN)

Stream Identifier Stream Sequence NumberPayload Protocol Identifier

Padding (up to 3 bytes)

50

SCTP – Multiple paths

q Alternate paths are probed by HEARTBEAT messages including a 64-bit nonceq Addresses exchanged during INIT sequenceq Allows secure setup of alternative pathsq Support for dynamic addresses added with RFC5061

■ Addresses added and removed using authenticated chunks (iff globally addressable)

■ Still requiring verification

q Messages are only sent over the primary pathq Switch after failure detectionq Does not directly allow for load-sharing!q Multipath SCTP: https://tools.ietf.org/html/draft-tuexen-tsvwg-sctp-

multipath-13 (December 2016, but no significant changes lately)

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Page 26: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

51

SCTP – Current State

q Not widely deployed

q Many reasons:q No killer featureq Application developers

must explicitly enable itq Firewall & NATs?

q RFC for NAT support not even done yet

Advanced Networking (SS 17): 07 - Transport Layer Evolution

http://www.caida.org/data/realtime/passive/?monitor=equinix-chicago-

dirA&row=timescales&col=sources&sources=proto&graphs_sing=ts&counters_sing=bits&timescales=24&timescales=168&time

scales=672&timescales=17520

52

Content

q TCP congestion control schemesq Multipath TCPq SCTPq SPDY and HTTP/2q QUIC

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Page 27: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

53

SPDY and HTTP/2

q Is this not application layer?!

Advanced Networking (SS 17): 07 - Transport Layer Evolutionhttp://www.caida.org/data/realtime/passive/?monitor=equinix-chicago-dirA

54

SPDY and HTTP/2

q Is this not application layer?!

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Page 28: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

55

SPDY and HTTP/2

q 2009 Google announced to develop an HTTP successor: SPDYq Goal: 50% reduction of page load timeq Includes HTTP header compressionq As of 2015 it is deprecated

q Now HTTP/2 is gold standard (RFC 7540)q Shares many of the ideas of SPDY

q Addressed key problem:q HTTP 1.1 pipelining is broken due to misbehaving applications and head

of line blockingq In practice disabled mostly� Problems with TCP congestion controlq Solution: building multi-stream support on top of TCP/TLS

■ Idea similar to SCTP but heavily optimized for web traffic & backward compatible with home routers

Advanced Networking (SS 17): 07 - Transport Layer Evolution

56

HTTP/2 – Binary encoding

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Ilya Grigorik: High Performance Browser Networking, O'Reilly, 2013

q HTTP/2 emulates “normal” HTTP to applicationq Internal encoding using binary data & compression

14.06.17, 14)13

Page 1 of 1https://hpbn.co/assets/diagrams/ae09920e853bee0b21be83f8e770ba01.svg

Page 29: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

57

HTTP/2 – Streams

q Multiple streams may interleavedq Prevents head of line blocking

q Client initiated streams carry odd numbers

q Proactive object delivery by server over server initiated streamsq Promises allow server to advertise upcoming proactively pushed

objectsq Streams may be priorized

Advanced Networking (SS 17): 07 - Transport Layer Evolution

14.06.17, 14)17

Page 1 of 1https://hpbn.co/assets/diagrams/47ba5b32e42cf5a06c3741d29ef9b94a.svg

Ilya Grigorik: High Performance Browser Networking, O'Reilly, 2013

58

HTTP/2 – Performance (I)

q Obvious: Object pushing & binary encoding optimize speed

Advanced Networking (SS 17): 07 - Transport Layer Evolution

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

qq

e

bingwikiped

eb site

Fig. 4. Page load time with an ADSL Livebox, 50ms latency.

time of 400ms. There is still some benefit: the page load timedecreases by 20% on average. Naturally, we expect to seeworse performances on a 3G network. The reality is that therewas not enough packet loss on the 3G network to influencethe page load time. The recent study by AT&T on SPDY’sperformances in [20] stated that the performances of SPDYwere worse than those of HTTP/1.1 over cellular networks.One would expect this to be valid over HTTP/2 as it is anevolution of SPDY.

0

5

10

15

20

25

qq

e

wikiped

bing

Page

L

Web site

HTTP/1.1HTTP/2

Fig. 5. Page load time with a 3G modem, 400ms latency.

3) Local Area Network tests: Latency. Because the major-ity of the Internet browsing is moving to mobile devices, it isworthwhile to look at the influence of latency and packet losson HTTP/2. To this end, we first vary the network latency onour local platform.

Figure 6 shows the page load time in HTTP/1.1 andHTTP/2 for various latency values. For each value, we plottedthe minimum and maximum value, the lower and upperquartiles, along with the median. Interestingly, an increasinglatency widens the difference between HTTP/1.1 and HTTP/2,which means that HTTP/2 reacts well to latency. This sug-gests that this positive influence might also occur on cellularnetworks as they suffer from higher latency.

Packet loss. We saw HTTP/2 reacts positively to high la-tency. But another important characteristic of cellular networksis important packet losses. That is why we conduct a similarexperiment, this time with a fixed latency and varying the

0

2

4

6

8

10

12

14

0 50 100 150 200

Page

Loa

d Ti

me

in s

econ

ds

Latency in milliseconds

HTTP/1.1HTTP/2

Fig. 6. Impact of latency, 0% loss. By pairs, left: HTTP/1.1, right: HTTP/2.

packet loss. Figure 7 shows a poor behaviour: the higher thepacket loss, the lesser the benefits of HTTP/2. Furthermore,the page load time ratio between HTTP/2 and HTTP/1.1 oftenexceeds 1, meaning that HTTP/2 takes longer than HTTP/1.1.

This can be explained as follows: HTTP/2 uses only oneTCP connection to communicate between the client and theserver. When this single connection suffers from packet loss,all streams running over this unique TCP connection arenegatively impacted. In HTTP/1.1, the situation is differentas several TCP connections are open between the client andthe server and this mitigates the packet loss problem. AT&Tin [20] already found similar results for SPDY who is theancestor of HTTP/2.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

qq

e

HTTP

/2 o

ver H

TTP/

1.1

ratio

Web site

0%6%

Fig. 7. Impact of packet loss, 100ms latency.

From an overall perspective, HTTP/2 decreases page loadtimes, because it goes past the head of line blocking issueby using multiplexing. However, several studies [9] [10] [20]have already stated that SPDY was negatively impacted bypacket loss on cellular networks. This statement is likely tohold true for HTTP/2 because it keeps the same idea as SPDYof multiplexing requests over a single TCP connection. Thisproblem stems from the underlying transport protocol, and assuch only a switch to another transport protocol can solve it.

B. Evaluations on Server push and Priority

Besides the multiplexing and compression mechanisms,there is a second class of new features which is optional

18th IEEE Global Internet Symposium

297

H. Saxcé et al.: Is HTTP/2 Really Faster Than HTTP/1.1?, 18th IEEE Global Internet Symposium, 2015

Page 30: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

59

HTTP/2 – Performance (II)

q Key question: Does larger congestion control window outperform loss due to head of line blocking?q Discuss: Why may HOL still occur?q Discuss: What is the impact of loss and delay?

Advanced Networking (SS 17): 07 - Transport Layer Evolution

60

HTTP/2 – Performance (III)

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

qq

e

bingwikiped

eb site

Fig. 4. Page load time with an ADSL Livebox, 50ms latency.

time of 400ms. There is still some benefit: the page load timedecreases by 20% on average. Naturally, we expect to seeworse performances on a 3G network. The reality is that therewas not enough packet loss on the 3G network to influencethe page load time. The recent study by AT&T on SPDY’sperformances in [20] stated that the performances of SPDYwere worse than those of HTTP/1.1 over cellular networks.One would expect this to be valid over HTTP/2 as it is anevolution of SPDY.

0

5

10

15

20

25

qq

e

wikiped

bing

Page

L

Web site

HTTP/1.1HTTP/2

Fig. 5. Page load time with a 3G modem, 400ms latency.

3) Local Area Network tests: Latency. Because the major-ity of the Internet browsing is moving to mobile devices, it isworthwhile to look at the influence of latency and packet losson HTTP/2. To this end, we first vary the network latency onour local platform.

Figure 6 shows the page load time in HTTP/1.1 andHTTP/2 for various latency values. For each value, we plottedthe minimum and maximum value, the lower and upperquartiles, along with the median. Interestingly, an increasinglatency widens the difference between HTTP/1.1 and HTTP/2,which means that HTTP/2 reacts well to latency. This sug-gests that this positive influence might also occur on cellularnetworks as they suffer from higher latency.

Packet loss. We saw HTTP/2 reacts positively to high la-tency. But another important characteristic of cellular networksis important packet losses. That is why we conduct a similarexperiment, this time with a fixed latency and varying the

0

2

4

6

8

10

12

14

0 50 100 150 200

Page

Loa

d Ti

me

in s

econ

ds

Latency in milliseconds

HTTP/1.1HTTP/2

Fig. 6. Impact of latency, 0% loss. By pairs, left: HTTP/1.1, right: HTTP/2.

packet loss. Figure 7 shows a poor behaviour: the higher thepacket loss, the lesser the benefits of HTTP/2. Furthermore,the page load time ratio between HTTP/2 and HTTP/1.1 oftenexceeds 1, meaning that HTTP/2 takes longer than HTTP/1.1.

This can be explained as follows: HTTP/2 uses only oneTCP connection to communicate between the client and theserver. When this single connection suffers from packet loss,all streams running over this unique TCP connection arenegatively impacted. In HTTP/1.1, the situation is differentas several TCP connections are open between the client andthe server and this mitigates the packet loss problem. AT&Tin [20] already found similar results for SPDY who is theancestor of HTTP/2.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

qq

e

HTTP

/2 o

ver H

TTP/

1.1

ratio

Web site

0%6%

Fig. 7. Impact of packet loss, 100ms latency.

From an overall perspective, HTTP/2 decreases page loadtimes, because it goes past the head of line blocking issueby using multiplexing. However, several studies [9] [10] [20]have already stated that SPDY was negatively impacted bypacket loss on cellular networks. This statement is likely tohold true for HTTP/2 because it keeps the same idea as SPDYof multiplexing requests over a single TCP connection. Thisproblem stems from the underlying transport protocol, and assuch only a switch to another transport protocol can solve it.

B. Evaluations on Server push and Priority

Besides the multiplexing and compression mechanisms,there is a second class of new features which is optional

18th IEEE Global Internet Symposium

297

Advanced Networking (SS 17): 07 - Transport Layer Evolution

H. Saxcé et al.: Is HTTP/2 Really Faster Than HTTP/1.1?, 18th IEEE Global Internet Symposium, 2015

Page 31: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

61

HTTP/2 – Performance (IV)

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

qq

e

bingwikiped

eb site

Fig. 4. Page load time with an ADSL Livebox, 50ms latency.

time of 400ms. There is still some benefit: the page load timedecreases by 20% on average. Naturally, we expect to seeworse performances on a 3G network. The reality is that therewas not enough packet loss on the 3G network to influencethe page load time. The recent study by AT&T on SPDY’sperformances in [20] stated that the performances of SPDYwere worse than those of HTTP/1.1 over cellular networks.One would expect this to be valid over HTTP/2 as it is anevolution of SPDY.

0

5

10

15

20

25

qq

e

wikiped

bing

Page

L

Web site

HTTP/1.1HTTP/2

Fig. 5. Page load time with a 3G modem, 400ms latency.

3) Local Area Network tests: Latency. Because the major-ity of the Internet browsing is moving to mobile devices, it isworthwhile to look at the influence of latency and packet losson HTTP/2. To this end, we first vary the network latency onour local platform.

Figure 6 shows the page load time in HTTP/1.1 andHTTP/2 for various latency values. For each value, we plottedthe minimum and maximum value, the lower and upperquartiles, along with the median. Interestingly, an increasinglatency widens the difference between HTTP/1.1 and HTTP/2,which means that HTTP/2 reacts well to latency. This sug-gests that this positive influence might also occur on cellularnetworks as they suffer from higher latency.

Packet loss. We saw HTTP/2 reacts positively to high la-tency. But another important characteristic of cellular networksis important packet losses. That is why we conduct a similarexperiment, this time with a fixed latency and varying the

0

2

4

6

8

10

12

14

0 50 100 150 200

Page

Loa

d Ti

me

in s

econ

ds

Latency in milliseconds

HTTP/1.1HTTP/2

Fig. 6. Impact of latency, 0% loss. By pairs, left: HTTP/1.1, right: HTTP/2.

packet loss. Figure 7 shows a poor behaviour: the higher thepacket loss, the lesser the benefits of HTTP/2. Furthermore,the page load time ratio between HTTP/2 and HTTP/1.1 oftenexceeds 1, meaning that HTTP/2 takes longer than HTTP/1.1.

This can be explained as follows: HTTP/2 uses only oneTCP connection to communicate between the client and theserver. When this single connection suffers from packet loss,all streams running over this unique TCP connection arenegatively impacted. In HTTP/1.1, the situation is differentas several TCP connections are open between the client andthe server and this mitigates the packet loss problem. AT&Tin [20] already found similar results for SPDY who is theancestor of HTTP/2.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

qq

e

HTTP

/2 o

ver H

TTP/

1.1

ratio

Web site

0%6%

Fig. 7. Impact of packet loss, 100ms latency.

From an overall perspective, HTTP/2 decreases page loadtimes, because it goes past the head of line blocking issueby using multiplexing. However, several studies [9] [10] [20]have already stated that SPDY was negatively impacted bypacket loss on cellular networks. This statement is likely tohold true for HTTP/2 because it keeps the same idea as SPDYof multiplexing requests over a single TCP connection. Thisproblem stems from the underlying transport protocol, and assuch only a switch to another transport protocol can solve it.

B. Evaluations on Server push and Priority

Besides the multiplexing and compression mechanisms,there is a second class of new features which is optional

18th IEEE Global Internet Symposium

297

Advanced Networking (SS 17): 07 - Transport Layer Evolution

H. Saxcé et al.: Is HTTP/2 Really Faster Than HTTP/1.1?, 18th IEEE Global Internet Symposium, 2015

62

Content

q TCP congestion control schemesq Multipath TCPq SCTPq SPDY and HTTP/2q QUIC

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Page 32: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

63

Quick UDP Internet Connections (QUIC)

q New transport layer protocol introduced by Google to remove shortcomings of SPDY/HTTP/2 over TCP

q Currently IETF draftq See https://tools.ietf.org/html/draft-ietf-quic-transport-04

q Goals:q Multi-streaming without HOLq Multi-homingq Backward compatibleq Built-in security (i.e. TLS)q Reduced latency by more simple handshakeq Decoupling of congestion control algorithm from protocolq FEC

Advanced Networking (SS 17): 07 - Transport Layer Evolution

64

QUIC in the protocol stack

q QUIC operates at session/transport/application layerq UDP only used for backward compatibility (port 80 or 443)q Sessions identified by 64 bit connection ID

Advanced Networking (SS 17): 07 - Transport Layer Evolution

TLS 1.2

HTTP/2

TCP

IP

QUIC

UDP

HTTP/2 API

J. Iyengar: QUIC - Redefining Internet Transport

Page 33: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

65

C0 P

Payload

Payload

QUIC – Packet format (according to RFC draft)

q Long header:

q Short header:

Advanced Networking (SS 17): 07 - Transport Layer Evolution

8 bit

1 Type

Connection ID (64 bit)

Packet Counter (32 bit)Version (32 bit)

8 bit

Connection ID (0 or 64 bit)

Packet Counter (8, 16 or 32 bit)

66

QUIC – Connection “establishment”

Advanced Networking (SS 17): 07 - Transport Layer Evolution

TCP + TLS QUIC(equivalent to TCP + TLS)

0-RTT! No! Just no timeouts – properties may be cached “forever”

Magic?

Page 34: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

67

QUIC – Actual connection establishment

q Indication of server: alternate-protocol:443:quic,p=0.02

q Client initiates with version and server name

q Server “rejects” giving certificates, configuration & “source-address token” to prevent spoofing

q Normal “0-RTT” handshake followsq Always contains source-address token q Contains servers DNS name

q Discuss: q What does this handshake mean for

DoS resistance?q What does it mean for PFS?q What happens if the first packet is

reordered?

Advanced Networking (SS 17): 07 - Transport Layer Evolution

Server starts to commit resources

68

QUIC – Change in security model significant

Client Attacker Server0-RTT key-exchange messages0-RTT data "request"

process "request"accept 0-RTT

key-exchange response messages

enforce loss of state (e.g., reboot)

replay 0-RTT key-exchange messagesreplay 0-RTT data "request"

reject after state lossfor security reasons

reject 0-RTTkey-exchange response messages

final key exchange messagesresend data "request" under final key(to ensure reliable transmission) process "request"

(again)

Figure 1: Generic replay attack discovered by Daniel Kahn Gillmor in the IETF TLS working groupdiscussion around TLS 1.3 [Res15b]. The 0-RTT data "request" could, e.g., be an HTTP request "POST/buy-something".

Note that the contrived requirement that the attacker is able to reboot the server (while the clientkeeps waiting for a response) vanishes in a real-world scenario with distributed server clusters, where theattacker instead simply forwards the 0-RTT messages to two servers and drops the first server’s response.The described attack hence in particular a�ects the cryptographic design of QUIC, which (among others)specifically targets settings with distributed clusters. Holding up the originally envisioned 0-RTT fullreplay protection being impossible, Langley and Chang write in the specification of July 2015 [LC15](Rev 20150720) that this design is “destined to die” and will be replaced by (an adapted version of) theTLS 1.3 handshake. We, however, argue here that QUIC’s strategy in Rev 20130620 still supports somekind of replay resistance, only at a di�erent level. TLS 1.3, in contrast, forgoes any protection mechanismsand instead accepts replays as inevitable (on the channel level). Developers using TLS 1.3 are supposedto be provided with a di�erent API call for sending 0-RTT data [Res16e, Appendix B.1], indicating itsreplayability, and responsible for taking replays into account for such data.

There is, then, a significant conceptual gap between replays (of key-exchange messages and keys) onthe key-exchange level, and the replay of user data faced on the level of the overall secure channel protocolin the 0-RTT setting. While the former can e�ectively be prevented within the key exchange protocol,this does not necessarily prevent the latter which can be (and in practice is) induced by the network stackof the channel actively and automatically re-sending (presumably) rejected 0-RTT data under the mainkey. The latter type of logical, network-stack replays is hence fundamentally beyond of what key exchangeprotocols can protect against.

5

Advanced Networking (SS 17): 07 - Transport Layer Evolution

M. Fischlin et al.: Replay Attacks on Zero Round-Trip Time: The Case of the TLS 1.3 Handshake Candidates, 2nd IEEE European Symposium on Security and Privacy (EuroS&P 2017)

Page 35: Advanced Networking Technologies - Startseite TU Ilmenau · 1 Advanced Networking Technologies Chapter 7 Transport Layer Evolution Advanced Networking (SS 17): 07 -Transport Layer

69

QUIC – Countering opportunistic ACK attacks

q Danger of opportunistic ACKs: Hostile clientq Uses HTTP to “download” huge fileq Injects ACKs even though has not received dataq Server uses up much of its bandwidth

q TCP offers no protection itself

q QUIC does so by allowing servers to skip sequence rangesq Design criterionq May reduce the load induced by hostile clients

Advanced Networking (SS 17): 07 - Transport Layer Evolution

70

QUIC – Production but work in progress

q Latest value found: 9.05% of Google traffic is QUIC

q General standardization of QUIC – In progressq Using BBR with QUIC – In progressq FEC support – Removed due to performance decreaseq Multihoming & multipath – Not implemented yetq Requirement due to some middleboxes: There must always be a

WORKING fallback path to TCP

q Other applications? q Currently very tight bundling to HTTP/2q Various difficulties: First packet may be retransmit silently, fallback

requirement, privacy issues due tracking of connection ID?q See https://tools.ietf.org/html/draft-kuehlewind-quic-applicability-00

Advanced Networking (SS 17): 07 - Transport Layer Evolution