Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin,...

39
Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri, Roger Wattenhofer, Ming Zhang

Transcript of Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin,...

Page 1: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Software-defined networking:Change is hard

Ratul Mahajanwith

Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu,Vijay Gill, Srikanth Kandula, Mohan Nanduri, Roger Wattenhofer, Ming Zhang

Page 2: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Inter-DC WAN: A critical, expensive resource

Hong Kong

Seoul

Seattle

Los Angeles

New York

Miami

Dublin

Barcelona

Page 3: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

But it is highly inefficient

Page 4: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

One cause of inefficiency: Lack of coordination

Page 5: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Another cause of inefficiency: Local, greedy resource allocation

Local, greedy allocation

A

B C D

E

FGH

B C D

FGH

A E

Globally optimal allocation[Latency inflation with MPLS-based traffic engineering, IMC 2011]

Page 6: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

SWAN: Software-driven WAN

Highly efficient WANFlexible sharing policies

Coordinate across servicesCentralize resource allocation

Goals Key design elements

[Achieving high utilization with software-driven WAN, SIGCOMM 2013]

Page 7: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

SWAN controller

SWAN overview

WAN

Service hosts

Network agentService broker

Traffic demand

BW allocation

Networkconfig.

Topology, traffic

Rate limiting

Page 8: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Key design challenges

Scalably computing BW allocations

Avoiding congestion during network updates

Working with limited switch memory

Page 9: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Congestion during network updates

Page 10: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Congestion-free network updates

Page 11: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Computing congestion-free update plan

Leave scratch capacity on each link Ensures a plan with at most steps

Find a plan with minimal number of steps using an LP Search for a feasible plan with 1, 2, …. max steps

Use scratch capacity for background traffic

Page 12: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

SWAN provides congestion-free updatesCo

mpl

emen

tary

CD

F

Oversubscription ratio Extra traffic (MB)

Page 13: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

SWAN comes close to optimal

SWAN

Thro

ughp

ut(r

elati

ve to

opti

mal

)

SWANw/o rate control

MPLS TE

Page 14: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Deploying SWAN

WAN

Data center

WAN

Data center

Partial deployment Full deployment

Page 15: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

The challenge of data plane updates in SDN

Not just about congestion Blackholes, loops, packet coherence, …

Page 16: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

The challenge of data plane updates in SDN

Not just about congestion Blackholes, loops, packet coherence, …

Real-world is even messier

CDF

Latency (seconds) Latency (seconds)

CDF

Google’s B4 Our controlled experiments

Page 17: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Many resulting questions of interest

Fundamental What consistency properties can be maintained and how? Is property strength and ease of maintenance related?

Practical How to quickly and safely update the data plane? Impacts failure recovery time, network utilization, flow response time

Page 18: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Minimal dependencies for a consistency property

[On consistent updates in software-defined networks, HotNets 2013]

None Self Downstream subset

Downstream all Global

Eventual consistency

Always guaranteed

Blackhole freedom Impossible Add before

remove

Loop freedom Impossible Rule dependency

forestRule dependency

tree

Packet coherence Impossible Flow version

numbersGlobal version

numbers

Congestion freedom Impossible Staged partial

moves

Page 19: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Fast, consistent network updates

Desired state

generator

Update planner

Routing policy

Consistency property

Target network

state

Update plan

Current network

state

Forward fault correction Computes states that are robust to common faults

DionysusDynamically schedules

network updates

Page 20: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Overview of forward fault correction

Control and data plane faults cause congestion Today, reactive data plane updates are needed to remove congestion

FFC handles faults proactively Guarantees absence of congestion for up to k faults

Main challenge: Too many possible faults Constraint reduction technique based on sorting networks

[Traffic engineering with forward fault correction, SIGCOMM 2014 (to appear)]

Page 21: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Congestion due to control plane faults

Current State Target state

Page 22: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

FFC for control plane faults

Current State Vulnerable target state

Robust target state (k=1)

Robust target state (k=2)

Page 23: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Congestion due to data plane faults

Pre-failure traffic distribution Post-failure traffic distribution

Page 24: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

FFC for data plane faults

Vulnerable traffic distribution Robust traffic distribution (k=1)

Page 25: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

FFC guarantee needs too many constraints

[

: { | is a set of up to faulty switches} 𝑇 𝑙(𝑠) : Additional   traffic   on   link   𝑙   when   switch  𝑠   is  faulty Spare capacity of link in the absence of faults

Number of constraints is for each link

Page 26: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Efficient solution using sorting networks

: mth largest variable in the array

Use bubble sort network to compute linear expressions for k largest variables

O(nk) constraints

Page 27: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

FFC performance in practice

Single-priority traffic(

Multi-priority traffic

Page 28: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Fast, consistent network updates

Desired state

generator

Update planner

Routing policy

Consistency property

Target network

state

Update plan

Current network

state

Forward fault correction Computes states that are robust to common faults

DionysusDynamically schedules

network updates

Page 29: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Overview of dynamic update scheduling

Current schedulers pre-compute a static update schedule Can get unlucky with switch delays

Dynamic scheduling adapts to actual conditions

Main challenge: Tractably exploring “safe” schedules

[Dionysus: Dynamic scheduling of network updates, SIGCOMM 2014 (to appear)]

Page 30: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Downside of static schedules

S1

S5S4

S3S2F2: 5 F3: 10

F4: 5F1: 5

Current State

S1

S5S4

S3S2

F1: 5

F4: 5

F2: 5 F3: 10

Target State

F2

F4F3

F1S1S2S3S4

21 time43

Plan A F4 F1

F2F3

F2

F4F3

F1S1S2S3S4

21 3 time4 5

Plan B F4

F1F2F3

F2

F4F3

F1S1S2S3S4

21 3 time

F2

F4F3

F1S1S2S3S4

431 2 time

Page 31: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Downside of static schedules

S1

S5S4

S3S2F2: 5 F3: 10

F4: 5F1: 5

Current State

S1

S5S4

S3S2

F1: 5

F4: 5

F2: 5 F3: 10

Target State

Dynamic plan

F4

F2F3

F1

Low update time regardless of latency variability

Static plan A

F4 F1

F2F3

Static plan B

F4

F1F2F3

Page 32: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Challenge in dynamic scheduling

Tractably explore valid orderings Exponential number of orderings Cannot completely avoid planning

S1

S5S4

S3S2

F2: 5

F3: 5F4: 5

F1: 5

Current State F5: 10

S1

S5S4

S3S2

F1: 5

F4: 5

F2: 5 F3: 10

Target State F5: 10

F3: 5

Page 33: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Dionysus pipeline

Dependency graph

generator

Consistency property

Target network

state

Dependency graph

Current network

stateUpdate

scheduler

Page 34: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Dionysus dependency graph

Nodes: updates and resourcesEdges: dependencies among nodes

S1

S5S4

S3S2

F2: 5

F3: 5F4: 5

F1: 5

Current State F5: 10

S1

S5S4

S3S2

F1: 5

F4: 5

F2: 5 F3: 10

Target State F5: 10

F3: 5

Page 35: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Dionysus scheduling

NP-complete problem with capacity and memory constraints

Approach Critical path scheduling Treat strongly connected components

as virtual nodes and favor them Rate limit flows to resolve deadlocks

Page 36: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Dionysus leads to faster updates

Median improvement over static scheduling (SWAN): 60-80%

Page 37: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Dionysus reduces congestion due to failures

99th percentile improvement over static scheduling (SWAN): 40%

Page 38: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Fast, consistent network updates

Desired state

generator

Update planner

Routing policy

Consistency property

Target network

state

Update plan

Current network

state

Forward fault correction Computes states that are robust to common faults

DionysusDynamically schedules

network updates

Page 39: Software-defined networking: Change is hard Ratul Mahajan with Chi-Yao Hong, Rohan Gandhi, Xin Jin, Harry Liu, Vijay Gill, Srikanth Kandula, Mohan Nanduri,

Summary

SDN enables new network operating points such as high utilization

But also pose a new challenge: fast, consistent data plane updates