Dynamic Scheduling of Network Updates
description
Transcript of Dynamic Scheduling of Network Updates
Dynamic Scheduling ofNetwork Updates
Xin JinHongqiang Harry Liu, Rohan Gandhi, Srikanth Kandula, Ratul Mahajan,
Ming Zhang, Jennifer Rexford, Roger Wattenhofer
2
SDN: Paradigm Shift in Networking
• Many benefits– Traffic engineering [B4, SWAN]
– Flow scheduling [Hedera, DevoFlow]
– Access control [Ethane, vCRIB]
– Device power management [ElasticTree]
Controller• Direct, centralized updates of forwarding rules in switches
3
Network Update is Challenging
• Requirement 1: fast– The agility of control loop
• Requirement 2: consistent– No congestion, no blackhole, no loop, etc.
Network
Controller
4
What is Consistent Network Update
Current State
A
D
CB
EF1: 5 F4: 5
F2: 5 F3: 10
Target State
A
D
CB
E
F1: 5
F4: 5
F2: 5 F3: 10
5
What is Consistent Network Update
• Asynchronous updates can cause congestion• Need to carefully order update operations
Current State
A
D
CB
EF1: 5 F4: 5
F2: 5 F3: 10
Target State
A
D
CB
E
F1: 5
F4: 5
F2: 5 F3: 10
A
D
CB
EF4: 5F1: 5
F3: 10F2: 5
Transient State
6
Existing Solutions are Slow
• Existing solutions are static [ConsistentUpdate’12, SWAN’13, zUpdate’13]
– Pre-compute an order for update operations
F3
F4
F2
F1
StaticPlan A
Target StateCurrent State
A
D
CB
EF4: 5
F2: 5
F1: 5
F3: 10A
D
CB
E
F1: 5
F4: 5
F2: 5 F3: 10
7
Existing Solutions are Slow
• Existing solutions are static [ConsistentUpdate’12, SWAN’13, zUpdate’13]
– Pre-compute an order for update operations
• Downside: Do not adapt to runtime conditions– Slow in face of highly variable operation completion time
F3
F4
F2
F1
StaticPlan A
8
Operation Completion Times are Highly Variable
>10x
• Measurement on commodity switches
9
Operation Completion Times are Highly Variable
• Measurement on commodity switches
• Contributing factors– Control-plane load– Number of rules– Priority of rules– Type of operations (insert vs. modify)
10
Static Schedules can be Slow
F3
F4
F2 F1StaticPlan B
F3
F4
F2
F1
StaticPlan A
F1F2F3F4
Time1 2 3 4 5
F1F2F3F4
Time1 2 3 4 5
F1F2F3F4
Time1 2 3 4 5
F1F2F3F4
Time1 2 3 4 5
Target StateCurrent State
A
D
CB
EF4: 5
F2: 5
F1: 5
F3: 10A
D
CB
E
F1: 5
F4: 5
F2: 5 F3: 10
No static schedule is a clear winner under all conditions!
11
Dynamic Schedules are Adaptive and Fast
F3
F4
F2 F1StaticPlan B
F3
F4
F2
F1
StaticPlan A
Target StateCurrent State
A
D
CB
EF4: 5
F2: 5
F1: 5
F3: 10A
D
CB
E
F1: 5
F4: 5
F2: 5 F3: 10
No static schedule is a clear winner under all conditions!
F3
F4
F2F1
DynamicPlan
Adapts to actual conditions!
12
Challenges of Dynamic Update Scheduling
• Exponential number of orderings• Cannot completely avoid planning
Current State
A
D
CB
EF3: 5
F2: 5
F1: 5F4: 5
F5: 10
Target State
A
D
CB
E
F1: 5
F3: 5F4: 5
F5: 10
F2: 5
13
Challenges of Dynamic Update Scheduling
• Exponential number of orderings• Cannot completely avoid planning
Current State
A
D
CB
EF3: 5F1: 5
F4: 5
F5: 10
Target State
A
D
CB
E
F1: 5
F3: 5F4: 5
F5: 10
F2: 5F2: 5
F2
Deadlock
14
Challenges of Dynamic Update Scheduling
• Exponential number of orderings• Cannot completely avoid planning
Current State
A
D
CB
EF3: 5
F2: 5
F4: 5
F5: 10
Target State
A
D
CB
E
F1: 5
F3: 5F4: 5
F5: 10
F2: 5
F1
F1: 5
15
Challenges of Dynamic Update Scheduling
• Exponential number of orderings• Cannot completely avoid planning
Current State
A
D
CB
E
F2: 5
F4: 5
F5: 10
Target State
A
D
CB
E
F1: 5
F3: 5F4: 5
F5: 10
F2: 5
F1 F3
F1: 5
F3: 5
16
Challenges of Dynamic Update Scheduling
• Exponential number of orderings• Cannot completely avoid planning
Current State
A
D
CB
E F4: 5
F5: 10
Target State
A
D
CB
E
F1: 5
F3: 5F4: 5
F5: 10
F2: 5
F1 F3 F2
Consistent update plan
F2: 5
F1: 5
F3: 5
17
Dionysus Pipeline
Dependency Graph Generator
Update Scheduler
Network
CurrentState
TargetState
ConsistencyProperty
Encode valid orderings
Determine a fast order
18
Dependency Graph Generation
Dependency Graph Elements
Operation node
Resource node (link capacity, switch table size)Node
Edge Dependencies between nodes
Path node
Current State
A
D
CB
EF3: 5
F2: 5
F1: 5F4: 5
F5: 10
Target State
A
D
CB
E
F1: 5
F3: 5F4: 5
F5: 10
F2: 5
19
Dependency Graph Generation
Current State
A
D
CB
EF3: 5
F2: 5
F1: 5F4: 5
F5: 10
Target State
A
D
CB
E
F1: 5
F3: 5F4: 5
F5: 10
F2: 5
MoveF3
MoveF1
MoveF2
20
Dependency Graph Generation
Current State
A
D
CB
EF3: 5
F2: 5
F1: 5F4: 5
F5: 10
Target State
A
D
CB
E
F1: 5
F3: 5F4: 5
F5: 10
F2: 5
A-E: 5
MoveF3
MoveF1
MoveF2
21
Dependency Graph Generation
Current State
A
D
CB
EF1: 5
F4: 5
F5: 10
A-E: 5
5
MoveF3
MoveF1
MoveF2
Target State
A
D
CB
E
F1: 5
F3: 5F4: 5
F5: 10
F2: 5
F3: 5
F2: 5
22
Dependency Graph Generation
A-E: 55
5
MoveF3
MoveF1
MoveF2
Target State
A
D
CB
E
F1: 5
F3: 5F4: 5
F5: 10
F2: 5
Current State
A
D
CB
EF3: 5
F2: 5
F1: 5F4: 5
F5: 10
23
Dependency Graph Generation
A-E: 55
55
MoveF3
MoveF1
MoveF2
Target State
A
D
CB
E
F1: 5
F3: 5F4: 5
F5: 10
F2: 5
Current State
A
D
CB
EF3: 5
F2: 5
F1: 5F4: 5
F5: 10
24
Dependency Graph Generation
A-E: 55
D-E: 0
5
5 5
5
MoveF3
MoveF1
MoveF2
Target State
A
D
CB
E
F1: 5
F3: 5F4: 5
F5: 10
F2: 5
Current State
A
D
CB
EF3: 5
F2: 5
F1: 5F4: 5
F5: 10
25
Dependency Graph Generation
• Supported scenarios– Tunnel-based forwarding: WANs– WCMP forwarding: data center networks
• Supported consistency properties– Loop freedom– Blackhole freedom– Packet coherence– Congestion freedom
• Check paper for details
26
Dionysus Pipeline
Dependency Graph Generator
Update Scheduler
Network
CurrentState
TargetState
ConsistencyProperty
Encode valid orderings
Determine a fast order
27
Dionysus Scheduling
• Scheduling as a resource allocation problem
A-E: 55
D-E: 0
5
5 5
5
MoveF3
MoveF1
MoveF2
28
Dionysus Scheduling
• Scheduling as a resource allocation problem
A-E: 0
D-E: 0
5
5 5
5
MoveF3
MoveF1
MoveF2
Deadlock!
29
Dionysus Scheduling
• Scheduling as a resource allocation problem
A-E: 05
D-E: 05 5
5
MoveF3
MoveF1
MoveF2
30
Dionysus Scheduling
• Scheduling as a resource allocation problem
A-E: 05
D-E: 55
5
MoveF3
MoveF2
31
Dionysus Scheduling
• Scheduling as a resource allocation problem
A-E: 05
5
MoveF3
MoveF2
32
Dionysus Scheduling
• Scheduling as a resource allocation problem
A-E: 55 Move
F2
33
Dionysus Scheduling
• Scheduling as a resource allocation problem
MoveF2
Done!
34
Dionysus Scheduling
• Scheduling as a resource allocation problem
• NP-complete problems under link capacity and switch table size constraints
• Approach– DAG: always feasible, critical-path scheduling– General case: covert to a virtual DAG– Rate limit flows to resolve deadlocks
35
Critical-Path Scheduling
• Calculate critical-path length (CPL) for each node
– Extension: assign larger weight to operation nodes if we know in advance the switch is slow
• Resource allocated to operation nodes with larger CPLs
A-B:5
CPL=3
C-D:0
5
CPL=1
5
5 5
5
CPL=1
CPL=2
CPL=2
CPL=1
MoveF4
MoveF3
MoveF2
MoveF1
36
Critical-Path Scheduling
• Calculate critical-path length (CPL) for each node
– Extension: assign larger weight to operation nodes if we know in advance the switch is slow
• Resource allocated to operation nodes with larger CPLs
A-B:0
CPL=3
C-D:0
5
CPL=1
5
5
5
CPL=1
CPL=2
CPL=2
CPL=1
MoveF4
MoveF3
MoveF2
MoveF1
37
Handling Cycles
• Convert to virtual DAG– Consider each strongly connected
component (SCC) as a virtual node
CPL=1CPL=3
weight = 1weight = 2
• Critical-path scheduling on virtual DAG– Weight wi of SCC: number of operation nodes
A-E: 55
D-E: 0
5
5 5
5
MoveF3
MoveF1
MoveF2
MoveF2
38
Handling Cycles
• Convert to virtual DAG– Consider each strongly connected
component (SCC) as a virtual node
CPL=1CPL=3
weight = 1weight = 2
• Critical-path scheduling on virtual DAG– Weight wi of SCC: number of operation nodes
A-E: 05
D-E: 05 5
5
MoveF3
MoveF1
MoveF2
MoveF2
39
Evaluation: Traffic Engineering
WAN TE Data Center TE0
0.5
1
1.5
2
2.5
3
3.5
50th Percentile Update Time
OneShotDionysusSWAN
Upd
ate
Tim
e (s
econ
d)
Not congestion-free
Congestion-free
Congestion-free
40
Evaluation: Traffic Engineering
Improve 50th percentile update speed by 80%compared to static scheduling (SWAN), close to OneShot
WAN TE Data Center TE0
0.5
1
1.5
2
2.5
3
3.5
50th Percentile Update Time
OneShotDionysusSWAN
Upd
ate
Tim
e (s
econ
d)
Not congestion-free
Congestion-free
Congestion-free
41
Evaluation: Failure Recovery
OneShot Dionysus SWAN0
1
2
3
4
5
99th PercentileLink Oversubscription
Link
Ove
rsub
scrip
tion
(GB)
OneShot Dionysus SWAN0
0.5
1
1.5
2
2.5
3
99th PercentileUpdate Time
Upd
ate
Tim
e (s
econ
d)
42
Evaluation: Failure Recovery
Reduce 99th percentile link oversubscription by 40% compared to static scheduling (SWAN)
Improve 99th percentile update speed by 80% compared to static scheduling (SWAN)
OneShot Dionysus SWAN0
1
2
3
4
5
99th PercentileLink Oversubscription
Link
Ove
rsub
scrip
tion
(GB)
OneShot Dionysus SWAN0
0.5
1
1.5
2
2.5
3
99th PercentileUpdate Time
Upd
ate
Tim
e (s
econ
d)
43
Conclusion
• Dionysus provides fast, consistent network updates through dynamic scheduling– Dependency graph: compactly encode orderings– Scheduling: dynamically schedule operations
Dionysus enables more agile SDN control loops
44
Thanks!