Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier and Nick Feamster ( Georgia Tech )

57
Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier and Nick Feamster (Georgia Tech) UFO: A Resilient Layered Routing Architecture

description

UFO: A Resilient Layered Routing Architecture. Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier and Nick Feamster ( Georgia Tech ). Scalability + High Availability ?. Scalability : Scalability of routing control plane Efficiency of routing data plane. - PowerPoint PPT Presentation

Transcript of Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier and Nick Feamster ( Georgia Tech )

Page 1: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

Yaping ZhuAdvisor: Prof. Jennifer Rexford

With: Andy Bavier and Nick Feamster (Georgia Tech)

UFO: A Resilient Layered Routing Architecture

Page 2: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

2

Scalability + High Availability ?

Scalability: Scalability of routing control planeEfficiency of routing data plane

High Availability: Quick adaptation and re-route

Page 3: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

3

Can We Have the Best of Both Worlds?

Scalability Availability

Internet Routing Yes No

Overlay Routing No Yes

UFO Routing Yes Yes

Basic Idea: 1. Layered routing architecture (borrowing idea from overlay routing)2. Underlay Support for efficient and scalable overlay routing

Page 4: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

4

Outline

• Background– Internet routing architecture– Overlay routing (Resilient Overlay Networks)– Basic idea of Layered routing architecture

• Efficient overlay forwarding

• Scalable overlay monitoring

• Enhancing the scalability of UFO

• Implementation and Evaluation

• Conclusion and deployment

Page 5: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

5

Internet Routing designed for Scalability

Autonomous System (AS)

AS

AS

AS

AS

AS

AS

AS

AS

AS

AS

AS

AS

AS

ASAS

AS

AS

ASASASAS

AS

Peering

Transit

Page 6: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

6

Internet Routing without High Availability

• Scalability– Statistics: 25K ASes, 200K prefixes, millions of routers– Hierarchical: intra-domain / inter-domain routing– Prefix aggregation

• Routing protocols oblivious to performance– Intra-domain: static link weights– Inter-domain: routing policies

• Slow outage detection and recovery – Disruptions during convergence– Performance suffers from black-holes and loops

Page 7: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

7

Scalable Internet Routing without Customization

• IP does destination-based forwarding– All traffic follows the same paths– Independent of the application requirements

• Yet, applications have different needs– Voice and gaming: low latency and loss– File sharing: high throughput

High throughput,but high latency

low latency,but low throughput

Page 8: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

8

Outline

• Background– Internet routing architecture– Overlay routing (Resilient Overlay Networks)– Basic idea of Layered routing architecture

• Efficient overlay forwarding

• Scalable overlay monitoring

• Enhancing the scalability of UFO

• Implementation and Evaluation

• Conclusion and deployment

Page 9: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

9

RON: Resilient Overlay Networks (by D. Andersen)

ScalableIP routing substrate

Page 10: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

10

RON: Resilient Overlay Networks System Components

• Overlay Control Plane– Probing, overlay path evaluation– Disseminate routing messages, update routes

• Overlay Data Plane– Tunnel setup: packet

encapsulation/decapsulation

• User Opt-in Method– DNS redirection to overlay server– Connection to overlay server: tunnels (e.g

VPN)

Page 11: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

11

Overlay Routing

• Pros:– High availability: End hosts discover network-level path

failure and cooperate to re-route.– Customization: Forwarding paths tailored to the

application

• Applications:– Content distribution (e.g. Akamai SureRoute)– Application layer multicast

Page 12: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

12

Overlay Routing: Poor Efficiency

• Problem: traffic must traverse bottleneck link both inbound and outbound– Additional latency overhead– Additional traffic consumption

Upstream ISP

Page 13: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

13

Overlay Routing: Poor Scalability

ScalableIP routing substrate

I don’t know when

failure happens

Let’s just keep

probing

Shall I re-route if one

packet lost?

Page 14: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

14

Overlay Routing: Poor Scalability

• Fundamental trade-off between probing freq and adaptation– To get Quick adaptation

-> aggressive probing at short time interval-> poor scalability:->RON only supports for a small (i.e.,< 50 nodes) set of connected hosts

• Can not differentiate packet lost due to different events– Failure -> fast re-route– Congestions -> may slower? -> oscillation?

Page 15: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

15

Outline

• Background– Internet routing architecture– Overlay routing (Resilient Overlay Networks)– Basic idea of Layered routing architecture

• Efficient overlay forwarding

• Scalable overlay monitoring

• Enhancing the scalability of UFO

• Implementation and Evaluation

• Conclusion and deployment

Page 16: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

16

Can We Have the Best of Both Worlds?

Scalability Availability Customization

Internet

Routing

Yes No No

Overlay

Routing

No Yes Yes

UFO

Routing

Yes Yes Yes

Page 17: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

17

A Resilient Layered Routing Architecture

• Combination of underlay and overlay routing

Page 18: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

18

UFO: Underlay Friendly to Overlays

Underlay Friendly to Overlays

• In-network support for overlays

Page 19: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

19

A Resilient Layered Routing Architecture

• Questions:– Which functionality belong to which layer?– What are the interfaces between both layers?

• Cross-layer design– Efficiency improvement:

• Direct control over forwarding table entries

– Scalability improvement:• Explicit notification about changing network

conditions

Page 20: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

20

Outline

• Efficient overlay forwarding– Overlay forwarding on line cards– Hosting the overlay control plane

• Scalable overlay monitoring– Registration of overlay links– Notification of network events– Lazy recovery

• Enhancing the scalability of UFO• Implementation and Evaluation • Conclusion and deployment

Page 21: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

21

Outline

• Efficient overlay forwarding– Overlay forwarding on line cards– Hosting the overlay control plane

• Scalable overlay monitoring– Registration of overlay links– Notification of network events– Lazy recovery

• Enhancing the scalability of UFO• Implementation and Evaluation • Conclusion and deployment

Page 22: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

22

Efficient Overlay Forwarding

• Problem: traffic must traverse bottleneck link both inbound and outbound

• Solution: reflection points in routers

Upstream ISP

Page 23: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

23

Overlay Forwarding on Router Line Cards

• Building block: tunnels

Page 24: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

24

Where the overlay control plane runs? On Routers

• On Routers: by Router virtualization– Pros: fast updates of forwarding tables– Pros: efficient transmission of control

messages– Pros: fate-sharing

Processors

Switching Fabric

Line Cards

Router

Page 25: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

25

Where the overlay control plane runs? On Servers

Page 26: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

26

Where the overlay control plane runs? On Servers

• On separate set of servers– Update forwarding table on router line cards– Data packets reflected in-network

• Pros:– Pros: cheap compared to router– Pros: compatibility with legacy overlay server

• Cons:– Lack of fate sharing

Page 27: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

27

Outline

• Efficient overlay forwarding– Overlay forwarding on line cards– Hosting the overlay control plane

• Scalable overlay monitoring– Registration of overlay links– Notification of different kinds of network events– Lazy recovery

• Enhancing the scalability of UFO• Implementation and Evaluation • Conclusion and deployment

Page 28: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

28

Scalable Overlay Monitoring

Assumption: Rich connectivity, multiple alternative overlay pathsOverlays could even tolerate “false positive” notification

What to notify? Different applications may want notification of different events

Notification Benefits: Accurate adaptation (compared with RON)Reduce probing overhead, and increase scalability

Page 29: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

29

Scalable Overlay Monitoring

• Notification preserve overlay link abstractions– Message format:

(overlay source, overlay destination, event)– Routers store states by explicit overlay registration

• Explicit notification about events which affect performance of overlay applications– Physical failures of routers or links– Reachability failures: route withdraw, routing session

failure– Network congestion– few “hello” packets lost

Page 30: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

30

Registration of Overlay Links

1A 2 3

4

B

C

Overlay Nodes: A, B, C

Routers: 1, 2, 3, 4

Register for uni-directional overlay links A->B and A->C

Page 31: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

31

Periodical Registration of Overlay Links

1A 2 3

4

B

C

ACK for successful registration

(A,B) (A,B) (A,B) (A,B)

Page 32: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

32

Periodical Registration of Overlay Links

1A 2 3

4

B

C

(A,B)(A,C)

(A,B)(A,C)

(A,B) (A,B)

(A,C)

(A,C)

Registration kept as soft state

Periodical re-registration

Page 33: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

33

Notification of Network Events

1A 2 3

4

B

C

(A,B)(A,C)

(A,B)(A,C)

(A,B) (A,B)

(A,C)

(A,C)

Page 34: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

34

Reactive Routing and Lazy Recovery

• Assumption: rich connectivity

• Reactive routing after notification– Re-route via alternative overlay paths– Disseminate notification message to peers

• Lazy recovery– Stick to alternative overlay paths (e.g. for mins)– Re-register for failed overlay– Reason: transient period during convergence of

recovery, causing loops and blackholes

Page 35: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

35

Outline

• Efficient overlay forwarding– Overlay forwarding on line cards– Hosting the overlay control plane

• Scalable overlay monitoring– Registration of overlay links– Notification of network events– Lazy recovery

• Enhancing the scalability of UFO• Implementation and Evaluation • Conclusion and deployment

Page 36: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

36

Unicast Registration is Inefficient

1A 2 3

4

B

C(B,A)(C,A)(D,A)(E,A)

D

E

(B,A)(C,A)(D,A)(E,A)

(B,A)(C,A)

(D,A)(E,A)

(B,A)

(C,A)

(D,A)

(E,A)

• Overlay Nodes: A, B, C, D, E and Routers: 1, 2, 3, 4• Register for overlay links B->A, C->A, D->A, E->A

Page 37: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

37

Unicast Notification is inefficient

1A 2 3

4

B

C(B,A)(C,A)(D,A)(E,A)

D

E

(B,A)(C,A)(D,A)(E,A)

(B,A)(C,A)

(D,A)(E,A)

(B,A)

(C,A)

(D,A)

(E,A)

Page 38: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

38

Multicast Registration

1A 2 3

4

B

C

D

E

GroupA GroupA GroupA

GroupA

GroupA

GroupA

GroupA

Page 39: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

39

Multicast Notification

1A 2 3

4

B

C

D

E

GroupA GroupA GroupA

GroupA

GroupA

GroupA

GroupA

Page 40: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

40

Benefits of Multicast registration/notification

• Reduce registration states stored at routers– Unicast: store state for each (src, dst) pair,

O(n2)– Multicast: store state each mcast group, O(n)

• Reduce notification message overhead

• Deployment Benefits:– Exploit IP-Multicast (which routers already

have)

Page 41: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

41

Outline

• Efficient overlay forwarding– Overlay forwarding on line cards– Hosting the overlay control plane

• Scalable overlay monitoring– Registration of overlay links– Notification of network events– Lazy recovery

• Enhancing the scalability of UFO• Implementation and Evaluation • Conclusion and deployment

Page 42: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

42

Prototype Implementation on VINI

• What’s finished?– RON

• Control plane: probing and reactive routing• Data plane: overlay tunnel setup• User Opt-in: user data packets delivered by overlays

– UFO: Notification of link failure

• What to do next?– UFO

• Evaluate inter-domain routing convergence• Notification of link congestion

– Run applications: e.g. VoIP

Page 43: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

43

Prototype Implementation on VINI

• Overlay: RON• Overlay FIB• Client opt-in• Notification by Filter

XORPIP Router

UML

eth1 eth3eth2eth0

Click

PacketForwardEngine

Control

DataUmlSwitch

element

Tunnel table

Filters

PlanetLab VM

RON

OverlayFIB

VPN ServerClients

Page 44: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

44

Evaluation Setup

• Topology– Routers and Overlay nodes

s

d

r

Page 45: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

45

Evaluation1: Reactive Routing of RON

• How much time does RON spend to detect outage? – RON probe interval : 12s– RON probe timeout: 3s– Average detection time =

Probe interval / 2 + probe timeout * 3

• What to evaluate?– Fundamental trade-off between probe

frequency and detection time– Parameters: probe interval

Page 46: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

46

Evaluation1: Reactive Routing of RON

• Detection time = probe interval / 2 + probe timeout *3

Page 47: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

47

Evaluation2: comparison of Convergence Speed

• Controlled Experiment– Fail a link by filtering all the packets

• Comparison of Convergence Speed– IP routing (XORP)– RON reactive routing– Reactive routing with UFO notification

Page 48: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

48

Evaluation2: comparison of Convergence Speed

– IP Routing (XORP)• Hello-interval: 15s• Router-dead-interval: 45s

Link down Link up

Page 49: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

49

Evaluation2: comparison of Convergence Speed

– RON• Probe interval: 12s• Probe timeout: 3s• Re-route immediately after outage detection

Link down Link upRON up

Page 50: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

50

Evaluation2: comparison of Convergence Speed

– UFO routing with explicit notification• Re-route immediately after outage notification

Link down Link upRON upUFO up

Page 51: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

51

Outline

• Efficient overlay forwarding– Overlay forwarding on line cards– Hosting the overlay control plane

• Scalable overlay monitoring– Registration of overlay links– Notification of network events– Lazy recovery

• Enhancing the scalability of UFO• Implementation and Evaluation • Conclusion and deployment

Page 52: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

52

Deployability Benefits

• Forwarding Support:– Low barriers to entry– Routers already have hardware for setting

tunnels– Upgrade small fraction for overlay forwarding

• Notification Support:– Upgrade all routers to support notification

(could start with one AS)– Performance benefits and business incentives– Better real-time applications: VoIP

Page 53: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

53

Related Work

• Overlay routing– Detour (Collins98)– Resilient Overlay Networks (Andersen01)

• Improving forwarding efficiency– Path reflection and path painting (Jannotti02)

• Reducing probing overhead– Routing Underlay for Overlays (Nakao03)

• Network virtualization– VINI, GENI, CABO, VERA

Page 54: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

54

Conclusion

• Contributions– Scalable overlay routing is feasible with in-

network support– UFO provides strong reliability and a

compelling deployment model

• Future Work– Further performance evaluation– Applications: VoIP– Application Layer Multicast (with NEC Lab)

Page 55: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

55

Acknowledgement

• General Exam Committee:– Prof. Jennifer Rexford (Advisor)– Prof. Larry Peterson– Prof. Vivek Pai

• Collaborators:– Andy Bavier and Nick Feamster (Georgia Tech)

• Cabernet Research Group

• VINI Support, Planetlab Operations

Page 56: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

56

Questions?

Page 57: Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier  and  Nick Feamster  ( Georgia Tech )

57

FAQ: recovery notification ?

• UFO does NOT support notification of recovery, because:– Alternative overlay paths available (overlays

don’t care !)– Hard for routers to determine intra-domain

convergence: synchronization to determine data-plane convergence

– Hard for routers to determine inter-domain convergence