Using VINI to Test New Network Protocols Murtaza Motiwala, Georgia Tech Andy Bavier, Princeton...

Post on 27-Mar-2015

213 views 0 download

Tags:

Transcript of Using VINI to Test New Network Protocols Murtaza Motiwala, Georgia Tech Andy Bavier, Princeton...

Using VINI to Test New Network Protocols

Murtaza Motiwala, Georgia TechAndy Bavier, Princeton University

Nick Feamster, Georgia TechSantosh Vempala, Georgia Tech

2

“The research agenda in measurement must change to consider measurement solutions which enlist the cooperation of routers. The need is so urgent that the deployment...can be finessed by cooperation between a few key ISPs. There is a rich vein of technical problems, hitherto considered only from an active measurement perspective, for which there can be new and effective...solutions.”

—Varghese and Estan, The Measurement Manifesto

3

Accountability and Availability

• Accountability: Detecting and locating the cause of performance degradations– Proposal: In-band path diagnosis (Orchid)– Need: Carry network traffic with modified packet

formats, routers with packet marking capabilities

• Availability: Maintaining reachability to Internet destinations in the face of failing components– Proposal: Path splicing– Need: Support for running multiple routing protocols

in parallel, modified packet formats, etc.

4

Data-Plane Accountability

• Mechanisms to detect and locate sources (and causes of bad behavior)

• Causes may be benign or malicious– Congestion– Faulty links– Denial of service attack

• Recourse to avoid faulty or malicious elements– Scalable network support for path diversity

5

One Mechanism: Out-of-Band

• Approach: Send additional probe traffic to capture network conditions– Ping, traceroute, pathchar, etc.

• Problem: Measured performance may not reflect conditions experienced by data traffic– May not capture transient faults– Probes may be treated differently– Introduces additional probe traffic, which may affect

observed performance

6

Alternative: In-Band Path Diagnosis

• Store information about network diagnostics in the packet itself.

• Advantage: Diagnostic information reflects information actually experienced by data traffic.

• Challenges– Lost data packets mean lost diagnostics– Distinguishing loss and reordering– Recovering diagnostic information (from the receiver)– Packet marking and storage requirements

7

Data-Plane Accountability

• Problem: Network elements drop packets, fail, and otherwise give rise to poor performance

• One Solution: In-Band Path Diagnosis

• Routers keep track of number of packets seen per flow

• Each router stamps each packet with current flow counter value

• If current counter value does not equal router’s expected packet count for that flow, router marks packet

IP Header

New Shim Header

Transport header

High-level Overview

8

Detailed Operation

• Suppose R2 and R3 have each lost one packet• Next packet: R2 sees “gap” in counter value

– Marks packet with its ID, updates flow counter value

• Subsequent packets contain marks for packets further downstream

9

Implementation and Evaluation

• Implementation in Click– Two main elements: ModifyIng, ModifyPkt

• Deployment on PL-VINI– Evaluation under direct packet drops and induced routing

instability

10

the entire approach completely disregards the cost of implementation on routers. … The authors must demonstrate that what they are proposing is feasible at e.g., 40Gbps if it is going to be implemented on the fast path…

Some Recent Feedback

11

Path Splicing: Main Idea

• Step 1: Run multiple instances of the routing protocol, each with slightly perturbed versions of the configuration

• Step 2: Allow traffic to switch between instances at any node in the protocol

ts

Compute multiple forwarding trees per destination.Allow packets to switch slices midstream.

Feamster, Motiwala, and Vempala, Path Splicing with Network Slicing

12

Perturbations

• Goal: Each instance provides different paths• Mechanism: Each edge is given a weight that is

a slightly perturbed version of the original weight– Two schemes: Uniform and degree-based

ts

3

3

3

“Base” Graph

ts

3.5

4

5 1.5

1.5

1.25

Perturbed Graph

13

Network Slicing

• Goal: Allow multiple instances to co-exist• Mechanism: Virtual forwarding tables

a

t

c

s b

t a

t c

Slice 1

Slice 2

dst next-hop

14

Path Splicing in Practice

• Packet has shim header with routing bits

• Routers use lg(k) bits to index forwarding tables– Shift bits after inspection– Incremental deployment is trivial– Persistent loops cannot occur

• To access different (or multiple) paths, end systems simply change the forwarding bits

15

Design and Implementation

• Click and Quagga on PL-VINI

Control Plane

ForwardingTable

Daemon

Classifier

Control Plane

ForwardingTable

Daemon

16

Challenges

• Can end hosts react quickly enough to recover?– How does the end system find the alternate path?

• How does splicing perform for other topologies?

• Deployment Paths– VINI– Overlay– Wireless

17

What ramifications does the proposed technique have on state-of-the-art router hardware?...As the routing method is supposed to use in the routers, some traditional metrics (e.g. the influence on throughput or latency) should be used to compare the performance…

More Feedback

the entire approach completely disregards the cost of implementation on routers. … The authors must demonstrate that what they are proposing is feasible at e.g., 40Gbps if it is going to be implemented on the fast path…

18

Questions

• What amount of “realism” should a testbed like VINI provide?

• How to convince– Researchers– Vendors– …

• Might VINI be a deployment platform, rather than simply a testing platform?

19

20

Internet Routing Lacks Accountability

• Control Plane: Messages can be falsified– Misconfiguration: AS 7007, ConEdison route leak– Malice: Spammers stealing address space

• Data Plane: Data traffic is not guaranteed to travel where the routing protocol indicates– Paths may not perform well– Even if a faulty path cold be located, no recourse

This talk: Detecting and isolating faulty elements and nodes.Some discussion about recourse.

21

Design Considerations

• Localization granularity: With what precision should a fault be located?– From within a few ASes to actual network element

• Statistics granularity: With what precision should statistics be captured?– From coarse, per-flow statistics to per-packet statistics

• Storage: How much state should be stored, and where should it be stored?– In the router vs. in the packet

22

Design Considerations (cont.)

• Modifications to packet format: Modify packet format, or squeeze data into existing headers?

• Robustness to malice: Should the scheme be robust in the face of malice?– Off-path: Hosts or routers off of the data path try to

disrupt communication– On-path: Malicious hosts or routers on-path may lie

23

Analysis of Accuracy

• Partially accurate: Faulty element identified, but not the correct number of lost packets– Example: Counter overflow

• Misleading: Network fault is attributed to the incorrect network element– Example: Packets containing information about packet

loss are also lost

• No information: No information reported

24

Multipath: Promise and Problems

• Bad: If any link fails on both paths, s is disconnected from t

• Want: End systems remain connected unless the underlying graph is disconnected

ts

25

Reliability Approaches that of Underlying Graph

• GEANT (Real) and Sprint (Rocketfuel) topologies• 1,000 trials• p indicates probability edge was removed from base graph

Reliability approaches optimal

Average stretch is only 1.3

GEANT topology,degree-based perturbations

26

Summary and Question• Network virtualization to “cheat” on scalability

tradeoffs– Path diversity vs. scalability

– Efficiency vs. scalability

– Convergence vs. scalability

• What are the common abstractions, functions, etc. that the substrate should provide?– Slicing

– Nesting

– “Knobs” for granularity control

– …?