Dynamics of Hot-Potato Routing in IP Networks Jennifer Rexford AT&T Labs—Research jrex Joint work...

Dynamics of Hot-Potato Routing in IP Networks

Jennifer RexfordAT&T Labs—Research

http://www.research.att.com/~jrex

Joint work with Renata Teixeira, Aman Shaikh, and Timothy Griffin

Network “Operations” Research

Understand key operations tasks Traffic engineering Planned maintenance Problem troubleshooting

Develop sound models and methods Problem the operator is trying to solve Underlying protocols and mechanisms Control knobs the operator to tune

Incorporate network measurements Characterize system behavior Detect and diagnose problems Input into “what if” models

Outline

Internet routing Interdomain and intradomain routing Coupling due to hot-potato routing

Measuring hot-potato routing Measuring the two routing protocols Correlating the two data streams

Performance evaluation Characterization on AT&T’s network Implications on operational practices

Conclusions and future directions

Autonomous Systems

ClientWeb server

AS path: 6, 5, 4, 3, 2, 1

Multiple links Middle of path

Interdomain Routing (BGP)

Border Gateway Protocol (BGP) IP prefix: block of destination IP addresses AS path: sequence of ASes along the path

Policy configuration by the operator Path selection: which of the paths to use? Path export: which neighbors to tell?

12.34.158.5

“I can reach 12.34.158.0/24”

“I can reach 12.34.158.0/24 via AS 1”

Intradomain Routing (IGP) Interior Gateway Protocol (OSPF and IS-IS)

Shortest path routing based on link weights Routers flood link-state information to each other Routers compute “next hop” to reach other routers

Link weights configured by the operator Simple heuristics: link capacity or physical distance Traffic engineering: tuning link weights to the traffic

3Path cost: 2+1+5

Two-Level Internet Routing

Hierarchical routing Intra-domain

Metric based Inter-domain

Reachability and policy

Design principles Scalability Isolation Simplicity of reasoning

intra-domainrouting (IGP)

inter-domainrouting (BGP)

Autonomous system (AS) = network with unified administrative routing policy (ex. AT&T, Sprint, UCSD)

packet to dst

Coupling: Hot-Potato Routing

XISP network

Hot-potato routing = ISPs policy of routing to closest exit point when there is more than one route to destination

Consequences:Routers CPU overloadTransient forwarding instabilityTraffic shiftInter-domain routing changes

ISP networkfailureplanned maintenancetraffic engineering

Routes to thousandsof destinations switch exit point!!!

BGP Decision Process

Ignore if exit point unreachable Highest local preference Lowest AS path length Lowest origin type Lowest MED (with same next hop AS) Lowest IGP cost to next hop Lowest router ID of BGP speaker

Hot potato

Hot-Potato Routing Model

Cost vector for Z: cX=10, cW=8, and cY=9

Egress set for dst: {X, Y} Best route for Z: through Y, which is closest

Hot-potato change: change in cost vector causes change in best route

The Big Picture

Interdomainchanges

Cost vector changes

Hot-potato routing changes(interdomain changes caused by intradomain changes)

Why Care about Hot Potatoes?

Understanding of Internet routing Frequency of hot-potato routing changes Influence on end-to-end performance

Operational practices Knowing when hot-potato changes happen Avoiding unnecessary hot-potato changes Analyzing externally-caused BGP updates

Distributed root cause analysis Each AS can tell what BGP updates it caused Someone should know why each change happened

Outline

Performance evaluation Characterization on AT&T’s network Implications on network practices

Why is This So Darn Hard?

Noisy signals Multiple IGP messages per event Multiple BGP messages per change Large background of BGP updates

Routing protocols Protocols don’t divulge their reasons Highly-configurable BGP policies Vendor-specific details (e.g., timers)

Monitoring limitations Limited number of vantage points Delivering the measurement data Time synchronization across collectors

Our Approach

Measure both protocols BGP and OSPF monitors

Correlate the two streams Match BGP updates with OSPF events

Analyze the interaction

AT&T backboneOSPF messagesBGP updates

Heuristic for Matching

Classify BGP updates by possible OSPF causes

Transform stream of OSPFmessages into routing changes

link failurerefresh weight change

chg cost

Match BGP updateswith OSPF events thathappen close in time

Stream of OSPF messages

Stream of BGP updates

Pre-processing OSPF LSAs

Transform OSPF messages into routing changes from a router’s perspective

22 10LSA weight change, 10

10LSA weight change, 10

X 5Y 4

CHG Y, 7X 5Y 7

LSA deleteLSA add, 1

ADD X, 5

X 5 Y 7

OSPF routing changes:

Classifying BGP Updates

BGP update from Z

Announcement of dst, X Withdrawal of dst, Y

Replacement of routeto dst

different route through Y

See next slide!

= “Can’t be caused by a cost change”

Classifying BGP Updates

route via X is betterroute via X is worse

routes are equally good

CHG X, CHG Y?

The Role of Time

IGP link-state advertisements Multiple LSAs from a single physical event Group into single cost vector change

BGP update messages Multiple BGP updates during convergence Group into single BGP routing change

Matching IGP to BGP Avoid matching unrelated IGP and BGP changes Match related changes that are close in time

Characterize the measurement data to determine the right windows

10 sec

70 sec

180 sec

Outline

Performance evaluation Characterization on AT&T’s network Implications on network practices

Summary Results (June 2003)

High variability in % of BGP updates

One LSA can have a big impact

location min max days > 10%

close to peers 0% 3.76% 0

between peers 0% 25.87% 5

location no impact prefixes impacted

close to peers 97.53% less than 1%

between peers 97.17% 55%

Delay for BGP Routing Change

Router vendor scan timer BGP table scan every 60 seconds OSPF changes arrive uniformly in interval

Internal BGP hierarchy (route reflectors) Wait for another router to change best route Introduces a wait for a second BGP scan

Transmitting many BGP messages Latency for transferring the data

We have seen delays of up to 180 seconds!

Delay for BGP Change (1st Prefix)

Cisco BGP scan timer

Two BGP scans

iBGP Route Reflectors

dst Y, 18dst W, 20

dst Y, 21dst W, 20

Announcement X dst X,19dst W, 20

Scalability trade-off: Less BGP state

vs. Number of BGP updates from Z and longer convergence delay

Transferring Multiple Prefixes

BGP Updates Over PrefixesC

% prefixes

Non-OSPF triggeredAll

OSPF-triggered

OSPF-triggered BGP updatesaffects ~50% of prefixesuniformly

prefixes with onlyone exit point

Contrast with non-OSPFtriggered BGP updates

Operational Implications

Forwarding plane convergence Accuracy of active measurements

Router proximity to exit points Likelihood of hot-potato routing changes

Cost in/out of links during maintenance Avoid triggering BGP routing changes

Forwarding Convergence

100 10111

R2 starts using R1 to reach dstScan process

runs in R2

R1’s scan processcan take up to

60 seconds to run

Packets to dst may be caught in a loop

for 60 seconds!

Measurement Accuracy

Measurements of customer experience Probe machines have just one exit point!

100 111

loop to reach dst

Avoid Equal-distance Exits

dst dst

Small changes will make Z switch exit points to dst

More robust to intra-domainrouting changes

Careful Cost in/out Links

Traffic is more predictableFaster convergenceLess impact on neighbors

Ongoing Work

Black-box testing of the routers Scan timer and its effects (forwarding loops) Vendor interactions (with Cisco)

Impact of the IGP-triggered BGP updates Changes in the flow of traffic Forwarding loops during convergence Externally visible BGP routing changes

Improving isolation (cooling those potatoes!) Operational practices: preventing interactions Protocol design: weaken the IGP/BGP coupling Network design: internal topology/architecture

Thanks!

Any questions?

Exporting Routing Instability

Announcement

No change => no announcement

What to do?

Increase estimate for forwarding convergence For destinations/customers with multiple exit points

Extensions to measurement infrastructure Multiple access links for a probe machine Multiple probe machines with same address

Better BGP implementation on the router Decrease scan timer (maybe too much overhead?) Event-driven IGP/BGP implementation

Time Lag

OSPF-triggered BGP updatesfor June 25th, 2003

time BGP – time OSPF (seconds)

~15% of OSPF-triggeredBGP updates in a day

most OSPF-triggered BGP updates lagfrom 10 to 60 seconds

Challenges

Lack of information on routing messages Routing protocols are designed to determine a path

between two hosts, but not to give reason

Example 1: BGP update caused by OSPF

BGP: announcement: dst, XOSPF: CHG: X, 8

dst, Ydst, X

Challenges

Example 2: BGP update NOT caused by OSPF

dst, Ydst, X

BGP: announcement: dst, X

Challenges

Example 2: BGP update NOT caused by OSPF

dst, Ydst, X

BGP: announcement: dst, XOSPF: CHG: X, 8

Dynamics of Hot-Potato Routing in IP Networks Jennifer Rexford AT&T Labs—Research jrex Joint work...

Documents

Transcript of Dynamics of Hot-Potato Routing in IP Networks Jennifer Rexford AT&T Labs—Research jrex Joint work...

“Hashing Out” the Future of Enterprise and Data-Center Networks Jennifer Rexford Princeton University jrex Joint with Changhoon.

How the Internet Works Jennifer Rexford Computer Science Department jrex.

1 Integral Data Types in C Professor Jennifer Rexford jrex.

Internet Routing (COS 598A) Today: Interdomain Routing Convergence Jennifer Rexford jrex/teaching/spring2005 Tuesdays/Thursdays.

Internet Routing (COS 598A) Today: Router Configuration Jennifer Rexford jrex/teaching/spring2005 Tuesdays/Thursdays 11:00am-12:20pm.

1 Assemblers and Linkers Professor Jennifer Rexford jrex.

Internet Routing (COS 598A) Today: Telling Routers What to Do Jennifer Rexford jrex/teaching/spring2005 Tuesdays/Thursdays.

O pen Internet Challenges in Mobile Broadband Networks Jennifer Rexford Princeton University jrex.

1 Digital Cameras Engineering Math Physics (EMP) Jennifer Rexford jrex.

1 VINI: Virtual Network Infrastructure Jennifer Rexford Princeton University jrex.

Professor Jennifer Rexford Princeton University, …jrex/resume.pdfProfessor Jennifer Rexford Princeton University, Computer Science ... Programming Abstractions for Software-Deﬁned

1 System Calls and Standard I/O Professor Jennifer Rexford jrex.

Computer Networks Guest Lecture in COS 318 Jennifer Rexford jrex.

Seamless Access to Services for Mobile Users Jennifer Rexford Princeton University jrex Joint work with Matvey Ayre, Mike.

Building a Strong Foundation for a Future Internet Jennifer Rexford Princeton University jrex.

1 Generics Professor Jennifer Rexford jrex.

Professor Jennifer Rexford Princeton University, Computer ...jrex/resume.pdf · Professor Jennifer Rexford Princeton University, Computer Science Department 35 Olden Street; Princeton,

Internet Routing COS 598A Jennifer Rexford jrex/teaching/spring2005 Tuesdays/Thursdays 11:00am-12:20pm.

Internet Routing (COS 598A) Today: Routing Protocol Security Jennifer Rexford jrex/teaching/spring2005 Tuesdays/Thursdays.

1 Next-Generation Network Research Facilities Jennifer Rexford Princeton University jrex.