Unicast Routing and BGP - Colorado State Universitycs557/slides/03.pdf · Unicast Routing and BGP...

59
Unicast Routing and BGP CSU CS557, Spring 2018 Instructor: Lorenzo De Carli (Slides by Christos Papadopoulos, remixed by Lorenzo De Carli)

Transcript of Unicast Routing and BGP - Colorado State Universitycs557/slides/03.pdf · Unicast Routing and BGP...

Unicast Routing and BGP

CSU CS557, Spring 2018 Instructor: Lorenzo De Carli

(Slides by Christos Papadopoulos, remixed by Lorenzo De Carli)

What is this lecture about?

• A general introduction to routing • A review of the BGP routing protocol • A discussion of BGP limitations

(assigned reading)

Forwarding V.S. Routing• Forwarding: the process of moving packets from

input to output based on: – the forwarding table – information in the packet

• Routing: process by which the forwarding table is built and maintained: – one or more routing protocols – procedures (algorithms) to convert routing info to

forwarding table

Forwarding Examples

• To forward unicast packets a router uses: – destination IP address – longest matching prefix in forwarding table

• To forward multicast packets: – source + destination IP address and incoming

interface – longest and exact match algorithms

Factors Affecting Routing

4

36

21

9

1

1D

A

FE

B

C

• Routing algorithms view the network as a graph

• Problem: find lowest cost path between two nodes

• Factors – static: topology – dynamic: load – policy

Two Main Approaches

• DV: Distance-vector protocols • LS: Link state protocols

Distance Vector Protocols

• Employed in the early Arpanet • Distributed next hop computation

– adaptive

• Unit of information exchange – vector of distances to destinations

• Distributed Bellman-Ford Algorithm

Distributed Bellman-FordStart Conditions: Each router starts with a vector of (zero) distances to all directly attached networks

Send step: Each router advertises its current vector to all neighboring routers.

Receive step: Upon receiving vectors from each of its neighbors, router computes its own distance to each neighbor. Then, for every network X, router finds that neighbor who is closer to X than any other neighbor. Router updates its cost to X. After doing this for all X, router goes to send step.

Example - Initial Distances

A

B

E

C

D

Info at node

ABCD

A B C

0 7 ~

7 0 1~ 1 0~ ~ 2

7

1

1

2

28

Distance to node

D

~

~20

E 1 8 ~ 2

1

8~20

E

E Receives D’s Routes

A

B

E

C

D

Info at node

ABCD

A B C

0 7 ~

7 0 1~ 1 0~ ~ 2

7

1

1

2

28

Distance to node

D

~

~20

E 1 8 ~ 2

1

8~20

E

E Updates Cost to C

A

B

E

C

D

Info at node

ABCD

A B C

0 7 ~

7 0 1~ 1 0~ ~ 2

7

1

1

2

28

Distance to node

D

~

~20

E 1 8 4 2

1

8~20

E

A Receives B’s Routes

A

B

E

C

D

Info at node

ABCD

A B C

0 7 ~

7 0 1~ 1 0~ ~ 2

7

1

1

2

28

Distance to node

D

~

~20

E 1 8 4 2

1

8~20

E

A Updates Cost to C

A

B

E

C

D

Info at node

ABCD

A B C

0 7 8

7 0 1~ 1 0~ ~ 2

7

1

1

2

28

Distance to node

D

~

~20

E 1 8 4 2

1

8~20

E

A Receives E’s Routes

A

B

E

C

D

Info at node

ABCD

A B C

0 7 8

7 0 1~ 1 0~ ~ 2

7

1

1

2

28

Distance to node

D

~

~20

E 1 8 4 2

1

8~20

E

A Updates Cost to C and D

A

B

E

C

D

Info at node

ABCD

A B C

0 7 5

7 0 1~ 1 0~ ~ 2

7

1

1

2

28

Distance to node

D

3

~20

E 1 8 4 2

1

8~20

E

Final Distances

A

B C

D

Info at node

ABCD

A B C

0 6 5

6 0 15 1 03 3 2

7

1

1

2

28

Distance to node

D

3

320

E 1 5 4 2

1

5420

E

E

Final Distances After Link Failure

A

B C

D

Info at node

ABCD

A B C

0 7 87 0 18 1 010 3 2

7

1

1

2

28

Distance to node

D

10320

E 1 8 9 11

189110

E

E

View From a Node

A

B

E

C

D

dest

ABCD

A B D

1 14 5

7 8 56 9 44 11 2

7

1

1

2

28

Next hop

E’s routing table

The Bouncing Effect

A

25

1

1

B

C

BC 2

1

dest costAC 1

1

dest cost

AB 1

2

dest cost

C Sends Routes to B

A

25 1

B

C

BC 2

1

dest costAC 1

~

dest cost

AB 1

2

dest cost

B Updates Distance to A

A

25 1

B

C

BC 2

1

dest costAC 1

3

dest cost

AB 1

2

dest cost

B Sends Routes to C

A

25 1

B

C

BC 2

1

dest costAC 1

3

dest cost

AB 1

4

dest cost

C Sends Routes to B

A

25 1

B

C

BC 2

1

dest costAC 1

5

dest cost

AB 1

4

dest cost

How Are These Loops Caused?

• Observation 1: – B’s metric increases

• Observation 2: – C picks B as next hop to A – But, the implicit path from C to A

includes itself!

Solution 1: Holddowns

• If metric increases, delay propagating information – in our example, B delays advertising route – C eventually thinks B’s route is gone,

picks its own route – B then selects C as next hop

• Adversely affects convergence

Other “Solutions”

• Split horizon – B does not advertise route to C – In general, prevents routes from being advertised

on the interface through which they were learned

• Poisoned reverse – B advertises route to C with infinite distance

Works for two-node loops – does not work for loops with more nodes

Example Where Split Horizon Fails

1

11

1

A B

C

D

1. When link breaks, C marks D as unreachable and reports that to A and B.

2. Suppose A learns it first. A now thinks best path to D is through B. A reports D unreachable to B and a route of cost=3 to C.

3. C thinks D is reachable through A at cost 4 and reports that to B.

4. B reports a cost 5 to A who reports new cost to C.

5. etc...

D: inf

D: 3D: 4

D: 5

D: 6

Avoiding the Bouncing Effect

Select loop-free paths • One way of doing this:

– each route advertisement carries entire path – if a router sees itself in path, it rejects the

route BGP does it this way Space proportional to diameter

Cheng, Riley et al

Distance Vector in Practice

• RIP and RIP2 – uses split-horizon/poison reverse

• BGP/IDRP – propagates entire path – path also used for effecting policies

BG and Inter-domain Routing

• BGP

30

Sources

– John Stewart III: “BGP4 - Inter-domain routing in the Internet”

– RFC1771: main BGP RFC – RFC1772-3-4: application, experiences, and

analysis of BGP – RFC1965: AS confederations for BGP – Christian Huitema: “Routing in the Internet”,

chapters 8 and 9 – Cisco tutorial on line

31

Autonomous Systems

• What is an AS? – a set of routers under a single technical administration

(A single organization may own multiple ASes) – a set of advertised prefixes – uses an interior gateway protocol (IGP) and common

metrics to route packets within the AS – uses an exterior gateway protocol (EGP) to route

packets to other AS’s • AS may use multiple IGPs and metrics, but appears

as single AS to other AS’s

32

Example

1 2

3

1.11.2

2.1 2.2

3.1 3.2

2.2.1

44.1 4.2

5

5.1 5.2

EGP

IGP

EGPEGP

IGP

IGP

IGPIGP

EGPEGP

33

History

• Mid-80s: EGP – reachability protocol (no shortest path) – did not accommodate cycles (tree topology) – evolved when all networks connected to

ARPANET • Limited size network topology • Result: BGP introduced as routing protocol

34

Path Vectors

• Each routing update carries the entire path as a sequence of ASes

• Loops are detected as follows: – when AS gets route check if AS already in path

• if yes, reject route • if no, add self and (possibly) advertise route further

• Advantage: – metrics are local - AS chooses path, protocol

ensures no loops35

Interconnecting BGP Peers

• BGP uses TCP to connect peers (port 179) • Advantages:

– BGP much simpler – no need for periodic refresh - routes are valid

until withdrawn, or the connection is lost – incremental updates

• Disadvantages – congestion control on a routing protocol?

36

Hop-by-hop Model

• BGP advertises to neighbors only those routes that it uses – consistent with the hop-by-hop Internet

paradigm – e.g., AS1 cannot tell AS2 to route to other ASes

in a manner different than what AS2 has chosen (need source routing for that)

37

AS Categories

– Stub: an AS that has only a single connection to one other AS - carries only local traffic

– Multi-homed: an AS that has connections to more than one AS, but does not carry transit traffic

– Transit: an AS that has connections to more than one AS, and carries both transit and local traffic (under certain policy restrictions)

38

AS Categories

AS1

AS2AS3AS1

AS2

AS3AS1

AS2

Stub

Multi-homed

Transit

39

Policy With BGP

• BGP provides capability for enforcing various policies

• Policies are not part of BGP: they are provided to BGP as configuration information

• BGP enforces policies by choosing paths from multiple alternatives and controlling advertisement to other AS’s

40

Examples of BGP Policies

• A multi-homed AS refuses to act as transit – limit path advertisement

• A multi-homed AS can become transit for some AS’s – only advertise paths to some AS’s

• An AS can favor or disfavor certain AS’s for traffic transit from itself – Pick appropriate routes by examining path vectors

41

BGP Is NOT Needed If:

• Single homed network (stub)

• AS does not provide downstream routing

• AS uses a default route

42

Path Selection Criteria

• Information based on path attributes • Attributes + external (policy) information • Examples:

– hop count – policy considerations – presence or absence of certain AS – path origin (EGP, IGP) – link dynamics (flapping, stable)

43

Path Attributes

• Categories (recall flags): – well-known mandatory (passed on) – well-known discretionary (passed on) – optional transitive (passed on) – optional non-transitive (if unrecognized, not passed

on) • Optional attributes allow for BGP extensions • Path attributes are used by routers to select routes • Many possible attributes; we are only going to

discuss one example44

AS_PATH Attribute

• Well-known, mandatory attribute • Important components:

– list of traversed AS’s • If forwarding to internal peer:

– do not modify AS_PATH attribute • If forwarding to external peer:

– prepend self into the path

45

AS_PATH Attribute

46

Delayed Internet Routing Convergence

• Labovitz00

47

BGP Problems: Delayed Convergence

• Question: – How long does it take for a route to fail-over?

• How to answer this question: – Experimental methodology – Explanation of observation using simple model

48

Key Idea

• Study and understand BGP convergence time –simulation –measurement

• Suggests bounds of O(n!) and O((n-3)*30s)

49

Why Is Convergence Important?

• Robustness – PSTN (telephone) fail-over times are in

milliseconds – Internet fail-over times are in 10s of seconds

50

Methodology

• Two years of traces • Introduce artificial faults across Internet

– but for only their AS, of course! – failures, repairs, fail-over

• Measure: – Tup: new route, Tdown: old route goes away, Tshort

& Tlong: a route shrinks or grows

51

Methodology

Internet-scale experimentation

52

Cum

ulat

ive

Perc

enta

ge o

f Eve

nts

0

25

50

75

100

Seconds Until Convergence0 20 40 60 80 100 120 140 160

TupTshortTlongTdown

Short

->Lon

g Fail

-Ove

r

New

Rou

te

Long

->Sh

ort F

ail-o

ver

Tup

and

Tsho

rt

Failu

re

• Less than half of Tdown events converge within two minutes • Long tailed distribution (up to 15 minutes)

Observed Convergence Latency

53

Impact on Traffic([Labovitz00a] figure 4a)

Why does loss go up?⇒Because there are route loops in the net causing packet drops.

54

How To Tell What’s Going On?

• Simulate BGP – model one router per AS – assume full routing mesh – ignore latency – synchronous processing via global queue ⇒simple model that captures key details

55

BGP Convergence Example

• What’s the problem here? • BGP’s loop detection prevents looped paths (e.g., in step 2 AS1 removes path

through 0 as it now includes 1 itself) • However, it does not prevent AS’es learning incorrect routes from neighbors

(e.g., in step 3 AS2 learns the non-existent 10R path from 1 and adds it to its routing table)

• Worst case (fully connected network graph): BGP tries all possible n! network paths before giving up and concluding that the route is not reachable

AS 0

AS 1AS 2

R

2 -> 0 21R2 -> 1 21R

2 -> 0 21R2 -> 1 21R

2 -> 0 21R2 -> 1 21R

2 -> 0 21R2 -> 1 21R

2 -> 0 21R2 -> 1 21R

0 -> 1 02R0 -> 2 02R

0 -> 1 02R0 -> 2 02R

2 -> 0 201R2 -> 1 201R

0 -> 2 W0 -> 1 W 1 -> 0 120R

1 -> 2 120R

2 -> 0 201R2 -> 1 201R

0 -> 1 02R0 -> 2 02R

2 -> 0 201R2 -> 1 201R 0 -> 2 W

0 -> 1 W 1 -> 0 120R1 -> 2 120R 0 -> 2 012R

0 -> 1 012R

0 -> 2 W0 -> 1 W 1 -> 0 120R

1 -> 2 120R 0 -> 2 012R0 -> 1 012R

Routing Tables

0(*R, 1R, 2R) 1(0R, *R, 2R) 2(0R, 1R, *R)

steady state

Messages Processing Messages Queued in System

1 -> 0 10R 1 -> 2 10R

0

R withdraws its route1 0 -> 2 01R 0 -> 1 01R

1 -> 0 10R 1 -> 2 10R

1 -> 0 12R1 -> 2 12R2

1 and 2 receive new announcement from 0

1 -> 0 10R 1 -> 2 10R0(-, -, *2R) 1(-, -, *2R) 2(*01R, 10R, -)

0 -> 1 01R 0 -> 2 01R

R -> 0 WR -> 1 WR -> 2 W

1 -> 0 12R1 -> 2 12R3

0 and 2 receive new announcement from 1

1 -> 2 12R1 -> 0 12R

4

1 -> 0 12R1 -> 2 12R5

0 and 1 receive new announcement from 2

0 and 2 receive new announcement from 1

6

N/A

N/A

N/A

N/A

N/A

N/A

N/A

0(-, *1R, 2R) 1(*0R, -, 2R) 2(*0R, 1R, -)

0(-, *1R, 2R) 1(-, -, *2R) 2(01R, *1R, -)

2 -> 0 20R2 -> 1 20R

2 -> 0 20R2 -> 1 20R

2 -> 0 20R2 -> 1 20R

2 -> 0 20R2 -> 1 20R

0 and 1 receive new announcement from 2

(steps omitted)

48 N/A 0(-, -, -) 1(-, -, -) 2(-, -, -)steady state

0(-, -, -) 1(-, -, *20R) 2(*01R, 10R, -)

0 -> 1 02R0 -> 2 02R

2 -> 0 201R

0(-, *12R, -) 1(-, -, *20R) 2(*01R, -, -)

0(-, *12R, 21R) 1(-, -, -) 2(*01R, -, -)

2 -> 1 201R

1 -> 0 W1 -> 2 W

Stage Time

5.1 Upper Bound on Convergence

182

Does this explain measurements?

• Tup/Tshort converge quickly because they shorten path length and therefore are quickly accepted

• Tdown/Tlong converge slowly because BGP tries hard to find all alternatives

57

What can be done about this?

• Route advertisement delay helps alleviate the problem (i.e. minimum waiting times between two successive advertisements of routes to a particular AS)

• Could do loop detection at sender side and not just receiver side (i.e., if a route goes through AS N, do not advertise that route to AS N)

58

See you on Thursday!

Cisco ASR 9000

Cisco router processor