Inter-Domain Routing: BGP Brad Karp UCL Computer Science (drawn mostly from lecture notes by Hari...

Post on 03-Jan-2016

215 views 1 download

Tags:

Transcript of Inter-Domain Routing: BGP Brad Karp UCL Computer Science (drawn mostly from lecture notes by Hari...

Inter-Domain Routing: BGP

Brad KarpUCL Computer Science

(drawn mostly from lecture notesby Hari Balakrishnan and Nick Feamster,

MIT)

CS 6007/GC15/GA0718th March, 2009

2

Outline

• Context: Inter-Domain Routing• Relationships between ASes• Enforcing Policy, not Optimality• BGP Design Goals• BGP Protocol• eBGP and iBGP• BGP Route Attributes• Synthesis: Policy through Route

Attributes

3

Context: Inter-Domain Routing

• So far, have studied intra-domain routing– Domain: group of routers owned by a single

entity, typically numbering at most 100s– Distance Vector, Link State protocols: types

of Interior Gateway Protocol (IGP)• Today’s topic: inter-domain routing

– Routing protocol that binds domains together into global Internet

– Border Gateway Protocol (BGP): type of Exterior Gateway Protocol (EGP)

4

Context:Why Another Routing Protocol?

• Scaling challenge:– millions of hosts on global Internet– ultra-naïve approach: use DV or LS routing,

each 32-bit host address is a destination– naïve approach: use DV or LS routing, each

subnet’s address prefix (i.e., Ethernet broadcast domain) is a destination

– DV and LS cannot scale to these levels• prohibitive message complexity for LS flooding• loops and slow convergence for DV• Keeping routes current costs traffic proportional to

product of number of nodes and rate of topological change

5

Context: Scaling Beyond the Domain

• Address allocation challenge:– Each host on Internet must have unique

32-bit IP address– How to enforce global uniqueness?– Onerous to consult central authority for

each new host

• Hierarchical addressing: solves scaling and address allocation challenges

6

Context: Hierarchical Addressing

• Divide 32-bit IP address hierarchically– e.g., 128.16.64.200 is host at UCL– e.g., 128.16.64 prefix is UCL CS dept– e.g., 128.16 prefix is all of UCL– destination is a prefix– writing prefixes:

• 128.16/16 means “high 16 bits of 128.16.x.y”• netmask 255.255.0.0 means “to find prefix of

32-bit address, bit-wise AND 255.255.0.0 with it”

– prefixes need not be multiples of 8 bits long

7

Hierarchical Addressing: Pro

• Routing protocols generally incur cost that increases with number of destinations– Hierarchical addresses aggregate– Outside UCL, single prefix 128.16 can represent

thousands of hosts on UCL network– End result: “reduces” number of destinations in

global Internet routing system• Centralized address allocation easier for

smaller user/host population– Hierarchical addresses assure global uniqueness

with only local coordination– Inside UCL, local authority can allocate low-order

16 bits of host IP addresses under 128.16 prefix– End result: decentralized unique address

allocation

8

Hierarchical Addressing: Con

• Inherent loss of information from global routing protocol less optimal routes– Nodes outside UCL know nothing about UCL

internal topology– UCL host in Antarctica has 128.16 prefix all

traffic to it must be routed via London

• Host addresses indicate both host identity and network attachment point– Suppose move my UCL laptop to Berkeley– IP address must change to Berkeley one, so

aggregates under Berkeley IP prefix!

9

Context: Autonomous Systems

• A routing domain is called an Autonomous System (AS)

• Each AS known by unique 16-bit number• IGPs (e.g., DV, LS) route among individual

subnets• EGPs (e.g., BGP) route among ASes• AS owns one or handful of address prefixes;

allocates addresses under those prefixes• AS typically a commercial entity or other

organization• ASes often competitors (e.g., different ISPs)

10

Global Internet Routing: Naïve View

• Find globally shortest paths

• Dense connectivity with many redundant paths

• Route traffic cooperatively onto lightly loaded paths

No correspondence to reality!

11

Global Internet Routing, Socialist Style

• Multiple, interconnected ISPs

• ISPs all equal:– in how connected

they are to other ISPs

– in geographic extent of their networks

Little correspondence to reality!

12

Global Internet Routing:Capitalist Style

• Tiers of ISPs:– Tier 3: local

geographically, end customers

– Tier 2: regional geographically

– Tier 1: global geographically, ISP customers, no default routes

• Each ISP an AS, runs own IGP internally

• AS operator sets policies for how to route to others, how to let others route to his AS

Reality!

13

Outline

• Context: Inter-Domain Routing• Relationships between ASes• Enforcing Policy, not Optimality• BGP Design Goals• BGP Protocol• eBGP and iBGP• BGP Route Attributes• Synthesis: Policy through Route

Attributes

14

AS-AS Relationships:Customers and Providers

• Smaller ASes (corporations, universities) typically purchase connectivity from ISPs

• Regional ISPs typically purchase connectivity from global ISPs

• Each such connection has two roles:– Customer: smaller AS paying for connectivity– Provider: larger AS being paid for connectivity

• Other possibility: ISP-to-ISP connection

15

AS-AS Relationship:Transit

• Provider-Customer AS-AS connections: transit

• Provider allows customer to route to (nearly) all destinations in its routing tables

• Transit nearly always involves payment from customer to provider

16

AS-AS Relationship:Peering

• Peering: two ASes (usually ISPs) mutually allow one another to route to some of the destinations in their routing tables

• Typically these are their own customers (whom they provide transit)

• By contract, but usually no money changes hands, so long as traffic ratio is narrower than, e.g., 4:1

17

Financial Motives: Peering and Transit

• Peering relationship often between competing ISPs

• Incentives to peer:– Typically, two ISPs notice their own direct

customers originate a lot of traffic for the other– Each can avoid paying transit costs to others for

this traffic; shunt it directly to one another– Often better performance (shorter latency, lower

loss rate) as avoid transit via another provider– Easier than stealing one another’s customers

• Tier 1s must typically peer with one another to build complete, global routing tables

18

Financial Motives: Peering and Transit(cont’d)

• Disincentives to peer:– Economic disincentive: transit lets ISP

charge customer; peering typically doesn’t

– Contracts must be renegotiated often– Need to agree on how to handle

asymmetric traffic loads between peers

19

Outline

• Context: Inter-Domain Routing• Relationships between ASes• Enforcing Policy, not Optimality• BGP Design Goals• BGP Protocol• eBGP and iBGP• BGP Route Attributes• Synthesis: Policy through Route

Attributes

20

The Meaning of Advertising Routes

• When AS A advertises a route for destination D to AS B, it effectively offers to forward all traffic from AS B to D

• Forwarding traffic costs bandwidth• ASes strongly motivated to control which

routes they advertise– no one wants to forward packets without being

compensated to do so– e.g., when peering, only let neighboring AS

send to specific own customer destinations enumerated peering contract

21

Advertising Routes for Transit Customers

• ISP motivated to advertise routes to its own customers to its transit providers– Customers paying to be reachable from

global Internet– More traffic to customer, faster link

customer must buy

• If ISP hears route for its own customer from multiple neighbors, should favor advertisement from own customer

22

Routes Heard from Providers

• If ISP hears routes from its provider (via a transit relationship), to whom does it advertise them?– Not to ISPs with peering relationships;

they don’t pay, so no motivation to provide transit service for them!

– To own customers, who pay to be able to reach global Internet

23

Example: Routes Heard from Providers

• ISP P announces route to C’P, own customer, to X

• X doesn’t announce C’P to Y or Z; no revenue from peering

• X announces C’P to Ci; they’re paying to be able to reach everywhere

24

Routes Advertised to Peers

• Which routes should an ISP advertise to ASes with whom it has peering relationships?– Routes for all own downstream transit

customers– Routes to ISP’s own addresses– Not routes heard from upstream transit

provider of ISP; peer might route via ISP for those destinations, but doesn’t pay

– Not routes heard from other peering relationships (same reason!)

25

Example: Routes Advertised to Peers

• ISP X announces Ci to Y and Z

• ISP X doesn’t announce routes heard from ISP P to Y or Z

• ISP X doesn’t announce routes heard from ISP Y to ISP Z, or vice-versa

26

Route Export: Summary

• ISPs typically provide selective transit– Full transit (export of all routes) for own

transit customers in both directions– Some transit (export of routes between

mutual customers) across peering relationship

– Transit only for transit customers (export of routes to customers) to providers

• These decisions about what routes to advertise motivated by policy (money), not by optimality (e.g., shortest paths)

27

Route Import

• Router may hear many routes to same destination network

• Identity of advertiser very important• Suppose router hears advertisement to own

transit customer from other AS– Shouldn’t route via other AS; longer path!– Customer routes higher priority than routes to

same destination advertised by providers or peers

• Routes heard over peering higher priority than provider routes– Peering is free; you pay provider to forward via

them• customer > peer > provider

28

Outline

• Context: Inter-Domain Routing• Relationships between ASes• Enforcing Policy, not Optimality• BGP Design Goals• BGP Protocol• eBGP and iBGP• BGP Route Attributes• Synthesis: Policy through Route

Attributes

29

Border Gateway Protocol (BGP):Design Goals

• Scalability in number of ASes• Support for policy-based routing

– tagging of routes with attributes– filtering of routes

• Cooperation under competitive pressure– BGP designed to run on successor to

NSFnet, the former single, government-run backbone

30

BGP Protocol

• BGP runs over TCP, port 179• Router connects to other router, sends

OPEN message• Both routers exchange all active routes in

their tables (possibly minutes, depending on routing table sizes)

• In steady state, two main message types:– announcements: changes to existing routes or

new routes– withdrawals: retraction of previously

advertised route

• No periodic announcements needed; TCP provides reliable delivery

31

BGP Protocol (cont’d)

• BGP doesn’t chiefly aim to compute shortest paths (or minimize other metric, as do DV, LS)

• Chief purpose of BGP is to announce reachability, and enable policy-based routing

• BGP announcement:– IP prefix: [Attribute 0] [Attribute1] […]

32

Outline

• Context: Inter-Domain Routing• Relationships between ASes• Enforcing Policy, not Optimality• BGP Design Goals• BGP Protocol• eBGP and iBGP• BGP Route Attributes• Synthesis: Policy through Route

Attributes

33

eBGP and iBGP

• eBGP: external BGP advertises routes between ASes

• iBGP: internal BGP propagates external routes throughout receiving AS

34

eBGP and iBGP (cont’d)

• Each eBGP participant hears different advertisements from neighboring ASes

• Must propagate routes learned via eBGP throughout AS

• Design goals:– Loop-free forwarding: forwarding paths

over routes learned via eBGP should not loop

– Complete visibility: all routers within AS must choose same, best route to destination learned via eBGP

Within AS1, choosing external route to destination in AS2 amounts to choosing egress router within AS1

35

Simple iBGP: Full Mesh• How to achieve

complete visibility?– Push all routes

learned via eBGP to all internal routers using iBGP

• Full Mesh: each eBGP router floods routes it learns to all other routers in AS

• Flooding done over TCP, using intra-AS routing provided by IGP (e.g., link state routing)

Pro: simpleCon: scales badly in intra-AS router count:

O(e2 + ei) iBGP sessions(where e eBGP routers, i iBGP routers)More scalable iBGP uses route reflectors or confederations; details in lecture notes

36

Synthesis:Routing with IGP + iBGP

• Every router in AS now learns two routing tables– IGP (e.g., link state) table: routes to every router

within AS, via interface– EGP (e.g., iBGP) table: routes to every prefix in

global Internet, via egress router IP

• Produce one integrated forwarding table– All IGP entries kept as-is– For each EGP entry

• find next-hop interface i for egress router IP in IGP table• add entry: <foreign prefix, i>

– End result: O(prefixes) entries in all routers’ tables

37

Outline

• Context: Inter-Domain Routing• Relationships between ASes• Enforcing Policy, not Optimality• BGP Design Goals• BGP Protocol• eBGP and iBGP• BGP Route Attributes• Synthesis: Policy through Route

Attributes

38

Using Route Attributes

• Recall: BGP route advertisement is simply:– IP Prefix: [Attribute 0] [Attribute 1] […]

• Administrators enforce policy routing using attributes:– filter and rank routes based on attributes– modify “next hop” IP address attribute– tag a route with attribute to influence

ranking and filtering of route at other routers

39

NEXT HOP Attribute

• Indicates IP address of next-hop router• Modified as routes are announced

– eBGP: when border router announces outside of AS, changes to own IP address

– iBGP: when border router disseminates within AS, changes to own IP address

– iBGP: any iBGP router that repeats route to other iBGP router leaves unchanged

40

ASPATH Attribute: Path Vector Routing

• Contains full list of AS numbers along path to destination prefix

• Ingress router prepends own AS number to ASPATH of routes heard over eBGP

• Functions like distance vector routing, but with explicit enumeration of AS “hops”

• Barring local policy settings, shorter ASPATHs preferred to longer ones

• If reject routes that contain own AS number, cannot choose route that loops among ASes!

41

MED Attribute:Choosing Among Multiple Exit

Points• ASes often connect at multiple points

(e.g., global backbones)• ASPATHs will be same length• But AS’ administrator may prefer a

particular transit point– …often the one that saves him money!

• MED Attribute: Multi-Exit Discriminator, allows choosing transit point between two ASes

42

MED Attribute: Example

• Provider P, customer C

• Source: Boston on P, Destination: San Francisco on C

• Whose backbone for cross-country trip?

• C wants traffic to cross country on P

43

MED Attribute: Example (cont’d)

• C adds MED attribute to advertisements of routes to DSF

– Integer cost

• C’s router in SF advertises MED 100; in BOS advertises 500

• P should choose MED with least cost for destination DSF

• Result: traffic crosses country on P

AS need not honor MEDs from neighborAS only motivated to honor MEDs from other AS with whom financial settlement in place; i.e., not done in peering arrangementsMost ISPs prefer shortest-exit routing: get packet onto someone else’s backbone as quickly as possibleResult: highly asymmetric routes! (why?)

44

Outline

• Context: Inter-Domain Routing• Relationships between ASes• Enforcing Policy, not Optimality• BGP Design Goals• BGP Protocol• eBGP and iBGP• BGP Route Attributes• Synthesis: Policy through Route

Attributes

45

Synthesis:Multiple Attributes into Policy

Routing• How do attributes interact? Priority order:

Priority

Rule Details

1 LOCAL PREF

Highest LOCAL PREF (e.g., prefer transit customer routes over peer and provider routes)

2 ASPATH Shortest ASPATH length

3 MED Lowest MED

4 eBGP > iBGP

Prefer routes learned over eBGP vs. over iBGP

5 IGP path “Nearest” egress router

6 Router ID Smallest router IP address

46

Summary:Inter-Domain Routing with BGP

• Inter-domain routing chiefly concerned with policy, not optimality– Economic motivation: cost of carrying traffic– Different relationships demand different routing:

customer-provider vs. peering

• BGP: Path-Vector inter-domain routing protocol– Scalable in number of ASes– Route attributes support policy routing– Loop-free at AS granularity– Shortest ASPATHs achieved, after policy enforced

• Behavior and configuration of BGP very complex and poorly understood; open research problem!