NANOG35-Multihoming

198
1 NANOG 35 BGP Multihoming Techniques Philip Smith <[email protected]> NANOG35 23-25 October 2005 Los Angeles

description

NANOG35-Multihoming

Transcript of NANOG35-Multihoming

1NANOG 35BGP MultihomingTechniquesPhilip SmithNANOG3523-25 October 2005Los Angeles222NANOG 35Presentation Slides Available onftp://ftp-eng.cisco.com/pfs/seminars/NANOG35-BGP-Multihoming.pdfAnd on the NANOG 35 meeting pages athttp://www.nanog.org/mtg-0510/pdf/smith.pdf333NANOG 35Preliminaries Presentation has many configuration examplesUses Cisco IOS CLI Aimed at Service ProvidersTechniques can be used by many enterprises too Feel free to ask questions444NANOG 35BGP Multihoming Techniques Why Multihome? Definition & Options Preparing the Network Basic Multihoming Service Provider Multihoming Complex Cases & Caveats5NANOG 35Why Multihome?Its all about redundancy, diversity & reliability666NANOG 35Why Multihome? RedundancyOne connection to internet means the networkis dependent on:Local router (configuration, software,hardware)WAN media (physical failure, carrier failure)Upstream Service Provider (configuration,software, hardware)777NANOG 35Why Multihome? ReliabilityBusiness critical applications demandcontinuous availabilityLack of redundancy implies lack of reliabilityimplies loss of revenue888NANOG 35Why Multihome? Supplier DiversityMany businesses demand supplier diversity as a matterof courseInternet connection from two or more suppliersWith two or more diverse WAN pathsWith two or more exit pointsWith two or more international connectionsTwo of everything999NANOG 35Why Multihome? Not really a reason, but oft quoted Leverage:Playing one ISP off against the other for:Service QualityService OfferingsAvailability10 10 10NANOG 35Why Multihome? Summary:Multihoming is easy to demand as requirement for anyservice provider or end-site networkBut what does it really mean:In real life?For the network?For the Internet?And how do we do it?11 11 11NANOG 35BGP Multihoming Techniques Why Multihome? Definition & Options Preparing the Network Basic Multihoming Service Provider Multihoming Complex Cases & Caveats12NANOG 35Multihoming: Definitions &OptionsWhat does it mean, what do we need, and how do we doit?13 13 13NANOG 35Multihoming Definition More than one link external to the localnetworktwo or more links to the same ISPtwo or more links to different ISPs Usually two external facing routersone router gives link and provider redundancyonly14 14 14NANOG 35AS Numbers An Autonomous System Number is required byBGP Obtained from upstream ISP orRegionalRegistry (RIR)AfriNIC, APNIC, ARIN, LACNIC, RIPE NCC Necessary when you have links to more than oneISP or to an exchange point 16 bit integer, ranging from 1 to 65534Zero and 65535 are reserved64512 through 65534 are called Private ASNs15 15 15NANOG 35Private-AS Application ApplicationsAn ISP with customersmultihomed on theirbackbone (RFC2270)-or-A corporate networkwith several regionsbut connections to theInternet only in thecore-or-Within a BGPConfederation1880193.1.34.0/2465003193.2.35.0/2465002193.0.33.0/2465001193.0.32.0/24A193.1.32.0/221880BC16 16 16NANOG 35Private-AS Removal Private ASNs MUST be removed from all prefixesannounced to the public InternetInclude configuration to remove private ASNs in theeBGP template As with RFC1918 address space, private ASNsare intended for internal useThey should not be leaked to the public Internet Cisco IOSneighbor x.x.x.x remove-private-AS17 17 17NANOG 35Configuring Policy Three BASIC Principles for IOSconfiguration examples throughoutpresentation:prefix-lists to filter prefixesfilter-lists to filter ASNsroute-maps to apply policy Route-maps can be used for filtering, butthis is more advanced configuration18 18 18NANOG 35Policy Tools Local preferenceoutbound traffic flows Metric (MED)inbound traffic flows (local scope) AS-PATH prependinbound traffic flows (Internet scope) Communitiesspecific inter-provider peering19 19 19NANOG 35Originating Prefixes: Assumptions MUST announce assigned address block toInternet MAY also announce subprefixes reachability isnot guaranteed Current RIR minimum allocation is /21Several ISPs filter RIR blocks on this boundarySeveral ISPs filter the rest of address space accordingto the IANA assignmentsThis activity is called Net Police by some20 20 20NANOG 35Originating Prefixes The RIRs publish their minimum allocation sizes per /8 addressblockAfriNIC: www.afrinic.net/docs/policies/afpol-v4200407-000.htmAPNIC:www.apnic.net/db/min-alloc.htmlARIN: www.arin.net/reference/ip_blocks.htmlLACNIC: lacnic.net/en/registro/index.htmlRIPE NCC: www.ripe.net/ripe/docs/smallest-alloc-sizes.htmlNote that AfriNIC only publishes its current minimum allocation size,not the allocation size for its address blocks IANA publishes the address space it has assigned to end-sites andallocated to the RIRs:www.iana.org/assignments/ipv4-address-space Several ISPs use this published information to filter prefixes on:What should be routed (from IANA)The minimum allocation size from the RIRs21 21 21NANOG 35Net Police prefix list issues meant to punish ISPs who pollute the routing table withspecifics rather than announcing aggregates impacts legitimate multihoming especially at the Internetsedge impacts regions where domestic backbone is unavailableor costs $$$ compared with international bandwidth hard to maintain requires updating when RIRs startallocating from new address blocks dont do it unless consequences understood and you areprepared to keep the list currentConsider using the Project Cymru bogon BGP feed insteadhttp://www.cymru.com/BGP/bogon-rs.html22 22 22NANOG 35Multihoming Scenarios:Stub Network No need for BGP Point static default to upstream ISP Router will load share on the two parallel circuits Upstream ISP advertises stub network Policy confined within upstream ISPs policyAS100AS10123 23 23NANOG 35Multihoming Scenarios:Multi-homed Stub Network Use BGP (not IGP or static) to loadshare Use private AS (ASN > 64511) Upstream ISP advertises stub network Policy confined within upstream ISPs policyAS100AS6553024 24 24NANOG 35Multihoming Scenarios:Multi-Homed Network Many situations possiblemultiple sessions to same ISPsecondary for backup onlyload-share between primary and secondaryselectively use different ISPsAS300AS200AS100Global Internet25 25 25NANOG 35Multihoming Scenarios:Multiple Sessions to an ISP Use eBGP multihopeBGP to loopback addresseseBGP prefixes learned with loopbackaddress as next hop Cisco IOSrouter bgp 65534 neighbor 1.1.1.1 remote-as 200 neighbor 1.1.1.1 ebgp-multihop 2!ip route 1.1.1.1 255.255.255.255 serial 1/0ip route 1.1.1.1 255.255.255.255 serial 1/1ip route 1.1.1.1 255.255.255.255 serial 1/2AS 655341.1.1.1AS 20026 26 26NANOG 35Multiple Sessions to an ISPeBGP multihop One eBGP-multihopgotcha:R1 and R3 are eBGP peersthat are loopback peeringConfigured with:neighbor x.x.x.x ebgp-multihop 2If the R1 to R3 link goesdown the session couldestablish via R2AS 200AS 100R1 R3R2Used PathDesired Path27 27 27NANOG 35Multiple Sessions to an ISPeBGP multihop Try and avoid use of ebgp-multihop unless:Its absolutely necessary orLoadsharing across multiple links Many ISPs discourage its use, for example:We will run eBGP multihop, but do not support it as a standard offeringbecause customers generally have a hard time managing it due to: routing loops failure to realise that BGP session stability problems are usually dueconnectivity problems between their CPE and their BGP speaker28 28 28NANOG 35Multihoming Scenarios:Multiple Sessions to an ISP Simplest scheme is to usedefaults Learn/advertise prefixes forbetter control Planning and some workrequired to achieveloadsharingPoint default towards one ISPLearn selected prefixes fromsecond ISPModify the number of prefixeslearnt to achieve acceptableload sharing No magic solutionAS 201ISPC C D DA A B B29 29 29NANOG 35BGP Multihoming Techniques Why Multihome? Definition & Options Preparing the Network Basic Multihoming Service Provider Multihoming Complex Cases & Caveats30NANOG 35Preparing the NetworkPutting our own house in order first31 31 31NANOG 35Preparing the Network We will deploy BGP across the network beforewe try and multihome BGP will be used therefore an ASN is required If multihoming to different ISPs, public ASNneeded:Either go to upstream ISP who is a registry member, orApply to the RIR yourself for a one off assignment, orAsk an ISP who is a registry member, orJoin the RIR and get your own IP address allocation too(this option strongly recommended)!32 32 32NANOG 35Preparing the Network The network is not running any BGP at themomentsingle statically routed connection to upstreamISP The network is not running any IGP at allStatic default and routes through the networkto do routing33 33 33NANOG 35Preparing the NetworkIGP Decide on IGP: OSPF or ISIS ! Assign loopback interfaces and /32 addresses toeach router which will run the IGPLoopback is used for OSPF and BGP router id anchorUsed for iBGP and route origination Deploy IGP (e.g. OSPF)IGP can be deployed with NO IMPACT on the existingstatic routingOSPF distance is 110, static distance is 1Smallest distance wins34 34 34NANOG 35Preparing the NetworkIGP (cont) Be prudent deploying IGP keep the Link StateDatabase Lean!Router loopbacks go in IGPWAN point to point links go in IGP(In fact, any link where IGP dynamic routing will be runshould go into IGP)Summarise on area/level boundaries (if possible) i.e.think about your IGP address plan35 35 35NANOG 35Preparing the NetworkIGP (cont) Routes which dont go into the IGP include:Dynamic assignment pools (DSL/Cable/Dial)Customer point to point link addressing(using next-hop-self in iBGP ensures that these do NOTneed to be in IGP)Static/Hosting LANsCustomer assigned address spaceAnything else not listed in the previous slide36 36 36NANOG 35Preparing the NetworkiBGP Second step is toconfigure the localnetwork to use iBGP iBGP can run onall routers, ora subset of routers, orjust on the upstream edge iBGP must run on allrouters which are in thetransit path betweenexternal connectionsAS200F F E ED D C CA AB B37 37 37NANOG 35Preparing the NetworkiBGP (Transit Path) iBGP must run on allrouters which are in thetransit path betweenexternal connections Routers C, E and F are notin the transit pathStatic routes or IGP willsuffice Router D is in the transitpathWill need to be in iBGPmesh, otherwise routingloops will resultAS200F F E ED D C CA AB B38 38 38NANOG 35Preparing the NetworkLayers Typical SP networks have three layers:Core the backbone, usually the transit pathDistribution the middle, PoP aggregationlayerAggregation the edge, the devicesconnecting customers39 39 39NANOG 35Preparing the NetworkAggregation Layer iBGP is optionalMany ISPs run iBGP here, either partial routing (morecommon) or full routing (less common)Full routing is not needed unless customers want full tablePartial routing is cheaper/easier, might usually consist ofinternal prefixes and, optionally, external prefixes to aidexternal load balancingCommunities and peer-groups make this administratively easy Many aggregation devices cant run iBGPStatic routes from distribution devices for address poolsIGP for best exit40 40 40NANOG 35Preparing the NetworkDistribution Layer Usually runs iBGPPartial or full routing (as with aggregation layer) But does not have to run iBGPIGP is then used to carry customer prefixes (does notscale)IGP is used to determine nearest exit Networks which plan to grow large shoulddeploy iBGP from day oneMigration at a later date is extra workNo extra overhead in deploying iBGP, indeed IGPbenefits41 41 41NANOG 35Preparing the NetworkCore Layer Core of network is usually the transit path iBGP necessary between core devicesFull routes or partial routes:Transit ISPs carry full routes in coreEdge ISPs carry partial routes only Core layer includes AS border routers42 42 42NANOG 35Preparing the NetworkiBGP ImplementationDecide on: Best iBGP policyWill it be full routes everywhere, or partial, orsome mix? iBGP scaling techniqueCommunity policy?Route-reflectors?Techniques such as peer groups and peertemplates?43 43 43NANOG 35Preparing the NetworkiBGP Implementation Then deploy iBGP:Step 1: Introduce iBGP mesh on chosen routersmake sure that iBGP distance is greater than IGP distance (itusually is)Step 2: Install customer prefixes into iBGPCheck! Does the network still work?Step 3: Carefully remove the static routing for theprefixes now in IGP and iBGPCheck! Does the network still work?Step 4: Deployment of eBGP follows44 44 44NANOG 35Preparing the NetworkiBGP ImplementationInstall customer prefixes into iBGP? Customer assigned address spaceNetwork statement/static route combinationUse unique community to identify customer assignments Customer facing point-to-point linksRedistribute connected through filters which only permitpoint-to-point link addresses to enter iBGPUse a unique community to identify point-to-point linkaddresses (these are only required for your monitoringsystem) Dynamic assignment pools & local LANsSimple network statement will do thisUse unique community to identify these networks45 45 45NANOG 35Preparing the NetworkiBGP ImplementationCarefully remove static routes? Work on one router at a time:Check that static route for a particular destination is alsolearned either by IGP or by iBGPIf so, remove itIf not, establish why and fix the problem(Remember to look in the RIB, not the FIB!) Then the next router, until the whole PoP is done Then the next PoP, and so on until the network is nowdependent on the IGP and iBGP you have deployed46 46 46NANOG 35Preparing the NetworkCompletion Previous steps are NOT flag day stepsEach can be carried out during different maintenanceperiods, for example:Step One on Week OneStep Two on Week TwoStep Three on Week ThreeAnd so onAnd with proper planning will have NO customer visibleimpact at all47 47 47NANOG 35Preparing the NetworkConfiguration Summary IGP essential networks are in IGP Customer networks are now in iBGPiBGP deployed over the backboneFull or Partial or Upstream Edge only BGP distance is greater than any IGP Now ready to deploy eBGP48 48 48NANOG 35BGP Multihoming Techniques Why Multihome? Definition & Options Preparing the Network Basic Multihoming BGP Traffic Engineering Complex Cases & Caveats49NANOG 35Basic MultihomingLearning to walk before we try running50 50 50NANOG 35Basic Multihoming No frills multihoming Will look at two cases:Multihoming with the same ISPMultihoming to different ISPs Will keep the examples easyUnderstanding easy concepts will make the morecomplex scenarios easier to comprehendAll examples assume that the site multihoming has a/19 address block51 51 51NANOG 35Basic Multihoming This type is most commonplace at the edge ofthe InternetNetworks here are usually concerned with inboundtraffic flowsOutbound traffic flows being nearest exit is usuallysufficient Can apply to the leaf ISP as well as Enterprisenetworks52NANOG 35Basic MultihomingMultihoming to the Same ISP53 53 53NANOG 35Basic Multihoming:Multihoming to the same ISP Use BGP for this type of multihominguse a private AS (ASN > 64511)There is no need or justification for a public ASNMaking the nets of the end-site visible gives no usefulinformation to the Internet Upstream ISP proxy aggregatesin other words, announces only your address block tothe Internet from their AS (as would be done if you hadone statically routed connection)54NANOG 35Two links to the same ISPOne link primary, the other link backup only55 55 55NANOG 35Two links to the same ISP(one as backup only) Applies when end-site has bought a largeprimary WAN link to their upstream asmall secondary WAN link as the backupFor example, primary path might be an E1,backup might be 64kbps56 56 56NANOG 35Two links to the same ISP(one as backup only)AS 100 AS 65534A AC C Border router E in AS100 removes private AS and anycustomer subprefixes from Internet announcementD DE EB Bprimarybackup57 57 57NANOG 35Two links to the same ISP(one as backup only) Announce /19 aggregate on each linkprimary link:Outbound announce /19 unalteredInbound receive default routebackup link:Outbound announce /19 with increased metricInbound received default, and reduce local preference When one link fails, the announcement of the /19aggregate via the other link ensures continuedconnectivity58 58 58NANOG 35Two links to the same ISP(one as backup only) Router A Configurationrouter bgp 65534 network 121.10.0.0 mask 255.255.224.0 neighbor 122.102.10.2 remote-as 100 neighbor 122.102.10.2 description RouterC neighbor 122.102.10.2 prefix-list aggregate out neighbor 122.102.10.2 prefix-list default in!ip prefix-list aggregate permit 121.10.0.0/19ip prefix-list default permit 0.0.0.0/0!59 59 59NANOG 35Two links to the same ISP(one as backup only) Router B Configurationrouter bgp 65534 network 121.10.0.0 mask 255.255.224.0 neighbor 122.102.10.6 remote-as 100 neighbor 122.102.10.6 description RouterD neighbor 122.102.10.6 prefix-list aggregate out neighbor 122.102.10.6 route-map routerD-out out neighbor 122.102.10.6 prefix-list default in neighbor 122.102.10.6 route-map routerD-in in!..next slide60 60 60NANOG 35Two links to the same ISP(one as backup only)ip prefix-list aggregate permit 121.10.0.0/19ip prefix-list default permit 0.0.0.0/0!route-map routerD-out permit 10 match ip address prefix-list aggregate set metric 10route-map routerD-out permit 20!route-map routerD-in permit 10 set local-preference 90!61 61 61NANOG 35Two links to the same ISP(one as backup only) Router C Configuration (main link)router bgp 100 neighbor 122.102.10.1 remote-as 65534 neighbor 122.102.10.1 default-originate neighbor 122.102.10.1 prefix-list Customer in neighbor 122.102.10.1 prefix-list default out!ip prefix-list Customer permit 121.10.0.0/19ip prefix-list default permit 0.0.0.0/062 62 62NANOG 35Two links to the same ISP(one as backup only) Router D Configuration (backup link)router bgp 100 neighbor 122.102.10.5 remote-as 65534 neighbor 122.102.10.5 default-originate neighbor 122.102.10.5 prefix-list Customer in neighbor 122.102.10.5 prefix-list default out!ip prefix-list Customer permit 121.10.0.0/19ip prefix-list default permit 0.0.0.0/063 63 63NANOG 35Two links to the same ISP(one as backup only) Router E Configurationrouter bgp 100 neighbor 122.102.10.17 remote-as 110 neighbor 122.102.10.17 remove-private-AS neighbor 122.102.10.17 prefix-list Customer out!ip prefix-list Customer permit 121.10.0.0/19 Router E removes the private AS andcustomers subprefixes from externalannouncements Private AS still visible inside AS10064NANOG 35Two links to the same ISPWith Loadsharing65 65 65NANOG 35Loadsharing to the same ISP More common case End sites tend not to buy circuits and leave themidle, only used for backup as in previousexample This example assumes equal capacity circuitsUnequal capacity circuits requires more refinement see later66 66 66NANOG 35Loadsharing to the same ISPAS 100 AS 65534A AC C Border router E in AS100 removes private AS and anycustomer subprefixes from Internet announcementD DE EB BLink oneLink two67 67 67NANOG 35Loadsharing to the same ISP Announce /19 aggregate on each link Split /19 and announce as two /20s, one on each linkbasic inbound loadsharingassumes equal circuit capacity and even spread of traffic acrossaddress block Vary the split until perfect loadsharing achieved Accept the default from upstreambasic outbound loadsharing by nearest exitokay in first approx as most ISP and end-site traffic is inbound68 68 68NANOG 35Loadsharing to the same ISP Router A Configurationrouter bgp 65534 network 121.10.0.0 mask 255.255.224.0 network 121.10.0.0 mask 255.255.240.0 neighbor 122.102.10.2 remote-as 100 neighbor 122.102.10.2 prefix-list routerC out neighbor 122.102.10.2 prefix-list default in!ip prefix-list default permit 0.0.0.0/0ip prefix-list routerC permit 121.10.0.0/20ip prefix-list routerC permit 121.10.0.0/19!ip route 121.10.0.0 255.255.240.0 null0ip route 121.10.0.0 255.255.224.0 null0Router B configuration is similar but with the other /2069 69 69NANOG 35Loadsharing to the same ISP Router C Configurationrouter bgp 100 neighbor 122.102.10.1 remote-as 65534 neighbor 122.102.10.1 default-originate neighbor 122.102.10.1 prefix-list Customer in neighbor 122.102.10.1 prefix-list default out!ip prefix-list Customer permit 121.10.0.0/19 le 20ip prefix-list default permit 0.0.0.0/0 Router C only allows in /19 and /20 prefixes fromcustomer block Router D configuration is identical70 70 70NANOG 35Loadsharing to the same ISP Loadsharing configuration is only on customerrouter Upstream ISP has toremove customer subprefixes from externalannouncementsremove private AS from external announcements Could also use BGP communities71NANOG 35Two links to the same ISPMultiple Dualhomed Customers (RFC2270)72 72 72NANOG 35Multiple Dualhomed Customers(RFC2270) Unusual for an ISP just to have one dualhomed customerValid/valuable service offering for an ISP with multiple PoPsBetter for ISP than having customer multihome with anotherprovider! Look at scaling the configuration! Simplifying the configurationUsing templates, peer-groups, etcEvery customer has the same configuration (basically)73 73 73NANOG 35Multiple Dualhomed Customers(RFC2270)AS 100AS 65534A1 A1C C Border router E in AS100 removesprivate AS and any customersubprefixes from Internet announcementD DE EB1 B1AS 65534A2 A2B2 B2AS 65534A3 A3B3 B374 74 74NANOG 35Multiple Dualhomed Customers Customer announcements as per previousexample Use the same private AS for each customerdocumented in RFC2270address space is not overlappingeach customer hears default only Router An and Bn configuration same as RouterA and B previously75 75 75NANOG 35Multiple Dualhomed Customers Router A1 Configurationrouter bgp 65534 network 121.10.0.0 mask 255.255.224.0 network 121.10.0.0 mask 255.255.240.0 neighbor 122.102.10.2 remote-as 100 neighbor 122.102.10.2 prefix-list routerC out neighbor 122.102.10.2 prefix-list default in!ip prefix-list default permit 0.0.0.0/0ip prefix-list routerC permit 121.10.0.0/20ip prefix-list routerC permit 121.10.0.0/19!ip route 121.10.0.0 255.255.240.0 null0ip route 121.10.0.0 255.255.224.0 null0Router B1 configuration is similar but for the other /2076 76 76NANOG 35Multiple Dualhomed Customers Router C Configurationrouter bgp 100 neighbor bgp-customers peer-group neighbor bgp-customers remote-as 65534 neighbor bgp-customers default-originate neighbor bgp-customers prefix-list default out neighbor 122.102.10.1 peer-group bgp-customers neighbor 122.102.10.1 description Customer One neighbor 122.102.10.1 prefix-list Customer1 in neighbor 122.102.10.9 peer-group bgp-customers neighbor 122.102.10.9 description Customer Two neighbor 122.102.10.9 prefix-list Customer2 in77 77 77NANOG 35Multiple Dualhomed Customers neighbor 122.102.10.17 peer-group bgp-customers neighbor 122.102.10.17 description Customer Three neighbor 122.102.10.17 prefix-list Customer3 in!ip prefix-list Customer1 permit 121.10.0.0/19 le 20ip prefix-list Customer2 permit 121.16.64.0/19 le 20ip prefix-list Customer3 permit 121.14.192.0/19 le 20ip prefix-list default permit 0.0.0.0/0 Router C only allows in /19 and /20 prefixesfrom customer block Router D configuration is almost identical78 78 78NANOG 35Multiple Dualhomed Customers Router E Configurationassumes customer address space is not part ofupstreams address blockrouter bgp 100 neighbor 122.102.10.17 remote-as 110 neighbor 122.102.10.17 remove-private-AS neighbor 122.102.10.17 prefix-list Customers out!ip prefix-list Customers permit 121.10.0.0/19ip prefix-list Customers permit 121.16.64.0/19ip prefix-list Customers permit 121.14.192.0/19 Private AS still visible inside AS10079 79 79NANOG 35Multiple Dualhomed Customers If customers prefixes come from ISPsaddress blockdo NOT announce them to the Internetannounce ISP aggregate only Router E configuration:router bgp 100 neighbor 122.102.10.17 remote-as 110 neighbor 122.102.10.17 prefix-list my-aggregate out!ip prefix-list my-aggregate permit 121.8.0.0/1380NANOG 35Basic MultihomingMultihoming to different ISPs81 81 81NANOG 35Two links to different ISPs Use a Public ASOr use private AS if agreed with the other ISPBut some people dont like the inconsistent-AS whichresults from use of a private-AS Address space comes fromboth upstreams orRegional Internet Registry Configuration concepts very similar82 82 82NANOG 35Inconsistent-AS? Viewing the prefixesoriginated by AS65534 in theInternet shows they appear tobe originated by both AS210and AS200This is NOT badNor is it illegal Cisco IOS command isshow ip bgp inconsistent-asAS 200AS 65534AS 210Internet83NANOG 35Two links to different ISPsOne link primary, the other link backup only84 84 84NANOG 35Two links to different ISPs(one as backup only) Announce /19 aggregate on each linkprimary link makes standard announcementbackup link lengthens the AS PATH by using ASPATH prepend When one link fails, the announcement of the/19 aggregate via the other link ensurescontinued connectivity85 85 85NANOG 35AS 100 AS 120AS 130C C D DTwo links to different ISPs(one as backup only)Announce /19 blockwith longer AS PATHInternetAnnounce /19 blockB B A A86 86 86NANOG 35Two links to different ISPs(one as backup only) Router A Configurationrouter bgp 130 network 121.10.0.0 mask 255.255.224.0 neighbor 122.102.10.1 remote-as 100 neighbor 122.102.10.1 prefix-list aggregate out neighbor 122.102.10.1 prefix-list default in!ip prefix-list aggregate permit 121.10.0.0/19ip prefix-list default permit 0.0.0.0/087 87 87NANOG 35Two links to different ISPs(one as backup only) Router B Configurationrouter bgp 130 network 121.10.0.0 mask 255.255.224.0 neighbor 120.1.5.1 remote-as 120 neighbor 120.1.5.1 prefix-list aggregate out neighbor 120.1.5.1 route-map routerD-out out neighbor 120.1.5.1 prefix-list default in neighbor 120.1.5.1 route-map routerD-in in!ip prefix-list aggregate permit 121.10.0.0/19ip prefix-list default permit 0.0.0.0/0!route-map routerD-out permit 10 set as-path prepend 130 130 130!route-map routerD-in permit 10 set local-preference 8088 88 88NANOG 35Two links to different ISPs(one as backup only) Not a common situation as most sites tendto prefer using whatever capacity theyhave But it shows the basic concepts of usinglocal-prefs and AS-path prepends forengineering traffic in the chosen direction89NANOG 35Two links to different ISPsWith Loadsharing90 90 90NANOG 35Two links to different ISPs(with loadsharing) Announce /19 aggregate on each link Split /19 and announce as two /20s, one on eachlinkbasic inbound loadsharing When one link fails, the announcement of the /19aggregate via the other ISP ensures continuedconnectivity91 91 91NANOG 35AS 100 AS 120AS 130C C D DTwo links to different ISPs(with loadsharing)Announce second/20 and /19 blockInternetAnnounce first/20 and /19 blockB B A A92 92 92NANOG 35Two links to different ISPs(with loadsharing) Router A Configurationrouter bgp 130 network 121.10.0.0 mask 255.255.224.0 network 121.10.0.0 mask 255.255.240.0 neighbor 122.102.10.1 remote-as 100 neighbor 122.102.10.1 prefix-list firstblock out neighbor 122.102.10.1 prefix-list default in!ip prefix-list default permit 0.0.0.0/0!ip prefix-list firstblock permit 121.10.0.0/20ip prefix-list firstblock permit 121.10.0.0/1993 93 93NANOG 35Two links to different ISPs(with loadsharing) Router B Configurationrouter bgp 130 network 121.10.0.0 mask 255.255.224.0 network 121.10.16.0 mask 255.255.240.0 neighbor 120.1.5.1 remote-as 120 neighbor 120.1.5.1 prefix-list secondblock out neighbor 120.1.5.1 prefix-list default in!ip prefix-list default permit 0.0.0.0/0!ip prefix-list secondblock permit 121.10.16.0/20ip prefix-list secondblock permit 121.10.0.0/1994 94 94NANOG 35Two links to different ISPs(with loadsharing) Loadsharing in this case is very basic But shows the first steps in designing aload sharing solutionStart with a simple conceptAnd build on it!95NANOG 35Two links to different ISPsMore Controlled Loadsharing96 96 96NANOG 35Loadsharing with different ISPs Announce /19 aggregate on each linkOn first link, announce /19 as normalOn second link, announce /19 with longer AS PATH,and announce one /20 subprefixcontrols loadsharing between upstreams and theInternet Vary the subprefix size and AS PATH lengthuntil perfect loadsharing achieved Still require redundancy!97 97 97NANOG 35AS 100 AS 120AS 130C C D DLoadsharing with different ISPsAnnounce /20 subprefix, and/19 block with longer AS pathInternetAnnounce /19 blockB B A A98 98 98NANOG 35Loadsharing with different ISPs Router A Configurationrouter bgp 130 network 121.10.0.0 mask 255.255.224.0 neighbor 122.102.10.1 remote-as 100 neighbor 122.102.10.1 prefix-list default in neighbor 122.102.10.1 prefix-list aggregate out!ip prefix-list aggregate permit 121.10.0.0/1999 99 99NANOG 35Loadsharing with different ISPs Router B Configurationrouter bgp 130 network 121.10.0.0 mask 255.255.224.0 network 121.10.16.0 mask 255.255.240.0 neighbor 120.1.5.1 remote-as 120 neighbor 120.1.5.1 prefix-list default in neighbor 120.1.5.1 prefix-list subblocks out neighbor 120.1.5.1 route-map routerD out!route-map routerD permit 10 match ip address prefix-list aggregate set as-path prepend 130 130route-map routerD permit 20!ip prefix-list subblocks permit 121.10.0.0/19 le 20ip prefix-list aggregate permit 121.10.0.0/19100 100 100NANOG 35Loadsharing with different ISPs This example is more commonplace Shows how ISPs and end-sites subdivideaddress space frugally, as well as use theAS-PATH prepend concept to optimise theload sharing between different ISPs Notice that the /19 aggregate block isALWAYS announced101 101 101NANOG 35BGP Multihoming Techniques Why Multihome? Definition & Options Preparing the Network Basic Multihoming BGP Traffic Engineering Complex Cases & Caveats102NANOG 35Service ProviderMultihomingBGP Traffic Engineering103 103 103NANOG 35Service Provider Multihoming Previous examples dealt with loadsharinginbound trafficOf primary concern at Internet edgeWhat about outbound traffic? Transit ISPs strive to balance traffic flows in bothdirectionsBalance link utilisationTry and keep most traffic flows symmetricSome edge ISPs try and do this too The original Traffic Engineering104 104 104NANOG 35Service Provider Multihoming Balancing outbound traffic requires inboundrouting informationCommon solution is full routing tableRarely necessaryWhy use the routing mallet to try solve loadsharingproblems?Keep It Simple is often easier (and $$$ cheaper) thancarrying N-copies of the full routing table105 105 105NANOG 35Service Provider MultihomingMYTHS!! Common MYTHS 1: You need the full routing table to multihomePeople who sell router memory would like you to believe thisOnly true if you are a transit providerFull routing table can be a significant hindrance to multihoming 2: You need a BIG router to multihomeRouter size is related to data rates, not running BGPIn reality, to multihome, your router needs to:Have two interfaces,Be able to talk BGP to at least two peers,Be able to handle BGP attributes,Handle at least one prefix 3: BGP is complexIn the wrong hands, yes it can be! Keep it Simple!106 106 106NANOG 35Service Provider Multihoming:Some Strategies Take the prefixes you need to aid trafficengineeringLook at NetFlow data for popular sites Prefixes originated by your immediateneighbours and their neighbours will do more toaid load balancing than prefixes from ASNsmany hops awayConcentrate on local destinations Use default routing as much as possibleOr use the full routing table with care107 107 107NANOG 35Service Provider Multihoming ExamplesOne upstream, one local peerOne upstream, local exchange pointTwo upstreams, one local peer Require BGP and a public ASN Examples assume that the local network hastheir own /19 address block108NANOG 35Service ProviderMultihomingOne upstream, one local peer109 109 109NANOG 35One Upstream, One Local Peer Very common situation in many regions of theInternet Connect to upstream transit provider to see theInternet Connect to the local competition so that localtraffic stays localSaves spending valuable $ on upstream transit costsfor local traffic110 110 110NANOG 35One Upstream, One Local PeerAS 110C CA AUpstream ISPAS130Local PeerAS120111 111 111NANOG 35One Upstream, One Local Peer Announce /19 aggregate on each link Accept default route only from upstreamEither 0.0.0.0/0 or a network which can be used asdefault Accept all routes from local peer112 112 112NANOG 35One Upstream, One Local Peer Router A Configurationrouter bgp 110 network 121.10.0.0 mask 255.255.224.0 neighbor 122.102.10.2 remote-as 120 neighbor 122.102.10.2 prefix-list my-block out neighbor 122.102.10.2 prefix-list AS120-peer in!ip prefix-list AS120-peer permit 122.5.16.0/19ip prefix-list AS120-peer permit 121.240.0.0/20ip prefix-list my-block permit 121.10.0.0/19!ip route 121.10.0.0 255.255.224.0 null0Prefix filtersinbound113 113 113NANOG 35 Router A Alternative Configurationrouter bgp 110 network 121.10.0.0 mask 255.255.224.0 neighbor 122.102.10.2 remote-as 120 neighbor 122.102.10.2 prefix-list my-block out neighbor 122.102.10.2 filter-list 10 in!ip as-path access-list 10 permit ^(120_)+$!ip prefix-list my-block permit 121.10.0.0/19!ip route 121.10.0.0 255.255.224.0 null0One Upstream, One Local PeerAS Path filters more trusting114 114 114NANOG 35One Upstream, One Local Peer Router C Configurationrouter bgp 110 network 121.10.0.0 mask 255.255.224.0 neighbor 122.102.10.1 remote-as 130 neighbor 122.102.10.1 prefix-list default in neighbor 122.102.10.1 prefix-list my-block out!ip prefix-list my-block permit 121.10.0.0/19ip prefix-list default permit 0.0.0.0/0!ip route 121.10.0.0 255.255.224.0 null0115 115 115NANOG 35One Upstream, One Local Peer Two configurations possible for Router AFilter-lists assume peer knows what they aredoingPrefix-list higher maintenance, but saferSome ISPs use both Local traffic goes to and from local peer,everything else goes to upstream116 116 116NANOG 35Aside:Configuration Recommendation Private PeersThe peering ISPs exchange prefixes they originateSometimes they exchange prefixes from neighbouring ASNstoo Be aware that the private peer eBGP router should carryonly the prefixes you want the private peer to receiveOtherwise they could point a default route to you andunintentionally transit your backbone117NANOG 35Service ProviderMultihomingOne Upstream, Local Exchange Point118 118 118NANOG 35One Upstream, Local Exchange Point Very common situation in many regions of theInternet Connect to upstream transit provider to see theInternet Connect to the local Internet Exchange Point sothat local traffic stays localSaves spending valuable $ on upstream transit costsfor local traffic119 119 119NANOG 35One Upstream, Local Exchange PointAS 110C CA AUpstream ISPAS130IXP120 120 120NANOG 35One Upstream, Local Exchange Point Announce /19 aggregate to everyneighbouring AS Accept default route only from upstreamEither 0.0.0.0/0 or a network which can be used asdefault Accept all routes originated by IXP peers121 121 121NANOG 35One Upstream, Local Exchange Point Router A Configurationinterface fastethernet 0/0 description Exchange Point LAN ip address 120.5.10.1 mask 255.255.255.224 ip verify unicast reverse-path!router bgp 110 neighbor ixp-peers peer-group neighbor ixp-peers prefix-list my-block out neighbor ixp-peers remove-private-AS neighbor ixp-peers route-map set-local-pref in..next slide122 122 122NANOG 35One Upstream, Local Exchange Point neighbor 120.5.10.2 remote-as 100 neighbor 120.5.10.2 peer-group ixp-peers neighbor 120.5.10.2 prefix-list peer100 in neighbor 120.5.10.3 remote-as 101 neighbor 120.5.10.3 peer-group ixp-peers neighbor 120.5.10.3 prefix-list peer101 in neighbor 120.5.10.4 remote-as 102 neighbor 120.5.10.4 peer-group ixp-peers neighbor 120.5.10.4 prefix-list peer102 in neighbor 120.5.10.5 remote-as 103 neighbor 120.5.10.5 peer-group ixp-peers neighbor 120.5.10.5 prefix-list peer103 in..next slide123 123 123NANOG 35One Upstream, Local Exchange Point!ip prefix-list my-block permit 121.10.0.0/19ip prefix-list peer100 permit 122.0.0.0/19ip prefix-list peer101 permit 122.30.0.0/19ip prefix-list peer102 permit 122.12.0.0/19ip prefix-list peer103 permit 122.18.128.0/19!route-map set-local-pref permit 10 set local-preference 150!124 124 124NANOG 35One Upstream, Local Exchange Note that Router A does not generate the aggregate forAS110If Router A becomes disconnected from backbone, then theaggregate is no longer announced to the IXBGP failover works as expected Note the inbound route-map which sets the localpreference higher than the defaultThis ensures that local traffic crosses the IXP(And avoids potential problems with uRPF check)125 125 125NANOG 35One Upstream, Local Exchange Point Router C Configurationrouter bgp 110 network 121.10.0.0 mask 255.255.224.0 neighbor 122.102.10.1 remote-as 130 neighbor 122.102.10.1 prefix-list default in neighbor 122.102.10.1 prefix-list my-block out!ip prefix-list my-block permit 121.10.0.0/19ip prefix-list default permit 0.0.0.0/0!ip route 121.10.0.0 255.255.224.0 null0126 126 126NANOG 35One Upstream, Local Exchange Point Note Router A configurationPrefix-list higher maintenance, but saferuRPF on the IX facing interfaceNo generation of AS110 aggregate IXP traffic goes to and from local IXP,everything else goes to upstream127 127 127NANOG 35Aside:IXP Configuration Recommendation IXP peersThe peering ISPs at the IXP exchange prefixes they originateSometimes they exchange prefixes from neighbouring ASNstoo Be aware that the IXP border router should carry only theprefixes you want the IXP peers to receive and thedestinations you want them to be able to reachOtherwise they could point a default route to you andunintentionally transit your backbone If IXP router is at IX, and distant from your backboneDont originate your address block at your IXP router128NANOG 35Service ProviderMultihomingTwo Upstreams, One local peer129 129 129NANOG 35Two Upstreams, One Local Peer Connect to both upstream transit providers tosee the InternetProvides external redundancy and diversity thereason to multihome Connect to the local peer so that local trafficstays localSaves spending valuable $ on upstream transit costsfor local traffic130 130 130NANOG 35Two Upstreams, One Local PeerAS 110C CA AUpstream ISPAS140Local PeerAS120D DUpstream ISPAS130131 131 131NANOG 35Two Upstreams, One Local Peer Announce /19 aggregate on each link Accept default route only from upstreamsEither 0.0.0.0/0 or a network which can be used asdefault Accept all routes from local peer132 132 132NANOG 35Two Upstreams, One Local Peer Router ASame routing configuration as in example withone upstream and one local peerSame hardware configuration133 133 133NANOG 35Two Upstreams, One Local Peer Router C Configurationrouter bgp 110 network 121.10.0.0 mask 255.255.224.0 neighbor 122.102.10.1 remote-as 130 neighbor 122.102.10.1 prefix-list default in neighbor 122.102.10.1 prefix-list my-block out!ip prefix-list my-block permit 121.10.0.0/19ip prefix-list default permit 0.0.0.0/0!ip route 121.10.0.0 255.255.224.0 null0134 134 134NANOG 35Two Upstreams, One Local Peer Router D Configurationrouter bgp 110 network 121.10.0.0 mask 255.255.224.0 neighbor 122.102.10.5 remote-as 140 neighbor 122.102.10.5 prefix-list default in neighbor 122.102.10.5 prefix-list my-block out!ip prefix-list my-block permit 121.10.0.0/19ip prefix-list default permit 0.0.0.0/0!ip route 121.10.0.0 255.255.224.0 null0135 135 135NANOG 35Two Upstreams, One Local Peer This is the simple configuration for Router Cand D Traffic out to the two upstreams will takenearest exitInexpensive routers requiredThis is not useful in practice especially forinternational linksLoadsharing needs to be better136 136 136NANOG 35Two Upstreams, One Local Peer Better configuration options:Accept full routing from both upstreamsExpensive & unnecessary!Accept default from one upstream and someroutes from the other upstreamThe way to go!137 137 137NANOG 35Two Upstreams, One Local PeerFull Routes Router C Configurationrouter bgp 110 network 121.10.0.0 mask 255.255.224.0 neighbor 122.102.10.1 remote-as 130 neighbor 122.102.10.1 prefix-list rfc1918-deny in neighbor 122.102.10.1 prefix-list my-block out neighbor 122.102.10.1 route-map AS130-loadshare in!ip prefix-list my-block permit 121.10.0.0/19! See www.cymru.com/Documents/bogon-list.html! ...for RFC1918 and friends list..next slideAllow all prefixes inapart from RFC1918and friends138 138 138NANOG 35Two Upstreams, One Local PeerFull Routesip route 121.10.0.0 255.255.224.0 null0!ip as-path access-list 10 permit ^(130_)+$ip as-path access-list 10 permit ^(130_)+_[0-9]+$!route-map AS130-loadshare permit 10 match ip as-path 10 set local-preference 120route-map AS130-loadshare permit 20 set local-preference 80!139 139 139NANOG 35Two Upstreams, One Local PeerFull Routes Router D Configurationrouter bgp 110 network 121.10.0.0 mask 255.255.224.0 neighbor 122.102.10.5 remote-as 140 neighbor 122.102.10.5 prefix-list rfc1918-deny in neighbor 122.102.10.5 prefix-list my-block out!ip prefix-list my-block permit 121.10.0.0/19! See www.cymru.com/Documents/bogon-list.html! ...for RFC1918 and friends listAllow all prefixes inapart from RFC1918and friends140 140 140NANOG 35Two Upstreams, One Local PeerFull Routes Router C configuration:Accept full routes from AS130Tag prefixes originated by AS130 and AS130s neighbouringASes with local preference 120Traffic to those ASes will go over AS130 linkRemaining prefixes tagged with local preference of 80Traffic to other all other ASes will go over the link toAS140 Router D configuration same as Router C withoutthe route-map141 141 141NANOG 35Two Upstreams, One Local PeerFull Routes Full routes from upstreamsExpensive needs lots of memory and CPUNeed to play preference gamesPrevious example is only an example real life willneed improved fine-tuning!Previous example doesnt consider inbound traffic see earlier in presentation for examples142 142 142NANOG 35Two Upstreams, One Local PeerPartial Routes Strategy:Ask one upstream for a default routeEasy to originate default towards a BGP neighbourAsk other upstream for a full routing tableThen filter this routing table based on neighbouring ASNE.g. want traffic to their neighbours to go over the link tothat ASNMost of what upstream sends is thrown awayEasier than asking the upstream to set up custom BGPfilters for you143 143 143NANOG 35Two Upstreams, One Local PeerPartial Routes Router C Configurationrouter bgp 110 network 121.10.0.0 mask 255.255.224.0 neighbor 122.102.10.1 remote-as 130 neighbor 122.102.10.1 prefix-list rfc1918-nodef-deny in neighbor 122.102.10.1 prefix-list my-block out neighbor 122.102.10.1 filter-list 10 in neighbor 122.102.10.1 route-map tag-default-low in!..next slideAllow all prefixesand default in; denyRFC1918 and friendsAS filter list filtersprefixes based onorigin ASN144 144 144NANOG 35Two Upstreams, One Local PeerPartial Routesip prefix-list my-block permit 121.10.0.0/19ip prefix-list default permit 0.0.0.0/0!ip route 121.10.0.0 255.255.224.0 null0!ip as-path access-list 10 permit ^(130_)+$ip as-path access-list 10 permit ^(130_)+_[0-9]+$!route-map tag-default-low permit 10 match ip address prefix-list default set local-preference 80route-map tag-default-low permit 20!145 145 145NANOG 35Two Upstreams, One Local PeerPartial Routes Router D Configurationrouter bgp 110 network 121.10.0.0 mask 255.255.224.0 neighbor 122.102.10.5 remote-as 140 neighbor 122.102.10.5 prefix-list default in neighbor 122.102.10.5 prefix-list my-block out!ip prefix-list my-block permit 121.10.0.0/19ip prefix-list default permit 0.0.0.0/0!ip route 121.10.0.0 255.255.224.0 null0146 146 146NANOG 35Two Upstreams, One Local PeerPartial Routes Router C configuration:Accept full routes from AS130(or get them to send less)Filter ASNs so only AS130 and AS130s neighbouring ASesare acceptedAllow default, and set it to local preference 80Traffic to those ASes will go over AS130 linkTraffic to other all other ASes will go over the link to AS140If AS140 link fails, backup via AS130 and vice-versa147 147 147NANOG 35Two Upstreams, One Local PeerPartial Routes Partial routes from upstreamsNot expensive only carry the routes necessary forloadsharingNeed to filter on AS pathsPrevious example is only an example real life willneed improved fine-tuning!Previous example doesnt consider inbound traffic see earlier in presentation for examples148 148 148NANOG 35Two Upstreams, One Local Peer When upstreams cannot or will notannounce default routeBecause of operational policy against usingdefault-originate on BGP peeringSolution is to use IGP to propagate defaultfrom the edge/peering routers149 149 149NANOG 35Two Upstreams, One Local PeerPartial Routes Router C Configurationrouter ospf 110 default-information originate metric 30 passive-interface Serial 0/0!router bgp 110 network 121.10.0.0 mask 255.255.224.0 neighbor 122.102.10.1 remote-as 130 neighbor 122.102.10.1 prefix-list rfc1918-deny in neighbor 122.102.10.1 prefix-list my-block out neighbor 122.102.10.1 filter-list 10 in!..next slide150 150 150NANOG 35Two Upstreams, One Local PeerPartial Routesip prefix-list my-block permit 121.10.0.0/19! See www.cymru.com/Documents/bogon-list.html! ...for RFC1918 and friends list!ip route 121.10.0.0 255.255.224.0 null0ip route 0.0.0.0 0.0.0.0 serial 0/0 254!ip as-path access-list 10 permit ^(130_)+$ip as-path access-list 10 permit ^(130_)+_[0-9]+$!151 151 151NANOG 35Two Upstreams, One Local PeerPartial Routes Router D Configurationrouter ospf 110 default-information originate metric 10 passive-interface Serial 0/0!router bgp 110 network 121.10.0.0 mask 255.255.224.0 neighbor 122.102.10.5 remote-as 140 neighbor 122.102.10.5 prefix-list deny-all in neighbor 122.102.10.5 prefix-list my-block out!..next slide152 152 152NANOG 35Two Upstreams, One Local PeerPartial Routesip prefix-list deny-all deny 0.0.0.0/0 le 32ip prefix-list my-block permit 121.10.0.0/19!ip route 121.10.0.0 255.255.224.0 null0ip route 0.0.0.0 0.0.0.0 serial 0/0 254!153 153 153NANOG 35Two Upstreams, One Local PeerPartial Routes Partial routes from upstreamsUse OSPF to determine outbound pathRouter D default has metric 10 primary outbound pathRouter C default has metric 30 backup outbound pathSerial interface goes down, static default is removedfrom routing table, OSPF default withdrawn154 154 154NANOG 35Aside:Configuration Recommendation When distributing internal default by iBGP or OSPFMake sure that routers connecting to private peers or to IXPsdo NOT carry the default routeOtherwise they could point a default route to you andunintentionally transit your backboneSimple fix for Private Peer/IXP routers:ip route 0.0.0.0 0.0.0.0 null0155 155 155NANOG 35BGP Multihoming Techniques Why Multihome? Definition & Options Preparing the Network Basic Multihoming BGP Traffic Engineering Complex Cases & Caveats156NANOG 35Complex Cases & CaveatsHow not to get stuck; how not to compromise routingsystem security157 157 157NANOG 35Complex Cases & Caveats Complex CasesMulti-exit backbone CaveatsNo default route on:Private peer edge routerIXP peering routerSeparating transit and local pathsBackup and non-backupAvoiding backbone hijack158NANOG 35Complex CasesMulti-exit backbone159 159 159NANOG 35Multi-exit backbone ISP with many exits to different service providersCould be large transit carrierCould be large regional ISP with a variety ofinternational links to different continental locations Load-balancing can be painful to set upOutbound traffic is often easier to balance thaninbound160 160 160NANOG 35Multi-exit backboneAS 110C CA AD DInternetB BIXP155Mbps45Mbps2x 155Mbps 45Mbps155Mbps45MbpsAS120AS130AS140 AS150AS160AS1701Gbps161 161 161NANOG 35Multi-exit backboneStep One How to approach this?Simple steps Step One:The IXP is easy!Will usually be non-transit so preferred path for allprefixes learned this wayOutbound announcement send our address blockInbound announcement accept everything originatedby IXP peers, high local-pref162 162 162NANOG 35Multi-exit backboneStep Two Where does most of the inbound traffic comefrom?Go to that source location, and check Looking Glasstrace and AS-PATHs back to the neighbouring ASNsi.e. which of AS120 through AS170 is the closest to thesource Apply AS-path prepends such that the paththrough AS140 is one AS-hop closer than theother ASNsAS140 is the ISPs biggest pipe to the InternetThis makes AS140 the preferred path to get from thesource to AS110163 163 163NANOG 35Multi-exit backboneStep Three Addressing plan now helpsCustomers in vicinity of each of Router A, C and Daddressed from contiguous address block assigned toeach RouterAnnouncements from Router A address block sent outto AS120 and AS130Announcements from Router C address block sent outto AS140 and AS150Announcements from Router D address block sent outto AS160 and AS170164 164 164NANOG 35Multi-exit backboneAddressing Plan Assists MultihomingAS 110C CA AD DInternetB BIXP155Mbps45Mbps2x 155Mbps 45Mbps155Mbps45MbpsAS120AS130AS140 AS150AS160AS1701GbpsRouter Aaddressing zoneRouter Daddressing zoneRouter Caddressing zone165 165 165NANOG 35Multi-exit backboneStep Four Customer type assists zone load balancingTwo customer classes: Commercial & ConsumerCommercial announced on T3 linksConsumer announced on STM-1 links CommercialNumbered from one address block in each zone ConsumerNumbered from the other address block in each zone166 166 166NANOG 35Multi-exit backboneExample Summary (1) Address block: 100.10.0.0/16 Router A zone: 100.10.0.0/18Commercial: 100.10.0.0/19Consumer: 100.10.32.0/19 Router C zone: 100.10.128.0/17Commercial: 100.10.128.0/18Consumer: 100.10.192.0/18 Router D zone: 100.10.64.0/18Commercial: 100.10.64.0/19Consumer: 100.10.96.0/19167 167 167NANOG 35Multi-exit backboneExample Summary (2) Router A announcement:100.10.0.0/16 with 3x AS-path prepend100.10.0.0/19 to AS130100.10.32.0/19 to AS120 Router B announcement:100.10.0.0/16 Router C announcement:100.10.0.0/16100.10.128.0/18 to AS150100.10.192.0/18 to AS140 Router D announcement:100.10.0.0/16 with 3x AS-path prepend100.10.64.0/19 to AS170100.10.96.0/19 to AS160168 168 168NANOG 35Multi-exit backboneSummary This is an example strategyYour mileage may vary Example shows:where to start,what the thought processes are, andwhat the strategies could be169NANOG 35CaveatsSeparating Transit and Local Paths170 170 170NANOG 35Transit and Local paths Common problem is separating transit and localtraffic for BGP customers Transit provider:Provides internet access for BGP customer over onepathProvides domestic access for BGP customer overanother pathUsually required for commercial reasonsInter-AS traffic is unmeteredTransit traffic is metered171 171 171NANOG 35Transit and Local pathsAS 110C CA AD DInternetB BZ ZAS 120IXPY YTraffic fromthe InternetPrivatepeering trafficX XAS110 is transitprovider but peersdomestic routes withother providers (e.g.AS120) at the IXPInternet172 172 172NANOG 35Transit and Local paths Assume Router X is announcing 192.168/16 prefix Router C and D see two entries for 192.168/16 prefix: BGP path selection rules pick the highest next hopaddressSo this could be Router A or Router B!No exit path selection hereRouterC#show ip bgp NetworkNext HopMetric LocPrf Weight Path* i192.168.0.0/16 10.0.1.11000 120 i*>i 10.0.1.51000 120 i173 173 173NANOG 35Transit and Local paths There are a few solutions to this problemPolicy Routing on Router A according topacket source addressGRE tunnels (gulp) Preference is to keep it simpleMinor redesign and use of BGP weight is asimple solution174 174 174NANOG 35Transit and Local paths(Network Revision)AS 110C CA AD DInternetB BZ ZAS 120IXPY YTraffic fromthe InternetPrivatepeering trafficX XAS110 is transitprovider but peersdomestic routes withother providers (e.g.AS120) at the IXPInternet175 175 175NANOG 35Transit and Local paths Router B hears 192.168/16 from Router Y across theIXP Router C hears 192.168/16 from Router Z across theprivate peering link Router B sends 192.168/16 by iBGP to Router C: Best path is by eBGP to Router ZSo Internet transit traffic to AS120 will go through privatepeering linkRouterC#show ip bgp NetworkNext HopMetric LocPrf Weight Path*> 192.168.0.0/16 10.1.5.71000 120 i* i 10.0.1.51000 120 i176 176 176NANOG 35Transit and Local paths Router D hears prefix by iBGP from both RouterB and Router C BGP best path selection might pick either path,depending on IGP metric, or next hop address,etc Solution to force local traffic over the IXP link:Apply high local preference on Router B for all routeslearned from the IXP peersRouterD#show ip bgp NetworkNext HopMetric LocPrf Weight Path* i192.168.0.0/16 10.0.1.31000 120 i*>i 10.0.1.51200 120 i177 177 177NANOG 35Transit and Local paths High local preference on B is visiblethroughout entire iBGPIncluding on Router C As a result, Internet traffic now goesthrough the IX, not the private peering linkas intendedRouterC#show ip bgp NetworkNext HopMetric LocPrf Weight Path*192.168.0.0/16 10.1.5.71000 120 i*>i 10.0.1.51200 120 i178 178 178NANOG 35Transit and Local paths Solution: Use BGP weight on Router C forprefixes heard from AS120: So Router C prefers private link to AS120for traffic coming from Internet Rest of AS110 prefers Router B exitthrough the IXP for local trafficRouterC#show ip bgp NetworkNext HopMetric LocPrf Weight Path*> 192.168.0.0/16 10.1.5.710050000 120 i* i 10.0.1.51200 120 i179 179 179NANOG 35Transit and Local pathsSummary Transit customer private peering connectsto Border routerTransit customer routes get high weight Local traffic on IXP peering router getshigh local preference Internet return traffic goes on privateinterconnect Domestic return traffic crosses IXP180NANOG 35CaveatsBackup and Non-backup181 181 181NANOG 35Transit and Local pathsBackups For the previous scenario, what happens ifprivate peering link breaks?Traffic backs up across the IXP What happens if the IXP breaks?Traffic backs up across the private peering Some ISPs find this backup arrangementacceptableIt is a backup, after all182 182 182NANOG 35Transit and Local pathsIXP Non-backup IXP actively does not allow transit ISP solution:192.168/16 via IX tagged one community192.168/16 via PP tagged other communityUsing community tags, iBGP on IX router (Router B)does not send 192.168/16 to upstream border (Router C)Therefore Router C only hears 192.168/16 via privatepeeringIf the link breaks, backup is via AS110 and AS120 upstreamISPs183 183 183NANOG 35Transit and Local pathsIXP Non-backupAS 110C CA AD DInternetB BZ ZAS 120IXPY YAS120 backs up viatheir own upstreamPrivatepeering trafficX XAS110 is transitprovider but peersdomestic routes withother providers (e.g.AS120) at the IXPInternet184 184 184NANOG 35Transit and Local pathsPrivate Peering link Non-backup With this solution, a breakage in the IX meansthat local peering traffic will still back up overprivate peering linkThis link may be metered AS110 Solution:Router C does not announce 192.168/16 by iBGP to theother routers in AS110If IX breaks, there is no route to AS120Unless Router C is announcing a default routeWhereby traffic will get to Router C anyway, and policybased routing will have to be used to avoid ingress trafficfrom AS110 going on the private peering link185 185 185NANOG 35Transit and Local pathsPrivate Peering link Non-backupAS 110C CA AD DInternetB BZ ZAS 120IXPY YPolicy basedrouting may beneeded to keepinter-AS traffic offprivate linkX XAS110 is transitprovider but peersdomestic routes withother providers (e.g.AS120) at the IXPInternetTraffic fromthe Internet186 186 186NANOG 35Transit and Local pathsSummary Not allowing BGP backup to do the right thingcan rapidly get messy But previous two scenarios are requested quiteoftenBilling of traffic seems to be more important thanproviding connectivityBut thinking through the steps required shows thatthere is usually a solution without having to resort toextreme measures187NANOG 35CaveatsAvoiding Backbone Hijack188 188 188NANOG 35Backbone Hijacks Can happen when peering ISPs:are present at two or more IXPshave two or more private peering links Usually goes undetectedCan be spotted by traffic flow monitoring tools Done because:Their backbone is cheaper than mine Caused by misconfiguration of private peeringrouters189 189 189NANOG 35Avoiding Backbone HijackAS 110C CA AD DB BZ ZAS 120IXP1Y YX XIXP2AS120 deliberately usesAS110s backbone astransit path between twopoints in the local networkInternetInternet190 190 190NANOG 35Avoiding Backbone Hijack AS110 peering routers at the IXPs should onlycarry AS110 originated routesWhen AS120 points static route for an AS120destination to AS110, the peering routers have nodestination apart from back towards AS120, so thepackets will oscillate until TTL expiryWhen AS120 points static route for a non-AS110destination to AS110, the peering routers have nodestination at all, so the packet is dropped191 191 191NANOG 35Avoiding Backbone Hijack Same applies for private peering scenariosPrivate peering routers should only carry the prefixesbeing exchanged in the peeringOtherwise abuses are possible What if AS110 is providing the full routing tableto AS120?AS110 is the transit provider for AS120192 192 192NANOG 35Avoiding Backbone HijackAS 110C CA AD DB BZ ZAS 120Y YX XAS120 deliberately uses AS110sbackbone as transit path betweentwo points in the local networkInternetInternet193 193 193NANOG 35Avoiding Backbone Hijack Router C carries a full routing table on itSo we cant use the earlier trick of only carrying AS110prefixes Reverse path forwarding check?But that only checks the packet source address, not thedestination and the source is fine! BGP WeightRecall that BGP weight was used to separate local andtransit traffic in the previous exampleIf all prefixes learned from AS120 on Router C had localweight increased, then destination is back out the incominginterfaceAnd the same can be done on Router B194 194 194NANOG 35Avoiding Backbone HijackSummary These are but two examples of manypossible scenarios which have becomefrequently asked questions Solution is often a lot simpler thanimaginedBGP Weight, selective announcement by iBGP,simple network redesigns195NANOG 35Summary196 196 196NANOG 35Summary Multihoming is not hard, reallyKeep It Simple & Stupid! Full routing table is rarely requiredA default is often just as goodIf customers want 170k prefixes, charge themmoney for it197 197 197NANOG 35Presentation Slides Available onftp://ftp-eng.cisco.com/pfs/seminars/NANOG35-BGP-Multihoming.pdfAnd on the NANOG 35 meeting pages athttp://www.nanog.org/mtg-0510/pdf/smith.pdf198NANOG 35BGP MultihomingTechniquesPhilip SmithNANOG3523-25 October 2005Los Angeles