PATH VECTOR ROUTING AND THE BORDER GATEWAY PROTOCOL
Transcript of PATH VECTOR ROUTING AND THE BORDER GATEWAY PROTOCOL
PATHVECTORROUTINGANDTHEBORDERGATEWAYPROTOCOL
READING:SECTIONS4.3.3PLUSOPTIONALREADING
COS461:ComputerNetworksSpring2010(MW3:00‐4:20inCOS105)
MikeFreedmanhDp://www.cs.princeton.edu/courses/archive/spring10/cos461/
1
GoalsofToday’sLecture
• Path‐vectorrouQng– Fasterconvergencethandistancevector– MoreflexibilityinselecQngpaths
• InterdomainrouQng– AutonomousSystems(AS)– BorderGatewayProtocol(BGP)
• BGPconvergence– CausesofBGProuQngchanges– PathexploraQonduringconvergence
2
InterdomainRouQngandAutonomousSystems(ASes)
3
InterdomainRouQng• InternetisdividedintoAutonomousSystems– DisQnctregionsofadministraQvecontrol– Routers/linksmanagedbyasingle“insQtuQon”– Serviceprovider,company,university,…
• HierarchyofAutonomousSystems– Large,Qer‐1providerwithanaQonwidebackbone– Medium‐sizedregionalproviderwithsmallerbackbone– Smallnetworkrunbyasinglecompanyoruniversity
• InteracQonbetweenAutonomousSystems– InternaltopologyisnotsharedbetweenASes– …but,neighboringASesinteracttocoordinaterouQng
4
AutonomousSystemNumbers
5
AS Numbers are 16 bit values.
• Level 3: 1 • MIT: 3 • Harvard: 11 • Yale: 29 • Princeton: 88 • AT&T: 7018, 6341, 5074, … • UUNET: 701, 702, 284, 12199, … • Sprint: 1239, 1240, 6211, 6242, … • …
Currently over 50,000 in use.
whois–hwhois.arin.netas88
6
OrgName: Princeton University OrgID: PRNU Address: Office of Information Technology Address: 87 Prospect Avenue City: Princeton StateProv: NJ PostalCode: 08540 Country: US
ASNumber: 88 ASName: PRINCETON-AS ASHandle: AS88 Comment: RegDate: Updated: 2008-03-07
RTechHandle: PAO3-ARIN RTechName: Olenick, Peter RTechPhone: +1-609-258-6024 RTechEmail: [email protected] …
ASNumberTrivia
• ASnumberisa16‐bitquanQty– So,65,536uniqueASnumbers
• Somearereserved(e.g.,forprivateASnumbers)– So,only64,510areavailableforpublicuse
• ManagedbyInternetAssignedNumbersAuthority– Givesblocksof1024toRegionalInternetRegistries– IANAhasallocated39,934ASnumberstoRIRs(Jan’06)
• RIRsassignASnumberstoinsQtuQons– RIRshaveassigned34,827(Jan’06)– Only21,191arevisibleininterdomainrouQng(Jan’06)
• Recentlystartedassigning32‐bitAS#s(2007)7
GrowthofASnumbers
8
InterdomainRouQng
• AS‐leveltopology– DesQnaQonsareIPprefixes(e.g.,12.0.0.0/8)– NodesareAutonomousSystems(ASes)– EdgesarelinksandbusinessrelaQonships
9
1
2
3 4
5
6 7
Client Web server
ChallengesforInterdomainRouQng
• Scale– Prefixes:200,000,andgrowing– ASes:20,000+visibleones,and60Kallocated– Routers:atleastinthemillions…
• Privacy– ASesdon’twanttodivulgeinternaltopologies– …ortheirbusinessrelaQonshipswithneighbors
• Policy– NoInternet‐widenoQonofalinkcostmetric– Needcontroloverwhereyousendtraffic– …andwhocansendtrafficthroughyou
10
Path‐VectorRouQng
11
Shortest‐PathRouQngisRestricQve• Alltrafficmusttravelonshortestpaths
• AllnodesneedcommonnoQonoflinkcosts
• IncompaQblewithcommercialrelaQonships
12
Regional ISP1
Regional ISP2
Regional ISP3
Cust1 Cust3 Cust2
National ISP1
National ISP2
YES
NO
Link‐StateRouQngisProblemaQc
• TopologyinformaQonisflooded– Highbandwidthandstorageoverhead– ForcesnodestodivulgesensiQveinformaQon
• EnQrepathcomputedlocallypernode– Highprocessingoverheadinalargenetwork
• MinimizessomenoQonoftotaldistance– Worksonlyifpolicyissharedanduniform
• TypicallyusedonlyinsideanAS– E.g.,OSPFandIS‐IS
13
DistanceVectorisontheRightTrack
• Advantages– Hidesdetailsofthenetworktopology– Nodesdetermineonly“nexthop”towardthedest
• Disadvantages– MinimizessomenoQonoftotaldistance,whichisdifficultinaninterdomainsemng
– SlowconvergenceduetothecounQng‐to‐infinityproblem(“badnewstravelsslowly”)
• Idea:extendthenoQonofadistancevector– Tomakeiteasiertodetectloops
14
Path‐VectorRouQng
• Extensionofdistance‐vectorrouQng– SupportflexiblerouQngpolicies– Avoidcount‐to‐infinityproblem
• Keyidea:adverQsetheenQrepath– Distancevector:senddistancemetricperdestd– Pathvector:sendtheen,repathforeachdestd
15
3 2 1
d
“d: path (2,1)” “d: path (1)”
data traffic data traffic
FasterLoopDetecQon
• Nodecaneasilydetectaloop– LookforitsownnodeidenQfierinthepath– E.g.,node1seesitselfinthepath“3,2,1”
• Nodecansimplydiscardpathswithloops– E.g.,node1simplydiscardstheadverQsement
16
3 2 1
“d: path (2,1)” “d: path (1)”
“d: path (3,2,1)”
FlexiblePolicies
• Eachnodecanapplylocalpolicies– PathselecQon:Whichpathtouse?
– Pathexport:WhichpathstoadverQse?
• Examples– Node2maypreferthepath“2,3,1”over“2,1”
– Node1maynotletnode3hearthepath“1,2”
17
2 3
1
2 3
1
BorderGatewayProtocol(BGP)
18
BorderGatewayProtocol
• InterdomainrouQngprotocolfortheInternet
– Prefix‐basedpath‐vectorprotocol– Policy‐basedrouQngbasedonASPaths– Evolvedduringthepast18years
19
• 1989 : BGP-1 [RFC 1105], replacement for EGP
• 1990 : BGP-2 [RFC 1163] • 1991 : BGP-3 [RFC 1267] • 1995 : BGP-4 [RFC 1771], support for CIDR • 2006 : BGP-4 [RFC 4271], update
BGPOperaQons
20
Establish session on TCP port 179
Exchange all active routes
Exchange incremental updates
AS1
AS2
While connection is ALIVE exchange route UPDATE messages
BGP session
IncrementalProtocol
• AnodelearnsmulQplepathstodesQnaQon– StoresalloftheroutesinarouQngtable– AppliespolicytoselectasingleacQveroute– …andmayadverQsetheroutetoitsneighbors
• Incrementalupdates– Announcement
• UponselecQnganewacQveroute,addnodeidtopath• …and(opQonally)adverQsetoeachneighbor
– Withdrawal• IftheacQverouteisnolongeravailable• …sendawithdrawalmessagetotheneighbors
21
IncrementalProtocol
• AnodelearnsmulQplepathstodesQnaQon– StoresalloftheroutesinarouQngtable– AppliespolicytoselectasingleacQveroute– …andmayadverQsetheroutetoitsneighbors
• Incrementalupdates– Announcement
• UponselecQnganewacQveroute,addnodeidtopath• …and(opQonally)adverQsetoeachneighbor
– Withdrawal• IftheacQverouteisnolongeravailable• …sendawithdrawalmessagetotheneighbors
22
ExportacQveroutes
• InconvenQonalpathvectorrouQng,anodehasonerankingfuncQon,whichreflectsitsrouQngpolicy
23
BGPRoute• DesQnaQonprefix(e.g.,128.112.0.0/16)• RouteaDributes,including
– ASpath(e.g.,“701888”)– Next‐hopIPaddress(e.g.,12.127.0.121)
24
AS 88 Princeton
128.112.0.0/16 AS path = 88 Next Hop = 192.0.2.1
AS 7018 AT&T
AS 11 Yale
192.0.2.1
128.112.0.0/16 AS path = 7018 88 Next Hop = 12.127.0.121
12.127.0.121
ASPATHADribute
25
AS7018 128.112.0.0/16 AS Path = 88
AS 1239 Sprint
AS 1755 Ebone
AT&T
AS 3549 Global Crossing
128.112.0.0/16 AS Path = 7018 88
128.112.0.0/16 AS Path = 3549 7018 88
AS 88
128.112.0.0/16 Princeton
Prefix Originated
AS 12654 RIPE NCC RIS project
AS 1129 Global Access
128.112.0.0/16 AS Path = 7018 88
128.112.0.0/16 AS Path = 1239 7018 88
128.112.0.0/16 AS Path = 1129 1755 1239 7018 88
128.112.0.0/16 AS Path = 1755 1239 7018 88
BGPPathSelecQon
• Simplestcase– ShortestASpath– ArbitraryQebreak
• Example– Three‐hopASpathpreferredoverafive‐hopASpath
– AS12654preferspaththroughGlobalCrossing
• But,BGPisnotlimitedtoshortest‐pathrouQng– Policy‐basedrouQng
26
128.112.0.0/16 AS Path = 3549 7018 88
AS 1129
128.112.0.0/16 AS Path = 1129 1755 1239 7018 88
AS 3549 Global Crossing
AS 12654 RIPE NCC RIS project
Global Access
ASPathLength!=RouterHops
• ASpathmaybelongerthanshortestASpath
• Routerpathmaybelongerthanshortestpath
27
2 AS hops, 8 router hops
s d
3 AS hops, 7 router hops
BGPConvergence
28
CausesofBGPRouQngChanges
• Topologychanges– Equipmentgoingupordown– Deploymentofnewroutersorsessions
• BGPsessionfailures– Duetoequipmentfailures,maintenance,etc.– Or,duetocongesQononthephysicalpath
• ChangesinrouQngpolicy– Changesinpreferencesintheroutes– Changesinwhethertherouteisexported
• PersistentprotocoloscillaQon– ConflictsbetweenpoliciesindifferentASes
29
BGPSessionFailure• BGPrunsoverTCP
– BGPonlysendsupdateswhenchangesoccur
– TCPdoesn’tdetectlostconnecQvityonitsown
• DetecQngafailure– Keep‐alive:60seconds– HoldQmer:180seconds
30
AS1
AS2
• ReacQngtoafailure– Discardallrouteslearnedfromtheneighbor– Sendnewupdatesforanyroutesthatchange– Overheadincreaseswith#ofroutes
• WhymanyQer‐1ASesfilterprefixes</24
RouQngChange:BeforeandAter
31
0
1 2
3
0
1 2
3
(1,0) (2,0)
(3,1,0)
(2,0)
(1,2,0)
(3,2,0)
RouQngChange:PathExploraQon
• AS1– Deletetheroute(1,0)– Switchtonextroute(1,2,0)– Sendroute(1,2,0)toAS3
• AS3– Sees(1,2,0)replace(1,0)– Comparestoroute(2,0)– SwitchestousingAS2
32
0
1 2
3
(2,0)
(1,2,0)
(3,2,0)
RouQngChange:PathExploraQon
• IniQalsituaQon– DesQnaQon0isalive– AllASesusedirectpath
• WhendesQnaQondies– AllASeslosedirectpath– Allswitchtolongerpaths– Eventuallywithdrawn
• E.g.,AS2– (2,0)(2,1,0)– (2,1,0)(2,3,0)– (2,3,0)(2,1,3,0)– (2,1,3,0)null 33
1 2
3
0
(1,0) (1,2,0) (1,3,0)
(2,0) (2,1,0) (2,3,0)
(2,1,3,0)
(3,0) (3,1,0) (3,2,0)
BGPConvergesSlowly
• Pathvectoravoidscount‐to‐infinity– But,ASessQllmustexploremanyalternatepaths– …tofindthehighest‐rankedpaththatissQllavailable
• Fortunately,inpracQce– MostpopulardesQnaQonshaveverystableBGProutes– AndmostinstabilityliesinafewunpopulardesQnaQons
• SQll,lowerBGPconvergencedelayisagoal– Canbetensofsecondstotensofminutes– HighforimportantinteracQveapplicaQons– …orevenconvenQonalapplicaQon,likeWebbrowsing
34
35
BGPNotGuaranteedtoConverge
Exampleknownasa“disputewheel”
(3d)isavailable
(2d)isavailable
(3d)isnotavailable
(1d)isavailable (2d)isnot
available
(1d)isnotavailable
Conclusions
• BGPissolvingahardproblem– RouQngprotocoloperaQngataglobalscale– Withtensofthousandsofindependentnetworks– Thateachhavetheirownpolicygoals– Andallwantfastconvergence
• KeyfeaturesofBGP– Prefix‐basedpath‐vectorprotocol– Incrementalupdates(announcementsandwithdrawals)
• Nextlecture:Tricksforsemngpolicy!
36