1© 2000, Cisco Systems, Inc.
Deploying and Troubleshooting BGP
Networks
Deploying and Troubleshooting BGP
Networks
© 2000, Cisco Systems, Inc.CCIE’00 Paris
AgendaAgenda
2
© 2000, Cisco Systems, Inc. 3
AgendaAgenda
• Basics
• Peering
• Attributes and Route Selection Algorithm
• Prefix Generation and Aggregation
© 2000, Cisco Systems, Inc. 4
Agenda (cont)Agenda (cont)
• Soft Reconfiguration
• Internal mesh reduction
• MP-BGP
BasicsBasics
© 2000, Cisco Systems, Inc.CCIE’00 Paris 5
© 2000, Cisco Systems, Inc. 6
Autonomous systemAutonomous system
• Collection of networks under a a single technical administration
• Range: 1 to 65,535 (private: 64512 to 65534)
AS678
AS123
AS456
A
B C
ED
© 2000, Cisco Systems, Inc. 7
Autonomous systemsAutonomous systems
Stub AS
ISPStub AS
© 2000, Cisco Systems, Inc. 8
Autonomous systemsAutonomous systems
• Multihomed Nontransit AS
AS 3
AS 1
AS 2
© 2000, Cisco Systems, Inc. 9
Autonomous systemsAutonomous systems
• Multihomed Transit AS
AS 3
AS 1
AS 2
© 2000, Cisco Systems, Inc. 10
• BGP session established on top of TCP (port 179)
• Reliable transport layer
• TCP needs a routing layer (IGP)
BGP session
BGP sessionBGP session
© 2000, Cisco Systems, Inc. 11
• BGP uses a database (BGP table)
• Databases are exchanged after session set up
• Incremental updates after
IGPFIB
BGP
BGP tableBGP table
© 2000, Cisco Systems, Inc. 12
• BGP supports CIDR
• NLRI: Network Layer Reachability Information
Information carried and exchanged by BGP
GeneralitiesGeneralities
© 2000, Cisco Systems, Inc. 13
• eBGP is used to exchange NLRI between Autonomous Systems
• iBGP is used to carry NLRI within the Autonomous System
• A BGP router has internal and/or external neighbors
iBGP vs eBGPiBGP vs eBGP
© 2000, Cisco Systems, Inc. 14
iBGP vs eBGPiBGP vs eBGP
eBGP session
AS 1AS 2
iBGP session
© 2000, Cisco Systems, Inc. 15
• Learns multiple paths via internal and external BGP speakers
• Picks THE best path and installs it in the IP forwarding table
• Policies applied by influencing the best path selection
General operationGeneral operation
© 2000, Cisco Systems, Inc. 16
• BGP speaker advertises only the routes that it uses itself
“hop-by-hop” routing paradigm
• Reliable Transport Protocol
no need to implement fragmentation, reTX, ACKs and sequencing
assumes a “graceful” close: all outstanding data will be delivered
General operationGeneral operation
© 2000, Cisco Systems, Inc. 17
• From eBGP -> advertise to all
• From iBGP -> advertise only to eBGP
full iBGP mesh is required!!
• Propagate ONLY the best path
Information TransferInformation Transfer
© 2000, Cisco Systems, Inc. 18
When should you use BGP?When should you use BGP?
• Most appropriate for
Multihomed transit and non-transit AS
Scaling large networks
Deploying new IP-VPN services (MBGP)
• Not appropriate on stub AS (static route instead)
PeeringPeering
© 2000, Cisco Systems, Inc.CCIE’00 Paris 19
© 2000, Cisco Systems, Inc. 20
PeersPeers
AS 100 AS 101
AS 102
EE
BB DD
AA CC
Peers
© 2000, Cisco Systems, Inc. 21
BGP message typesBGP message types
• OPEN
• UPDATE
• NOTIFICATION
• KEEPALIVE
• size: 19 to 4096 octets
© 2000, Cisco Systems, Inc. 22
Open messageOpen message
Version
BGP identifier
Hold Time
My autonomous system
Opt param Len
bytes1 2 3 4
Optional parameters
Hold time = Max time (in sec) that may elapse between the receipt of successive UPDATE or KEEPALIVE packet. Negotiated when session starts
© 2000, Cisco Systems, Inc. 23
Notification messageNotification message
Error
Data
DataError subcode
bytes1 2 3 4
Error code Error subcode
1- message Header Error 1: Connection Not sync2: Bad message length3: Bad message type
2-Open message error 1: Unsupported version numb2: Bad Peer AS3: Bad BGP identifier4: Unsupported Optional Par.5: Authent error6: Unacceptable hold time
3-UPDATE message error 1: Malformed Attribute-list2: Unrecognised well-know attr.3: Missing well-know attribute (…)
4-Hold timer expired NA5-Finite state machine error NA6-Cease NA
© 2000, Cisco Systems, Inc. 24
Update messageUpdate message
Unfeasible Routes Length
Total Path Attribute length
Length
Withdrawn routes (variable len)
Path Attributes (var len)
bytes1 2 3 4
Prefix (var)
Length Prefix (var)
(…)
Unreach.routes
Path Attributes
NLRI Information
© 2000, Cisco Systems, Inc. 25
Path AttributesPath Attributes
• 4 Categories:
Well-Known mandatory (ex: AS_Path, next-hop, origin)
Well-Known discretionary (ex: local pref)
Optional transitive: should be passed along even if not supported (ex: community, aggregator)
Optional nontransitive (ex: MED)
© 2000, Cisco Systems, Inc. 26
Keepalive messageKeepalive message
• 19 Byte BGP header with no data
• Periodically exchanged.
• Hold time = max time between successive Keepalive and Update messages.
© 2000, Cisco Systems, Inc. 27
Neighbor negotiation’s finite state machine
Neighbor negotiation’s finite state machine
ConnectActive
OpenSent
OpenConfirm
Established GOAL
Idle
START
?
© 2000, Cisco Systems, Inc. 28
Neighbor negotiation’s finite state machine
Neighbor negotiation’s finite state machine
Idle
ConnectActive
OpenSent
OpenConfirm
Established
Start event (inc: reset)
© 2000, Cisco Systems, Inc. 29
Neighbor negotiation’s finite state machine
Neighbor negotiation’s finite state machine
Idle
ConnectActive
OpenSent
OpenConfirm
Established
TCP session successful
BGP is waiting for the transport session to startTCP session
not OK
Connect retry timer expires ->new TCP session
© 2000, Cisco Systems, Inc. 30
Neighbor negotiation’s finite state machine
Neighbor negotiation’s finite state machine
Idle
ConnectActive
OpenSent
OpenConfirm
Established
BGP tries to establish TCP session and listens for other potential peers
TCP session successfully established
Connect retry timer expires
Troubleshooting tip: A neighbor state flip-flopping between connect and active indicates a problem with the TCP session. Use extended ping to check
© 2000, Cisco Systems, Inc. 31
Neighbor negotiation’s finite state machine
Neighbor negotiation’s finite state machine
Idle
ConnectActive
OpenSent
OpenConfirm
Established
Open messagesent. BGP waitsfor neighbor’sopen message
In case of error (ex: bad version)-> Notification message sent
If Open mess. OK, send a Keepalive
If TCP disconnect received
© 2000, Cisco Systems, Inc. 32
Neighbor negotiation’s finite state machine
Neighbor negotiation’s finite state machine
Idle
ConnectActive
OpenSent
OpenConfirm
Established
BGP waitsfor Keepalive Keepalive
received
Notification message received
Neighbor negotiation’s finite state machine
Neighbor negotiation’s finite state machine
Idle
ConnectActive
OpenSent
OpenConfirm
Established
If notification message receivedor sent
Sends periodicKeepalives
© 2000, Cisco Systems, Inc.CCIE’00 Paris 33
© 2000, Cisco Systems, Inc. 34
AS 109
AS 110
131.108.0.0/16A
B150.10.0.0/16
131.108.10.0/24
.1
.2
• BGP speakers in different AS
• Should be directly connected
• Configuration:Router B
router bgp 110network 150.10.0.0 neighbor 131.108.10.1 remote-as 109
Router A
router bgp 109network 131.108.0.0 neighbor 131.108.10.2 remote-as 110
eBGP PeeringeBGP Peering
© 2000, Cisco Systems, Inc. 35
eBGP PeeringeBGP Peering
AS 109
AS 110
131.108.0.0/16A
B150.10.0.0/16
131.108.10.0/24
.1
.2
• Non directly connected neighbors
-> ebgp-multihop
• Configuration:Router B
router bgp 110neighbor 131.108.10.1 remote-as 109
neighbor 131.108.10.1 update-source ethernet 0
Router A
router bgp 109neighbor 150.10.0.1 remote-as 110
neighbor 150.10.0.1 ebgp-multihop
ip route 150.10.0.1 255.255.255.255 131.108.10.2
.1
© 2000, Cisco Systems, Inc. 36
iBGP PeeringiBGP Peering
131.108.10.0/24
.1
.2
A
B
AS 123
10.0.0.2/32
• BGP speakers in same AS
• Use loopback interfaces-> Update source loopback 0
• Configuration:Router B
router bgp 123neighbor 131.108.10.1 remote-as 123
neighbor 131.108.10.1 update-source loopback 0
Router A
router bgp 123neighbor 10.0.0.2 remote-as 123
37
• Use of <ebgp-multihop>
• Use the loopback on both routers
• Define IGP between the loopback interfaces in DMZ
• Configuration:router bgp 201neighbor x.x.x.x remote-as ISP-ASneighbor x.x.x.x update-source loopback0neighbor x.x.x.x ebgp-multihop!ip route x.x.x.x 255.255.255.255 next-hop0/1ip route x.x.x.x 255.255.255.255 next-hop0/2
ISP
AS 201
Load Balancing across parallel links
Load Balancing across parallel links
CCIE’00 Paris © 2000, Cisco Systems, Inc.
38
Typical issue with eBGP multihopTypical issue with eBGP multihop
ISP
AS 201
• Use specific static routes
ex: ip route x.x.x.x 255.255.255.255 next-hop0/1
• If not a specific static route, you could end-up learning via BGP a better prefix (longer match) for reaching the neighbor.
-> Session restarts continuously.
CCIE’00 Paris © 2000, Cisco Systems, Inc.
39
MultiPath SupportMultiPath Support
• Router peering with multiple routers in neighboring AS
• Install multiple routes in IP routing table
• Routes should be identical
• Next-hop is set to self (use loopback interface)
AS 201
ISP
DD FF
AA
CCIE’00 Paris © 2000, Cisco Systems, Inc.
40CCIE’00 Paris © 2000, Cisco Systems, Inc.
AS 201
ISP
DD FF
AA
• Configuration:router bgp 201neighbor 141.153.12.1 remote-as 2neighbor 141.153.17.2 remote-as 2maximum-paths 2
• <sh ip route>B 144.10.0.0/16 [20/0] via 141.153.12.1, 00:03:29 [20/0] via 141.153.17.2, 00:03:29
MultiPath Support (Cont.)MultiPath Support (Cont.)
© 2000, Cisco Systems, Inc. 41
Summary Typical Peering issuesSummary Typical Peering issues
• Extended ping fails -> IGP issue
• Update source missing
• No directly connected route to neighbor (eBGP) + forgot ebgp-multihop
• ebgp-multihop but wrong (or not specific enough) static route to neighbor
Attributes and Route Selection AlgorithmAttributes and Route Selection Algorithm
42CCIE’00 Paris © 2000, Cisco Systems, Inc.
© 2000, Cisco Systems, Inc. 43
• AS-path
• Next-hop
• Origin
• Local preference
• Atomic aggregate
• Aggregator
• Community
• Multi Exit Discriminator (MED)
WKM
WKD
OT
ONT
BGP AttributesBGP Attributes
© 2000, Cisco Systems, Inc. 44
SynchronizationSynchronization
“
”
In a transit network, a route learned from an external peer should not be advertised to other eBGP peers until all the routers in the local AS have
learned about it.
© 2000, Cisco Systems, Inc. 45
• Rtr A won’t advertise the prefixes from AS209 until the IGP converges.
• Turn synchronization off!
next-hop has to be known via IGP
router bgp 1880no sync
1880
209
690
B
A
SynchronizationSynchronization
© 2000, Cisco Systems, Inc. 46
• Rtr A won’t advertise the prefixes from AS209 until the IGP converges.
• Solutions:
redistribute into IGP (NOT!)
run BGP in rtr B
6901880
209
B
A
C
SynchronizationSynchronization
© 2000, Cisco Systems, Inc. 47
• Why?
not a transit network
all routers in transit path run BGP
• Advantages
carry fewer routes in IGP
BGP converges faster
no synchronizationno synchronization
© 2000, Cisco Systems, Inc. 48
AS 109
AS 110
131.108.0.0/16A
B
150.10.0.0/16
131.108.10.0/24
.1
.2
• The next hop to reach a network
eBGP
IP address of the peer
iBGP
NEXT_HOP advertised by eBGP
IGP should carry route to NEXT_HOPs
Recursive route lookup
Unlinks BGP from the physical topology
Allows IGP to make intelligent forwarding decision
NEXT_HOPNEXT_HOP
Unreachable next-hop -> route not used
© 2000, Cisco Systems, Inc. 49
192.68.1.0/24
AS 201
AS 200
CC
AA BB
• Example:
A and B arein the same AS
Router A will advertise 192.68.1.0/24with a NEXT_HOP of 150.1.1.3.
• More efficient!
150.1.1.3
150.1.1.1
150.1.1.2
Third-Party NEXT_HOPThird-Party NEXT_HOP
© 2000, Cisco Systems, Inc. 50
• Use of <next-hop-self>
• Example:
A and B are in the same AS
Router A will advertise 150.10.0.0 with a NEXT_HOP of 131.108.10.1, but router C can’t reach the next-hop!!
• Configuration (rtr A):
router bgp 109network 150.10.0.0neighbor 131.108.10.3 next-hop-self A
B
C
131.108.10.0
.2
.3.1
150.10.0.0
Third-Party NEXT_HOPThird-Party NEXT_HOP
Frame relay
© 2000, Cisco Systems, Inc. 51
• Alternative to configuring a specific IP address to be the next-hop for BGP routes
• Syntax (route-map command):
set ip next-hop peer-addressset ip next-hop peer-address
Override Third-Party Next-HopOverride Third-Party Next-Hop
© 2000, Cisco Systems, Inc. 52
• Set IP next-hop : best used on outboundoutbound route-map
• Be careful when manipulating next-hop and default routes. Routing loops can occur!
Solution: Good network design
Override Third-Party Next-Hop (Cont.)
Override Third-Party Next-Hop (Cont.)
© 2000, Cisco Systems, Inc. 53
• Cisco specific (sort of router’s internal local preference)
• Local to the router
Not propagated
• value: 0 - 65535
• Default:originated locally = 32768
other = 0
WEIGHTWEIGHT
© 2000, Cisco Systems, Inc. 54
• Indication of preferred path to exit the local AS
• Global to the local AS
• Paths with highest LOCAL-PREF are most desirable (default = 100)
bgp default local-preference value
LOCAL_PREFLOCAL_PREF
© 2000, Cisco Systems, Inc. 55
• Configuration (rtr A):router bgp 109neighbor x.x.x.x remote-as 1880neighbor x.x.x.x route-map foo in!route-map foo permit 10 match as-path 2 set local-preference 120!ip as-path access-list 2 permit ^1880_
A
1755 1880
666
Needs to go to 690
690
LOCAL_PREF (Cont.)LOCAL_PREF (Cont.)
© 2000, Cisco Systems, Inc. 56
•AS-PATH contains the list of AS the update had to traverse.
•AS-PATH is updated by the sending router with its own AS number.
•BGP uses the AS-PATH to detect routing loops.
AS_PATHAS_PATH
© 2000, Cisco Systems, Inc. 57
•Each time the router receives an eBGP update it checks the AS-PATH.
•If it finds is own AS number on the AS-PATH, the update is discarded.
AS_PATHAS_PATH
© 2000, Cisco Systems, Inc. 58
1880
141.253.10.0/24
A
690B
200
C
1. Router A sends update for 141.253.10.0/24 with AS_PATH: 18801. Router A sends update for 141.253.10.0/24 with AS_PATH: 1880
2. Router B sends update for 141.253.10.0/24 with AS_PATH: 690 1880
2. Router B sends update for 141.253.10.0/24 with AS_PATH: 690 1880
3.Router C sends update for 141.253.10.0/24 with AS_PATH: 200 690 1880
3.Router C sends update for 141.253.10.0/24 with AS_PATH: 200 690 1880
4.Router A will detect its own AS number and will discard the update
4.Router A will detect its own AS number and will discard the update
AS_PATHAS_PATH
© 2000, Cisco Systems, Inc. 59
Internet
ISP 1
ISP 2
You
Problem: 80% of the incoming traffic comesfrom ISP 1
AS_PATH manipulationAS-PATH prepending
AS_PATH manipulationAS-PATH prepending
© 2000, Cisco Systems, Inc. 60
Internet
ISP 1
ISP 2
As-path: 250 250 250
AS 250
As-Path: 250
AS_PATH manipulationAS-PATH prepending
AS_PATH manipulationAS-PATH prepending
Solution:
route-map prepend permit 10 match as-path 2 set as-path prepend 250 250
© 2000, Cisco Systems, Inc. 61
• neighbor x.x.x.x remove-private-AS
available for eBGP neighbors only
Update must have AS_PATH exclusively made up of private-AS numbers.
Confederations: private AS will be removed only if it’s after the confederation’s set of Ases
remove-private-as will not work if the private ASN you want to remove is the neighboring one!
AS_path manipulation Private-AS Removal
AS_path manipulation Private-AS Removal
© 2000, Cisco Systems, Inc. 62
Private-AS - ApplicationPrivate-AS - Application
• Applications include:
ISP with single-homed customers
Scaling big corporate networks 1880
193.1.34.0/24 65003193.2.35.0/24
65002193.0.33.0/24
65001193.0.32.0/24
A
193.1.32.0/22 1880
© 2000, Cisco Systems, Inc. 63
misc issue with AS_PATHmisc issue with AS_PATH
• Error message: #%BGP-3-INSUFCHUNKS: Insufficient chunk pools for aspath
• Router keeps working fine!!!
• Appears when router gets an update with AS_PATH > 50 AS
• Since 12.0(11) and 12.1(2), only appears when AS_PATH > 125
© 2000, Cisco Systems, Inc. 64
• Origin of the prefix
• Values:
IGP (i) = via network command
EGP (e) = learned from EGP
incomplete (?) = redistribution
ORIGINORIGIN
© 2000, Cisco Systems, Inc. 65
• Indication (to external peers) of the preferred path into an AS
used in multiple entry AS
non-transitive
• Compared only for routes from the same AS
• Lower MED value is more preferable
Multi-Exit Discriminator (MED)Multi-Exit Discriminator (MED)
© 2000, Cisco Systems, Inc. 66
• Configuration (rtr B):router bgp 1755neighbor x.x.x.x remote-as 1880neighbor x.x.x.x route-map set_MED out!route-map set_MED permit 10 match as-path 2 set metric 2!ip as-path access-list 2 permit _690$
B
1755
690
1880
209
A
MEDMED
© 2000, Cisco Systems, Inc. 67
• set metric-type internal
enable BGP to advertise a MED which corresponds to the IGP metric values
changes are monitored (and readvertised if needed) every 600s
bgp dynamic-med-interval <secs>
MED & IGP MetricMED & IGP Metric
© 2000, Cisco Systems, Inc. 68
• MED is compared ONLY for prefixes received from the same AS
(unless bgp always-compare-med is enabled)
• If the AS_PATH is made up of only confederation sub-ASs, its length is not considered AND the MED is not compared
• If an update is received with no MED, the router (by default) assigns it a value of 0
MED ComparisonMED Comparison
© 2000, Cisco Systems, Inc. 69
• Used to group destinations and apply a common policy
• Each prefix can belong to multiple communities
• Not propagated by default
neighbor ip-address send-community
rfc1997
Community AttributeCommunity Attribute
© 2000, Cisco Systems, Inc. 70
Community Attribute (Cont.)Community Attribute (Cont.)
• 32-bits longuse 16 bits to indicate the ASN
ip bgp-community new-format
set community AS:community [additive]
set community none
erase all the values in the attribute
set comm-list <number> delete
erase selected communities
© 2000, Cisco Systems, Inc. 71
• internet = all routes are members of this community
• no-export = do not advertise to eBGP peers
• no-advertise = do not advertise to any peer
• local-AS = do not advertise outside local AS (used with confederations)
Well-Known CommunitiesWell-Known Communities
© 2000, Cisco Systems, Inc. 72
170.10.0.0/16
170.10.X.X No-Export
170.10.0.0/16
AS 100 AS 200
170.10.X.X
CC FF
GG
DDAA
BB EE
No-Export CommunityNo-Export Community
© 2000, Cisco Systems, Inc. 73
Extended Community AttributeExtended Community Attribute
• Extended range
8 Bytes (64 bits)
• Structure
type:value
Value may be of the form AS:xxx
draft-ramachandra-bgp-ext-communities-01
© 2000, Cisco Systems, Inc. 74
• 1 Only consider paths with reachable NEXT_HOPs
• 2 Do not consider iBGP path if not synchronized
• 3 Highest WEIGHT
• 4 Highest LOCAL_PREF
• 5 Prefer locally originated route
• 6 Shortest AS_PATH
BGP Path SelectionBGP Path Selection
© 2000, Cisco Systems, Inc. 75
• 7 Lowest ORIGIN code: IGP < EGP < incomplete
• 8 Lowest Multi-Exit Discriminator (MED)8a IF bgp always-compare-med, then compare it for all paths8b Considered only if paths are from the same neighbor AS
• 9 Prefer an External path over an Internal one
• 10 Lowest IGP metric to the NEXT_HOP
BGP Path SelectionBGP Path Selection
© 2000, Cisco Systems, Inc. 76
BGP Path Selection (Cont.)BGP Path Selection (Cont.)
• 11 IF multipath is enabled, the router may install up to N parallel paths in the routing table
• 12 For eBGP paths, select the “oldest” to minimize route-flap
• 13 Lowest Router-ID Originator-ID is considered for reflected routes
• 14 Shortest Cluster-List Client must be aware of RR attributes!
• 15 Lowest neighbor IP address
Prefix Generation And Aggregation
Prefix Generation And Aggregation
Say what?!Say what?!
77CCIE’00 Paris © 2000, Cisco Systems, Inc.
© 2000, Cisco Systems, Inc. 78
• Networks originated by the local router
• Matching IGP route must existdynamic or static entry in routing table
• Example:router bgp 109network 200.10.10.0network 198.10.0.0 mask 255.255.0.0!ip route 198.10.0.0 255.255.0.0 null 0
<network> Command<network> Command
© 2000, Cisco Systems, Inc. 79
• From IGPTypically NOT a good thing!
• Static routes pointed to null0
• Example:router bgp 109redistribute static!ip route 198.10.0.0 255.255.0.0 null 0
RedistributionRedistribution
© 2000, Cisco Systems, Inc. 80
• Combine different routes into one
• Advertised as coming from the local AS
• A component must exist in the BGP table
Aggregate Addresses Aggregate Addresses
Aggregate Addresses Aggregate Addresses
© 2000, Cisco Systems, Inc. 81
• Aggregator AttributeLast AS number that formed the aggregate route
IP address of the BGP speaker that formed the aggregate route
• Atomic Aggregate attributeindicates a more specific route exists
BGP speaker receiving this attribute shall not remove the attribute when propagating it
• Useful for debugging. Don’t affect route selection.
Aggregation AttributesAggregation Attributes
© 2000, Cisco Systems, Inc. 82
Aggregate AttributesAggregate Attributes
NEXT_HOP = local
WEIGHT = 32768
LOCAL_PREF = best
AS_PATH = AS_SET or nothing
ORIGIN = worst
MED = none
© 2000, Cisco Systems, Inc. 83
• With no options it propagates the aggregate and all the components
• summary-only
Advertise ONLY the aggregate (no components)
Example:
router bgp 109aggregate-address 198.10.0.0 255.255.0.0 summary-only
<aggregate address><aggregate address>
© 2000, Cisco Systems, Inc. 84
• AS_SET
unordered set of al ASs traversed
helps avoid loops
• advertise the prefix and the components AND include AS_SET information in the path
as-setas-set
© 2000, Cisco Systems, Inc. 85
• Example:router bgp 1880network 193.1.34.0aggregate-address 193.0.32.0 255.255.254.0 as-set
193.1.34/24 1880193.0.33/24 1880 1881193.0.32/24 1880 1883193.0.32/23 1880 {1881,1883}
1880193.1.34/24
1881193.0.33/24
1883193.0.32/24
A
as-set (Cont.)as-set (Cont.)
© 2000, Cisco Systems, Inc. 86
suppress-map = suppress specific components
advertise-map = create an aggregate from specific components
attribute-map = set attributes for the aggregate route
suppress | advertise | attribute-map
Options (Cont.)Options (Cont.)
© 2000, Cisco Systems, Inc. 87
• Conditionally advertise prefixes— useful for dual homing
• Syntax:
neighbor <address> advertise-map <route-map> non-exist-map <route-map>
non-exist-map is periodically checked; if satisfied (i.e. routes are not in the BGP table), the prefixes matched by the advertise-map are advertised to the neighbor
Conditional AdvertisementConditional Advertisement
Soft ReconfigurationSoft Reconfiguration
88CCIE’00 Paris © 2000, Cisco Systems, Inc.
© 2000, Cisco Systems, Inc. 89
• Allows policies to be changed without clearing the neighbor
• Both inbound and outbound
Inbound requires additional memory
Outbound is more efficient
BGP Soft-ReconfigurationBGP Soft-Reconfiguration
© 2000, Cisco Systems, Inc. 90
• Outbound does not require any configuration
• Inbound configuration:router bgp 30neighbor 141.153.12.2 remote-as 32neighbor 141.153.12.2 soft-reconfiguration neighbor 141.153.12.2 route-map filter in neighbor 141.153.30.2 remote-as 31
• <clear ip bgp x.x.x.x soft [in|out]>
Soft-ReconfigurationSoft-Reconfiguration
© 2000, Cisco Systems, Inc. 91
Managing Policy ChangesManaging Policy Changes
• <addr> may be any of the following
x.x.x.x IP address of a peer
* all peers
ASN all peers in an AS
external all external peers
peer-group <name> all peers in a peer-group
clear ip bgp <addr> [soft] [in|out]
© 2000, Cisco Systems, Inc. 92
Route Refresh CapabilityRoute Refresh Capability
• Facilitates non-disruptive policy changes
• No configuration is needed
• No additional memory is used
• clear ip bgp x.x.x.x in
93© 2000, Cisco Systems, Inc.
Internal mesh reductionInternal mesh reduction
93CCIE’00 Paris © 2000, Cisco Systems, Inc.
© 2000, Cisco Systems, Inc. 94
• IBGP speaker does not advertise IBGP learned info to a third IBGP speaker!!!
• Avoids routing information loop
• Does not scale
• Following solutions do not change the current behaviour
Route reflectors
Confederation
IBGP MeshIBGP Mesh
© 2000, Cisco Systems, Inc. 95
AS 100
AA
CCBB
Normal IBGPNormal IBGP
© 2000, Cisco Systems, Inc. 96
Route Reflector
AS 100
CCBB
AA
Route Reflector: PrincipleRoute Reflector: Principle
© 2000, Cisco Systems, Inc. 97
• Multiple level of RR
RRRR
AS2
B
AS 1
Route-reflectorRoute-reflector
© 2000, Cisco Systems, Inc. 98
• Originator_ID Attribute
carries the RID of the originator of the route in the local AS
• Cluster_list Attribute
The local cluster-id (RR router-ID) is added when the update is reflected (added by the RR)
Loop AvoidanceLoop Avoidance
© 2000, Cisco Systems, Inc. 99
• When RR receives an update:
Check if its cluster-id is on the cluster-list
If cluster-id is on the cluster-list the update is silently discarded
If the BGP update is ok, the RR updates the cluster-list with its cluster-id and reflects the update (according to the rules)
With multiple RR in the same cluster, aunique cluster-id should be set
by configuration
Loop AvoidanceLoop Avoidance
© 2000, Cisco Systems, Inc. 100
• Collection of AS—sub-AS
• Visible to outside world as single AS
• Uses reserved AS numbers for internal sub-AS
• Sub-AS are fully meshed
• EBGP between sub-AS
ConfederationsConfederations
© 2000, Cisco Systems, Inc. 101
Confederation 100
Sub-ASSub-AS6500265002
Sub-ASSub-AS6500265002
Sub-ASSub-AS6500365003
Sub-ASSub-AS6500365003
Sub-ASSub-AS6500165001
Sub-ASSub-AS6500165001
BB CC
AA
ConfederationConfederation
© 2000, Cisco Systems, Inc. 102
• Mini-AS have eBGP like connections to other mini-AS
• However they do carry all the usual IBGP information : MED, local-pref, next-hop.
Confederation: PrincipleConfederation: Principle
© 2000, Cisco Systems, Inc. 103
Sub-ASSub-AS6500265002
Sub-ASSub-AS6500265002
Sub-ASSub-AS6500365003
Sub-ASSub-AS6500365003
Sub-ASSub-AS6500165001
Sub-ASSub-AS6500165001
Confederation 100
Sub-ASSub-AS6500465004
Sub-ASSub-AS6500465004
180.10.0.0/16 200
180.10.0.0/16 {65002} 200180.10.0.0/16 {65004 65002} 200
180.10.0.0/16 100 200
AA
FFEEDD
GGHH
CC
BB
Confederation: AS-pathConfederation: AS-path
© 2000, Cisco Systems, Inc. 104
• Route-Reflectors
– Easy to configure (clients are unchanged)
– RR configuration does not require any downtime
– RR will scale easily
RR vs ConfederationsRR vs Confederations
© 2000, Cisco Systems, Inc. 105
• Confederations
– Maintenance is complex due reconfiguration of ALL routers in AS
– Sub-confederation may have different BGP policies
RR vs ConfederationsRR vs Confederations
Route DampeningRoute Dampening
106CCIE’00 Paris © 2000, Cisco Systems, Inc.
© 2000, Cisco Systems, Inc. 107
• Route flaps ripple through the entire Internet
up and down of path
change in attributes
• Wastes CPU
• Objective: reduce the scope of route flap propagation
Route Flap DampeningRoute Flap Dampening
© 2000, Cisco Systems, Inc. 108
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
0
1
2
3
4
Suppress-Limit
Reuse-Limit
Time
Penalty
Route Flap DampeningRoute Flap Dampening
© 2000, Cisco Systems, Inc. 109
• Add fixed penalty for each flap
flap = withdraw or attribute change
• Exponentially decay penalty
half-life determines rate
• Penalty above suppress-limit = do not advertise up route
• Penalty decayed below reuse-limit = advertise route
Flap Dampening: OperationFlap Dampening: Operation
MP-BGPMP-BGP
110CCIE’00 Paris © 2000, Cisco Systems, Inc.
© 2000, Cisco Systems, Inc. 111
• Extension to the BGP protocol in order to carry routing information about other protocolsex: Multicast, MPLS-VPN, IPv6, CLNS, ...
• Exchange of Multi-Protocol NLRI must be negotiated at session set up
BGP Capabilities negotiation
Multi-Protocol BGPMulti-Protocol BGP
© 2000, Cisco Systems, Inc. 112
• New non-transitive and optional BGP attributesMP_REACH_NLRI
“Carry the set of reachable destinations together with the next-hop information to be used for forwarding to these destinations” (RFC2283)
MP_UNREACH_NLRI
Carry the set of unreachable destinations
Multi-Protocol BGP - RFC2283Multi-Protocol BGP - RFC2283
© 2000, Cisco Systems, Inc. 113
• Attribute contains one or more Triples
1) Address Family Information (AFI) with Sub-AFI
Identifies the protocol information carried in the NLRI field
2) Next-Hop Information
Next-hop address must be of the same family
3) NLRI
Multi-Protocol BGP - RFC2283Multi-Protocol BGP - RFC2283
© 2000, Cisco Systems, Inc. 114
• BGP routers establish BGP sessions through the OPEN message
• OPEN message contains optional parameters
• BGP session is terminated if OPEN parameters are not recognised
• A new optional parameter: CAPABILITIES
BGP Capabilities NegotiationBGP Capabilities Negotiation
© 2000, Cisco Systems, Inc. 115
• A BGP router sends an OPEN message with CAPABILITIES parameter containing its capabilities:
Multiprotocol extension
Route-refresh
...
BGP Capabilities NegotiationBGP Capabilities Negotiation
© 2000, Cisco Systems, Inc. 116
• BGP routers determine capabilities of their neighbors by looking at the capabilities parameters in the open message
• Unknown or unsupported capabilities may trigger the transmission of a NOTIFICATION message
BGP Capabilities NegotiationBGP Capabilities Negotiation
© 2000, Cisco Systems, Inc. 117
• MBGP: Multiprotocol BGP for Multicast NLRIs
Multicast-BGP
• Unicast and Multicast routes are carried through same BGP session
MBGPMBGP
© 2000, Cisco Systems, Inc. 118
• AFI, Sub-AFI part of MP_REACH_NLRI and
MP_UNREACH_NLRIAFI = 1 (IPv4)
Sub-AFI = 1 (NLRI is used for unicast)Sub-AFI = 2 (NLRI is used for multicast)Sub-AFI = 3 (NLRI is used for both unicast and
multicast)
• Separate BGP tables
MBGPMBGP
© 2000, Cisco Systems, Inc. 119
• MBGP is used to match RPF
• MBGP does NOT propagate any multicast state
• Same rules apply to path selection and validationBGP attributes (AS-Path, LocalPref, MED, …)
• Recursive RPF lookup is done in unicast routing table
MBGPMBGP
© 2000, Cisco Systems, Inc. 120
• BGP/MBGP configuration allows todefine which NLRI type are exchanged (unicast,
multicast, both)
set NLRI type through route-maps (redistribution)
define policies through standard BGP attributes (for unicast and/or multicast NLRI)
• Translation between multicast and unicast NLRIs
MBGPMBGP
© 2000, Cisco Systems, Inc. 121
AS 321AS 123
BGP session for unicast and multicast NLRI
BGP: 192.168.100.2 open active, local address 192.168.100.1BGP: 192.168.100.2 went from Active to OpenSentBGP: 192.168.100.2 sending OPEN, version 4BGP: 192.168.100.2 OPEN rcvd, version 4BGP: 192.168.100.2 rcv OPEN w/ option parameter type: 2, len: 6BGP: 192.168.100.2 OPEN has CAPABILITY code: 1, length 4BGP: 192.168.100.2 OPEN has MP_EXT CAP for afi/safi: 1/1BGP: 192.168.100.2 rcv OPEN w/ option parameter type: 2, len: 6BGP: 192.168.100.2 OPEN has CAPABILITY code: 1, length 4BGP: 192.168.100.2 OPEN has MP_EXT CAP for afi/safi: 1/2BGP: 192.168.100.2 went from OpenSent to OpenConfirmBGP: 192.168.100.2 went from OpenConfirm to Established
senderreceiver
RPRP
192.168.100.0/24
MBGPMBGP
© 2000, Cisco Systems, Inc. 122
AS 321
Single BGP session across loopback interfaces
AS 123
192.168.100.0/24
192.168.200.0/24
192.168.25.0/24
sender
Multicast traffic
Unicast traffic
router bgp 321 network 192.168.100.0 nlri unicast network 192.168.200.0 nlri multicastnetwork 192.168.25.0 nlri unicast multicast neighbor 192.168.1.1 remote-as 123 nlri unicast multicast neighbor 192.168.1.1 ebgp-multihop 255 neighbor 192.168.1.1 update-source Loopback0 neighbor 192.168.1.1 route-map setNH out!route-map setNH permit 10 match nlri multicast set ip next-hop 192.168.200.2!route-map setNH permit 15match nlri unicastset ip next-hop 192.168.100.2
MBGP and non-congruent topologies
MBGP and non-congruent topologies
© 2000, Cisco Systems, Inc. 123
• An IP network infrastructure delivering private network services over a public infrastructure
Use a layer 3 backbone
Scalability, easy provisioning
Global as well as non-unique private address space
MPLS-VPN What is an IP VPN ?
MPLS-VPN What is an IP VPN ?
© 2000, Cisco Systems, Inc. 124
• Private trunks over a TELCO/SP shared infrastructureLeased/Dialup lines
FR/ATM circuits
IP (GRE) tunnelling
• Transparency between provider and customer networks
• Optimal routing requires full mesh over backbone
VPN Models - The Overlay modelVPN Models - The Overlay model
© 2000, Cisco Systems, Inc. 125
• Both provider and customer network use same network protocol
• CE and PE routers have a routing adjacency at each site
• All provider routers hold the full routing information about all customer networks
• Private addresses are not allowed
VPN Models - The Peer modelVPN Models - The Peer model
© 2000, Cisco Systems, Inc. 126
• Same as Peer model BUT !!!
• Provider Edge routers receive and hold routing information only about VPNs directly connected
• Reduces the amount of routing information a PE router will store
• Routing information is proportional to the number of VPNs a router is attached to
• MPLS is used within the backbone to switch packets (no need of full routing)
VPN Models - MPLS-VPN: The True Peer model
VPN Models - MPLS-VPN: The True Peer model
© 2000, Cisco Systems, Inc. 127
MPLS VPN Connection ModelMPLS VPN Connection Model
VPN_A
VPN_A
VPN_B10.3.0.0
10.1.0.0
11.5.0.0
P P
PP PE
PECE
CE
CE
VPN_A
VPN_B
VPN_B
10.1.0.0
10.2.0.0
11.6.0.0
CEPE
PE
CE
CE
VPN_A
10.2.0.0
CE
MP-iBGP sessions
• P routers (LSRs) are in the core of the MPLS cloud
• PE routers use MPLS with the core and plain IP with CE routers
• P and PE routers share a common IGP
• PE router are MP-iBGP fully meshed
© 2000, Cisco Systems, Inc. 128
PE
VPN Backbone IGP
MP-iBGP session
PE
P P
P P
• Multiple routing tables (VRFs) are used on PEs
Each VRF contain customer routes
Customer addresses can overlap
VPNs are isolated
• MP-BGP is used to propagate these addresses between PE routers
MPLS VPN Connection ModelMPLS VPN Connection Model
© 2000, Cisco Systems, Inc. 129
PE
VPN Backbone IGP
MP-iBGP session
PE
P P
P P
• BGP always propagate ONE route per destination
• What if two customers are using the same address ?
BGP will propagate only one route - PROBLEM !!!
• Therefore MP-BGP will distinguish between customer addresses
MPLS VPN Connection ModelAddresses overlap
MPLS VPN Connection ModelAddresses overlap
© 2000, Cisco Systems, Inc. 130
MPLS VPN Connection ModelRoute propagation through MP-BGP
MPLS VPN Connection ModelRoute propagation through MP-BGP
PE-1
VPN Backbone IGP
PE-2
P P
P P
MP-BGP assign a RD to each route in order to make them unique
In order to propagate them all
MP-BGP assign a Route-Target in order for remote PEs to insert such route to the corresponding routing table (VRF)
Route-Target is the colour of the route
VPN-IPv4 update:RD1:Net1, Next-hop=PE-1SOO=Site1, RT=Yellow, Label=10
CE-1
Site-2VPN-A
VPN-IPv4 updates are translated into IPv4 address and inserted into the VRF corresponding to the RT value
Site-1VPN-B
Site-1VPN-A
update for Net1
update for Net1
VPN-IPv4 update:RD2:Net1, Next-hop=PE-1SOO=Site1, RT=Green, Label=12
Site-2VPN-B
update for Net1
update for Net1
© 2000, Cisco Systems, Inc. 131
VPN Connection Model:Route propagation through MP-BGPVPN Connection Model:Route propagation through MP-BGP
PE-1
VPN Backbone IGP
PE-2
P P
P P
When a PE router receives a MP-BGP route it does check the route-target value
If such value is equal to the one intended to be used in a particular routing table the route is inserted into it
The label associated with the route is stored and used to send packets towards the destination
VPN-IPv4 update:RD1:Net1, Next-hop=PE-1SOO=Site1, RT=Yellow, Label=10
CE-1
Site-2VPN-B
VPN-IPv4 updates are translated into IPv4 address and inserted into the VRF corresponding to the RT value
Site-1VPN-B
Site-1VPN-A
update for Net1
update for Net1
VPN-IPv4 update:RD2:Net1, Next-hop=PE-1SOO=Site1, RT=Green, Label=12
Site-2VPN-A
update for Net1
update for Net1
© 2000, Cisco Systems, Inc. 132
MPLS VPN Connection ModelMP-BGP Update
MPLS VPN Connection ModelMP-BGP Update
• VPN-IPV4 address
Route Distinguisher
64 bits
Makes the IPv4 route globally unique
RD is configured in the PE for each VRF
IPv4 address (32bits)
• Extended Community attribute (64 bits)
Site of Origin (SOO): identifies the originating site
Route-target (RT): identifies the set of sites the route has to be advertised to
133CCIE’00 Paris © 2000, Cisco Systems, Inc.
MPLS VPN Connection ModelMP-BGP Update
MPLS VPN Connection ModelMP-BGP Update
Any other standard BGP attribute
Local PreferenceMEDNext-hopAS_PATHStandard Community...
A Label identifying:
The outgoing interface
The VRF where a lookup has to be done (aggregate label)
The BGP label will be the second label in the label stack of packets travelling in the core
© 2000, Cisco Systems, Inc. 134
• Existing BGP techniques can be used to scale the route distribution: route reflectors
• Each edge router needs only the information for the VPNs it supports
Directly connected VPNs
• RRs are used to distribute VPN routing information
ScalingScaling
© 2000, Cisco Systems, Inc. 135
ScalingScaling
• Very highly scalable:Initial VPN release: 1000 VPNs x 1000 sites/VPN = 1,000,000 sites
Architecture supports 100,000+ VPNs, 10,000,000+ sites
BGP “segmentation” through RRs is essential !!!!
• Easy to add new sites• configure the site on the PE connected to it
• the network automagically does the rest
© 2000, Cisco Systems, Inc. 136
VPN_A
VPN_A
VPN_B
10.3.0.0
10.1.0.0
11.5.0.0
P P
PP PE
PE CE
CE
CE
RR RRRoute Reflectors
VPN_B
VPN_B
10.1.0.0
10.2.0.0
VPN_A11.6.0.0
CEPE1
PE2CE
CE
VPN_A10.2.0.0
CE
• Route Reflectors may be partitioned
Each RR store routes for a set of VPNs
• Thus, no BGP router needs to store ALL VPNs information
• PEs will peer to RRs according to the VPNs they directly connect
MPLS-VPNScaling BGPMPLS-VPN
Scaling BGP
© 2000, Cisco Systems, Inc. 137
iBGP full mesh between PEs results in flooding all VPNs routes to all PEs
Scaling problems when large amount of routes. In addition PEs need only routes for attached VRFs
Therefore each PE will discard any VPN-IPv4 route that hasn’t a route-target configured to be imported in any of the attached VRFs
This reduces significantly the amount of information each PE has to store
Volume of BGP table is equivalent of volume of attached VRFs (nothing more)
MPLS-VPN ScalingBGP updates filtering
MPLS-VPN ScalingBGP updates filtering
ConclusionConclusion
138CCIE’00 Paris © 2000, Cisco Systems, Inc.
© 2000, Cisco Systems, Inc. 139
SummarySummary
• BGP represents a viable solution today for Service Providers to:
Offer new world IP-VPN services.
Interconnect transit and non transit AS to the Internet
• And for Enterprise customers to
Scale Big networks and dual home their AS.
© 2000, Cisco Systems, Inc. 140
Thanks toThanks to
• Stefano Previdi for his slides!!!
• You for your attention!!!!!
141© 2000, Cisco Systems, Inc.CCIE’00 Paris
Top Related