Cheng Tien Ee, Byung-Gon Chun, Vijay Ramachandran, Kaushik Lakshminarayanan,
Scott ShenkerUC, Berkeley
ACM SIGCOMM 2007
Presenter: Te-Yuan Huang
Resolving Inter-Domain Policy Disputes
OutlineBGP Brief ReviewBGP OscillationGoal of this workApproach – Global Preference MetricsPractical IssuesResultsConclusion
BGP Brief ReviewBorder Gateway Protocol (BGP)
External BGP (eBGP)Routing Protocol for Inter-AS
Interior BGP (iBGP)Routing Protocol for Intra-ASShortest iBGP distance to egress point
Policy Based ExpressivenessAutonomy
4
Internet Routing is Policy-Based
Comcast
Abilene
AT&T Cogent
MIT
Expressiveness: BGP can realized Ases’ objectivesAutonomy: ASes can configure policies independently
Expressiveness vs. autonomy
Copyright ® Weifeng Chen
5
Expressiveness: Ranking and FilteringRanking: route selectionFiltering: route advertisement
Customer
Competitor
Primary
Backup
Ranking: controls traffic out of the network traffic engineering
Filtering: controls traffic into the network business contracts
Must have autonomy!
Copyright ® Weifeng Chen
BGP Brief Review – Cont.Routing Advertisement
Only exchange with direct neighborsBGP Attributes
Local preference Multi-exit discriminator (MED)
BGP Brief Review – Cont.BGP Attributes – Local Preference
Set according to policy decision
Advertisement172.16.1.0/24
Advertisement172.16.1.0/24
Ato200Local Pref = 1
Bto200Local Pref = 0
Bto200 0Ato200 1
Preferred
Route
Lower Value is preferred
BGP Brief Review – Cont.BGP Attributes – Multi-exit discriminator
(MED)Recommendation from neighbors
Advertisement172.16.1.0/24with MED=5
Advertisement172.16.1.0/24With MED=1
Preferred
Route
Lower Value is preferred
Note: some providers may not listen to MEDs
Problem of BGP - OscillationRoot of the problem
Policy & Autonomy!ISPs administer their networks
independentlyNo global view of all AsesNo global controlISP’s policies reflect business relations, thus
not revealedConflicts in policies can cause persistent
route oscillationsHard to detect and to resolve
BGP Oscillation - Example
A
DB C(CD
)(D)
(AD)(D)
(BD)(D)
D
D D
BD
CD
AD
BGP Oscillation - Example
A
DB C(CD
)(D)
(AD)(D)
(BD)(D)
BCD
CAD
ABD
!!! Route BD is GONE!
!!! Route AD is GONE!
!!! Route CD is GONE!
Switch Back to first stage
Related Works – Good NewsHierarchical business structure results in
convergence [Gao, Rexford 2001]Relationship either peer-to-peer, customer-
provider, or backupValley-Free characteristics
BUT….Violations of these assumptions exists
Complex agreementBusiness mergesMisconfigurations
Still difficult to verify stability [Griffin 2002]Prefer to separate economics and network
convergence
Related Works – Bad NewsTheorem [FJB ’05]:
If ASes keep the freedom to filter routes arbitrarily
Only shortest-path routing would guarantee convergence
Moreover, ISPs would not use shortest path policy autonomy is much more important for
ISPs
What should we do, then?Rely on natural business arrangementAnd hope it would provide a stable
hierarchy?
Eliminate policy autonomy Because of its potential to induce
route oscillations?
NONE of them sounds FEASIBLE
GoalA simple extension to BGP
Only activate when a persistent oscillation is detected
Desired properties:Do not reveal any ISP policiesDetect and resolve oscillation locally in real-
timeIf there is no oscillation
Let BGP remain runningDistinguish transient oscillation from
persistent oscillationDon’t unnecessarily dismiss routes
One More Related Works – Dispute Wheel[Griffin, 2002]Any policy-induced oscillation
can be characterize by a dispute wheelNo dispute wheels → No route oscillation
A set of pivot nodesAt each pivot pi
A spoke path Qi from pi to destination
A rim path Ri+1 to next pivot pi+1
Prefer path piRi+1pi+1Qi+1d over piQid
One More Related Works – Dispute Wheel (Cont.)
Approach
Detect oscillation &
Resolve it
23
Oscillation Revisit
1
2 3
1 3 0 1 0
3 2 0 3 0
2 1 0 2 0
0
Varadhan, Govindan, & Estrin, “Persistent Route Oscillations in Interdomain Routing”, 1996 Griffin, Shepherd, & Wilfong, “The Stable Paths Problem and Interdomain Routing”, ToN, 2002
Dispute wheel: global, cyclic relationship among rankings
24
High-level Ideas
1
2 3
1 3 0 1 0
3 2 0 3 0
2 1 0 2 0
0
Node 2: locally thinks it may be involved in oscillation
Express its suspect to neighbors
If neighbors also express the same concern
Presence of dispute wheel is confirmed
Precedence MetricThe tool to express their concern
Detect dispute wheelResolve dispute wheel
A history of observed route advertisement
Maintain a global precedence valueGlobal : shared by more than one
ASes, not all Ases
Precedence Metric – Cont.Precedence Metric consist of global and
local values
First, look at global valueChoose route P2Advertise P2 with 0(global)+1(local) = 1
AS PATH Global Precedence(Incoming AD)
Local Precedence(Policy Decision)
P0 2 0
P1 1 1
P2 0 1
RankBased onBGP decisionprocess
Precedence Metric – Cont.Precedence Metric consist of global and
local values
Since global value are all zeroLook at local value insteadExactly the same as BGP
AS PATH Global Precedence(Incoming AD)
Local Precedence(Policy Decision)
P0 0 0
P1 0 1
P2 0 1
RankBased onBGP decisionprocess
Precedence in Action
AS PATH Global Value Local Value
AD 0 0
D 0 1
A
B C
D
[D:0]
[D:0] [D:0]
[AD:0]
[CD:0]
[BD:0]
B’s History Table
Precedence in Action – Cont.
AS PATH Global Value Local Value
AD 0 0
D 0 1
A
B C
D
[AD:1]
[BD:1]
[CD:1]
[ACD:0]
[CBD:0]
[BAD:0]
B’s History Table
1
No more oscillation
Precedence in Action – Cont.On the other hand …if no dispute exists
Nodes pick their most preferred routeGlobal precedence values advertised are all
0
Properties of Precedence MetricKnowledge of routes encountered during
oscillations + Precedence MetricNO further policy-induced oscillations can
occur.After convergence,
If non-zero global value exists → dispute wheel existed
Only global value advertisedNot revealed any policy
Properties are proved in the paper
Practical Issues1. Knowledge of routes encountered during
oscillationWhich routes to storeFor how long
2. Distinguish transient and persistent oscillation
Transient OscillationOften occur after outrageSeveral second
3. Minimize the troubleshooting informationNot reveal ISP policiesStill enough for troubleshooting
Which Routes and How Long?Which routes to store for dispute detection?
More preferred, unavailable routesUsed to confirm disputes resulting in oscillation
How long should routes be stored?Long enough to distinguish transient and persistentUpper boundFor a route change could propagate around the
dispute wheelNd * MRAI (Minimum Route Advertisement Interval)
Nd : number of node around the wheel (rim node included)Node counter is send when ever global value is
incremented
Transient OscillationTransient oscillation – those disappear
without usage of our solution
Reset global value when policy is changedEx: link breakageNot permanently suppress any route
R1
AR1
R2
AR2
AR1 eventually evicted by timeout
Troubleshooting InformationTransmit upstream with <Route Indentifier,
AS, Seq#>Only for trouble shooting Precedence could run without this info
Small Summary of Achieved GoalAble to handle transient and permanent
oscillationsMinimal reveal of policies
All information exchanged are shared through BGP
No requirement for global knowledgeRoute selection still be made locally
Changes into RoutersHistory Table
Global precedence valueFeasible or not
Router counterAdaptive Convergence Window
The amount of time to keep infeasible routes
EvaluationSimulation
Event-based, packet-level, asynchronous simulator
Parameters:Batched route advertisement, MRAI = 30 secondDelay jitter
Randomly selected in [0,1] secondLink propagation delay 10ms
Two types of graphsSimple graphAS-level network topology dumped from
RouteViews
Route View GraphRetrieved route dump on 3rd Jan, 2007
24307 Ases, 56914 inter-AS linksHowever, AS Policy is not included
And it’s also impossible to retrieveBuild their own routing policy
Restrict to next hop preferenceFine tune the route inflation
Depth limit + BFSRoute inflation = actual length of AS Path / Shortest
AS Path
Evaluation ResultMetrics
Convergence timeA network is converged if the routing tables of all
nodes are not changed anymoreMemory Requirement
Evaluation Result – Cont. Convergence TimeCompared to
Shortest pathHalf of nodes take
20% more time to converge
Insignificance difference with/without dispute
Handling Misbehaved ASMisbehaving AS:
Not select the route with lowest valueNot properly update global value
Misbehave detectionRequire a monitor to keep tracks of ASesBeyond the scope of thie work
Future Work
ConclusionPrecedence solution
Operate only when oscillation existsNo additional ISP information is revealNo global view of the Internet is needed Routes are not dismissed due to transient
oscillations
Like vs. DislikeLikeProvide a simple idea to solve a really
difficult problemConsider practical issues alsoDislikeAll router need to be changedCan not detect and resolve misbehavior
Top Related