Post on 23-Dec-2015
Traffic Matrix Estimation for Traffic Engineering
Mehmet Umut Demircin
Traffic Engineering (TE)
TasksLoad balancingRouting protocols configurationDimensioningProvisioningFailover strategies
Particular TE Problem
Optimizing routes in a backbone network in order to avoid congestions and failures.Minimize the max-utilization.MPLS (Multi-Protocol Label Switching)
Linear programming solution to a multi-commodity flow problem.
Traditional shortest path routing (OSPF, IS-IS) Compute set of link weights that minimize congestion.
Traffic Matrix (TM)
A traffic matrix provides, for every ingress point i into the network and every egress point j out of the network, the volume of traffic Ti,j from i to j over a given time interval.
TE utilizes traffic matrices in diagnosis and management of network congestion.
Traffic matrices are critical inputs to network design, capacity planning and business planning.
Traffic Matrix (cont’d)
Ingress and egress points can be routers or PoPs.
Determining the Traffic Matrix
Direct Measurement:TM is computed directly by collecting flow-level measurements at ingress points.
Additional infrastructure needed at routers. (Expensive!)
May reduce forwarding performance at routers. Terabytes of data per day.
Solution = Estimation
TM Estimation
Available information:Link counts from SNMP data.Routing information. (Weights of links)Additional topological information. ( Peerings,
access links)Assumption on the distribution of demands.
Traffic Matrix Estimation:Existing Techniques and New
Directions
A. Madina, N. Taft, K. Salamatian, S. Bhattacharyya, C. Diot
Sigcomm 2003
Three Existing Techniques
Linear Programming (LP) approach. O. Goldschmidt - ISMA Workshop 2000
Bayesian estimation. C. Tebaldi, M. West - J. of American Statistical Association,
June 1998.
Expectation Maximization (EM) approach. J. Cao, D. Davis, S. Vander Weil, B. Yu - J. of American
Statistical Association, 2000.
Terminology
c=n*(n-1) origin-destination (OD) pairs. X: Traffic matrix. (Xj data transmitted by OD pair
j) Y=(y1,y2,…,yr ) : vector of link counts. A: r-by-c routing matrix (aij=1, if link i belongs to
the path associated to OD pair j)
Y=AXr<<c => Infinitely many solutions!
Linear Programming
Objective:
Constraints:
Statistical Approaches
Bayesian Approach
Assumes P(Xj) follows a Poisson distribution with mean λj. (independently dist.)
needs to be estimated. (a prior is needed)
Conditioning on link counts: P(X,Λ|Y)
Uses Markov Chain Monte Carlo (MCMC) simulation method to get posterior distributions.
Ultimate goal: compute P(X|Y)
Expectation Maximization (EM)
Assumes Xj are ind. dist. Gaussian.
Y=AX implies:
Requires a prior for initialization. Incorporates multiple sets of link measurements. Uses EM algorithm to compute MLE.
Comparison of Methodologies
Considers PoP-PoP traffic demands. Two different topologies (4-node, 14-node). Synthetic TMs. (constant, Poisson, Gaussian,
Uniform, Bimodal) Comparison criteria:
Estimation errors yielded. Sensitivity to prior. Sensitivity to distribution assumptions.
4-node topology
4-node topology results
14-node topology
14-node topology results
Marginal Gains of Known Rows
New Directions
Lessons learned: Model assumptions do not reflect the true nature of
traffic. (multimodal behavior) Dependence on priors Link count is not sufficient (Generally more data is
available to network operators.) Proposed Solutions:
Use choice models to incorporate additional information.
Generate a good prior solution.
New statement of the problem
Xij= Oi.αij
Oi : outflow from node (PoP) i.αij : fraction Oi going to PoP j.
Equivalent problem: estimating αij . Solution via Discrete Choice Models
(DCM).User choices. ISP choices.
Choice Models
Decision makers: PoPs Set of alternatives: egress PoPs. Attributes of decision makers and alternatives:
attractiveness (capacity, number of attached customers, peering links).
Utility maximization with random utility models.
Random Utility Model
Uij= Vi
j + εij : Utility of PoP i choosing to
send packet to PoP j. Choice problem: Deterministic component:
Random component: mlogit model used.
Results
Two different models (Model 1:attractiveness,
Model 2: attractiveness + repulsion )
Fast Accurate Computation of Large-Scale IP Traffic Matrices from
Link Loads
Y. Zhang, M. Roughan, N. Duffield, A. Greenberg
Sigmetrics 2003
Highlights
Router to router traffic matrix is computed instead of PoP to PoP.
Performance evaluation with real traffic matrices.
Tomogravity method (Gravity + Tomography)
Tomogravity
Two step modeling.Gravity Model: Initial solution obtained using
edge link load data and ISP routing policy.
Tomographic Estimation: Initial solution is refined by applying quadratic programming to minimize distance to initial solution subject to tomographic constraints (link counts).
Gravity Modeling
General formula:
Simple gravity model: Try to estimate the amount of traffic between edge links.
Generalized Gravity Model
Four traffic categories Transit Outbound Inbound Internal
Peers: P1, P2, …
Access links: a1, a2, ...
Peering links: p1,p2,…
Generalized Gravity Model
Generalized Gravity Model
Tomography
Solution should be consistent with the link counts.
Reducing the computational complexity Hundreds of backbone routers, ten
thousands of unknowns. Observations:
Some elements of the BR to BR matrix are empty. (Multiple BRs in each PoP, shortest paths)
Topological equivalence. (Reduce the number of IGP simulations)
Quadratic Programming
Problem Definition:
Use SVD to solve the inverse problem. Use Iterative Proportional Fitting (IPF) to
ensure non-negativity.
Evaluation of Gravity Models
Performance of proposed algorithm
Comparison
Robustness
Measurement errors
x=At+ε
ε=x*N(0,σ)
Questions?