Traffic management 2007. An example Executive participating in a worldwide videoconference Executive...

43
Traffic management 2007

Transcript of Traffic management 2007. An example Executive participating in a worldwide videoconference Executive...

Traffic management

2007

An example Executive participating in a worldwide Executive participating in a worldwide

videoconferencevideoconference

Proceedings are videotaped and stored in an Proceedings are videotaped and stored in an archivearchive

Edited and placed on a Web siteEdited and placed on a Web site

Accessed later by others Accessed later by others

During conferenceDuring conference

Sends email to an assistantSends email to an assistant Breaks off to answer a voice callBreaks off to answer a voice call

What this requires For videoFor video

sustained bandwidth of at least 64 kbpssustained bandwidth of at least 64 kbps low loss ratelow loss rate

For voiceFor voice sustained bandwidth of at least 8 kbpssustained bandwidth of at least 8 kbps low loss ratelow loss rate

For interactive communicationFor interactive communication low delay (< 100 ms one-way)low delay (< 100 ms one-way)

For playbackFor playback low delay jitterlow delay jitter

For email and archivingFor email and archiving reliable bulk transportreliable bulk transport

What if… A million executives were simultaneously A million executives were simultaneously

accessing the network?accessing the network? What What capacitycapacity should each trunk have? should each trunk have? How should packets be How should packets be routedrouted? (Can we ? (Can we

spread load over alternate paths?)spread load over alternate paths?) How can different traffic types get How can different traffic types get different different

servicesservices from the network? from the network? How should each endpoint How should each endpoint regulateregulate its load? its load? How should we How should we priceprice the network? the network?

These types of questions lie at the heart of These types of questions lie at the heart of network network designdesign and and operationoperation, and form the , and form the basis for basis for traffic managementtraffic management..

Traffic management Set of Set of policiespolicies and and mechanismsmechanisms that allow a that allow a

network to efficiently satisfy a diverse range of network to efficiently satisfy a diverse range of service requestsservice requests

Tension is between Tension is between diversitydiversity and and efficiencyefficiency Traffic management = Connectivity + Quality Traffic management = Connectivity + Quality

of Serviceof Service Traffic management is necessary for providing Traffic management is necessary for providing

Quality of Service (QoS)Quality of Service (QoS) Subsumes congestion control (congestion Subsumes congestion control (congestion

== loss of efficiency)== loss of efficiency) One of the most challenging open problems in One of the most challenging open problems in

networkingnetworking

Outline Economic principlesEconomic principles

Traffic modelsTraffic models

Traffic classesTraffic classes

Time scales - Mechanisms Time scales - Mechanisms

Basics: utility function Users are assumed to have a utility function Users are assumed to have a utility function

that maps from a given quality of service to a that maps from a given quality of service to a level of level of satisfactionsatisfaction, or utility, or utility

Utility functions are Utility functions are privateprivate information information Cannot compare utility functions between Cannot compare utility functions between

users ?users ? Rational Rational users take actions that users take actions that maximizemaximize their their

utilityutility

Can determine utility function by observing Can determine utility function by observing preferencespreferences

Example Let u = S - a tLet u = S - a t

u = utility from file transferu = utility from file transfer S = satisfaction when transfer infinitely fast S = satisfaction when transfer infinitely fast

(t=0)(t=0) t = transfer timet = transfer time a = rate at which satisfaction decreases with a = rate at which satisfaction decreases with

timetime As transfer time increases, utility decreasesAs transfer time increases, utility decreases If t > S/a, user is worse off! (reflects time If t > S/a, user is worse off! (reflects time

wasted) -- timeoutwasted) -- timeout Assumes linear decrease in utilityAssumes linear decrease in utility S and a can be experimentally determinedS and a can be experimentally determined

Social welfare Suppose network manager knew the utility Suppose network manager knew the utility

function of every userfunction of every user Social WelfareSocial Welfare is maximized when some is maximized when some

combination of the utility functionscombination of the utility functions (such as (such as sum) is maximizedsum) is maximized

An economy (network) is An economy (network) is efficientefficient when when increasing the utility of one user must increasing the utility of one user must necessarily decrease the utility of another necessarily decrease the utility of another -- -- conservation lawconservation law

An economy (network) is An economy (network) is envy-freeenvy-free if no user if no user would trade places with another (better would trade places with another (better performance also costs more) ?performance also costs more) ?

Goal: Goal: maximize social welfaremaximize social welfare subject to efficiency, envy-freeness, and subject to efficiency, envy-freeness, and

making a profitmaking a profit

Constant 1

N

iiiq

Assume: two users - A, BAssume: two users - A, B Single switch, each user imposes load 0.4 (=ρ) Single switch, each user imposes load 0.4 (=ρ) Same delay to both users: delay = d Same delay to both users: delay = d A’s utility: 4 - dA’s utility: 4 - d B’s utility : 8 - 2d (wish less delay than A)B’s utility : 8 - 2d (wish less delay than A)

Conservation lawConservation law (KL_Q_v2_117) (KL_Q_v2_117) 0.4d + 0.4d = C => d = 1.25 C 0.4d + 0.4d = C => d = 1.25 C social welfaresocial welfare (sum of utilities) = 12-3.75 C (sum of utilities) = 12-3.75 C

If B’s delay reduced to 0.5C, then A’s delay = 2CIf B’s delay reduced to 0.5C, then A’s delay = 2C sum of utilities = 12 - 3Csum of utilities = 12 - 3C

Increase in social welfare (sum of utilities) need Increase in social welfare (sum of utilities) need not benefit everyonenot benefit everyone A loses utility, but may pay less for serviceA loses utility, but may pay less for service

ExampleExample

Some economic principles Lowering delay of delay-sensitive traffic Lowering delay of delay-sensitive traffic

increased welfareincreased welfare can increase welfare by can increase welfare by matchingmatching service menu to service menu to

user requirements user requirements BUT need to know what users want (signaling)BUT need to know what users want (signaling)

A single network that provides A single network that provides heterogeneous heterogeneous QoS is better than separate networks for each QoS is better than separate networks for each QoS - QoS - Q theoryQ theory unused capacity is available to othersunused capacity is available to others

For typical utility functions, welfare increases For typical utility functions, welfare increases more than linearly with increase in capacity ?more than linearly with increase in capacity ? individual users see smaller overall fluctuationsindividual users see smaller overall fluctuations can increase welfare by increasing capacitycan increase welfare by increasing capacity

- - Q theoryQ theory, , Large number lawLarge number law

Principles applied A single wire that carries both voice and data A single wire that carries both voice and data

is more efficient than separate wires for voice is more efficient than separate wires for voice and dataand data IP Phone IP Phone ADSLADSL

Moving from a 20% loaded10 Mbps Ethernet to Moving from a 20% loaded10 Mbps Ethernet to a 20% loaded 100 Mbps Ethernet will still a 20% loaded 100 Mbps Ethernet will still improve social welfareimprove social welfare increase capacity whenever possibleincrease capacity whenever possible

Better to give 5% of the traffic lower delay than Better to give 5% of the traffic lower delay than all traffic low delayall traffic low delay should somehow mark and isolate low-delay should somehow mark and isolate low-delay

traffictraffic

The two camps Can increase welfare either byCan increase welfare either by

matching services to user requirements matching services to user requirements oror increasing capacity blindlyincreasing capacity blindly

small and smartsmall and smart vs. vs. big and dumbbig and dumb Which is cheaper? no one is really sure!Which is cheaper? no one is really sure! It seems that smarter ought to be betterIt seems that smarter ought to be better otherwise, to get low delays for some traffic, otherwise, to get low delays for some traffic,

we need to give we need to give all trafficall traffic low delay, even if low delay, even if it doesn’t need itit doesn’t need it

But, perhaps, we can use the money spent on But, perhaps, we can use the money spent on traffic management to increase welfare traffic management to increase welfare

We will study traffic management, assuming We will study traffic management, assuming that it matters!that it matters!

Outline Economic principlesEconomic principles

Traffic models Traffic models (+ our research)(+ our research)

Traffic classesTraffic classes

Time scales - Mechanisms Time scales - Mechanisms

Traffic models To effectively manage traffic, need to have To effectively manage traffic, need to have

some idea of how users or aggregates of some idea of how users or aggregates of users behave => traffic modelusers behave => traffic model

e.g. average size of a file transfer e.g. average size of a file transfer e.g. how long a user uses a modeme.g. how long a user uses a modem

Models change with network usageModels change with network usage

We can only guess about the futureWe can only guess about the future

Two types of models - hard to match bothTwo types of models - hard to match both

from from measurementsmeasurements mathematicalmathematical analysis - educated guesses analysis - educated guesses

Telephone traffic models How are calls placed?How are calls placed?

call arrival modelcall arrival model studies show that time between calls is studies show that time between calls is

drawn from an drawn from an exponential exponential distributiondistribution memoryless: the fact that a certain amount memoryless: the fact that a certain amount

of time has passed since the last call gives of time has passed since the last call gives no information of time to next callno information of time to next call

call arrival process is therefore call arrival process is therefore PoissonPoisson How long are calls held?How long are calls held?

usually modeled as usually modeled as exponentialexponential however, measurement studies show it to however, measurement studies show it to

be be heavy tailedheavy tailed means that a significant number of calls means that a significant number of calls

last a very long timelast a very long time

Internet traffic modeling A few apps account for most of the trafficA few apps account for most of the traffic

WWW, FTP, telnetWWW, FTP, telnet A common approach is to model apps (this A common approach is to model apps (this

ignores distribution of destination!)ignores distribution of destination!) time between app invocationstime between app invocations connection durationconnection duration # bytes transferred# bytes transferred packet interarrival distributionpacket interarrival distribution

Little consensus on models - hard problemLittle consensus on models - hard problem PoissonPoisson models, models, MarkovMarkov-modulated models-modulated models Self-similarSelf-similar models models

Internet traffic models: features LAN connections differ from WAN connectionsLAN connections differ from WAN connections

Higher bandwidth (more bytes/call)Higher bandwidth (more bytes/call) longer holding timeslonger holding times

Many parameters are Many parameters are heavy-tailed heavy-tailed --> --> Self-similaritySelf-similarity Examples: # bytes in call, call durationExamples: # bytes in call, call duration means that a means that a fewfew calls are responsible for calls are responsible for

most of the trafficmost of the traffic these calls must be well-managedthese calls must be well-managedcan have long burstscan have long bursts

also means that also means that eveneven aggregates with many aggregates with many calls not be smoothcalls not be smooth

New models appear all the time, to account for New models appear all the time, to account for rapidly changing traffic mixrapidly changing traffic mix

Fractal dimension (1)Fractal dimension (1) A point has no dimensions A point has no dimensions A line has one dimension - length A line has one dimension - length A plane has two dimensions - length and A plane has two dimensions - length and

width, no depthwidth, no depth Space, a huge empty box, has three Space, a huge empty box, has three

dimensions - length, width, and depthdimensions - length, width, and depth Fractals can have fractional (or fractal) Fractals can have fractional (or fractal)

dimensiondimension A fractal might have dimension of 1.6 or 2.4A fractal might have dimension of 1.6 or 2.4 If you divide a line segment into N identical If you divide a line segment into N identical

parts, each part will be scaled down by the parts, each part will be scaled down by the ratio r = 1/Nratio r = 1/N

Fractal dimension (2)Fractal dimension (2) A two dimensional object, such as a square, A two dimensional object, such as a square,

can be divided into N self-similar parts, each can be divided into N self-similar parts, each part being scaled down by the factor part being scaled down by the factor

r = 1/Nr = 1/N(1/2)(1/2)

If one divides a self-similar D-dimensional If one divides a self-similar D-dimensional object into N smaller copies of itself, each object into N smaller copies of itself, each copy will be scaled down by a factor r, wherecopy will be scaled down by a factor r, where

r = 1 / Nr = 1 / N(1/D)(1/D)

given a self-similar object of N parts scaled given a self-similar object of N parts scaled down by the factor r, we can compute its down by the factor r, we can compute its fractal dimension (also called similarity fractal dimension (also called similarity dimension) from the above equation as dimension) from the above equation as

D = log (N) / log (1/r)D = log (N) / log (1/r)

The Koch SnowflakeThe Koch Snowflake D = log (4) / log (3) = 1.26

Fractal imagesFractal images Fractal images can be very beautiful. Here are Fractal images can be very beautiful. Here are

some of them, enjoy them!some of them, enjoy them!

Fractal is rather an art than a science.Fractal is rather an art than a science.

Self-similarity (1)Self-similarity (1) How long is the coast-line of Great Britain?How long is the coast-line of Great Britain? Using sticks of different size Using sticks of different size SS to estimate the to estimate the

length length LL of a coastline of a coastline

Self-similarity (2)Self-similarity (2) It turns out that as the scale of It turns out that as the scale of

measurement decreases the estimated measurement decreases the estimated length increases without limit. length increases without limit.

Thus, if the scale of the (hypothetical) Thus, if the scale of the (hypothetical) measurements were to be infinitely measurements were to be infinitely small, then the estimated length would small, then the estimated length would become infinitely large!become infinitely large!

Self-similarity (3)Self-similarity (3) self-similarityself-similarity

any portion of the curve, if blown up in any portion of the curve, if blown up in scale, would appear identical to the whole scale, would appear identical to the whole curve (see a coast-line from an airplane)curve (see a coast-line from an airplane)

Thus the transition from one scale to Thus the transition from one scale to another can be represented as iterations of another can be represented as iterations of a scaling processa scaling process

fractalfractal ( by Mandelbrot ) ( by Mandelbrot ) any curve or surface that is independent of any curve or surface that is independent of

scale scale This property, referred to as This property, referred to as self-similarityself-similarity

Self-similar feature of trafficSelf-similar feature of traffic Fractal characteristicsFractal characteristics

order of dimension = fractal order of dimension = fractal Self-similar featureSelf-similar feature

across wide range of time scalesacross wide range of time scales BurstnessBurstness

across wide range of time scalesacross wide range of time scales Long-range dependenceLong-range dependence

autocorrelations that span many time autocorrelations that span many time scalesscales

ACF (autocorrelation function)ACF (autocorrelation function)

r(k) r(k) = limT∫-T~T XtXt+kdt ACF does not decay exponentially as lag ACF does not decay exponentially as lag

increasesincreases

Presence of self-similar features Presence of self-similar features in measured network trafficin measured network traffic

Ethernet [Bellcore Leland 1992] Ethernet [Bellcore Leland 1992] Variable-Bit-Rate video traces [Bellcore 95] Variable-Bit-Rate video traces [Bellcore 95] WAN-TCP WAN-TCP [Paxson][Paxson], NSFNET , NSFNET [Klivan],[Klivan], CERNET CERNET [TJU][TJU]

Common Channel Signaling Network (CCSN) Common Channel Signaling Network (CCSN) - SS7 - Terabyte range - SS7 - Terabyte range [Bellcore][Bellcore]

MAN-DQDB and LAN cluster [Cinotti] MAN-DQDB and LAN cluster [Cinotti] FTP - a file server [Raatikainen]FTP - a file server [Raatikainen] Web [Crovella]Web [Crovella][TJU][TJU] ATM-Bay Area Gigabit Testbed (BAGNet) ATM-Bay Area Gigabit Testbed (BAGNet)

[Siddevad][Siddevad] WiFi [Stanford, UCSD, WiFi [Stanford, UCSD, TJUTJU]]

Self-similar modelsSelf-similar models Traditional models can only capture short-range Traditional models can only capture short-range

dependence dependence -- Poisson process, ARIMA(-- Poisson process, ARIMA(p,d,qp,d,q) etc.) etc.

Computer networks exhibits self-similarity, i.e. Computer networks exhibits self-similarity, i.e. long-range dependence.long-range dependence.

Self-similar modelsSelf-similar models only describe the long-range dependence, can't be only describe the long-range dependence, can't be

used to describe the short-range dependence used to describe the short-range dependence FGN (fractional Gaussian noise)FGN (fractional Gaussian noise) FDN (fractional differencing noise) = FARIMA(0,FDN (fractional differencing noise) = FARIMA(0,dd,0),0)

FARIMA(FARIMA(p,d,qp,d,q) (fractional autoregressive ) (fractional autoregressive integrated moving average) model integrated moving average) model can describe can describe both long-range and short-range dependence both long-range and short-range dependence simultaneouslysimultaneously

Network delay on FGN modelsNetwork delay on FGN models

Network delay on FARIMA modelsNetwork delay on FARIMA models

Network delay on FARIMA models Network delay on FARIMA models with non-Gaussian distribution

N*N Cell Switch N*N Cell Switch with Input and Output Queueingwith Input and Output Queueing

non-blocking

cell swictch:

N

1

:

1

N

:

:space-division

Cell Loss Probability versus Input Cell Loss Probability versus Input Buffer SizeBuffer Size

FARIMA(FARIMA(pp,,dd,,qq) model and application) model and application FARIMA modelsFARIMA models

Generating FARIMA processesGenerating FARIMA processes

Traffic modeling using FARIMA modelsTraffic modeling using FARIMA models

Traffic prediction using FARIMA Traffic prediction using FARIMA modelsmodels

Prediction-based bandwidth allocationPrediction-based bandwidth allocation

Prediction-based admission controlPrediction-based admission control

The definition of chaosThe definition of chaos

Definition:Definition: Chaos is Chaos is aperiodicaperiodic time-asymptotic behaviour in a time-asymptotic behaviour in a

deterministicdeterministic non-linearnon-linear dynamic system which dynamic system which exhibits sensitive dependence on exhibits sensitive dependence on initial conditionsinitial conditions..

Sensitive dependence on initial conditions (Butterfly Sensitive dependence on initial conditions (Butterfly effect) effect) :: It refers to sensitive dependence on initial It refers to sensitive dependence on initial

conditions. In nonlinear systems, making small conditions. In nonlinear systems, making small changes in the initial input values will have changes in the initial input values will have dramatic effects on the final outcome of the dramatic effects on the final outcome of the system. system.

The result of the butterfly effect is The result of the butterfly effect is unpredictability.unpredictability.

Chaos vs. RandomnessChaos vs. Randomness

Do not confuse chaotic with randomDo not confuse chaotic with random::

RandomRandom:: irreproducible and unpredictableirreproducible and unpredictable

ChaoticChaotic:: deterministic - same initial conditions deterministic - same initial conditions

lead to same final state…lead to same final state… but the final but the final state is very different for small changes state is very different for small changes to initial conditionsto initial conditions

difficult or impossible to make long-term difficult or impossible to make long-term predictionspredictions

Nonlinear Instabilities in TCP-REDNonlinear Instabilities in TCP-REDPriya Ranjan, Eyad H. Abed and Richard J. La INFOCOM 2002

This work develops a discrete time This work develops a discrete time feedback system model for a simplified feedback system model for a simplified TCP network with RED (Random Early TCP network with RED (Random Early Detection) control. Detection) control.

The model involves sampling the buffer The model involves sampling the buffer occupancy variable at certain instants. occupancy variable at certain instants.

The The non-linear dynamical modelnon-linear dynamical model is used is used to analyze the TCP-RED operating point to analyze the TCP-RED operating point and its stability with respect to various and its stability with respect to various RED controller and system parameters.RED controller and system parameters.

Fig.5. Bifurcation diagram of average and instantaneous queue length w.r.t. w, max = 0.1

Study on the chaotic nature of Study on the chaotic nature of wireless trafficwireless traffic

TheThe correlation dimensioncorrelation dimension of wireless traffic traces isof wireless traffic traces is non-integer numbernon-integer number. .

The The largest Lyapunov exponentlargest Lyapunov exponent of wireless traffic of wireless traffic traces istraces is positivepositive. .

The The principal components analysisprincipal components analysis showed that the showed that the intrinsic informationintrinsic information of the traffic is of the traffic is accumulatedaccumulated in in thethe first and few lower-index components. first and few lower-index components.

All those results indicated that the All those results indicated that the wireless traffic iswireless traffic is a a low dimensional chaotic systemlow dimensional chaotic system. .

This gives us the good theoretical basis for the This gives us the good theoretical basis for the analysis and analysis and modeling of wireless traffic using Chaos modeling of wireless traffic using Chaos TheoryTheory..

SVM-Based Models for SVM-Based Models for Predicting WLAN TrafficPredicting WLAN Traffic

SVM (SVM (Support Vector MachineSupport Vector Machine): a novel type of ): a novel type of learning learning machinemachine, Presented by V. Vapnik et. al. in 1995, Presented by V. Vapnik et. al. in 1995

AdvantagesAdvantages Good generalization performance: SVM implements Good generalization performance: SVM implements

the Structural Risk Minimization Principle which the Structural Risk Minimization Principle which seeks to minimize the upper bound of the seeks to minimize the upper bound of the generalization error rather than only minimize the generalization error rather than only minimize the training errortraining error

Absence of local minima: Training SVM is equivalent Absence of local minima: Training SVM is equivalent to solving a linearly constrained quadratic to solving a linearly constrained quadratic programming problem. Hence the solution of SVM is programming problem. Hence the solution of SVM is unique and globally optimalunique and globally optimal

Small amount of training samples: In SVM, the Small amount of training samples: In SVM, the solution to the problem only depends on a subset of solution to the problem only depends on a subset of training data points, called support vectorstraining data points, called support vectors

Related WorksRelated Works SVM ApplicationsSVM Applications

Pattern recognition, document classification, etc. Pattern recognition, document classification, etc. Time series prediction, Internet traffic prediction, call Time series prediction, Internet traffic prediction, call

classification for AT&T’s natural dialog system, classification for AT&T’s natural dialog system, multi-user detection and signal recovery for a CDMA multi-user detection and signal recovery for a CDMA system, SVM-based bandwidth reservation in system, SVM-based bandwidth reservation in sectored cellular communicationssectored cellular communications

Our work:Our work:

Applying SVM for wireless Applying SVM for wireless traffic predictiontraffic prediction One-step-ahead predictionOne-step-ahead prediction Multi-step ahead predictionMulti-step ahead prediction

Comparing its performance with three baseline Comparing its performance with three baseline predictorspredictors

One-step-ahead predictionOne-step-ahead prediction Show the one-step-ahead prediction performance for various prediction Show the one-step-ahead prediction performance for various prediction

methodsmethods