Experimental validation of the ON–OFF packet-level model for IP traffic

15
Experimental validation of the ON–OFF packet-level model for IP traffic Daniel Zaragoza * , Carlos Belo Instituto de Telecomunicac ¸o ˜es-IST, Av Rovisco Pais, 1049-001 Lisboa, Portugal Available online 18 September 2006 Abstract Much effort has been spent in analyzing and modeling IP traffic from the perspective of the packet-counting process that aggregates packet arrivals over fixed time intervals. Much less has been done regarding the modeling of the patterns of arrival of IP packets. The main purpose of this study was to determine whether the ON–OFF packet-level model is an adequate model. As shown in this paper, at the aggregate level the answer is yes, to great accuracy. Further, it compares favorably to other models such as modulated Poisson arriv- als. At the IP-flow level, the ON–OFF model is not adequate; rather, an active–inactive model is appropriate. The reason is that the rate in the ON state greatly varies from an ON period to the other. We further argue experimentally that it is only necessary to consider a small number of hosts and host pairs to account for the impact on a queuing system and on the long-term variability of the traffic. The starting point of the approach is to analyze and model traffic along the dimensions of packet size and packet inter-arrival processes. The method is applied to well-known and publicly available traces from the ITA and NLANR repositories, as well as traces of traffic cap- tured in our institution premises. Ó 2006 Elsevier B.V. All rights reserved. Keywords: IP traffic; Analysis; Modeling; ON–OFF; Packet-level model 1. Introduction The analysis and modeling of packet, and Internet Pro- tocol (IP) traffic in particular, has received considerable attention from the research community in the last decade, see [1] and included references. IP traffic is commonly defined via a counting process – number of packets or bytes in a fixed time interval. Much effort has been spent in analyzing, finding models, and pro- viding physical explanations for the scaling properties of the counting process. Much less has been done regarding the analysis and modeling of the patterns of arrivals of IP packets, i.e., how individual packets appear in the counts. The main purpose of this study was to determine whether a packet-level ON–OFF model is adequate and accurate. The well-known (Section 2) packet-level ON– OFF model for a source is characterized by the following properties: (a) the source alternates between ON and OFF periods, which are random variables. The ON and OFF variables are mutually independent and are drawn according to (generally) different distributions. Each ON (OFF) period is independent of other ON (OFF) periods. (b) During an ON period the source emits packets of constant size at constant intervals, which are the same for all ON periods, that is, the ON bit-rate is a constant; during an OFF period the source does not emit packets. If, for example, the ON bit-rate varies from an ON period to the other, or the ON and OFF periods are not independent, then the model is not ON–OFF, rather, it is an active–inactive model and additional distributions must be supplied. To increase confidence in a model standard practice consists in performing a large number of experiments in different conditions; however, a single counter-example is sufficient to invalidate that model. In order to increase 0140-3664/$ - see front matter Ó 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.comcom.2006.08.029 * Corresponding author. Tel.: +351 21 841 8454; fax: +351 21 841 8472. E-mail addresses: [email protected] (D. Zaragoza), carlos. [email protected] (C. Belo). www.elsevier.com/locate/comcom Computer Communications 30 (2007) 975–989

Transcript of Experimental validation of the ON–OFF packet-level model for IP traffic

www.elsevier.com/locate/comcom

Computer Communications 30 (2007) 975–989

Experimental validation of the ON–OFF packet-levelmodel for IP traffic

Daniel Zaragoza *, Carlos Belo

Instituto de Telecomunicacoes-IST, Av Rovisco Pais, 1049-001 Lisboa, Portugal

Available online 18 September 2006

Abstract

Much effort has been spent in analyzing and modeling IP traffic from the perspective of the packet-counting process that aggregatespacket arrivals over fixed time intervals. Much less has been done regarding the modeling of the patterns of arrival of IP packets. Themain purpose of this study was to determine whether the ON–OFF packet-level model is an adequate model. As shown in this paper, atthe aggregate level the answer is yes, to great accuracy. Further, it compares favorably to other models such as modulated Poisson arriv-als. At the IP-flow level, the ON–OFF model is not adequate; rather, an active–inactive model is appropriate. The reason is that the ratein the ON state greatly varies from an ON period to the other. We further argue experimentally that it is only necessary to consider asmall number of hosts and host pairs to account for the impact on a queuing system and on the long-term variability of the traffic. Thestarting point of the approach is to analyze and model traffic along the dimensions of packet size and packet inter-arrival processes. Themethod is applied to well-known and publicly available traces from the ITA and NLANR repositories, as well as traces of traffic cap-tured in our institution premises.� 2006 Elsevier B.V. All rights reserved.

Keywords: IP traffic; Analysis; Modeling; ON–OFF; Packet-level model

1. Introduction

The analysis and modeling of packet, and Internet Pro-tocol (IP) traffic in particular, has received considerableattention from the research community in the last decade,see [1] and included references.

IP traffic is commonly defined via a counting process –number of packets or bytes in a fixed time interval. Mucheffort has been spent in analyzing, finding models, and pro-viding physical explanations for the scaling properties ofthe counting process. Much less has been done regardingthe analysis and modeling of the patterns of arrivals ofIP packets, i.e., how individual packets appear in thecounts.

The main purpose of this study was to determinewhether a packet-level ON–OFF model is adequate and

0140-3664/$ - see front matter � 2006 Elsevier B.V. All rights reserved.

doi:10.1016/j.comcom.2006.08.029

* Corresponding author. Tel.: +351 21 841 8454; fax: +351 21 841 8472.E-mail addresses: [email protected] (D. Zaragoza), carlos.

[email protected] (C. Belo).

accurate. The well-known (Section 2) packet-level ON–OFF model for a source is characterized by the followingproperties: (a) the source alternates between ON andOFF periods, which are random variables. The ONand OFF variables are mutually independent and aredrawn according to (generally) different distributions.Each ON (OFF) period is independent of other ON(OFF) periods. (b) During an ON period the sourceemits packets of constant size at constant intervals,which are the same for all ON periods, that is, the ONbit-rate is a constant; during an OFF period the sourcedoes not emit packets. If, for example, the ON bit-ratevaries from an ON period to the other, or the ON andOFF periods are not independent, then the model isnot ON–OFF, rather, it is an active–inactive model andadditional distributions must be supplied.

To increase confidence in a model standard practiceconsists in performing a large number of experiments indifferent conditions; however, a single counter-example issufficient to invalidate that model. In order to increase

976 D. Zaragoza, C. Belo / Computer Communications 30 (2007) 975–989

the confidence in the ON–OFF model, and to find whenand where it is not applicable, we use different traces, cap-tured at different sites, and at different times (Section 3.1).Most of these traces are publicly available at [2] or [3]; thisallows building upon known results, whenever applicable.A necessary second step is to obtain the parameters ofthe model and to verify that it leads to the same resultsas the original trace. This second step is a necessary exer-cise in model fitting; we emphasize that model fitting isnot the main purpose of this work. Given the variety oftraces considered, the actual parameters are different fromtrace to trace; what is retained is whether the patterns ofpacket arrivals are best represented as ON–OFF. As a thirdstep, we also compare the ON–OFF model with alternativemodels, which are actually simpler to fit, and verify thatthey are less accurate.

The main purpose of this paper is to show that the ON–OFF packet-level model is an accurate model for IP trafficat the aggregate level, i.e., irrespective of senders andreceivers, at least for the traces we consider here. At thelevel of IP-flows, defined here by the 2-tuple (sourceaddress, destination address), the ON–OFF model is notapplicable. The reason is that the ON bit-rate is highly var-iable from an ON period to the other thus the correspond-ing model is more an active–inactive model than ON–OFF.Further, whenever possible (not all traces contain fullheaders), we have verified that such variability is not dueto congestion. That is, the traces we examined for this pur-pose do not contain retransmitted packets or retransmis-sion requests. However, an in-depth analysis of thisvariability is not in scope here, and further developmentmust be left for future work. This paper also reports onthe methods that we have used to obtain the parametersof the ON–OFF models. The methods at the aggregatelevel and at the IP-flow level are different.

Besides packet-level modeling at the aggregate level andIP-flow level, two additional points are developed here.The first one concerns the number of simultaneously activefast flows that most influence the queuing behavior, and byway of consequence are the most important in a giventrace. That number is commonly small, up to 10–15, overall traces. The second point concerns the question ofwhether these fast senders/flows are also responsible forthe observed long-term variability of traffic. For some trac-es, the answer is yes but for others we must also includeslower senders. Overall, the number of these senders is stillsmall, up to a few tens.

The paper is organized as follows. In the next section,we discuss related work.

In Section 3 we provide background material. We firstgive details about the traces we consider. We then reviewthe ON–OFF model while motivating its use in this work.

Section 4 presents an overview of the method we use forthe analysis of a given trace. From the perspective of mod-eling, the main result of this analysis is that some packetsizes, or ranges, are more important than other sizes interms of the effects on a queuing system and on the traffic.

We call this set the dominant component; the rest is calledthe remaining component. The dominant component ineach trace is made of the packet sizes that senders use totransfer relatively large amounts of data. Popular sizesare 1500 B and 512 or 536 B (B is for byte). However,implementers and system administrators are free to setthe packet size at the value they deem important to use.Therefore, these IP packet sizes are variable from site tosite.

Section 5 is devoted to the modeling at the aggregatelevel. The models developed there are for the dominantcomponent. Our modeling approach consists in consider-ing two processes. One process is a modulating process.The other, which is in focus here, consists of the processof individual packet arrivals. The modulating process isobtained from the trace and is not modeled. It is a deviceused to take into account the long-term variability of thetraffic in each trace. To validate our packet-level modelswe use an event-driven queuing system with an infinitebuffer. The following three requirements are used tovalidate the model. (a) The simulated traces have the samenumber of packets and the same bit-rate as the originaltrace. Because there are several hundred thousands to mil-lions of packets in a trace, our intent is to reproduce theoriginal trace in one shot only. (b) A model is validatedwhen both the overflow probability and the density(actually, the histogram) of the queue content are the sameas or very close to (within a few percent) those of theoriginal trace with the same output capacity. (c) Thismatching must be obtained for a range of utilizationproducing low buffer occupancy (tens of Kbytes) to highbuffer occupancy (Mbytes range). The reason for thisrequirement is presented in Section 6.

In Section 6, we compare other modeling approachesand their accuracy relatively to the present approach. Theexperimental results presented in this section also lead usto conjecture that, in an open loop context, the patternsof packet arrivals have no relevance for high buffer occu-pancy. Therefore, comparisons and validation of packet-level models must include measurements at low bufferoccupancy.

In Section 7, we use the well-known idea of IP-flow, i.e.,a sequence of packets from a source IP address to a desti-nation IP address, to extend the ON–OFF model to thatlevel. We also show that the number of simultaneouslyactive flows that impact the queuing behavior is small.

In Section 8, we argue that the observed long-term var-iability of the traffic can be accounted for by considering asmall number of hosts and host pairs.

The conclusions and discussion are presented in Section 9.In the remainder of the paper, we use the following

abbreviations. Inter-arrival time is abbreviated as iat. Mea-surement point is abbreviated as MP. Semi-Markov chainis abbreviated as SMC. BL stands for burst length. LRDis for long-range dependent or dependence. Other abbrevi-ations such as TCP, HTTP, FTP, and DNS are common inthe field.

D. Zaragoza, C. Belo / Computer Communications 30 (2007) 975–989 977

2. Related work

The literature on traffic modeling is vast; we stress thefollowing topics and ideas that are closely related to thepresent work. These topics include the ON–OFF, and oth-ers, packet-level models, traffic decomposition, the tech-nique of Poissonification, traffic non-stationarity, theexistence of fast IP-flows, the analysis of traffic along thedimensions of packet size and packet iat, and the use of aqueuing system as analysis/validation tool.

Surveys on traffic modeling appeared in [1,4–8]. Withthe exception of [4], these surveys focus on the mathemat-ical modeling of the counting process. For mathematicaldetails, the reader is referred to these papers. Ref. [4] pre-sents an extensive list of possible mathematical modelstogether with their properties, as of 1997. Ref. [1] is orient-ed towards structural models, while Ref. [6] focuses onblack-box models. In Ref. [5], in addition to the survey,an attempt is also made to model the closed-loop natureof TCP via chaotic-maps. Ref. [7] focuses on the modelingof the long-term variability of network traffic, i.e., its LRDproperty. Ref. [8] focuses on the multiscaling property onshorter time-scales of the traffic.

The ON–OFF model, either in the form of a packet-level model or a fluid-flow model, is well-known in theteletraffic community [9]. It has been introduced in thelate 1970s in the context of packetized voice [10]. As earlyas 1974, for data terminal traffic, an ON–OFF model(without being named so) was considered in [11]. It hasalso been widely considered for IP traffic, in particularto explain its LRD property [1]. A connection has beenmade with the distribution of file sizes, which was foundheavy tailed.

A related model is the packet train model of [12]. Thismodel was derived from a token ring local area networktrace. This model is not exactly ON–OFF as packets inboth directions are included in the train. We quote the con-clusions of [12]: ‘‘The intertrain time is a user parameter,and it depends on the frequency with which applicationsuse the network. The intercar interval is a system parame-ter and depends on the network hardware and software.’’This conclusion was stated in 1986. Here, we use a varietyof more recent IP traffic traces and our claim is thatthe ON–OFF model is adequate and accurate. However,the actual parameters depend on the site where the traffichas been captured. In addition, unlike [12], we verify thatthe model produces the same effects on a queuing systemas the original traffic.

According to the ON–OFF model, a file or digital objectis transferred at constant rate. Moreover, the rate is thesame for all transfers. In [13] this second assumption isshown not to be valid. In particular, the authors analyzetraffic at the TCP-flow level, i.e., TCP connections, andfind that the distribution of connection rates (size/dura-tion) is variable from connection to connection and is bestmodeled by a Weibull distribution. Here, we do not consid-er TCP-flows but IP-flows. We find that the rate of IP-flow

bursts is variable within the same IP-flow as well as acrossIP-flows.

The authors of [14] propose a compound model in theform of a Poisson cluster process. The arrival of TCP-flowsis modeled with a Poisson process while the packet arrivalwithin flows is modeled with a Gamma-Renewal process.In that paper, logscale diagrams (LDs) are used as mainvalidation tool. While for some traces the model providesa reasonable fit to the LDs, for other traces the fit is notso good. Here, our metric is the queuing behavior at lowand high buffer occupancy.

In Refs. [15,16] traffic is analyzed by decomposition. In[15], the burstiness of the traffic in 1 s intervals comes fromtwo applications: Sybase and DNS. The other applicationsproduce smooth traffic. In [16], the traffic trace is shown tobe multifractal at the IP level (see also [1,6,8] for mathe-matical details). This multifractality is that of TCP. In turn,HTTP traffic is multifractal, while the FTPdata traffic ismonofractal. Here, we also use the idea of decomposition,but, because we work at the IP level, our decomposition isbased on packet size. We decompose traffic in dominantcomponent and remaining component.

The technique of Poissonification has been used in [15]to address the issue of critical time scale. The technique,called semi-experimental method, has also been used in[14] with a different purpose. We also extensively use thattechnique. The advantage is that a Poisson process is neu-tral with respect to the counting process, besides being veryeasy to generate.

The question of non-stationarity has been considered in[17–19]. Here, we introduce non-stationarity in the form ofthe modulating process. Another form of non-stationaritythat we have observed takes the form of level-shifts, pla-teaus, or trends in the long-term average of the bit-rateprocess. These patterns can span very large portions of agiven trace. Although, at the aggregate level, the modulat-ing process can account for these patterns, for reasons ofsimplicity we have chosen to consider only traces or sub-traces, which still have many packets, that do not showthese patterns. Besides visual inspection of a plot of thebit-rate process, we use a queuing system to detect thesepatterns. By definition, the utilization of a link is the ratioof the measured average bit-rate of the trace and the capac-ity of the output link. We collect and plot, e.g., the averagequeue content every second, for different values of utiliza-tion of the output link. We expect this measure to fluctuatefor the duration of the trace. If the queue does not emptyduring, e.g., half of the trace and is empty for the other partthe trace is split.

In Ref. [20], traffic in the Sprint backbone is analyzed atthe AS-flow level (AS is for Autonomous System) usingessentially the (discrete) wavelet transform. In particular,they observe different small-time scaling behaviors depend-ing on where the trace has been captured. They introducethe notions of dense and sparse (AS-)flows to explain thatobservation. We find that fast IP-flows do have a greatimpact on the queuing behavior. However, in some traces

978 D. Zaragoza, C. Belo / Computer Communications 30 (2007) 975–989

we must also include slower senders to account for thelong-term variability of the traffic.

Analysis of traffic along the dimensions of packet sizeand/or packet iat processes can be found in [21–23]. Forthe traces considered in these papers, LRD has been estab-lished. Using different traces, opposite views are presentedin [19,24]. Recently, the results of [19] have been con-tradicted in [25]. As discussed in Section 4, our principaltool for traffic analysis is a semi-Markov chain (SMC)rather than autocorrelation function or wavelet transform.This SMC tool allows concluding that, for the traces weconsider, including the MRA backbone trace, the packetsizes are not independently drawn. The same applies tothe packet iats. Indeed, this lack of independence is thebasis for our ON–OFF packet-level modeling.

The idea of using a queuing system as an analysis/vali-dation tool has been independently proposed in [26]. How-ever, there are differences in the implementation of the idea.In [26], traffic is aggregated in 10 ms intervals and theresulting fluid queue content is analyzed and compared tothat of the original trace. Here, we use an event-driven sys-tem that uses packets as input. The main reason for thischoice is that information on the queue content is lost whena fluid arrival process is used instead of the actual packetarrival process. This is clearly visible when the histogramsof the queue content are compared. On the other hand, twoadvantages of fluid queue simulations over event-drivensimulations are a much faster simulation time in high loadsconditions and, as discussed above, the quick detection ofnon-stationarity in a trace.

Finally, the origin of (TCP) flow rates has been consid-ered in [27]. The relationship volume-rate is also studied.The classification provided in that paper provides a startingpoint for an explanation of the widely variable rates thatwe observe at the IP-flow level.

3. Background material

In this section, we first present general informationabout the traces considered in this study. We then motivatethe ON–OFF model as packet-level model. Finally, we givedetails regarding the ON–OFF packet-level model.

Table 1General information about the traces used in this paper

Name Original filename Date/Capt. on Dur

AUG89(-if0) BC-pAug89.TL.Z August 1989/Ether 52 m

OCT89(-if11) BC-pOct89.TL.Z October 1989/Ether 600

LBL3(-if0) lbl-tcp-3.tar.Z January 1994/Ether 2 h

DEC1(-if10) dec-pkt-1.tar.Z March 1995/Ether 32 m

BUF1(-if1) BUF-1039113560-1.tsh December 2002/OC3 90 s

MRA(-if2) MRA-1076599531-1.tsh February 2004/OC12 90 s

IT2(-if2) March 2004/Ether 2 h

IT3(-if2) March 2004/Ether 3 h

3.1. Traces considered

Table 1 provides general information about the tracesconsidered in this paper. The first six traces are publiclyavailable; the reader is referred to the ITA [2] and NLANR[3] Internet sites for further information. Most of thesetraces are well-known and have been studied and modeledmany times from the counting process perspective; we donot repeat the results of these studies here. We also indicatein Table 1 the number of different IP addresses of sendersand receivers, which range from a few tens to several hun-dred thousands. These traces were captured in seven differ-ent sites and cover about 15 years of IP traffic, from 1989 to2004. Other traces from the NLANR repository have beenanalyzed, but as they were not modeled, we do not mentionthem here. The choice of traces was made to take intoaccount the evolution of IP traffic due to new applicationsand link capacity variety while considering also the‘‘famous’’ Bellcore traces. We use the per-interface (if)NLANR classification to differentiate directions of trafficflow. Whenever applicable, we use only the direction withmore traffic. The ITA traces, labeled if0, were capturedon a shared Ethernet and both directions are mixed. Fur-ther, the ‘‘Bellcore’’ traces are in the form of a {time, size}series only, the LBL and DEC traces are traces of TCPtraffic without information about sequence numbers andreceive window. The remaining traces are also TCP trafficonly but include sequence numbers and receiver window,which allows detecting retransmissions and flow controldue to low or zero window announcement. The OCT89-if11 sub-trace corresponds to the excerpt from 1150 s tothe end of the original trace, which shows a level-shift. Itcontains 450738 packets for an average bit-rate of 4Mb/s. The DEC1-if10 trace corresponds to the excerptfrom 1100 s to 3000 s of the original trace; this excerpt cor-responds to the plateau in the original trace. It contains 1.2million packets for an average bit-rate of 1.33 Mb/s. TheBUF and MRA traces are Internet2 traffic. As concernsour institution (IT) traces, IT2-if2 and IT3-if2, they werecaptured using the windump program via the copy portof the 100 Mb/s Ethernet switch located right behind ouraccess router. We have separated the traffic into local

ation # IP snd./recv. Comments

in N/A

s N/A

1540/1613 50 hosts send 80% of bytes

in 2416/2764 10 hosts send 79.6% of bytes

1476/5061 50 hosts send 88.5% of bytes

77344/234450 1 host sends 3.6% of bytes (500 send 70%)

679/36 1 host send 39% of bytes

720/38 1 host send 55% of bytes

D. Zaragoza, C. Belo / Computer Communications 30 (2007) 975–989 979

traffic, which includes traffic to and from our university viaa high-speed wired link, outgoing Internet traffic (if1), andincoming Internet traffic (if2). At the time of the capture,the main access of our institution to the Internet was viaa point-to-point (lossy) 802.11b wireless link. The accesswas further average-rate-limited to 1 Mb/s by Cisco’sCommitted Access Rate algorithm running in the routerof our Internet Service Provider.

3.2. Motivation for the ON–OFF packet-level model

As a picture is worth a thousand words, Fig. 1 illustratesour motivation for a packet-level ON–OFF model. In thefigure, we plot excerpts of actual packet iat series (in ms).The excerpts on the left plot are for the aggregate traffic,i.e., irrespective of sending/receiving hosts but for selectedpacket size ranges. The right plot displays excerpts for spe-cific host pairs. The plots show signs of ON–OFF patternsboth in the old AUG89 (1989) trace and in the recentBUF1 (2002) and MRA (2004) traces. Bursts of packetsappear approximately back-to-back at, or close to, theMP line rate (10 Mb/s, OC3, or OC12) separated bysilence. The logarithmic y-axis shows that silence can spanseveral orders of magnitude. Similar observations can bemade when considering specific host pairs (right plot).

The figure also illustrates another point. In the plot forthe AUG89 trace, and quite consistently in the wholetrace, one can see that a burst of packets is followed bya silence of about 4–6 ms after which two packets appear,themselves followed by a large (>10 ms) silence before thecycle restarts. A similar pattern (a periodicity of about12–15 ms, leading to a �2 Mb/s peak rate) is visible onthe right plot for the 1500 B packets that host 145 sendsto host 250 in the LBL3 trace. These packets are fromthe X-window connection that exists between these twohosts throughout the trace. Essentially, only these twohosts send/receive 1500 B packets in that trace. Thereare four simultaneous connections between hosts 38 and39 in the MRA trace. Packets from these connectionsoccasionally pile up for transmission in the sender andappear at the MP at full OC12 rate. These observationssuggest that thresholds for deciding whether a packet

0.01

0.1

1

10

100

1000

50 100 150iat no

iat (

ms)

AUG89-if0-1000-1100B pkts

BUF1-if1-1400-1500B pkts

MRA1-if2-1400-1500B pkts

Fig. 1. Excerpts from packet iat series for selected sizes

belongs to the same ON state as the previous packets(alternatively, to decide whether the interval betweentwo packets defines an OFF period) are of importancefor the purpose of obtaining an approximate but accuratepacket-level model. For example, a tentative threshold of10 ms could be used for the AUG89 trace, while it couldbe set at 12 ms for the LBL3 trace due to the major influ-ence of the X-window connection (see also Fig. 5). Forfaster links, this threshold is actually lower. Another pointworth noting is that due to the great variety of traces, it isnot reasonable to expect the same, or similar, set of modelparameters. Other factors of importance, such as long-term variability, hinder considering a purely stationarymodel; details are discussed in Section 5.

3.3. The ON–OFF packet-level model

As mentioned in Section 1, according to the ON–OFFpacket-level model, while the source is in the ON statepackets are emitted with constant interval between them(i.e., at constant rate) from a source to a destination. Inthe OFF state, no packet is emitted.

The model is characterized by the distributions of theON and OFF periods, the packet size, and the packet iatin the ON state. For data traffic the ON time is not directlyavailable, a more convenient parameter is the distributionof the burst length (BL, a.k.a. burst size). This is the num-ber of packets appearing at peak rate. For convenience, themodel is illustrated in Fig. 2. That model is used for thedominant component of a given trace.

To obtain these parameters from the data we proceed asfollows. First, a threshold, or timeout, TO, that determinesan ON or OFF period is selected. TO is judiciously chosenand further adjusted by trial and error. To obtain an esti-mate for TO, plots like Fig. 1 are helpful but, given thelarge number of packets, are of limited practical use. Torefine the choice, we use a plot of the histogram of iatsand a subsequent SMC analysis (Section 4) over the wholeseries for the dominant component.

Once the threshold is defined, the OFF (empirical) dis-tribution is obtained from the data. In general, it is notpossible to fit an equation to the distribution we therefore

0.01

0.1

1

10

100

1000

50 100 150

iat no

iat (

ms)

BUF1-if1-(56->433)-1500BLBL3 (145->250)-1500BMRA1-if2 (38->39)-1500B

. Left: aggregate traffic. Right: selected hosts pairs.

ON when packet iat ≤ TO OFF when packet iat > TO

ON (BL=7) ON

OFF

Fig. 2. Illustration of the ON–OFF packet-level model.

980 D. Zaragoza, C. Belo / Computer Communications 30 (2007) 975–989

use the empirical distribution ‘‘as is’’. Our purpose is tovalidate the ON–OFF model not to determine what shouldbe the distributions applicable to all traces.

The (empirical) distribution of BL is measured from thedata. This is the number of packets with iat 6 TO. In gen-eral we also use the empirical distribution ‘‘as is’’.

The remaining parameters, i.e., the constant packet iatin the ON state (not TO), and the packet size are obtainedfrom the data. The constant packet iat corresponds to themode of the histogram of iats up to TO having 1-stepdependency; this corresponds to packets arriving atapproximately the MP line rate. The packet size is takenas the average of the dominant component. Additionally,the iat value can serve to adjust the number of packets pro-duced and the bit-rate to match those of the original trace.Unlike the packet iats, the packet size can be made state-dependent (in particular, for the LBL3 trace where the1500 B and 552 B packets are equally important) wherethe states are those of the modulating process mentionedin Section 1, see also Section 5.

The previous ON–OFF model is used for the aggregatetraffic. For the modeling at the IP-flows level we increasethe value of TO and consider only bursts of 10 packetsor more (therefore, some low-rate bursts as well as entireIP-flows are eliminated). As discussed in Section 7, at theIP-flow level the ON–OFF model is not applicable; rather,an active–inactive approach is more appropriate.

4. Traffic analysis

In this section, we present the method we used for theanalysis of a given trace from the perspective of packet sizeand iat. We emphasize a particular point, i.e., the detectionof the dominant component, which is the most relevant formodeling. Before describing the method proper, we discussthe main tools we use for the analysis.

We extensively use histograms because details are bettervisualized. In particular, for the iat process the minimumbin size used in the histogram is chosen depending on theaccuracy of the timestamps and the MP link capacity.However, distributions do not reveal possible correlationand dependency structures.

A well-known non-parametric tool used to obtain infor-mation about the underlying second-order structure of theprocess under consideration is the autocorrelation func-tion. However, it only indicates an average tendency,although it may reveal existing periodicity. The waveletanalysis is believed to be a robust tool but is mainly usedfor the detection of scaling properties of the counting

process. In particular, scaling in a process can be obtainedwith independent but heavy-tailed variables, or the processmay have light-tailed marginal distribution but may beLRD. We refer the reader to the recent Refs. [28,29] fora discussion about the strength, limitations, and pitfallsin interpretation of the (discrete) wavelet analysis results.We prefer to use a SMC as measurement tool. It appearsthat we have rediscovered an old technique used in [30]for the analysis of bit errors in telephone channels. Thisis a parametric approach (the states of the chain are user-defined) but allows obtaining fine-grained views of a timeseries. The SMC is used here as a tool for exploratory dataanalysis. An analogy could be that of an engineer turningthe knobs of his measurement device (e.g., a spectrum ana-lyzer) to better focus on characteristics of a signal. Giventhe non-overlapping user-defined states, we measure thetransition matrix and the holding time (BL) in each state.For example, for the AUG89 trace we can define the statesof iat to be ‘‘0–10 ms’’ and ‘‘above 10 ms’’ and verify (viathe BL distribution) that the small iat state is rather persis-tent while the large iat state has much less persistence. Notealso that the SMC approach allows proving that values in aseries are not independent. Indeed, when values are inde-pendently drawn the transition matrix has identical rows,and the per-state distribution of holding time has a geomet-ric distribution with parameter derived from the corre-sponding diagonal element of the transition matrix. Onthe other hand, the SMC tool only gives an indication thatthere is, at least, one-step dependency for the selectedstates. In particular, we cannot conclude that the corre-sponding process is (semi-) Markovian or that the selectedstates provide a good model. Finally, this SMC analysisgives us a starting point for the selection of the timeout val-ue used for the separation of ON and OFF periods men-tioned in Section 3.3.

The previous discussion is illustrated in Fig. 3 for theBUF1 trace. The left plot shows the histogram of the iatsfor packets in the [1400–1500 B] range. The main mode isat 80 ls, which is the transmission time of a 1500 B packeton an OC3 link. A secondary mode is at 160 ls. Also indi-cated are the three states selected and the correspondingtransition matrix. The right plot shows the measured BLdistribution for states 1 and 2, and, for comparison, theassociated geometric distribution. Clearly, these packetsdo not appear independently of one another. Afterperforming a state aggregation, a good value of theON–OFF timeout for this trace is 160 ls, which corre-sponds to the secondary histogram mode.

The steps of the method are as follows. We first verifythat a trace does not contain backward time jumps,negative packet sizes, or suspicious periodic bit-ratevariations. Second, because our method focuses on packetiats, we determine the accuracy of the timestamps. This isobtained by analyzing the differences between the mini-mum theoretical iat and the measured iat. For this purposethe layers 1 and 2 technologies and framing used on theMP link must be known.

0.001

0.01

0.1

1

10

100

0 10 20 30 40 50

Meas. State1

Geo State1

Meas. State2

Geo State2

Freq

uenc

y (%

)

BL

BUF1-if1. 1400-1500B. Measured and Geo. fit for BL distribution per stateBUF1 (if1) 1400-1500B, iat histogram in 10us bins

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5iat (ms)

dist

r.

Total iat count: 409,07058% are <= 100us80.8% are <= 200 us

State 1: ]70us, 100us]State 2 ]100 us, 200us]State 3: > 200 us

Transition Matrix 1 2 31 0.737 0.170 0.0932 0.393 0.376 0.2313 0.224 0.318 0.447

Fig. 3. Example of SMC analysis of packet iat for the BUF-if1 trace.

D. Zaragoza, C. Belo / Computer Communications 30 (2007) 975–989 981

The basic idea of the third step, upstream queuing, isfrom [31]. However, we extend it by measuring, among oth-ers, the BL distribution of the back-to-back packets andthe distribution of the size of queued packets. Althoughthe percentage of back-to-back packets may be relativelyhigh, this analysis allows us to conclude that the links westudy are not congested. Indeed, over all traces, the maxi-mum number of back-to-back packets is only up to a fewtens. The distribution of the size of queued packets givesa first indication of what packets to look for. The next stepin the method is, in a combined fashion, the analysis ofpacket sizes, packet iat, and bit-rate using the tools alreadymentioned. This analysis reveals that, for the traces consid-ered, packet sizes are not independently drawn either.

The detection of the dominant component proceeds inthree steps:

(a) For each packet size, we measure its contribution tothe average bit-rate of the trace. The results of this firststep are given in Table 2 for the top10 packet sizes ofsome traces. Well-known sizes are first sought, but theymay not appear in a trace nor be the most numerous.For example, the non-standard 144 B and 156 B packetsare the most numerous in the Bellcore traces but have

Table 2Examples of packet sizes versus average bit-rate (BR) demand

AUG89

# pkts and avg BR (kb/s) 1000000 1059.66

IP Size(B) % pkts % BR IP Size(B)

1072 17.4 44.8 552

1500 7.3 26.2 40

144 20.3 7.0 1500

156 18.4 6.9 41

920 0.9 1.95 72

552 1.25 1.7 576

1224 0.51 1.5 104

46 10.5 1.2 45

1064 0.44 1.1 60

124 2.93 0.87 80

negligible influence compared to the 1064 B and1072 B packets present in these traces.(b) We select intervals of packet sizes and plot their bit-rate over, e.g., time intervals of 1 s (e.g., Fig. 5).(c) For these candidate sizes, we analyze, via the histo-gram and SMC tools, their bit-rate demand at the small-est possible time-scale, which is the transmission time onthe MP link. From the analysis we deduce that, specific,or ranges of, packet sizes have dependent short iats thatcome in bursts. In addition, their bit-rate demand ishigh, and when high it is sustained. This strongly sug-gests the presence of a dominant component in the trace.To further validate the hypothesis we ‘‘Poissonify’’ (withthe same average iat) the arrival of the packets of thatsupposed component without changing their size andnumber. We merge that new trace with the remainingcomponent extracted from the original trace, and com-pare the queuing behavior with the original. The resultsare that these packets have a major influence on the bit-rate process and on a queuing system. Applying the pro-cedure to the remaining component shows that it has noinfluence. The results of this procedure are shown inFig. 4 for the BUF (left) and LBL3 (right) traces. It isvisible that poissonifying the packets of the dominant

LBL3 BUF1-if1

1789995 350.75 634461 63975

% pkts % BR IP Size(B) % pkts % BR

19.9 62.3 1500 54.7 72.4

32.1 7.3 1496 6.5 8.6

0.84 7.2 1372 3.3 4.0

14.2 3.3 1420 3 3.8

5.4 2.2 1300 2.3 3.6

0.28 0.90 1368 1.4 1.7

1.3 0.76 1336 0.94 1.1

2.4 0.60 576 1.5 0.78

1.6 0.55 1216 0.32 0.64

0.9 0.44 628 1.1 0.61

982 D. Zaragoza, C. Belo / Computer Communications 30 (2007) 975–989

component has a tremendous effect, while poissonifyingthe packets of the remaining component has no effect.

Table 3 gives the actual dominant component for eachtrace considered here. In some traces, the dominantcomponent is made of specific sizes while for others it is madeof a range, e.g., 1400–1500 B. Such range includes more spe-cific sizes such as 1400, 1420, 1496, and 1500 B. Other sizes,such as 1300, 1336, 1368, and 1372 B also appear (Table 2)but are much less numerous and influent than the first rangeso we included them in the remaining component.

Fig. 5 shows, as an example, the bit-rate processes forthe dominant (dom) and remaining (rem) components forthe BUF1-if1 (over 100 ms) and LBL3 (over 10 s) traces.For better readability, the plots for the remaining

00.01

0.1

1

10

100

0 20 40 60 80 100 120 140xin KB

% P

(Q >

>x)

DO-RP (0.7)

DP-RO(0.7)

DP-RP(.7)

ORIG(0.7)

BUF1-if1 D = 1400-1500B

Fig. 4. Poissonification of components, effect on a queuing system. ORIG iscomponent, O is for Original (unchanged), P is for Poissonified. Utilization is

Table 3Dominant component for the traces considered in this paper

Trace Dominant component

AUG89 1000–1100 B (1072 B) and 1400–1500 B (1500 B)OCT89 1000–1100 B (1064 B and 1072 B) and 1400–1500 BLBL3 500–600 B (552 B) and 1400–1500 B (1500 B)DEC1 500–600 B (552 B) and 1400–1500 B (1500 B)

BUF1-if1 (100ms) BR (Mb/s) vs time (s)

0

10

20

30

40

50

60

70

80

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90

BRrem avg:9.6BRdom avg:54.4

time

BR

(M

b/s)

Fig. 5. Plots of bit-rate processes for the dominant and remaining componentsare plotted separately.

component have been offset (right plot) and the averagesare given. Clearly, that component is low rate and smoothalthough variance–time plots (not shown) would lead toconclude that it also has high long-term variability. Asconcerns the averages, the remaining component is notnegligible compared to the dominant component. Thedominant component shows long-term variability (noticethe low frequency undulations) and possibly spikiness asin the LBL3 trace. Fig. 5 is also intended to illustrate thefact that IP traffic can be qualitatively quite different fromtrace to trace. Thus, finding models (if they exist) thatapply to all is quite challenging. While the plot for theBUF trace shows some smoothness, the corresponding plotfor the LBL3 trace is quite spiky. Both traces show vari-ability over five orders of magnitude; both traces have

LBL3: D = D1 + D2 w/ D1 =500-600B, D2=1400-1500B

.01

0.1

1

10

100

0 500 1000 1500 2000 2500

x in KB

%P

(Q >

x)

DO-RP (0.5)

DP-RO (0.5)

D1O-R1P (0.5)

D1P-R1O (0.5)

D2O-R2P (0.5)

D2P-R2O (0.5)

ORIG(0.5)

for Original trace, D is for Dominant component, R is for Remainingin parenthesis.

Trace Dominant component

BUF1-if1 1400–1500 B(1500 B) MRA-if2 1400–1500 B

IT2-if2 1420 B and 1064 BIT3-if2 1420 B

LBL3 (10s) BR (kb/s) vs time (s)

0

250

500

750

1000

1250

1500

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000 6500 7000

BR (500-600B) avg:226.7

BR(1400-1500B) avg:25.2

BRrem (off:1100) avg:98.8

time

BR

(kb

/s)

. Left: BUF1 trace. Right: LBL3 trace; the 1500 B and 552 B components

D. Zaragoza, C. Belo / Computer Communications 30 (2007) 975–989 983

the same number of senders (Table 1) although the first is90 s long while the second is 2 h long.

5. ON–OFF packet-level modeling at the aggregate level

For some traces (in particular, the BUF trace), it is pos-sible to fit a stationary packet-level ON–OFF model thatgives the same queuing results (within a few percent) asthe original trace up to utilization of 0.8, which corre-sponds to medium buffer occupancy. However, due to thelong-term variability of the traffic (Fig. 5), the model failsat higher utilization. This failure leads us to consider amodulating process to take into account this high variabil-ity without having to actually model that process. In thisapproach, we keep the ON–OFF packet arrival but makeit dependent on the state of the modulating process. In thissection, we describe how the modulating process is deter-mined and discuss the results of the modeling approach.

Finding an adequate modulating process proceeds inseveral steps. For a given trace, a well-matched modulatingprocess is obtained when the queuing behaviors are thesame for both the original trace and the packet-level modelunder low to high buffer occupancy.

The first step is to define a time interval, S, then to definestates (three or four, depending on the trace) of bit-rates,for the dominant component only. Besides visual inspec-tion of the bit-rate process (similar to Fig. 5), a good helpis obtained using histograms of the bit-rate using small bins(e.g., 100 kb/s bins). From the histogram and the counts inthe various bins, we define states corresponding to low-,medium- (spread around the average), medium-high, andhigh-rate. Once the states are defined, the parameters ofthe per-state ON–OFF process are obtained from the traceas described in Section 3.3. Therefore, the ON–OFF pro-cess is conditionally stationary given the state. A new traceis produced according to the model. The times of statechange are those of the original trace in this generatedtrace. We then merge this new trace with the untouchedremaining component of the original trace. Finally, thequeuing behaviors are compared. In practice, the trickypart is for low buffer occupancy. In our experience, for high

Table 4Parameter set for the modulating process and ON–OFF model at the aggrega

Parameter AUG89(-if0) OCT89(-if11) LBL3(-i

S 1 s 1 s 10 s

BR states (units) (1) [0–600] (1) [0–3500] (1) [0–2(2) [600–1000] (2) [3500–4000] (2) [250(3) [1000–2000] (3) [4000–5000] (3) [500(4) >2000 (kb/s) (4) >5000 (kb/s) (4) >950

TO 10 ms 10 ms 12 ms

ON iat 1.3 ms for all states 1.3 ms for all states 1.65 ms

Packet size 1196 B for all states 1101 B for all states (1) 556(2) 557(3) 637(4) 835

buffer occupancy, the choice of S is not critical and manymodels (including not-tuned ON–OFF) provide a goodmatch. When the queuing behavior is not satisfactory forlow occupancy, the interval S is reduced until a good matchis obtained. One trace (LBL3) required up to four passesthrough the procedure.

Table 4 presents the parameter set per modeled trace.The complete model also includes the per-state empiricaldistributions (not shown). In Fig. 6 we plot, as examples,the queuing results for the LBL3 and MRA traces (ORIGis for the original trace). Note the good matching betweenthe results from the original trace and the model for boththe overflow probability and the histogram of the queuecontent at low as well as high buffer occupancy. For theLBL3 trace the matching is not perfect at utilization of0.2. However, the model is still a good approximation.The values in Table 4 indicate that the states of bit-rateare quite varying from one trace to the other. The param-eter S is on the order of seconds for the old Ethernet traceswhile is on the order of 100 ms for the more recent high-speed traces. Further, for all traces the ON packet iatapproximately corresponds to line rate, while the packetsize is approximately constant in all cases except for theLBL3 trace, where the 1500 B packets of the X-windowsession are of great importance. The packet size is theper-state average packet size taken from the trace (domi-nant component only). The TO value is independent ofthe states as well as independent of S, which, being largecompared to TO, allows for several ON–OFF cycles perS-intervals. This TO value is obtained from the analysisof the trace as a whole as per Sections 3.3 and 4.

6. Comparison with other possible approaches

The previous approach may appear to be complicated,using a number of heuristics to obtain the ON–OFF time-out and the S-interval values, which are further trace spe-cific. One may wonder if a simpler, more straightforward,and more generally applicable method could give accurateresults. As an example, we show here that a very straight-forward modulated Poisson approach is not accurate.

te traffic level

f0) DEC1(-if10) MRA(-if2)

5 s 100 ms

50] (1) [0–1100] (1) [0–180]–500] (2) [1100–1600] (2) [180–210]–950] (3) [1600–2000] (3) >210

(kb/s) (4) >2000 (kb/s) (Mb/s)

10 ms 139 ls

for all states 1.4 ms for all states 25 ls for all states

B (1) 560 B 1481 B for all statesB (2) 561 BB (3) 569 BB (4) 554 B or 570 B for all states

LBL3

0.01

0.1

1

10

100

0 200 400 600 800 1000 1200 1400 1600 1800 2000

x (KB)

% P

(Q >

x)

ORIG(.2) ONOFF(.203)

ORIG(.3) ONOFF(.305)

ORIG(.4) ONOFF(.407)

ORIG(.5) ONOFF(.508)

LBL3

0

1

2

3

4

5

6

7

8

9

0.1 1 10 100 1000 10000

x (KB)

dens

ity (

%)

ORIG(.2) ONOFF(.203)

ORIG(.3) ONOFF(.305)

MRA1-if2

0.01

0.1

1

10

100

0 100 200 300 400 500 600x (KB)

%P

(Q>

x)

ORIG(.7) ONOFF(.701)

ORIG(.8) ONOFF(.802)ORIG(.9) ONOFF(.902)

MRA1-if2

0

5

10

15

20

25

30

0.1 10 100 1000 10000x(KB)

dens

ity(%

)

ORIG(.7) ONOFF(.701)ORIG(.8) ONOFF(.802)ORIG(.9) ONOFF(.902)

1

Fig. 6. Comparison of queuing behavior between original trace and ON–OFF model. Top: LBL3 trace. Bottom: MRA trace (1 KB = 1024 B).

984 D. Zaragoza, C. Belo / Computer Communications 30 (2007) 975–989

Fig. 7 shows a comparison of queuing results for theLBL3 trace when a Poisson process is used instead of theON–OFF process. In this setting, the S-interval and thepacket sizes are the same as in Table 4. The only differenceis that, instead of being ON–OFF, the arrival of packets isPoisson with the same per-state average rate as the trace.

Fig. 7 shows that the modulated Poisson approach is notaccurate enough for the overflow probability at utilization0.2 and 0.3 (left plot), where the Poisson curves are wellabove or well below the original curves. The right plotsshow that the histograms are not well reproduced for lowbuffer values at different utilization, compare with the topright plots of Fig. 6. Note also that the tail of the overflowprobability is well reproduced for large buffer occupancy.Such good reproduction of the tail at high buffer occupan-cy occurs for this trace, as well as other traces, when anaggregate fluid approach is adopted (not shown), or whenthe ON–OFF model is not tuned. These observationslead us to conjecture that, in an open loop context, for high

LBL3

0.01

0.1

1

10

100

0 200 400 600 800 1000x (KB)

% P

(Q >

x)

ORIG(.2) PoissPkts(.2)

ORIG(.3) PoissPkts(.3)

ORIG(.4) PoissPkts(.4)

Fig. 7. Comparison queuing behavior between original

buffer occupancy the patterns of packet arrivals have no, orlittle, relevance. This also strongly suggests that thevalidation of packet-level models must include compari-sons at low buffer occupancy.

7. Modeling at the IP-flow level

In this section, we show that the ON–OFF model doesnot extend to the IP-flow level for the reason that theON rate varies greatly from an ON period to the other.Another fact is that the number of simultaneous activeinfluent flows is quite small.

7.1. IP-flow level modeling

We consider IP-flows, i.e., unidirectional sequences ofpackets having the same source address and destinationaddress, and packet size of the dominant component. Foreach trace, we select the topX (X = 100–400) most sending

LBL3

0123456789

10

0.1 1 100 1000x (KB)

dens

ity (

%)

ORIG(.2) PoissPkts(.2)

ORIG(.3) PoissPkts(.3)

10

LBL3 trace and modulated Poisson packet arrivals.

D. Zaragoza, C. Belo / Computer Communications 30 (2007) 975–989 985

(elephants) IP-flows. For most traces, this number includesall IP-flows. By extending the ON–OFF approachdescribed above, together with an appropriate selectionof the timeout value, it is possible to extract a per-IP-flowsubset of the packets of the dominant component that,from the perspective of the queuing effects, is of greatimportance in a given trace. That subset is composed ofthe bursts of 10 packets, or more, of a selection of IP-flows.The selection is validated in two steps. First, we Poissonifythe arrival of these packets, without changing their size,and verify that they have a major impact on the queuingbehavior. Second, using the actual time of arrival of theIP-flows bursts, we replace the arrival of packets withinbursts by a deterministic or Poisson process with the sameaverage iat and packet size as that of the original burst. Wethen verify that the queuing behavior is the same as that ofthe original trace for low buffer occupancy as well as forhigh buffer occupancy. As for the modeling at the aggre-gate level, the timeout value is of importance for low bufferoccupancy.

Table 5 displays, in the second row, the parameter val-ues for the traces we applied our method to. The firstparameter is the minimum number of packets per burst(10 packets), the second is the minimum bit-rate per burst(in kb/s or Mb/s), and the third is the timeout used for theselection of per-IP-flow bursts. Note that the selections ofTable 5 are such that adding bursts (by lowering the bit-rate threshold) does not improve the matching, whileremoving bursts worsen greatly that matching. The tablealso indicates (third row) the percentages relative to thetotal number of packets present in the trace and to thepackets of the dominant component, respectively. Thesevalues show that from the set of packets of the dominantcomponent it is necessary to consider only a subset, whichis of greater importance for the queuing behavior.

Table 5Parameters for the IP-flow bursts selection

LBL3-if0 DEC1-if10 BU

(10P-250k-1 s) and (50P-150k-1 s) (10P-400k-150 ms) (1(7.9%, 36.8%) and (9.6%, 44.8%) (12.6%, 32.6%) (2

0.01

0.1

1

10

0 100 200 300 400 500 600

ORIG(.9) PoissPkts(.9)

x (KB)

> x

0.001

0.01

0.1

1

10

0 20 40 60

ORIG(.6)ORIG(.7)ORIG(.8)

x (K

%P

(Q >

x)

0.01

0.1

1

10

100

0 100 200 300 400 500 600

ORIG(.9) PoissPkts(.9)

x (KB)

%P

(Q >

x)

MRA-if2 (10P-5M-15ms)

0.001

0.01

0.1

1

10

100

0 20 40 60

ORIG(.6)ORIG(.7)ORIG(.8)

x (K

%P

(Q >

x)

MRA-if2 (10P

Fig. 8. Queuing results for ON–OFF modeling at

Fig. 8 shows examples of queuing results. The left plot isobtained by Poissonifying (PoissPkts label) all the packetsof the bursts of Table 5 for the MRA trace. It is visible thatthese packets have a great influence. The middle and rightplots are obtained by replacing the actual per-burst packetarrival by a deterministic iat process (DET label) or by aPoisson process (POI label), as mentioned above. The plotfor the LBL3 trace at utilization 0.1 suggests that there issome structure which is not captured by the coarse 1-stimeout.

Once the selection is validated, we further analyze theproperties of the IP-flow bursts such as bit-rate and burstiat. The left plot of Fig. 9 shows the per-burst bit-rate(minimum of 5 Mb/s as per Table 5) for the top50 flowsof the MRA trace. Not all flows are present due to theselection rule. Only one of the large flows (flow #1) hasan approximately constant ON bit-rate. Some flows haverates spanning three orders of magnitude. Because theON bit-rate fluctuates widely from an ON period to theother, the ON–OFF model is not applicable at the IP-flowlevel. Rather, an active–inactive model is more adequateand one must supply the joint distribution of the burstlength and rate. Although it is not in the scope of this paperto find an explanation for these widely varying rates, wenote that congestion and packet losses are not the respon-sible for this behavior.

The right plot of Fig. 9 shows a scatter plot of bit-rate(x-axis) versus burst-length (y-axis) for all the bursts of atleast 10 packets that we detected per IP-flow in the DECtrace. Similar plots are obtained for other traces althoughactual values are different; similar observations can alsobe made. In this plot, the selected bursts (Table 5) are thosewith rate larger than 400 kb/s. The plot illustrates the factsthat (a) it is not possible to observe large and high ratebursts. (b) Not all IP-flows and their bursts have an impact

F1-if1 MRA-if2 IT2-if2

0P-20M-15 ms) (10P-5M-15 ms) (10P-150k-100 ms)9.4%, 34.6%) (8.05%, 33.6%) (41%, 64.4%)

0.001

0.01

0.1

1

10

100

0 20 40 60 80 100 12

% P

(Q>

x)

ORIG(.1) DET(.1) POI(.1)

ORIG(.15) DET(.15) POI(.15)

80 100 120 140

POI(.6) DET(.6)POI(.7) DET(.7)POI(.8) DET(.8)

LBL3 (10P-250k-1s)

0.001

0.01

0.1

1

10

100

0 20 40 60 80 100 120 140x (KB)

% P

(Q>

x)

ORIG(.1) DET(.1) POI(.1)

ORIG(.15) DET(.15) POI(.15)

80 100 120 140

POI(.6) DET(.6)POI(.7) DET(.7)POI(.8) DET(.8)

B)

-5M-15ms)

the IP-flow level; utilization is in parenthesis.

MRA1-if2 (10P-5M-15ms)- max=537Mb/s

1

10

100

1000

0 5 10 15 20 25 30 35 40 45 50pair nb

BR

DEC1-if10 (10P-xxk-150ms) minBR=35kb/s, maxBR=10Mb/s

10

100

1000

10000

10 100 1000 10000BR(kb/s)

BL

Fig. 9. Left: bit-rate per burst for the top50 IP-flows in the MRA trace. Right: Burst length versus bit-rate for 10-packet bursts in the DEC1 trace.

986 D. Zaragoza, C. Belo / Computer Communications 30 (2007) 975–989

on the queuing behavior, only IP-flows having a sufficientlyhigh ON-rate matter.

7.2. The number of simultaneously active IP-flows – Flow

queuing

An interesting question regarding the selected IP-flows is‘‘how many are simultaneously active?’’ The answer is illus-trated in Fig. 10, where, for better visualization, we onlypresent excerpts. Except for the MRA trace, that numberis up to four. In these traces, the queuing behavior is gen-erally influenced by a single high-rate IP-flow. For theMRA trace, that number is up to 15. In this trace, the aver-age number of active IP-flows is 5.6; further, a Poisson dis-tribution reasonably fits the empirical distribution of thatnumber (the number of bursts is 49864), and the burst iatsare uncorrelated. The mentioned relatively small averagecorresponds to active IP-flows that most influence a queu-ing system. However, in this 90 s trace, our scripts havedetected a total of 3940 IP-flows that send at least 10 pack-ets of size between 1400 and 1500 B. Therefore, even in abackbone trace, a small number of IP-flows can accountfor the effects on a queuing system.

8. An empirical explanation of the long-term variability of IP

traffic

In the previous sections, we were interested in the selec-tion of packets or IP-flows that most influence the queuing

Fig. 10. The number of simultaneous ac

behavior. Another question is whether this selection is alsosufficient to account for the observed long-term variabilityof the traffic. For some traces, as illustrated in the left plotof Fig. 11, the selection is adequate. We obtained this plotby removing from the trace the packets corresponding tothe selection of Table 5 and adding an offset for better visu-alization and comparison with the original data. Theremoval of these packets has a manifest smoothing effecton the 1-s data and on the 30-s moving average (labeledMA) shown. However, for other traces, a complementaryapproach is necessary to provide a better account for thelong-term traffic variability.

This complementary approach consists in detecting thesender hosts (or host pairs) that are the major contributorsto the traffic variability. The right plot of Fig. 11 illustratesthe approach for the MRA backbone trace. The plot repre-sents (a) the bit-rate per 100 ms intervals for the originaldominant component (top curve). (b) The bit-rate achievedby a selection of senders (label SA) having a bit-rate largerthan 1 Mb/s in each interval (middle curve). (c) The differ-ence of both (bottom curve with an offset). For better visu-alization, a 2 s (forward) moving average is also plotted foreach curve. Qualitatively, it is visible that the differencecurve is much smoother than the middle curve, which, nice-ly enough, follows the original curve. The undulations ofthe moving average, as well as the spikes visible at the100-ms time scale have been greatly reduced. Quantitative-ly, first, there is not a great difference in the averages,111.7 Mb/s for the middle curve and 76.9 Mb/s for the bot-

tive influent flows in different traces.

Fig. 11. Accounting for long-term variability of the bit-rate process by differencing. Left: LBL3 trace (1-s intervals). Right: MRA trace (100 ms intervals).

D. Zaragoza, C. Belo / Computer Communications 30 (2007) 975–989 987

tom curve. Further, the application of the variance–timemethod results in a Hurst parameter of 0.83, 0.83, and0.68 for the original data of the top curve, the data ofthe middle curve, and that of the bottom curve, respective-ly. This confirms the qualitative impression that the long-term variability is smaller for the bottom curve, and thatthe data of the middle plot reproduce the original long-term variability.

The left plots of Fig. 12 are more striking. The plotsrepresent, per 100 ms intervals, the number of hostssending (a) at least one packet of any size in the interval.(b) At least one packet of the dominant component. (c)Packets of the dominant component but achieving a bit-rate of at least 1 Mb/s over the interval. (d) As before,but with a bit-rate of 5 Mb/s. The rounded average num-bers are 2877, 317, 46, and 4, respectively. These num-bers clearly speak for themselves. The right plots ofFig. 12 are for the LBL3 trace with a 5-s interval. Thenumber of senders of packets of any size clearly showsan increasing trend, which is put in evidence by the leastsquares regression line. We therefore conclude that, evenfor a backbone link (there are over 77000 different send-ing IP addresses in this 90 s trace), it is necessary to onlytake into account a few hosts pairs as concerns the queu-ing behavior (previous section); and a few tens as con-cerns the long term variability of the traffic. The rightplots in Fig. 12 also show that the variability of thenumber of senders in a trace does not constitute a goodindicator of long-term variability of the traffic. Many

Fig. 12. Number of senders (SA). Left: MRA trace (1

sources send too little data to have an influence eitheron a queuing system or on the bit-rate process.

9. Conclusions and discussion

The main motivation for this study was to determinewhether an ON–OFF packet-level model is an adequateand accurate model for IP traffic. For that purpose, weused a number of traces captured at different sites and dif-ferent times. Our main conclusions are as follows.

Conclusion 1. For the aggregate traffic, the ON–OFFpacket-level model is adequate to great accuracy (Sec-tion 5).Discussion 1. For a given trace, the essential of themethod consists in: (a) the determination of the domi-nant component, and (b) the determination of the ade-quate modulating process. The modulating process is adevice that accounts for the long-term variability of IPtraffic common to all traces we considered. The methodused for the detection of the dominant component, aswell as other characteristics of a given trace, is describedin Section 4.Given our focus on the packet-level, the modulatingprocess was not modeled. However, in some cases, amodel is known for that modulation. For example, atthe 1-s time-scale, a good model for the Bellcore tracesis Fractional Brownian Motion (FBM) [32]. In essence,the approach developed here consists in sampling the

00-ms intervals). Right: LBL3 trace (5-s intervals).

988 D. Zaragoza, C. Belo / Computer Communications 30 (2007) 975–989

Gaussian marginal distribution of the FBM for thesetraces and producing individual packets according toan ON–OFF model. Note that each of the other tracesrequires their own model for the modulation. Thatmodel cannot be FBM because, unlike the Bellcoretraces, the autocorrelation functions at different time-scales are very dissimilar.Conclusion 2. We have shown that the ON–OFF modelcompares favorably to other, easier to fit, packet-levelapproaches such as modulated Poisson arrivals (Section6). Further, our experimental results strongly suggestthat, in an open loop context: (a) the patterns of arrivalof IP packets have no relevance at high buffer occupan-cy, and (b) the comparison and validation of packet-level models ought to be done at low buffer occupancy.Conclusion 3. We have experimentally shown that theON–OFF model does not extend to IP-flow (Section7). This is because the ON bit-rate is widely variablefrom an ON period to the other. This fact has also beenverified in Internet2 traces showing no evidence of con-gestion and packet losses.Discussion 3. Providing a full model at the IP-flow levelrequires, at least, the distribution of burst iat and thejoint distribution of burst length and burst rate. Further,these distributions would depend on location and timeof the traffic capture; that is, the distributions wouldbe quite different for the old DEC trace and the recentInternet2 traces. The modeling and explanation of suchcharacteristics were not in scope here.Conclusion 4. We have shown that for each trace whereIP addresses were available, including a backbone trace,the number of simultaneously active IP-flows that areimportant in terms of the effects on a queuing systemis quite low, up to a maximum of about 10–15. Further,one needs only consider the fastest among the most-sending IP-flows.Discussion 4. On the one hand, this result is interestingfor per-flow queuing. Indeed, one would need to keepstate for very few flows. On the other hand, one needsto detect these few flows among the possibly many thou-sands that may co-exist. This raises the following ques-tion: if one finds the motivation for doing so, how toefficiently detect these important flows?Conclusion 5. Finally, we have argued in Section 8 that,even for a backbone trace, it is necessary to consideronly up to a few tens senders among several tens ofthousands to account for the long-term variability ofthe traffic.Discussion 5. Such low numbers render manageable theanalysis of the per-host or per-host-pair traffic patternson an individual rather than statistical basis.Limitations. The models developed here are open loop.The majority of IP traffic consists in data carried byTCP, which operates in a closed-loop fashion. There-fore, the proposed models (as any open-loop model)are descriptive rather than predictive. That is, becauseTCP adapts its sending pattern to the network condi-

tion, one cannot use an open-loop model for predictionpurposes. It is our opinion that open-loop models arestill useful for gaining an understanding of actual IPtraffic. Furthermore, we are not aware of any publica-tion that shows that open-loop models are useless orwrong for real TCP traffic subjected to low loss ratio(e.g., below 1% of packet losses) or when the load ismoderate (e.g., below 60% of link capacity).Possible future work. The traces analyzed and modeledhere are from research sites and networks, it would beinteresting to determine whether our analyses and conclu-sions still hold when considering traffic captured at theedges of commercial networks providing Internet accessvia cable, xDSL, or standard modem technologies.

Acknowledgement

Work funded by grant SFRH/BD/8163/2002 and POSI/EIA/60061/2004 of Fundacao para a Ciencia e a Tecnolo-gia of Portugal.

References

[1] W. Willinger, V. Paxson, R.H. Riedi, M.S. Taqqu, Long rangedependence and data network traffic, in: P. Doukhan, P. Oppenhei-mer, M.S. Taqqu (Eds.), Long Range Dependence: Theory andApplications, Birkhauser, 2002.

[2] The Internet Traffic Archive. Available from: http://ita.ee.lbl.gov, 13-04-2006.

[3] Passive Measurement and Analysis (PMA). Available from: http://pma.nlanr.net/, 13-04-2006.

[4] D.L. Jagerman, B. Melamed, W. Willinger, Stochastic modeling oftraffic processes, in: J. Dshalalow (Ed.), Frontiers in Queuing, CRCPress, Boca Raton, FL, 1997.

[5] A. Erramilli, M. Roughan, D. Veitch, W. Willinger, Self-similartraffic and network dynamics, Proc. IEEE V90 (N5) (2002).

[6] I.W.C. Lee, A.O. Fapojuwo, Stochastic processes for computernetwork traffic modeling, Computer Commun. 29 (2005) 1–23.

[7] O. Cappe, E. Moulines, J-C. Pesquet, A. Petropulu, X. Yang, Long-Range Dependence and Heavy-Tail Modeling for Teletraffic Data,IEEE Signal Processing Magazine, 2002.

[8] P. Abry, R. Baraniuk, P. Flandrin, R. Riedi, D. Veitch, MultiscaleNature of Network Traffic, IEEE Signal Processing Magazine, 2002.

[9] J. Roberts, U. Mocci, J. Virtamo (Eds.), Broadband NetworkTeletraffic, Final Report of Action COST242, Springer-Verlag,Berlin, 1996.

[10] J. Weinstein, Fractional speech loss and talker activity model forTASI and for packet-switched speech, IEEE Trans. Com. V26 (N5)(1978) 1253–1257.

[11] G.C. Groenendaal, L.O.G. Kristiansson, J.B.H. Peek, The behaviorof a hold and forward concentrator for a type of dependentinterarrival statistics, IEEE Trans. Com. V22 (N3) (1974) 273–277.

[12] R. Jain, S.A. Routhier, Packet trains-measurements and a new modelfor computer network traffic, IEEE JSAC V4 (N6) (1986).

[13] J. Aracil, D. Moratom, Characterizing Internet load as a non-regularMultiplex of TCP streams, in: Proc. ICCCN’00, 2000.

[14] N. Hohn, D. Veitch, P. Abry, Cluster processes, a natural languagefor network traffic, IEEE Trans. Sig. Proc. V51 (N8) (2003) 2229–2244.

[15] V. Bolotin, J. Coombs-Reyes, D. Heyman, Y. Levy, D. Liu, IP trafficcharacterization for planning and control, Proc. ITC 16 (1999) 425–436.

D. Zaragoza, C. Belo / Computer Communications 30 (2007) 975–989 989

[16] S. Molnar, T.D. Dang, Scaling analysis of IP traffic components,Proc. ITC Spec. Sem. (2000) 18-1–18-9.

[17] J. Cao, W.S. Cleveland, D. Lin, D.X. Sun, On the nonstationarity ofInternet traffic, Proc. ACM Sigmetrics’01 (2001) 102–112.

[18] C. Liu, S.V. Wiel, J. Yang, A nonstationary traffic train model for finescale inference from coarse scale counts, IEEE JSAC V21 (N6) (2003)895–907.

[19] T. Karagiannis, M. Molle, M. Faloutsos, A. Broido, A nonstationarypoisson view of Internet traffic, Proc. Infocom’04 (2004).

[20] Z-L. Zhang, V.J. Ribeiro, S. Moon, C. Diot, Small-time scalingbehavior of Internet backbone traffic, Computer Networks 48 (2005)315–334.

[21] P. Abry, D. Veitch, Wavelet analysis of long-range dependent traffic,IEEE Trans. Info. Theory V44 (N1) (1998) 1–15.

[22] J. Gao, I. Rubin, Multiplicative multifractal modeling of long-range dependent network traffic, Int. J. Commun. Syst. 14 (2001)783–801.

[23] L. Yao, M. Agapie, J. Ganbar, M. Doroslovacki, Long rangedependence in Internet backbone traffic, Proc. ICC’03 (2003) 1611–1615.

[24] J. Cao, W.S. Cleveland, D. Lin, D.X. Sun, Internet traffic tendstoward poisson and independence as the load increases, in: C.

Holmes, D. Denison, M. Hansen, B. Yu, B. Mallick (Eds.), NonlinearEstimation and Classification, Springer, Berlin, 2002.

[25] H. Jiang, C. Dovrolis, Why is the Internet traffic bursty in short timescales, in: Proc. ACM Sigmetrics’05, 2005.

[26] D.A. Rolls, G. Michailidis, F. Hernandez-Campos, Queuing analysisof network traffic: Methodology and visualization tools, ComputerNetworks 48 (2005) 447–473.

[27] Y. Zhang, L. Breslau, V. Paxson, S. Shenker, On the characteristicsand origins of Internet flow rates, in: Proc. ACM Sigcomm’02, 2002.

[28] S. Stoev, M.S. Taqqu, C. Park, J.S. Marron, On the wavelet spectrumdiagnostic for hurst parameter estimation in the analysis of Internettraffic, Computer Networks 48 (2005) 423–445.

[29] D. Veitch, N. Hohn, P. Abry, Multifractality in TCP/IP traffic: Thecase against, Computer Networks 48 (2005) 293–313.

[30] P.A.W. Lewis, D.R. Cox, A statistical analysis of telephone circuiterror data, IEEE Trans. Com. V14 (N4) (1966) 382–389.

[31] J. Cao, W.S. Cleveland, D. Lin, D.X. Sun, The effect of StatisticalMultiplexing on the Long-range Dependence of Internet PacketTraffic, Bell Labs Technical Report 2002.

[32] I. Norros, On the use of fractional brownian motion in thetheory of connectionless networks, IEEE JSAC V13 (N6) (1995)953–962.