HaTCh: A Two-level Caching Scheme for Estimating the ...

6
HaTCh: A Two-level Caching Scheme for Estimating the Number of Active Flows Sungwon Yi, Xidong Deng, George Kesidis, and Chita R. Das Department of Computer Science & Engineering The Pennsylvania State University University Park, PA 16802 syi, xdeng, kesidis, das @cse.psu.edu Abstract—In this paper, we present a Markov model to examine the capability of SRED in estimating the number of active flows. We show that the SRED cache hit rate can be used to quantify the number of active flows. We then propose a modified SRED scheme, called HaTCh (Hash-based Two-level Caching), that uses hashing and a two-level caching mechanism to accurately estimate the number of active flows under various workloads. We formulate a preliminary Markov model of the proposed scheme to show its effectiveness in preventing the monopoly of misbehaving flows. Simulation results indicate that the proposed scheme provides better estimation of the number of active flows compared to SRED, stabilizes the estimation with respect to workload fluctuations, and prevents performance degradation by efficiently isolating the misbehaving flows. I. I NTRODUCTION Internet congestion control is an important, but admittedly complex problem primarily because of the unpredictable traffic dynamics. Several active queue management (AQM) schemes have been proposed for congestion control to minimize high packet loss rates and global synchronization in the Inter- net [1] [2] [3] [4] [5] [6]. Two crucial functions of an AQM scheme are to estimate the level of congestion and to respond accordingly either by randomly dropping or marking the pack- ets. In these schemes, average queue length [2] [4] [5] or link idle time and packet loss due to the buffer overflow [1] was used as an indication of congestion level. However, none of these schemes are very effective since they give limited information about the level of congestion. Recently a new approach, called Stabilized RED (SRED), 1 [7] drew wide attention since it proposed to use the number of active flows as an indication of the con- gestion level. It is believed that the number of active flows is a better indicator of network congestion compared to other parameters such as average queue length and packet loss event. Further, the number of active flows can be used as a configuration parameter to stabilize the AQM control systems [8] [3]. Therefore, an accurate estimation of this number can provide a better handle to congestion control. In SRED, a small cache memory, called zombie list, is used to record the M most recently seen flows. Each cache line, zom- bie, contains the source and destination address pair, last arrival time, and hit count of the flow. Each arriving packet (source and destination address) is compared with a randomly selected cache line. If the addresses match (called a hit), the hit count of In this paper, we use “SRED” to indicate the flow estimation capability of SRED. the cache line is increased by 1. Otherwise (called a miss), the selected cache line is replaced by the arriving flow’s address with a replacement probability . To estimate the number of active flows, SRED maintains a hit frequency and updates with on a hit, and with on a miss, where is a time constant. The inverse of the hit fre- quency ( ) is used to estimate the number of active flows. Although the SRED concept and the use of number of ac- tive flows as an indication of congestion level are quite novel, a detailed performance analysis showed several limitations. For example, one of the problems in SRED is that the estimated number of flows fluctuates as the number of flows in the net- work increases. This implies that the severity of the congestion is not accurately captured. Second, although misbehaving flows such as UDP can be identified by computing the hit count and calculating the total occurrence of the flow as described in [7], SRED still under-estimates the number of active flows when the traffic mix includes both TCP and UDP connections. To address these problems, we first present a mathematical model to analyze the estimation capability of SRED, and show how the steady state hit frequency of the cache model can be used to estimate the number of active flows. We then propose a modified SRED scheme, called HaTCh (Hash-based Two-level Caching), that uses hashing and a two-level caching mechanism (L1 and L2) to accurately estimate the number of active flows under various workloads. With the hashing scheme, each arriving packet is hashed into one of the partitioned subcaches, and the hit frequency is main- tained for each subcache. Therefore, the hit probability of an arriving packet is improved. The hashing scheme stabilizes the estimation through this improved hit probability. On the other hand, the purpose of the L1 cache in HaTCh is to isolate the misbehaving flows from the L2 cache. The two-level caching scheme accurately estimates the number of active flows by iso- lating the misbehaving flows to prevent monopolization of the L2 cache, and to yield more room in the L2 cache for the con- forming flows. We extend the SRED analytical model for the proposed two- level caching scheme to demonstrate its effectiveness in isolat- ing the misbehaving flows. Then we analyze the performance of the HaTCh scheme through extensive simulation using the ns-2 simulator [9]. The simulation results indicated that two-level scheme not only stabilizes the estimation but also improves the accuracy of estimation for various workloads. In particular, HaTCh outperformed SRED when the traffic included misbe- having flows.

Transcript of HaTCh: A Two-level Caching Scheme for Estimating the ...

HaTCh: A Two-level Caching Scheme for Estimating the Number of

Active FlowsSungwon Yi, Xidong Deng, George Kesidis, and Chita R. Das

Department of Computer Science & EngineeringThe Pennsylvania State University

University Park, PA 16802fsyi, xdeng, kesidis, [email protected]

Abstract—In this paper, we present a Markov model to examinethe capability of SRED in estimating the number of active flows.We show that the SRED cache hit rate can be used to quantifythe number of active flows. We then propose a modified SREDscheme, called HaTCh (Hash-based Two-level Caching), thatuses hashing and a two-level caching mechanism to accuratelyestimate the number of active flows under various workloads.We formulate a preliminary Markov model of the proposedscheme to show its effectiveness in preventing the monopoly ofmisbehaving flows. Simulation results indicate that the proposedscheme provides better estimation of the number of active flowscompared to SRED, stabilizes the estimation with respect toworkload fluctuations, and prevents performance degradation byefficiently isolating the misbehaving flows.

I. I NTRODUCTION

Internet congestion control is an important, but admittedlycomplex problem primarily because of the unpredictable trafficdynamics. Several active queue management (AQM) schemeshave been proposed for congestion control to minimize highpacket loss rates and global synchronization in the Inter-net [1] [2] [3] [4] [5] [6]. Two crucial functions of an AQMscheme are to estimate the level of congestion and to respondaccordingly either by randomly dropping or marking the pack-ets. In these schemes, average queue length [2] [4] [5] or linkidle time and packet loss due to the buffer overflow [1] was usedas an indication of congestion level. However, none of theseschemes are very effective since they give limited informationabout the level of congestion.

Recently a new approach, called Stabilized RED(SRED),1 [7] drew wide attention since it proposed touse the number of active flows as an indication of the con-gestion level. It is believed that the number of active flowsis a better indicator of network congestion compared to otherparameters such as average queue length and packet lossevent. Further, the number of active flows can be used as aconfiguration parameter to stabilize the AQM control systems[8] [3]. Therefore, an accurate estimation of this number canprovide a better handle to congestion control.

In SRED, a small cache memory, calledzombie list, is usedto record the M most recently seen flows. Each cache line,zom-bie, contains the source and destination address pair, last arrivaltime, and hit count of the flow. Each arriving packet (sourceand destination address) is compared with a randomly selectedcache line. If the addresses match (called a hit), the hit count of

1In this paper, we use “SRED” to indicate the flow estimation capability ofSRED.

the cache line is increased by 1. Otherwise (called a miss), theselected cache line is replaced by the arriving flow’s addresswith a replacement probabilityr. To estimate the number ofactive flows, SRED maintains a hit frequencyf(t) and updatesf(t) with (1��)f(t�1)+� on a hit, and with(1��)f(t�1)on a miss, where� is a time constant. The inverse of the hit fre-quency (f(t)�1) is used to estimate the number of active flows.

Although the SRED concept and the use of number of ac-tive flows as an indication of congestion level are quite novel, adetailed performance analysis showed several limitations. Forexample, one of the problems in SRED is that the estimatednumber of flows fluctuates as the number of flows in the net-work increases. This implies that the severity of the congestionis not accurately captured. Second, although misbehaving flowssuch as UDP can be identified by computing the hit count andcalculating thetotal occurrenceof the flow as described in [7],SRED still under-estimates the number of active flows when thetraffic mix includes both TCP and UDP connections.

To address these problems, we first present a mathematicalmodel to analyze the estimation capability of SRED, and showhow the steady state hit frequency of the cache model can beused to estimate the number of active flows. We then propose amodified SRED scheme, called HaTCh (Hash-based Two-levelCaching), that uses hashing and a two-level caching mechanism(L1 and L2) to accurately estimate the number of active flowsunder various workloads.

With the hashing scheme, each arriving packet is hashed intoone of the partitioned subcaches, and the hit frequency is main-tained for each subcache. Therefore, the hit probability of anarriving packet is improved. The hashing scheme stabilizes theestimation through this improved hit probability. On the otherhand, the purpose of the L1 cache in HaTCh is to isolate themisbehaving flows from the L2 cache. The two-level cachingscheme accurately estimates the number of active flows by iso-lating the misbehaving flows to prevent monopolization of theL2 cache, and to yield more room in the L2 cache for the con-forming flows.

We extend the SRED analytical model for the proposed two-level caching scheme to demonstrate its effectiveness in isolat-ing the misbehaving flows. Then we analyze the performance ofthe HaTCh scheme through extensive simulation using the ns-2simulator [9]. The simulation results indicated that two-levelscheme not only stabilizes the estimation but also improves theaccuracy of estimation for various workloads. In particular,HaTCh outperformed SRED when the traffic included misbe-having flows.

The rest of this paper is organized as follows: In Section II,we present a mathematical model to analyze the estimation ca-pability of SRED, and describe the limitations of SRED. Theproposed estimation scheme, HaTCh, is detailed in Section III.In Section IV, simulation results are presented followed by theconcluding remarks in Section V.

II. SRED: STABILIZED RED

A. An analytic model for SRED

The key idea of SRED is to relate the cache hit frequency tothe number of active flows. However, this was not proved for-mally in the original paper [7] that relied on simulation to arriveat the conclusion. In this section, we present a Markov modelto understand the concept of SRED and analyze its estimatingbehavior.

Consider a small cache memory (zombie list) with M lines,andN independent flows, each with a rate�i packets/s. Here,the packet arrival process maynot be necessarily Poisson. Weonly assume that the flow IDs of the multiplexed sequence ofarriving packets are independent. If we assumeXi is the num-ber of cache lines with the flow IDi in the cache and�X is thevector that maintains the number of cache lines occupied byeach flow, then�X can be presented as a Markov chain, wheretransitions occur at packet arrivals.�X and its state space can bedefined as

PNi=1Xi =M , and

�X 2 f �m =

0BBB@

m1

� � �mi

� � �mN

1CCCA�� 0 � mi �M;

NXi=1

mi =Mg � SM;N :

Here, the transition rates of the Markov chain are dependenton both the replacement probability and the outcome of thecache comparison between an arriving packet and a randomlyselected cache line. Based on SRED’s functionality, for a givencache state�m, the number of cache lines for each flow in thecache remains the same either in the event of a cache hit, orin the event of a cache miss with probability1 � r. However,the state of the cache changes from�m to �ij �m in the event ofa cache miss with replacement probabilityr. In the later case,the new state�ij �m is defined as:

�ij �m =

0BBBBBBB@

m1

� � �mi + 1� � �

mj � 1� � �mN

1CCCCCCCA

if i < j;

for all the cache states�m such thatmi < M and0 < mj . Ifi > j, then the i and j terms are swapped.

Now the transition probability of this cache model can bewritten for a cache hit or a cache miss without replacement as:P ( �X(t+ 1) = �m

�� �X(t) = �m) (1)

� Phit( �m; �m) + Pmiss( �m; �m) (1� r)

=

NXi=1

mi

M�

�iPNk=1 �k

+

NXi;j=1i6=j

mj

M�

�iPNk=1 �k

(1� r):

For a cache miss that is replaced with probabilityr, the ex-pression is:

P ( �X(t+ 1) = �ij �m�� �X(t) = �m) (2)

� P ( �m; �ij �m) r =

NXi;j=1i6=j

mj

M�

�iPNk=1 �k

r:

For the detailed description of these term, please refer to [10].For a given state�m such thatmi < M and0 < mj , the

detailed balance equations of this system for anyr becomes:

�( �m)P ( �m; �ij �m) = �(�ij �m)P (�ij �m; �m): (3)

Here,�( �m) is the steady state distribution of�X. Thus, (3) be-comes

�( �m) = �(�ij �m)�j

mj

�mi + 1

�i: (4)

We have found that the following distribution solves the de-tailed balance equation (3):

�( �m) =

QN

n=1

�mnn

mn!GMN

; (5)

where the normalizing constant is

GMN =X

�m2SM;N

NYn=1

�mnn

mn!:

Thus, we can conclude that the process�X is time reversible.Now, let us denotef(t) as the hit frequency, andH as the

steady state hit probability of the cache. Then,H includes thesummation of all states as:

H =X

�m2SM;N

�( �m)Phit( �m; �m): (6)

In practice, H can be estimated by a first order autoregressiveprocess, defined as:

f(t) = (1� �)f(t� 1) + � � 1 fhit at tth packet arrivalg

for 0 < � < 1, where1 represents an indicator function of acache hit. It is clear thatH is a limiting point of the processf . We assume that the choice of� is such thatf(t) convergesfaster than the rate of change of N. (TCP estimates the roundtrip time using an autoregressive process too.) Here,� plays arole as a time constant that determines the speed of the model toreach the steady state. Finally, using (1), (5) and (6), we expressthe steady state hit frequency of the given cache model as thefollowing theorem.

Theorem 1:Under the assumption of independent packetflow identifiers, the hit frequency calculated by a first order au-toregressive process for the single cache system (SRED) con-verges to

H =X

�m2SM;N

0BB@

QN

n=1

�mnn

mn!GMN

NXi=1

(mi

M�

�iPN

k=1 �k)

1CCA (7)

TABLE ITHE ESTIMATED NUMBER OF FLOWS

Number of Memory Estimated Number Estimated NumberFlows Used Size by Equation (7) by Simulation (r = 0:25)

10 10 12.054910 20 10 11.6001

40 10 11.078180 10 10.724450 50 59.0031

50 100 50 58.4126200 50 54.6074400 50 51.9261100 100 121.2941

100 200 100 118.5075400 100 111.7792800 100 106.5374

in the steady state.Observe that if each�i in the summand of the numerator of

(7) is replaced by the maximum value of�max, thenH is less

than or equal to�maxPN

k=1 �k. Similarly, if each�i in the summand

of the numerator of (7) is replaced by the minimum value of

�min, thenH is greater than or equal to�minPN

k=1 �k. Therefore,

we have the following two corollaries.Corollary 1:

�minPN

k=1 �k� H �

�maxPN

k=1 �k: (8)

Corollary 2: If all the arriving rates,�is, are equal for allN ,then the upper and lower bounds onH in (8) are equal, leadingto

H =1

N:

Due to the large state space (M+N�1CM), we first investigatethe accuracy of our model with a relatively small cache mem-ory (10 cache lines) and a small numbers of flows (10 or less)by enumerating the entire state space. We then extend the val-idation for large cache sizes (up to 800 cache lines) and morenumber of flows (up to 100) by using various state space trun-cation techniques.

Table I shows the results of the comparison between the es-timation by (7) and the simulation results. The results indicatethat the model is quite robust in estimating the number of flows.The model provided accurate estimates regardless of the cachesize, while the simulation over-estimated the number of flowsas the cache size decreased.

B. Limitations of SRED

In order to examine the capability of SRED in estimating thenumber of active flows, we performed a number of simulationsusing a dumbbell topology with a common link bandwidth of45Mbps. In the simulations, all the sources randomly initiatepacket transmission between 0 to 1s, each intermediate nodehas a buffer size of 600 packets, and the packet size is fixed at 1K bytes. SRED is deployed at the intermediate node to estimatethe number of active flows with a cache size of 1000 lines, re-placement probability (r) = 0.25, and� = 0.001, as used in[7]. Figure 1 shows the estimated number of active flows with

0 20 40 60 80 1000

5

10

15

20

25

30

35

40

Time (s)

Estim

ated

Num

ber o

f Flow

s

(a) 20 TCP sources

0 20 40 60 80 1000

20

40

60

80

100

120

140

160

180

200

Time (s)

Estim

ated

Num

ber o

f Flo

ws

(b) 100 TCP sources

0 20 40 60 80 1000

200

400

600

800

1000

1200

1400

1600

1800

2000

Time (s)

Estim

ated

Num

ber o

f Flo

ws

(c) 1000 TCP sources

0 20 40 60 80 1000

1000

2000

3000

4000

5000

6000

7000

8000

Time (s)

Estim

ated

Num

ber o

f Flo

ws

(d) 4000 TCP sources

Fig. 1. Estimated number of flows with SRED

SRED when 20, 100, 1000, and 4000 flows are used. The esti-mation of SRED is quite accurate with small number of flows,but it severely fluctuates as the number of flows increases. Thefluctuation results from the low hit probability when a largenumber of flows is used. Another problem in SRED is thatwhen misbehaving flows are present in a network, thezombielist is contaminated by these flows, and the estimation abilityof SRED is significantly degraded. We demonstrate the impactof misbehaving flows using the SRED mathematical model inTable II and through a simulation study in Section IV.

We again used a cache size of 10 lines, with 10 flows, andset the arrival rate of misbehaving flows (�m) as 2 and 3 timesthat of the conforming flows (�c) to mimic the behavior of mis-behaving flows in the mathematical model (equation (7)). Thesmall memory size and number of flows are not enough to cap-ture the exact effect of misbehaving flows, but it helps us to pre-dict the tendency when large number of flows is used. Table IIshows that as the fraction of misbehaving flows increases, thenumber of active flows is under-estimated. When misbehavingflows become a dominant fraction of the traffic, the estimatednumber starts to recover from under-estimation. These resultshave a similar trend as that of the simulation results with a total500 flows (TCP + UDP) presented in Section IV (Figure 4).

In summary, SRED exhibits unstable estimation with a largenumber of flows, and under-estimation in the presence of mis-behaving flows.

TABLE IIIMPACT OF MISBEHAVING FLOWS IN ESTIMATING THE NUMBER OF FLOWS

Fraction of Estimated Number Estimated NumberMisbehaving Flows of Flows of Flowsin Total Workload (when�m = 2 � �c) (when�m = 3 � �c)

0 % 10 1010% 9.309 8.00120% 9.000 7.54130% 8.897 7.52940% 8.907 7.71250% 9.002 8.000

14

12

3

3

1

2

����������������������������������������������������������������������������������������������������

����������������������������������������������������������������������������������������������������

����������������������������������������������������������������������

����������������������������������������������������������������������

L2 Cache(Zombie List)

Packet

CASE 1: L1 HitCASE 2: L1 Miss / L2 HitCASE 3: L1 Miss / L2 Miss

Hash(SRC,DST)

Hit

Miss

UpdateHit Frequency

&# of flows

L1 Cache

Subcache 0

Subcache 1

Subcache k-1

Subcache k-1

Subcache 2

Subcache 1

Subcache 0

New

Subcache 2

( Cleared )

line withReplace Cache

probability r

Miss

Hit

Update

Miss

Fig. 2. HaTCh (Hash-based Two-level Caching) Architecture

III. T HE PROPOSEDHASH-BASED TWO-LEVEL CACHING

SCHEME (HATCH)

Based on the discussions in the previous section, we proposea new active flow estimation scheme, called HaTCh (Hash-based Two-level Caching). The HaTCh scheme consists of twoparts. The first part is a hashing scheme to stabilize the estima-tion, and the other is a two-level caching scheme to isolate themisbehaving flows that contaminate thezombie listand lead tounder-estimation.

A. The Hash-Based Estimation

The key idea of the hashing scheme comes from the fact thatSRED’s hit probability is too low for large number of flows.This problem can be alleviated by using a hashing scheme. Toimplement hashing, the single cache memory (zombie list) ispartitioned intok small chunks, called subcaches. Whenevera packet arrives, the packet is hashed into a subcache usingthe source and destination addresses, and a cache line is ran-domly selected for comparison from the subcache. Note thata connection is always hashed to the same subcache with thistechnique. Each subcache maintains its own hit frequency andthe estimated number of flows, and the estimated number offlows per subcache is aggregated to find the total number of es-timated flows. The performance of the hash-based estimationmay degrade when most active flows are hashed into one ortwo subcaches, but this can be alleviated by periodically scat-tering the hash function as noted in [11]. When all the flowshave the same sending rate and round trip time (RTT), the hitprobability of a flow is( 1

N)2 under SRED, but the hit proba-

bility increases to( kN)2 when k subcaches are used in HaTCh.

This improved hit probability stabilizes the fluctuation of hitfrequency and flow estimation.

B. The Two-level Caching Scheme

The motivation for a two-level caching comes from the factthat the misbehaving flows send more packets than the con-forming flows, and this increases the hit rate. Therefore, de-veloping an efficient scheme to isolateexcesspackets from thememory is the key for accurate estimation. We accomplish thisthrough the two-level cache design.

Figure 2 shows the basic organization of HaTCh, which com-bines hashing and the two-level caching. As a packet arrives, it

is hashed into one of the L1 subcaches using the source and des-tination addresses. Then a randomly selected cache line fromthe L1 subcache is compared with the arriving packet. If theL1 cache hits (Case 1 in Figure 2), the hit count of the cacheline is increased by 1, but the hit frequency of this subcache re-mains the same. If the L1 cache misses, the corresponding L2subcache is selected. Then, a randomly selected L2 cache linein the corresponding subcache is compared with the arrivingpacket. If the L2 cache hits (Case 2 in Figure 2), the previouslyselected L1 cache line is updated with this L2 cache line, andthe hit count of L1 cache line is set to 1. The L2 cache line isthen cleared. If the L2 cache misses (Case 3 in Figure 2), theL2 cache line is replaced with the arriving packet with a proba-bility r. Irrespective of whether there is a hit or miss in the L2cache, the hit frequency and the number of estimated flows forthe subcache are recalculated, and also the total estimated num-ber of flows. The sequence of operations for the three cases isshown by the solid, dotted and dashed lines in Figure 2.

The key features of HaTCh are following: First, HaTCh takesadvantage of the improved hit probability though hashing inboth L1 and L2 caches. This stabilizes the estimation processof HaTCh. Second, the hit frequency is recalculated only whenthere is a L1 cache miss. L1 cache is updated only when a flowhits the L2 cache and thus, the L1 cache is generally sharedby the flows that hit the L2 cache. Filtering these flows to L1cache provides a fair chance for all the flows to update the hitfrequency. Third, on an L2 cache hit, the selected cache lineis cleared to yield room for the following conforming flows;a mechanism called L2 cache cleaning. Ideally, the L2 cache(zombie list) should be shared uniformly among all competingflows to yield an accurate flow estimation. Although the misbe-having flows are filtered in the L1 cache, the flows that missedthe L1 cache will fill the L2 cache more aggressively than con-forming flows. Therefore, the cleaning mechanism in the L2cache also contributes to fair sharing of the L2 cache.

C. A Preliminary Model of the HaTCh Scheme

In this section, we present a mathematical model for the pro-posed HaTCh scheme to demonstrate its effectiveness in isolat-ing the misbehaving flows. The model extends the single cachedesign to capture the two-level cache memory withM1 lines forthe L1 cache andM2 lines for the L2 cache (M1 � M2), andN independent flows. If we assume (Xi, Yi) be the number ofcache lines with the flow IDi in the L1 and L2 caches respec-tively, and (�X, �Y ) be a pair of vectors representing the numberof cache lines in L1 and L2 occupied by each flow, then (�X,�Y ) represents the Discrete State model of the system. Now thestate space for�X , �Y can be defined similarly as has been donein Section II.

Let us assume an arriving packet has a flow IDi, and arandomly selected cache line from the L1 cache has a flowID j. For a given cache state (�c, �m), if there is a L1 cachemiss and L2 cache hit, the state changes to ij(�c, �m), whereci = ci+1; cj = cj�1, andmi = mi�1. Then, the transitionprobability of the two-level cache model for an L1 cache missand L2 cache hit can be written as:

P (( �X; �Y )(t+ 1) = i;j(�c; �m)�� ( �X; �Y )(t) = (�c; �m))

� P ((�c; �m); ij(�c; �m))

=

NXi;j=1i6=j

cj

M1

�miPN

z=1mz

��iPN

z=1 �z

=

NXi=1

(1�ci

M1

)miPN

z=1mz

��iPN

z=1 �z: (9)

Note that the steady state hit frequency is determined by thesteady state cache line distribution and the steady state hit prob-ability in (6) under SRED. Therefore, for the two-level cacheunder HaTCh, we can write the following Theorem from (6)and (9).

Theorem 2:Under the assumption of independent packetflow identifiers, the hit frequency calculated by a first order au-toregressive process for the two-level cache system (HaTCh)converges to

H =X

(�c; �m)2SM1;M2;N

�(�c; �m)P ((�c; �m); ij(�c; �m)) (10)

=X

(�c; �m)2SM1;M2;N

�(�c; �m)

NXi=1

(1�ci

M1

)miPN

z=1mz

��iPN

z=1 �z

in the steady state.In the above equation�(�c; �m) represents the steady state dis-

tribution of ( �X; �Y ), and ij(�c; �m) represents the state of HaTChwhen there is a L1 cache miss and a L2 cache hit for a givenstate (�c; �m). Unlike the SRED model, it is difficult to find aclosed form expression for�(�c; �m). Thus, in stead of comput-ing the state probability distribution, we explain how equation(10) captures the misbehaving flows.

In (7) for the SRED model, the effect of the misbehav-ing flows is magnified by both the arrival rate of the misbe-having flows,�i, and the number of cache lines occupied bythe flows, mi

M, which is proportional to the arrival rate. This

leads to under-estimation of SRED when misbehaving flowsare present. In contrast, (10) clearly explains how HaTCh ef-fectively isolates misbehaving flows and yields more accurateestimation. First, HaTCh clears the selected L2 cache line onan L2 cache hit (L2 cache cleaning mechanism) to create moreroom for the following conforming flows, and this contributesto fair distribution of the L2 cache lines for all active flows.The term miP

N

z=1mz

in (10) captures this because it represents

the probability that the flow ID in the L2 cache isi. Second, thedistribution of the L1 cache lines, which is occupied by otherflows, 1 � ci

M1term in (10), mitigates the effect of arrival rate

of the misbehaving flows,�i. As a result, HaTCh yields moreaccurate and stable estimation even in the presence of misbe-having flows.

The above analysis shows a formal technique to prove theeffectiveness of the proposed scheme. We were unable to com-puteH from (10) since the state space enumeration and thesteady state probability�(�c; �m) computation was extremelyhard. We validate our claim about the effectiveness of HaTChthrough simulation results in the next section.

0 20 40 60 80 1000

5

10

15

20

25

30

35

40

Time (s)

Estim

ated

Num

ber o

f Flow

s

r = 0.01r = 0.25

(a) 20 TCP sources

0 20 40 60 80 1000

20

40

60

80

100

120

140

160

180

200

Time (s)

Estim

ated

Num

ber o

f Flo

ws

r = 0.01r = 0.25

(b) 100 TCP sources

0 20 40 60 80 1000

200

400

600

800

1000

1200

1400

1600

1800

2000

Time (s)

Estim

ated

Num

ber o

f Flo

ws

r = 0.01r = 0.25

(c) 1000 TCP sources

0 20 40 60 80 1000

1000

2000

3000

4000

5000

6000

7000

8000

Time (s)

Estim

ated

Num

ber o

f Flo

ws

r = 0.01r = 0.25

(d) 4000 TCP sources

Fig. 3. Estimated number of flows with HaTCh

IV. SIMULATION RESULTS

Due to space limitations, we only present the part of the sim-ulations results. In-depth simulation results and discussion onperformance of HaTCh with respect to impact of burst traffic,TCP round trip time (RTT), and HaTCh configuration such ashash size (k) can be found in [10]. In our simulation, weused the same topology described in Section II, and configuredHaTCh with a hash size of 10 (k = 10), the L2 cache replace-ment probability (r) as 0.01, and L1 and L2 cache sizes as 100and 1000 lines, respectively. Figure 3 shows the estimated num-ber of active flows with HaTCh for two different replacementprobabilities (r = 0:25 and0:01) when 20,100,1000, and 4000TCP flows are used. The estimated number of flows is remark-ably stabilized compare to Figure 1 at the cost of a little bit ofmore delay. However, HaTCh also showed a tendency to under-estimate the number of flows for a replacement probability of0.25 as the workloads increased. Successive arrival of the bursttraffic sources (TCP connections) causes more hit within a shortperiod of time, and this results in more stable but lower estima-tion. The accuracy of the estimation is significantly improvedwith a replacement probability of 0.01 as discussed in [10]. Upto 1000 flows, the estimated number of flows oscillated within20% range with HaTCh. As the number of flows increased, thefluctuation was more and also the number of flows were under-estimated due to the traffic burst and insufficient cache size.

Theoretically, SRED like mechanism (including HaTCh) cankeep track ofN

rflows as noted in [7]. We found that if the num-

ber of active flows (N ) is greater than the number of L2 cachelines (M2), the HaTCh performance starts to degrade (oscilla-tion exceeds 20%). However, Unlike SRED, HaTCh showedsmooth oscillation even for 8000 TCP flows, but the oscillationgradually increased as we increased the number of flows from1000 to 8000. Since the memory requirement for each cacheline is very marginal (about 32 bytes), we believe that the L2cache can accommodate large number of flows (32M bytes L2cache can support 1 million flows with about 20% error bound,and 8 millions flows with 50% error bound). We also foundthat the performance of HaTCh was not affected significantlyby the size of L1 cache as long as the number of L1 cache lines

010

2030

4050

0

20

40

60

80

1000

1000

2000

3000

UDP/Total Workload (%)Time (s)

Est

imat

ed n

umbe

r of

flow

s

(a)�UDP = 2 � �fair

010

2030

4050

0

20

40

60

80

1000

1000

2000

3000

UDP/Total Workload (%)Time (s)

Est

imat

ed n

umbe

r of

flow

s

(b) �UDP = 3 � �fair

Fig. 4. Impact of misbehaving flows in estimating the number of active flowswith SRED

010

2030

4050

0

20

40

60

80

1000

500

1000

UDP/Total Workload (%)Time (s)

Est

imat

ed n

umbe

r of

flow

s

(a)�UDP = 2 � �fair

010

2030

4050

0

20

40

60

80

1000

500

1000

UDP/Total Workload (%)Time (s)

Est

imat

ed n

umbe

r of

flow

s

(b) �UDP = 3 � �fair

Fig. 5. Impact of misbehaving flows in the estimating the number of activeflows with HaTCh

is greater than the number of misbehaving flows. We leave thesize of L1 and L2 cache lines as design parameters that deter-mine target number of misbehaving flows and target number ofactive flows. In our experiments, we set the L1 cache size to 100lines to control up to 100 misbehaving flows. This configurationprovided the best performance for our simulation environment,where HaTCh effectively isolated up to 100 UDP flows (20%of total workload) in Figure 5 (b).

To demonstrate the effect of misbehaving flows on HaTChperformance, we performed simulations with different sendingrates of misbehaving UDP flows. We also varied the UDP traf-fic workload from 0 to 50% of the total workload. Figure 4shows the results of SRED estimation with 500 total flows. InFigure 4 (a), we set the sending rate of the misbehaving UDPflows (�UDP ) as 2 times of the fair share of the link bandwidth(�fair). Although the estimated number of flows went downslightly, the fluctuation was not alleviated. We observed moreclear impact of misbehaving flows when we increased the UDPsending rate to 3 times of the fair share of the link bandwidthin Figure 4 (b). Here, SRED began to under-estimate the num-ber of flows as the UDP workload increased up to 30%, and theestimation recovered after as the UDP flows became dominantpart of the traffic (40 and 50%).

We repeated the same experiments using HaTCh. Note thatwe used different scaling factors in the graphs to show the re-sults. Clearly in Figure 5 (a), HaTCh’s estimation is remarkablystable, and the misbehaving flows were successfully isolated asexpected. HaTCh showed the same performance up to 20%UDP workload in Figure 5 (b) as well. As the UDP workloadincreased, HaTCh also showed graceful performance degrada-tion since a large number of excess packets from misbehavingflows could not be captured in the L1 cache. However, the es-timation error was significantly reduced with HaTCh comparedto SRED.

V. CONCLUDING REMARKS

Designing AQM schemes has been an active area of researchin the Internet community to minimize network congestion, andthus, improve performance. However, the complexity of the In-ternet traffic dynamics has eluded researchers in finding an ef-ficient solution. Recently, SRED [7] was proposed for provid-ing a better handle to congestion by using the number of activeflows as an indicator of congestion. It was shown through asimulation study that an accurate estimation of the number ofactive flows is essential to monitor the level of congestion.

In this paper, we have presented a mathematical frameworkfor analyzing the estimation behavior of SRED. The modelshowed that the steady state hit frequency of the SRED cachemodel can be used to estimate the number of active flows. Themodel is accurate enough to capture the effect of misbehavingflows in the estimation. In order to alleviate the drawbacks ofSRED, we have proposed a modified SRED, called HaTCh, thatuses hashing and a two-level caching mechanism. A prelimi-nary model of the proposed scheme was presented to demon-strate its effectiveness. Also, we conducted extensive simula-tion for an in-depth performance analysis of both the schemes.It was observed that the proposed HaTCh scheme improves notonly the estimation accuracy and stability compared to SRED,but also the robustness of the estimation by effectively isolatingthe misbehaving flows to avoid cache contamination.

Our future work involves further investigation of the two-level cache configuration for better performance and develop-ment of a complete AQM mechanism based on the HaTCh con-cept. Also, the model needs fine tuning to include other sys-tem/design parameters.

REFERENCES

[1] W. Feng, D. D. Kandlur, S. Debanjan, and Kang Shin, “Stochastic FairBlue - A Queue Management Algorithm for Enforcing Fairness,” inPro-ceedings of IEEE INFOCOM, April 2001, pp. 1520–1529.

[2] S. Floyd and V. Jacobson, “Random Early Detection Gateways for Con-gestion Avoidance,”IEEE/ACM Transactions on Networking, vol. 1, no.4, pp. 397–413, August 1993.

[3] S. Kunniyur and R. Srikant, “Analysis and Design of an Adaptive VirtualQueue(AVQ) Algorithm for Active Queue Management,” inProceedingsof the ACM SIGCOMM, August 2001, pp. 123–134.

[4] D. Lin and R. Morris, “Dynamics of Random Early Detection,” inPro-ceedings of the ACM SIGCOMM, September 1997, pp. 127–137.

[5] R. Mahajan and S. Floyd, “RED-PD: Controlling High Bandwidth Flowsat the Congested Router,” ICSI Technical Report TR-01-001, April 2001.

[6] R. Pan, B. Prabhakar, and K. Psounis, “CHOKe - A Stateless ActiveQueue Management Scheme for Approximating Fair Bandwidth Alloca-tion,” in Proceedings of IEEE INFOCOM, March 2000, pp. 942–951.

[7] T. Ott, T. Lakshman, and L. Wong, “SRED: Stabilized RED,” inPro-ceedings of IEEE INFOCOM, March 1999, pp. 1346–1355.

[8] C.V. Hollot, Vishal Misra, Don Towsley, and Wei-Bo Gong, “A ControlTheoretic Analysis of RED,” inProceedings of IEEE INFOCOM, April2001, pp. 1510–1519.

[9] “Network simulator (Ns),” On-line document, Available fromhttp://www.isi.edu/nsnam .

[10] Sungwon Yi, Xidong Deng, and George Kesidis Chita R. Das, “A Modi-fied SRED for Estimating Accurate Number of Active Flows ,” The Penn-sylvania State University Technical Report CSE03-004, 2003.

[11] P. McKenney, “Stochastic Fairness Queueing,” inProceedings of IEEEINFOCOM, March 1990, pp. 733–740.