Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in...

18
Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03) San Diego June 9-12, 2003 Presented by Laura D. Goadrich

Transcript of Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in...

Page 1: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Storage Allocation in Prefetching Techniques of Web Caches

D. Zeng, F. Wang, S. Ram

Appeared in proceedings of ACM conference in Electronic commerce (EC’03) San Diego June 9-12, 2003

Presented by Laura D. Goadrich

Page 2: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

The Web Large-scale distributed information

system where data Objects are published and accessible by users

Problems caused by the demand of increased web capacity: Network traffic congestion Web server overloads

Solution: web caching

Page 3: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Web caching: Benefits:

Improves web performance (reduces access latency) Increases web capacity Alleviate traffic congestion (reducing network bandwidth

consumption) Reducing number of client requests (workload) Possibly improve failure tolerance and robustness of Web

(maintaining cached copies of web objects for unreachable networks)

Prefetching: Anticipate users’ future needs This research:

Focuses on making cache-related storage capacity decisions (storage capacity limits the number of prefetched web objects)

Therefore allocate cache storage in prefetching The authors state this focus has not been researched**

Page 4: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Ideas: Current research:

Predict user web accesses without considering cache storage limit

This research: optimization based models Maximize hit rate Maximize byte hit rate Minimize access latency(first 2 are primary goals of web caching:

maximize)

Benefit of this research: guide the operations of a prefetching system

Page 5: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Web prefetching techniques Client-initiated policies

User A is likely to access URL U2 right after URL U1 Patterns learned via Markov algorithms

Server-initiated policies Anticipate future requests based on server logs and

proactively send the corresponding Web objects to participating cache servers or client browsers

Top-n algorithm Hybrid policies

Combine user access patterns from clients and general statistics from servers to improve the quality of prediction

Failing of policies: how to make decisions of which Web objects to prefetch considering storage capacity

Page 6: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Assumptions/Notation

C maximum amount of storage space is available to store prefetched Web objects

i URL of potential interest

Pi Predicted probability with which URL i will be visited

(i, Pi) Prediction of users’ future accesses

N Set of all URLs of potential interest

Si э (Si<C)

Size of each Web object referred to by i

Page 7: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Hit Rate (HR) Model

NiX

CXS

XPZ

i

Niii

NiiiHR

1,0

max (1)

(2)

(3)

Page 8: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Byte Hit Rate (BHR) Model

NiX

CXS

XSPSP

Z

i

Niii

Niiii

Niii

BHR

1,0

1max (4)

(2)

(3)

Page 9: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Byte Hit Rate (BHR) Model

NiX

CXS

XSPSP

Z

i

Niii

Niiiiii

Niiiii

AL

1,0

)()(

1max

(7)

(2)

(3)

αi # of seconds to establish the network connection between the client machine and the Web server hosting i

βi # of seconds per byte to transmit i over the network

Page 10: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Transforming HR, BHR & AL into the Knapsack problem Benefits of Knapsack problem

Well studied “easiest” NP-hard problem Can solve optimally by a pseudo-polynomial

algorithm based on dynamic programming A fully polynomial approximation is possible

Focus on greedy algorithm (due to paper length limits)

Page 11: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Greedy Algorithm:

1. Sort all URLs into a sequence

2. Determine a threshold k defined as:

3. Prefetch Web objects referred to by URLs

N

N

i

i

i

i

i

i

N S

P

S

P

S

Piii

2

2

1

1,,, 21

CNjkji

ii 1

:,,2,1max

kiii ,,, 21 otherwiseX

iiiiifX

i

ki

,0

}),,,{(,1 21

Page 12: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Other Allocation Policies Tested

Optimal policy using CPLEX Disadvantages

Complex Increased implementation time Difficult to implement

Top-n Developed for Web usage prediction Used to regulate storage allocations by

appropriately setting n Equivalent to Greedy BHR relying only on Pi

Page 13: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Simulations

Small Large

|N| 50 200

rep Text multimedia

C 100,000 100,000α/ β 5,000 (slow) 30,000 (fast)

LN(μ,σ)= lognormal distribution with mean eμ and shape σ

a.

b.

Page 14: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Performance Comparison

Experimental Condition

Hit Rate Byte Hit Rate % Savings in Access Latency

Opt

G-HR

Top-n Opt

G-HR

Top-n

Opt

G-HR

Top-n

a=50, LN(10,.05), b=5000 .47 .45 .44 .45 .44 .44 .45 .44 .44

a=50 , LN(10,.05) ), b=30000 .47 .45 .44 .45 .44 .44 .46 .44 .44

a=50 , LN(10,1) ), b=5000 .47 .44 .35 .32 .28 .28 .33 .29 .29

a=50 , LN(10,1) ), b=30000 .47 .44 .35 .32 .28 .28 .38 .34 .32a=200 ,

LN(10,.05) ), b=5000

.36 .34 .33 .34 .33 .33 .34 .33 .33

a=200 , LN(10,.05) ),

b=30000

.36 .34 .33 .34 .33 .33 .35 .34 .33

a=200 , LN(10,1) ), b=5000 .36 .34 .25 .20 .17 .17 .22 .18 .18

a=200 , LN(10,1) ), b=30000 .36 .34 .25 .20 .17 .17 .27 .23 .21

Page 15: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Results Greedy algorithms and Top-n in general

achieve reasonable performance Greedy algorithms outperform Top-n with

respect to hit rate and access latency There exists a relatively large

performance gap between an optimal approach and fast heuristic methods when Web objects vary greatly in size Suggests the need for developing more

sophisticated allocation policies such as a dynamic programming-based approach

Page 16: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Contributions: Focus: stress importance of effective

storage allocation in prefetching

Paper contributions:1. Provide new formulations for prefetching

storage allocation2. Create computationally efficient allocation

policies based on storage allocations solved by the knapsack problem

3. Models created lead to more precise understanding of the applicability and effectiveness of Top-n policy

Page 17: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Future Work

Trace-based simulation Actual web access logs More realistic environment

Modeling Integrate allocation models with

caching storage management modelsi.e. Cache replacement

Page 18: Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)

Changes- Recommendations

Not renaming the same constraints More resources (5 articles, 2

books) Discuss feasible solve times (opt) Test/Hypothesize implementation

strategies for real application