Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas...

10
Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University {mor, hector, paepcke}@cs.stanford.edu http://www-db.stanford.edu/
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    221
  • download

    6

Transcript of Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas...

Page 1: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University.

Evaluation of Delivery Techniques for Dynamic

Web Content

Mor Naaman, Hector Garcia-Molina, Andreas Paepcke

Department of Computer Science

Stanford University

{mor, hector, paepcke}@cs.stanford.edu

http://www-db.stanford.edu/

Page 2: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University.

Dynamic Web is Ubiquitous

Page 3: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University.

Problems with Dynamic Pages• Generation of pages is resource-intensive• Pages are too dynamic, or too personalized,

to be cached

• Higher load on servers (page generation and delivery)

• More network traffic

Page 4: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University.

We Evaluate Two Competing Solutions(Both address at least the network load)

ESI (Oracle, Akamai)

•Enables assembly of pages from small fragments

•Fragments can be cached on specialized network caches (edge servers)

•Fragments are assembled on the edge server

Class Based Delta Encoding

•Computes delta of generated page from a chosen base file

•Base files can be cached on network caches

•Client receives delta from the server and base file from cache; applies delta to base file to get final page

Page 5: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University.

A Page Content Model• Page composed from groups; groups

include items.• Page construction

modeled as two-phase selection (groups, then items)

Groups

Items

Page 6: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University.

Our Simulation

Book pages in Amazon-style website

MyYahoo-type personalized pages

Personalized stock portfolio pages

A simple personalized weather page

Test-case web pages:

Page 7: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University.

Simulation of ESI• Assuming Zipf-like distribution for groups and items (popularityi=k/i)• Performance highly dependant on (ranging from 0.7-1.5 in our

simulations)• Hit rate estimates for items:

=Arrival rate; TTL = item time-to-live; = constant

Sample simulation results(bookstore-type resource, With “backend” servers)

Alpha = 0.8

0

50

100

150

200

250

300

0 2000 4000 6000 8000Time-to-live (seconds)

Traf

fic

(Gb

per

Day

)

Client-EdgeEdge-backendBackend-Main site

Traffic vs. TTL

0%

20%

40%

60%

80%

100%

0.7 0.9 1.1 1.3 1.5Alpha

Hit

rate

Edge hit rate

System hit rate

Hit-rate vs. value of Zipfian parameter

Page 8: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University.

Class-Based Delta Encoding Simulation

• For some pages, client likely to be able to re-use base files

0

50

100

150

200

250

300

0 200 400 600 800 1000Number of base files

Traf

fic

(Gb

per

Day

)

Traff ic Without DEAggregate Client Traff icMain Site Traff ic

Traffic vs. number of base files

0

50

100

150

200

250

300

0 2500 5000 7500 10000 12500 15000Threshold (Bytes)

Tra

ffic

(G

b p

er D

ay)

Aggregate Client Traff ic

Main Site Traff ic

Traffic vs. Same-Base threshold

• For other pages, client-cache link traffic is higher than before. To minimize client traffic, use same base file owned by client if delta is larger than threshold

Page 9: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University.

Sample Comparison Numbers

MyYahoo-type pages

Amazon-style Book pages

Savings – client link

Savings – server link

Edge cache usage

ESI 0% 62% 1.5Mb

DE 66% 87% 3.2Mb

Savings – client link

Savings – server link

Edge cache usage

ESI 0% 30% 1.2Mb

DE -8% 82% 2.2Mb

Page 10: Evaluation of Delivery Techniques for Dynamic Web Content Mor Naaman, Hector Garcia-Molina, Andreas Paepcke Department of Computer Science Stanford University.

Conclusions

Excellent *, Good +, Bad -, Sometimes ~

All the details: http://dbpubs.stanford.edu/pub/2003-7

ESI DE

Reduces server traffic + *

Reduces client traffic - ~

Reduces computational load on web server * -

Performance dependent on web page structure Yes Yes

Performance dependent on characteristics of data

Yes No

Benefits greater when popularity rises Yes Less

Requires main site hardware/software installation

No Yes

Requires web-page code changes Yes No

Requires network infrastructure (CDN services) Yes No

Can exploit information available from CDN for page construction

Yes No