Squirrel: A peer-to-peer web cache Sitaram Iyer Joint work with Ant Rowstron (MSRC) and Peter...
-
Upload
beau-winne -
Category
Documents
-
view
217 -
download
0
Transcript of Squirrel: A peer-to-peer web cache Sitaram Iyer Joint work with Ant Rowstron (MSRC) and Peter...
Squirrel: A peer-to-peer web cache
Sitaram Iyer
Joint work with Ant Rowstron (MSRC)
and Peter Druschel
Peer-to-peer Computing
Decentralize a distributed protocol:– Scalable
– Self-organizing
– Fault tolerant
– Load balanced
Not automatic!!
Web Caching
1. Latency, 2. External bandwidth, 3. Server load.
ISPs, Corporate network boundaries, etc.
Cooperative Web Caching: group of web caches tied together and acting as one web cache.
Web Cache
Browser
Browser
Browser
Cache
Browser
Cache
Centralized
Web Cache
Web
Server
Sharing!
LAN Internet
Decentralized Web Cache
Browser
Browser
Browser
Cache
Browser
Cache
Web
Server
LAN Internet
• Why?• How?
Why peer-to-peer ?
1. Cost of dedicated web cache No additional hardware
2. Administrative costsSelf-organizing
3. Scaling needs upgrading Resources grow with clients
4. Single point of failure Fault-tolerant by design
Setting
• Corporate LAN
• 100 - 100,000 desktop machines
• Single physical location
• Each node runs an instance of Squirrel
• Sets it as the browser’s proxy
Pastry
Peer-to-peer object location and routing substrate
Distributed Hash Table:reliably map an object key to a live node
Routes in log2b(N) steps
(e.g. 3-4 steps for 100,000 nodes, with b=16)
Home-store model
client
home
LAN Internet
URL hash
Home-store model
client
home
…that’s how it works!
Directory model
Client nodes always store objects in local caches.
Main difference between the two schemes: whether the home node also stores the object.
In the directory model, it only stores pointers to recent clients, and forwards requests to them.
Directory model
client
home
NetLAN
Directory model
client delegate
homerando
mentry
(skip) Full directory protocol
dir
server
servere : cGET req
origin
origin
otherother
req
home
req
client
req
2
b : not-modified
3
e3
21c ,e : req
c ,e : object1
4a , d
2a , d : req 1a : no dir, go to origin. Also d2
3
1
not-modifiedobject or
dele-gate
Recap
• Two endpoints of design space, based on the choice of storage location.
• At first sight, both seem to do about as well. (e.g. hit ratio, latency).
Quirk
Consider a– Web page with many images, or– Heavily browsing node
In the Directory scheme,Many home nodes pointing to one
delegate
Home-store: natural load balancing.. evaluation on trace-based workloads ..
Trace characteristics
Redmond Cambridge
Total duration 1 day 31 days
Number of clients 36,782 105
Number of HTTP requests 16.41 million 0.971 million
Peak request rate 606 req/sec 186 req/sec
Number of objects 5.13 million 0.469 million
Number of cacheable objects 2.56 million 0.226 million
Mean cacheable object reuse 5.4 times 3.22 times
Total external bandwidth
85
90
95
100
105
0.001 0.01 0.1 1 10 100
Tot
al e
xter
nal b
andw
idth
(in G
B)
[
low
er is
bet
ter]
Per-node cache size (in MB)
Directory
Home-store
No web cache
Centralized cache
Redm
ond
Total external bandwidth
5.5
5.6
5.7
5.8
5.9
6
6.1
0.001 0.01 0.1 1 10 100
Tot
al e
xter
nal b
andw
idth
(in G
B)
[
low
er is
bet
ter]
Per-node cache size (in MB)
Directory
Home-store
No web cache
Centralized cache
Cam
bri
dg
e
LAN Hops
0%
20%
40%
60%
80%
100%
0 1 2 3 4 5 6
Fra
ctio
n of
cac
heab
le r
eque
sts
Total hops within the LAN
Centralized Home-store Directory
Redm
ond
LAN Hops
0%
20%
40%
60%
80%
100%
0 1 2 3 4 5
Fra
ctio
n of
cac
heab
le r
eque
sts
Total hops within the LAN
Centralized Home-store Directory
Cam
bri
dg
e
Load in requests per sec
1
10
100
1000
10000
100000
0 10 20 30 40 50
Num
ber
of s
uch
seco
nds
Max objects served per-node / second
Home-storeDirectory
Redm
ond
Load in requests per sec
1
10
100
1000
10000
100000
1e+06
1e+07
0 10 20 30 40 50
Num
ber
of s
uch
seco
nds
Max objects served per-node / second
Home-storeDirectory
Cam
bri
dg
e
Load in requests per min
1
10
100
0 50 100 150 200 250 300 350
Num
ber
of s
uch
min
utes
Max objects served per-node / minute
Home-storeDirectory
Redm
ond
Load in requests per min
1
10
100
1000
10000
0 20 40 60 80 100 120
Num
ber
of s
uch
min
utes
Max objects served per-node / minute
Home-storeDirectory
Cam
bri
dg
e
Conclusion
Possible to decentralize web caching
Performance comparable to centralized cache
Is better in terms of cost, administration, scalability and fault tolerance.
(backup) Storage utilization
Redmond Home-store Directory
Total 97641 MB 61652 MB
Mean per-node 2.6 MB 1.6 MB
Max per-node 1664 MB 1664 MB
(backup) Fault tolerance
Home-store Directory
EquationsMean H/O
Max Hmax /O
Mean (H+S)/O
Max max(Hmax,Smax)/O
Redmond
Mean 0.0027%
Max 0.0048%
Mean 0.198%
Max 1.5%
Cambridge
Mean 0.95%
Max 3.34%
Mean 1.68%
Max 12.4%
(backup) Full home-store protocol
server
client
otherother
req
home
req
req
a : object or notmod from home
b : object or notmod from origin3
1
b2
(WAN)(LAN)
origin
b : req
(backup) Full directory protocol
dir
server
servere : cGET req
origin
origin
otherother
req
home
req
client
req
2
b : not-modified
3
e3
21c ,e : req
c ,e : object1
4a , d
2a , d : req 1a : no dir, go to origin. Also d2
3
1
not-modifiedobject or
dele-gate