The little engine(s) that could: scaling online social networks
-
Upload
josep-m-pujol -
Category
Technology
-
view
3.029 -
download
0
description
Transcript of The little engine(s) that could: scaling online social networks
![Page 1: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/1.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
The Little Engine(s) that could: Scaling Online Social Networks
Josep M. Pujol, Vijay Erramilli, Georgos Siganos, Xiaoyuan Yang
Nikos Laoutaris, Parminder Chhabra and Pablo Rodriguez
Telefonica Research, Barcelona
![Page 2: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/2.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Problem
Solution
Implementation
Conclusions
2
![Page 3: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/3.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Scalability is a pain: the designers’ dilemma
“Scalability is a desirable
property of a system, a network,
or a process, which indicates its
ability to either handle growing
amounts of work in a graceful
manner or to be readily enlarged"
The designers’ dilemma:
wasted complexity and long
time to market
vs.
short time to market but risk
of death by success
3
![Page 4: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/4.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Towards transparent scalability
The advent of the Cloud: virtualized machinery + gigabit network
Hardware scalability
Application scalability
4
![Page 5: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/5.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Towards transparent scalability
Elastic resource allocation for the
Presentation and the Logic layer:
More machines when load increases.
Components are stateless, therefore,
independent and duplicable
Components are interdependent,
therefore, non-duplicable on demand.
5
![Page 6: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/6.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Scaling the data backend...
• Full replication
– Load of read requests decrease with the number of servers
– State does not decrease with the number of servers
6
![Page 7: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/7.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Scaling the data backend...
• Full replication
– Load of read requests decrease with the number of servers
– State does not decrease with the number of servers
– Maintains data locality
7
![Page 8: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/8.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Scaling the data backend...
• Full replication
– Load of read requests decrease with the number of servers
– State does not decrease with the number of servers
– Maintains data locality
Operations on the data (e.g. a query) can be resolved in a single
server from the application perspective
Application Logic
Data Store
Application Logic
Data Store
Data Store
Data Store
Centralized Distributed
8
![Page 9: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/9.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Scaling the data backend...
• Full replication
– Load of read requests decrease with the number of servers
– State does not decrease with the number of servers
– Maintains data locality
• Horizontal partitioning or sharding
– Load of read requests decrease with the number of servers
– State decrease with the number of servers
– Maintains data locality as long as…
9
![Page 10: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/10.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Scaling the data backend...
• Full replication
– Load of read requests decrease with the number of servers
– State does not decrease with the number of servers
– Maintains data locality
• Horizontal partitioning or sharding
– Load of read requests decrease with the number of servers
– State decrease with the number of servers
– Maintains data locality as long as…
… the splits are disjoint (independent)
Bad news for OSNs!!
10
![Page 11: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/11.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Why sharding is not enough for OSN?
• Shards in OSN can never be disjoint because:
– Operations on user i require access to the data of other users
– at least one-hop away.
• From graph theory: – there is no partition that for all
nodes all neighbors and itself are in the same partition if there is a single connected component.
Data locality cannot be
maintained by partitioning a
social network!!11
![Page 12: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/12.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
A very active area…
• Online Social Networks are massive systems spanning hundreds of machines in multiple datacenters
• Partition and placement to minimize network traffic and delay
– Karagiannis et al. “Hermes: Clustering Users in Large-Scale Email Services” -- SoCC 2010
– Agarwal et al. “Volley: Automated Data Placement of Geo-Distributed Cloud Services” -- NSDI 2010
• Partition and replication towards scalability
– Pujol et al “Scaling Online Social Networks without Pains” --NetDB / SOSP 2009
12
![Page 13: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/13.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Problem
Solution
Implementation
Conclusions
13
![Page 14: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/14.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
14
![Page 15: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/15.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
Me
15
![Page 16: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/16.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[]
Inbox
Friends
16
![Page 17: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/17.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[]
key-101
17
![Page 18: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/18.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-101]
key-101
18
![Page 19: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/19.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key-105
key-121
key-146
19
![Page 20: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/20.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key-105
key-121
key-146
Let’s see what’s going onin the world
20
![Page 21: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/21.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101zz
key-105
key-121
key-146
Let’s see what’s going onin the world
1) FETCH –inbox [from server ]A
21
![Page 22: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/22.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key-105
key-121
key-146
Let’s see what’s going onin the world
1) FETCH –inbox [from server ]
R= [key-146, key-121, key-105, key-101]
A
22
![Page 23: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/23.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key-105
key-121
key-146
Let’s see what’s going onin the world
1) FETCH –inbox [from server ]
2) FETCH text WHERE key IN R [from
server ]
R= [key-146, key-121, key-105, key-101]
A
?
23
![Page 24: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/24.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key-105
key-121
key-146
Let’s see what’s going onin the world
1) FETCH –inbox [from server ]
2) FETCH text WHERE key IN R [from
server ]
Where is the data?
R= [key-146, key-121, key-105, key-101]
A
?
24
![Page 25: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/25.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s Operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key-105
key-121
key-146
1) FETCH –inbox [from server ]
2) FETCH text WHERE key IN R [from
server ]
in , then data locality!
R= [key-146, key-121, key-105, key-101]
A
?
A
A AA
A
25
![Page 26: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/26.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s Operations 101
1 2 3 4
5
6789
0
[key-146, key-121, key-105, key-101]
key-101
key-105
key-121
key-146
1) FETCH –inbox [from server ]
2) FETCH text WHERE key IN R [from
server ]
in , then no
data locality!
R= [key-146, key-121, key-105, key-101]
A
?
E
A FB
B F E
26
![Page 27: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/27.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
OSN’s operations 101
1) Relational Databases Selects and joins across multiple shards of the database are possible but performance is poor (e.g. MySQL Cluster, Oracle Rack)
2) Key-Value Stores (DHT) More efficient than relational databases: multi-get primitives to transparently fetchdata from multiple servers.
But it’s not a silver bullet:o lose SQL query language =>
programmatic querieso lose abstraction from data operationso suffer from high traffic, eventually
affecting performance:• Incast issue• Multi-get hole• latency dominated by the worse
performing server
27
![Page 28: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/28.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
28
![Page 29: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/29.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0
Server 1 Server 2
Users are assigned toservers randomlyRandom partition
(DHT-like)
29
![Page 30: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/30.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0 Random Partitioning – DHT
30
![Page 31: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/31.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0 Random Partitioning – DHT
31
![Page 32: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/32.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0 Random Partitioning – DHT
How do we enforce data locality?
32
![Page 33: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/33.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0 Random Partitioning – DHT
How do we enforce data locality?
Replication
33
![Page 34: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/34.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0
4 7
34
![Page 35: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/35.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0
4 7
#4 Master + Replica#4 Master + Replica
35
![Page 36: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/36.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0 That does it for user #3, but
what about the others?
4 7
36
![Page 37: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/37.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0 That does it for user #3, but
what about the others?
The very same procedure…4 7
37
![Page 38: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/38.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0 Data locality for reads for any user is
guaranteed!
4 7 1
0 5
3 2 8
9 6
After a while…
38
![Page 39: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/39.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0 Data locality for reads for any user is
guaranteed!
4 7 1
0 5
3 2 8
9 6
39
![Page 40: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/40.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0 Data locality for reads for any user is
guaranteed!
4 7 1
0 5
3 2 8
9 6
40
![Page 41: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/41.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0 Data locality for reads for any user is
guaranteed!
Random partition + Replication can lead
to high replication overheads!4 7 1
0 5
3 2 8
9 6
This is full replication!!
41
![Page 42: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/42.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0 Data locality for reads for any user is
guaranteed!
Random partition + Replication can lead
to high replication overheads!4 7 1
0 5
3 2 8
9 6
Can we do better?
42
![Page 43: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/43.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0 Data locality for reads in any user is
guaranteed!
Random partition + Replication can lead
to high replication overheads!4 7 1
0 5
3 2 8
9 6
We can leverage the underlyingsocial structure for the partition…
43
![Page 44: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/44.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0 Data locality for reads in any user is
guaranteed!
Random partition + Replication can lead
to high replication overheads!4 7 1
0 5
3 2 8
9 6
We can leverage the underlyingsocial structure for the partition…
44
![Page 45: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/45.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0 Data locality for reads in any user is
guaranteed!
Random partition + Replication can lead
to high replication overheads!4 7 1
0 5
3 2 8
9 6
We can leverage the underlyingsocial structure for the partition…
0 43
9
2
68
5
7
1 Social Partition
45
![Page 46: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/46.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0 Data locality for reads in any user is
guaranteed!
Random partition + Replication can lead
to high replication overheads!4 7 1
0 5
3 2 8
9 6
We can leverage the underlyingsocial structure for the partition…
0 43
9
2
68
5
7
1 Social Partition and Replication: SPAR
3 2
46
![Page 47: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/47.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0 Data locality for reads in any user is
guaranteed!
Random partition + Replication can lead
to high replication overheads!4 7 1
0 5
3 2 8
9 6
We can leverage the underlyingsocial structure for the partition…
0 43
9
2
68
5
7
1 Social Partition and Replication: SPAR
… only two replicas to achieve data
locality
3 2
47
![Page 48: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/48.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Maintaining data locality
1 2 3 45
6789
0
1 23
4
5
67
8
9
0 Data locality for reads in any user is
guaranteed!
Random partition + Replication can lead
to high replication overheads!4 7 1
0 5
3 2 8
9 6
We can leverage the underlyingsocial structure for the partition…
0 43
9
2
68
5
7
1 Social Partition and Replication: SPAR
… only two replicas to achieve data
locality
3 2
48
![Page 49: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/49.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
SPAR Algorithm from 10000 feet
“All partition algorithms are equal, but some are more equal than others”
49
![Page 50: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/50.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
SPAR Algorithm from 10000 feet
“All partition algorithms are equal, but some are more equal than others”
1) Online (incremental) a) Dynamics of the SN (add/remove node, edge)b) Dynamics of the System (add/remove server)
2) Fast (and simple) a) Local informationb) Hill-climbing heuristicc) Load-balancing via back-pressure
3) Stable a) Avoid cascades
4) Effective a) Optimize for MIN_REPLICA (i.e. NP-Hard)b) While maintaining a fixed number of replicas per
user for redundancy (e.g. K=2)
50
![Page 51: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/51.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Looks good on paper, but...
Real OSN data
• Twitter – 2.4M users, 48M edges,
12M tweets for 15 days (Twitter as of Dec08)
• Orkut (MPI)– 3M users, 223M edges
• Facebook (MPI)– 60K users, 500K edges
Partition algorithms
• Random (DHT)
• METIS
• MO+
• modularity optimization + hack
• SPAR online
... let’s try it for real
51
![Page 52: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/52.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
How many replicas are generated?
Twitter 16 partitions Twitter 128 partitions
Rep. overhead % over K=2 Rep. overhead % over K=2
SPAR 2.44 +22% 3.69 +84%
MO+ 3.04 +52% 5.14 +157%
METIS 3.44 +72% 8.03 +302%
Random 5.68 +184% 10.75 +434%
52
![Page 53: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/53.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
How many replicas are generated?
Twitter 16 partitions Twitter 128 partitions
Rep. overhead % over K=2 Rep. overhead % over K=2
SPAR 2.44 +22% 3.69 +84%
MO+ 3.04 +52% 5.14 +157%
METIS 3.44 +72% 8.03 +302%
Random 5.68 +184% 10.75 +434%
2.44 replicas per user (on average)2 to guarantee redundacy
+ 0.44 to guarantee data locality (+22%)
2.44
53
![Page 54: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/54.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
How many replicas are generated?
Twitter 16 partitions Twitter 128 partitions
Rep. overhead % over K=2 Rep. overhead % over K=2
SPAR 2.44 +22% 3.69 +84%
MO+ 3.04 +52% 5.14 +157%
METIS 3.44 +72% 8.03 +302%
Random 5.68 +184% 10.75 +434%
Replication with random partitioning, too costly!
54
![Page 55: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/55.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
How many replicas are generated?
Twitter 16 partitions Twitter 128 partitions
Rep. overhead % over K=2 Rep. overhead % over K=2
SPAR 2.44 +22% 3.69 +84%
MO+ 3.04 +52% 5.14 +157%
METIS 3.44 +72% 8.03 +302%
Random 5.68 +184% 10.75 +434%
Other social based partitioning algorithmshave higher replication overhead than SPAR.
55
![Page 56: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/56.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Problem
Solution
Implementation
Conclusions
56
![Page 57: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/57.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
The SPAR architecture
57
![Page 58: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/58.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
The SPAR architecture
Application, 3-tiers58
![Page 59: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/59.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
The SPAR architecture
Application, 3-tiers
Application, 3-tiersSPAR Middleware
(data-store specific)
59
![Page 60: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/60.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
The SPAR architecture
Application, 3-tiers
Application, 3-tiersSPAR Middleware
(data-store specific)
SPAR Controller : Partiton manager and
Directory Service
60
![Page 61: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/61.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
SPAR in the wild
• Twitter clone (Laconica, now Statusnet)
– Centralized architecture, PHP + MySQL/Postgres
• Twitter data
– Twitter as of end of 2008, 2.4M users, 12M tweets in 15 days
• The little engines
– 16 commodity desktops: Pentium Duo at 2.33Ghz, 2GB RAM connected with Gigabit-Ethernet switch
• Test SPAR on top of:
– MySQL (v5.5)
– Cassandra (v0.5.0)
61
![Page 62: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/62.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
SPAR on top of MySQL
• Different read request rate (application level):– 1 call to LDS, 1 join to on the notice and notice inbox to get the last 20
tweets and its content
– User is queried only once per session (4min)
– Spar allows for data to be partitioned, improving both query times and cache hit ratios.
– Full replication suffers from low cache hit ratio and large tables (1B+)
MySQL Full-Replication SPAR + MySQL
99th < 150ms 16 req/s 2500 req/s
62
![Page 63: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/63.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
SPAR on top of Cassandra
Network bandwidth is not an issue but Network I/O and CPU I/O are:
– Network delays is reduced• Worse performing server produces
delays
– Memory hit ratio is increased• Random partition destroys correlations
•Different read request rate (application level):
Vanilla Cassandra
SPAR + Cassandra
99th < 100ms 200 req/s 800 req/s
63
![Page 64: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/64.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Problem
Solution
Implementation
Conclusions
64
![Page 65: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/65.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
Conclusions
SPAR provides the means
to achieve transparent
scalability
1) For applications using
RDBMS (not necessarily
limited to OSN)
2) And, a performance
boost for key-value stores
due to the reduction of
Network I/O
65
![Page 66: The little engine(s) that could: scaling online social networks](https://reader033.fdocuments.us/reader033/viewer/2022051323/547c8015b4af9fb9158b51bb/html5/thumbnails/66.jpg)
ACM/SIGCOMM 2010 – New Delhi, India
http://research.tid.es/jmps/
@solso
Thanks!
66