Social Hash - USENIX · Summary Assignment problems are common in distributed systems design...

23
Social Hash: an Assignment Framework for Optimizing Distributed Systems Operations on Social Networks Alon Shalita, Brian Karrer, Igor Kabiljo, Arun Sharma, Alessandro Presta, Aaron Adcock, Herald Kllapi, and Michael Stumm March 2016

Transcript of Social Hash - USENIX · Summary Assignment problems are common in distributed systems design...

Page 1: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Social Hash:an Assignment Framework for Optimizing Distributed Systems Operations on Social Networks

Alon Shalita, Brian Karrer, Igor Kabiljo, Arun Sharma, Alessandro Presta, Aaron Adcock, Herald Kllapi, and Michael Stumm

March 2016

Page 2: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Assignment ProblemFront-end clusters

Cache

Point-of-Presence

(PoP)Cache

Cache

Alon’s HTTP requests

Igor’s HTTP requests

Page 3: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Assignment ProblemFront-end clusters

Cache

Cache

Cache

Point-of-Presence

(PoP)

Alon’s HTTP requests

Igor’s HTTP requests

Page 4: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Assignment Problem OptimizationFront-end clusters

Cache

Cache

Cache

Point-of-Presence

(PoP)

Page 5: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Solution RequirementsFront-end clusters

Cache

Cache

Cache

Balanced

Point-of-Presence

(PoP)

Page 6: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Solution RequirementsFront-end clusters

Cache

Cache

Cache

Adaptive

Point-of-Presence

(PoP)

Alon’s HTTP requests

Page 7: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Solution RequirementsFront-end clusters

Cache

Cache

Cache

Stable

Point-of-Presence

(PoP)

Alon’s HTTP requests

Page 8: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Solution RequirementsFront-end clusters

Cache

Cache

Cache

Fast decision

Point-of-Presence

(PoP)

Alon’s HTTP requests

Page 9: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Social Hash framework

Page 10: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Social Hash framework

!

Components!(e.g.,!compute!clusters!or!storage!subsystems)!

Groups!

Objects!(e.g.,!data!records!or!HTTP!requests)!

static!!

assignm

ent!

dynamic!!

assignm

ent!

Page 11: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Static assignment

▪ Goal: assign similar objects sent to the same group

▪ Data access pattern -> represent as graph -> graph partitioning

▪ Large-scale optimization: slow, time-consuming

Page 12: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Dynamic assignment

▪ Goal: adapt to maintain load balance by altering group -> component

▪ hardware changes

▪ dynamic workload

▪ addition and removal of objects

▪ Two-level framework separates optimization from adaptation

▪ Slow optimization -> static

▪ Fast adaptation -> dynamic

▪ Group-to-component ratio controls tradeoff

Page 13: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Social Hash framework!!!!!!!!!!!!!!!!!!!!!! !

Social!Hash!Tbl! Assignment!Tbl!

group!g"

Lookup!

Request!

Missing!key!!assignment!

key"

failed!

group!

c"

Graph!Partitioning!

graph!specifications! Dynamic!!

Assignment!

monitoring!info!operator!console!

key"

Page 14: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

HTTP Request Routing

Page 15: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Social Hash for Facebook’s web routing

▪ Objects: HTTP request identified by user, Components: front-end clusters

▪ PoP: Dynamic assignment by hash ringFront-end clusters

Cache

Cache

Point-of-Presence

(PoP)

Page 16: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Edge locality for Facebook’s web routing

●●

●●

●●

●●

●●

0.00

0.25

0.50

0.75

1.00

10 100 1,000 10,000 100,000Number of groups

Edge

loca

lity

▪ Production routing: 21k groups for 10’s of front-end clusters

▪ Over half of friendships are within groups

▪ Updated on a weekly basis (~1% movement)

Page 17: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Live traffic experiment: TAO miss rate

−30

−20

−10

0

10

Sun

Mon

Tue

Wed Thu Fri

Sat

Sun

Mon

Tue

Wed Thu Fri

Sat

Day

Perc

enta

ge c

hang

e in

TAO

miss

rate

(%)▪ Orange: traffic shifts

▪ Red: duration of test

▪ Green: updated Social Hash table

Page 18: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Storage Sharding

Page 19: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Assignment Problem 2: Storage sharding

Arun’s query

Objects: data recordsComponents: storage machines

Page 20: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Assignment Problem 2: Storage sharding

Arun’s query

Objects: data recordsComponents: storage machines

Page 21: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Static assignment

▪ Minimize fanout through bipartite graph partitioning

▪ Graph contains recent queries and data records

▪ edge => query accesses data record

▪ Dotted: edge locality optimization

▪ Solid: fanout optimization●

10

20

30

40

50

3 10 30 100 300 1,000 3,000 10,000Number of groups

Aver

age

fano

ut

Page 22: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Storage sharding deployment

▪ Graph database with thousands of storage servers

▪ Group-to-component ratio of 8

▪ Static assignment every few months

▪ Results:

▪ Average latencies decreased by over 50%

▪ CPU utilization decreased by over 50%

Page 23: Social Hash - USENIX · Summary Assignment problems are common in distributed systems design Proposed Social Hash framework for solving assignment problems Two-level design optimizes

Summary

▪ Assignment problems are common in distributed systems design

▪ Proposed Social Hash framework for solving assignment problems

▪ Two-level design optimizes performance with graph partitioning

▪ Two Facebook integrations in production for over a year

▪ HTTP Request Routing: > 25% reduction in TAO miss rate

▪ Storage Sharding: Latency and CPU utilization reduced by over 50%