An Analysis of Facebook Photo Caching

Post on 30-Dec-2015

72 views 2 download

Tags:

description

An Analysis of Facebook Photo Caching. Qi Huang , Ken Birman, Robbert van Renesse (Cornell), Wyatt Lloyd (Princeton, Facebook), Sanjeev Kumar, Harry C. Li (Facebook). 250 Billion * Photos on Facebook. Profile. Cache Layers. Feed. Full-stack Study. Album. Storage Backend. - PowerPoint PPT Presentation

Transcript of An Analysis of Facebook Photo Caching

Qi Huang, Ken Birman, Robbert van Renesse (Cornell), Wyatt Lloyd (Princeton, Facebook),

Sanjeev Kumar, Harry C. Li (Facebook)

An Analysis ofFacebook Photo Caching

2

250 Billion* Photos on Facebook

Profile

Feed

Album

* Internet.org, Sept., 2013

StorageBackend

CacheLayers

Full-stackStudy

3

Preview of Results

Current Stack Performance

Opportunities for Improvement

• Browser cache is important (reduces 65+% request )

• Photo popularity distribution shifts across layers

• Smarter algorithms can do much better (S4LRU)

• Collaborative geo-distributed cache worth trying

4

Client

Facebook Photo-Serving Stack

5

Client-based Browser CacheClient

Browser

Cache

Client

LocalFetch

6

Client-based Browser CacheClient

Browser

Cache(Millions)

Client

7

Stack Choice

Browser

Cache

Client

CachesStorage

Facebook Stack

Akamai

Content Distribution Network(CDN)

• Focus: Facebook stack

(Millions)

8

Geo-distributed Edge Cache (FIFO)

Edge Cache

(Tens)

Browser

Cache

Client

PoP

(Millions)

9

Geo-distributed Edge Cache (FIFO)

Edge Cache

(Tens)

Browser

Cache

Client

PoP

(Millions)

Purpose

1. Reduce cross-country latency

2. Reduce Data Center bandwidth

10

Geo-distributed Edge Cache (FIFO)

Edge Cache

(Tens)

Browser

Cache

Client

PoP

(Millions)

11

Geo-distributed Edge Cache (FIFO)

Edge Cache

(Tens)

Browser

Cache

Client

PoP

(Millions)

12

Single Global Origin Cache (FIFO)

Browser

Cache

Edge Cache

OriginCache

PoPClient

Data Center

(Tens)(Millions) (Four)

13

Single Global Origin Cache (FIFO)

Browser

Cache

Edge Cache

OriginCache

PoPClient

Data Center

(Tens)(Millions) (Four)

Purpose

1. Minimize I/O-bound operations

14

Single Global Origin Cache (FIFO)

Browser

Cache

Edge Cache

OriginCache

PoPClient

Data Center

(Tens)(Millions) (Four)

Hash(url)

15

Single Global Origin Cache (FIFO)

Browser

Cache

Edge Cache

OriginCache

PoPClient

Data Center

(Tens)(Millions) (Four)

16

Haystack Backend

Backend (Haystack

)

Browser

Cache

Edge Cache

OriginCache

PoPClient

Data Center

(Tens)(Millions) (Four)

17

How did we collect the trace?

18

Trace Collection

Instrumentation Scope

Backend (Haystack

)

Browser

Cache

Edge Cache

OriginCache

PoPClient

Data Center

(Object-based sampling)

• Request-based: collect X% of requests• Object-based: collect reqs for X% objects

19

Sampling on Power-law

Object rank

20

Sampling on Power-law

• Req-based: bias on popular content, inflate cache perf

Object rank

Req-based

21

Sampling on Power-law

Object-based

Object rank

• Object-based: fair coverage of unpopular content

22

Sampling on Power-law

• Object-based: fair coverage of unpopular content

Object rank

Object-based

23

Trace Collection

77.2M reqs

(Desktop)

12.3MBrowser

s

Instrumentation Scope1.4M photos, all reqs for each

Backend (Haystack

)

Browser

Cache

Edge Cache

OriginCache

PoPClient

Data Center

Resizer

R

2.6M photo objects, all reqs for each

12.3KServers

24

Analysis

• Traffic sheltering effects of caches

• Photo popularity distribution

• Size, algorithm, collaborative Edge

• In paper– Stack performance as a function of photo age– Stack performance as a function of social

connectivity– Geographical traffic flow

25

Traffic Sheltering

77.2M

26.6M11.2M

7.6M

Backend (Haystack

)

Browser

Cache

Edge Cache

OriginCache

PoPClient

Data Center

65.5%58.0%

31.8%

R

Traffic Share

65.5% 20.0% 4.6% 9.9%

26

Photo popularity and its cache impact

27

Popularity Distribution

• Browser resembles a power-law distribution

2%

28

Popularity Distribution

• “Viral” photos becomes the head for Edge

29

Popularity Distribution

• Skewness is reduced after layers of cache

30

Popularity Distribution

• Backend resembles a stretched exponential dist.

31

Popularity with Absolute Traffic

• Storage/cache designers: pick a layer

32

Popularity Impact on Caches

High Low M

Lowest

Each has 25% requests

33

Popularity Impact on Caches

• Browser traffic share decreases gradually

34

Popularity Impact on Caches

• Edge serves consistent share except for the tail

7.8%

22~23%

35

Popularity Impact on Caches

• Origin contributes most for “low” group

9.3%

36

Popularity Impact on Caches

• Backend serves the tail

70% Haystack

37

Can we make the cache better?

38

Simulation

• Replay the trace (25% warm up)

• Estimate the base cache size

• Evaluate two hit-ratios (object-wise, byte-wise)

39

Edge Cache with Different Sizes

• Picked San Jose edge (high traffic, median hit ratio)

59%

40

Edge Cache with Different Sizes

• “x” estimates current deployment size (59% hit ratio)

65%68%

59%

41

Edge Cache with Different Sizes

• “Infinite” size ratio needs 45x of current capacity

Infinite Cache

65%68%

59%

42

Edge Cache with Different Algos

• Both LRU and LFU outperforms FIFO slightly

Infinite Cache

43

S4LRU

Cache Space

More Recent

L3

L2

L1

L0

44

S4LRU

Cache Space

L3

L2

L1

L0Missed Object

More Recent

45

S4LRU

Cache Space

L3

L2

L1

L0

HitMore Recent

46

S4LRU

Cache Space

L3

L2

L1

L0

Evict

More Recent

47

Edge Cache with Different Algos

• S4LRU improves the most

68%

1/3x

Infinite Cache

59%

48

Edge Cache with Different Algos

• Clairvoyant (Bélády) shows much improvement space

Infinite Cache

49

Origin Cache

• S4LRU improves Origin more than Edge

14%

Infinite Cache

50

Which Photo to Cache

• Recency & frequency leads S4LRU

• Does age, social factors also play a role?

51

Collaborative cache on the Edge

52

Geographic Coverage of Edge

Small working set

53

Geographic Coverage of Edge

9 Edges with high-volume traffic

54

Geographic Coverage of Edge

Do clients stay with local Edge?

55

Geographic Coverage of Edge

Atlanta

56

Geographic Coverage of Edge

Atlanta

20% local

5% Dallas

35% D.C.

5% NYC

20% Miami

5% California

10% Chicago

• Atlanta has 80% requests served by remote Edges

57

Geographic Coverage of Edge

Atlanta

20% local

Miami

35% localDall

as50% local

Chicago

60% local

LA 18% local

NYC 35% local

• Substantial remote traffic is normal

58

Geographic Coverage of Edge

Amplified working set

59

Collaborative Edge

60

Collaborative Edge

• “Independent” aggregates all high-volume Edges

61

Collaborative Edge

• “Collaborative” Edge increases hit ratio by 18%

18%

Collaborative

62

Related WorkStorage Analysis

Content Distribution Analysis

Web Analysis

BSD file system (SOSP ’85), Sprite ( SOSP ’91), NT (SOSP ’99),NetApp (SOSP ’11), iBench (SOSP ’11)

Cooperative caching (SOSP ’99), CDN vs. P2P (OSDI ’02),P2P (SOSP ’03), CoralCDN (NSDI ’10), Flash crowds (IMC ’11)

Zipfian (INFOCOM ’00), Flash crowds (WWW ’02),Modern web traffic (IMC ’11)

63

Conclusion

• Quantify caching performance

• Quantify popularity changes across layers of caches

• Recency, frequency, age, social factors impact cache

• Outline potential gain of collaborative caching