An Analysis of Facebook Photo Caching

63
Qi Huang, Ken Birman, Robbert van Renesse (Cornell), Wyatt Lloyd (Princeton, Facebook), Sanjeev Kumar, Harry C. Li (Facebook) An Analysis of Facebook Photo Caching

description

An Analysis of Facebook Photo Caching. Qi Huang , Ken Birman, Robbert van Renesse (Cornell), Wyatt Lloyd (Princeton, Facebook), Sanjeev Kumar, Harry C. Li (Facebook). 250 Billion * Photos on Facebook. Profile. Cache Layers. Feed. Full-stack Study. Album. Storage Backend. - PowerPoint PPT Presentation

Transcript of An Analysis of Facebook Photo Caching

Page 1: An Analysis of Facebook Photo Caching

Qi Huang, Ken Birman, Robbert van Renesse (Cornell), Wyatt Lloyd (Princeton, Facebook),

Sanjeev Kumar, Harry C. Li (Facebook)

An Analysis ofFacebook Photo Caching

Page 2: An Analysis of Facebook Photo Caching

2

250 Billion* Photos on Facebook

Profile

Feed

Album

* Internet.org, Sept., 2013

StorageBackend

CacheLayers

Full-stackStudy

Page 3: An Analysis of Facebook Photo Caching

3

Preview of Results

Current Stack Performance

Opportunities for Improvement

• Browser cache is important (reduces 65+% request )

• Photo popularity distribution shifts across layers

• Smarter algorithms can do much better (S4LRU)

• Collaborative geo-distributed cache worth trying

Page 4: An Analysis of Facebook Photo Caching

4

Client

Facebook Photo-Serving Stack

Page 5: An Analysis of Facebook Photo Caching

5

Client-based Browser CacheClient

Browser

Cache

Client

LocalFetch

Page 6: An Analysis of Facebook Photo Caching

6

Client-based Browser CacheClient

Browser

Cache(Millions)

Client

Page 7: An Analysis of Facebook Photo Caching

7

Stack Choice

Browser

Cache

Client

CachesStorage

Facebook Stack

Akamai

Content Distribution Network(CDN)

• Focus: Facebook stack

(Millions)

Page 8: An Analysis of Facebook Photo Caching

8

Geo-distributed Edge Cache (FIFO)

Edge Cache

(Tens)

Browser

Cache

Client

PoP

(Millions)

Page 9: An Analysis of Facebook Photo Caching

9

Geo-distributed Edge Cache (FIFO)

Edge Cache

(Tens)

Browser

Cache

Client

PoP

(Millions)

Purpose

1. Reduce cross-country latency

2. Reduce Data Center bandwidth

Page 10: An Analysis of Facebook Photo Caching

10

Geo-distributed Edge Cache (FIFO)

Edge Cache

(Tens)

Browser

Cache

Client

PoP

(Millions)

Page 11: An Analysis of Facebook Photo Caching

11

Geo-distributed Edge Cache (FIFO)

Edge Cache

(Tens)

Browser

Cache

Client

PoP

(Millions)

Page 12: An Analysis of Facebook Photo Caching

12

Single Global Origin Cache (FIFO)

Browser

Cache

Edge Cache

OriginCache

PoPClient

Data Center

(Tens)(Millions) (Four)

Page 13: An Analysis of Facebook Photo Caching

13

Single Global Origin Cache (FIFO)

Browser

Cache

Edge Cache

OriginCache

PoPClient

Data Center

(Tens)(Millions) (Four)

Purpose

1. Minimize I/O-bound operations

Page 14: An Analysis of Facebook Photo Caching

14

Single Global Origin Cache (FIFO)

Browser

Cache

Edge Cache

OriginCache

PoPClient

Data Center

(Tens)(Millions) (Four)

Hash(url)

Page 15: An Analysis of Facebook Photo Caching

15

Single Global Origin Cache (FIFO)

Browser

Cache

Edge Cache

OriginCache

PoPClient

Data Center

(Tens)(Millions) (Four)

Page 16: An Analysis of Facebook Photo Caching

16

Haystack Backend

Backend (Haystack

)

Browser

Cache

Edge Cache

OriginCache

PoPClient

Data Center

(Tens)(Millions) (Four)

Page 17: An Analysis of Facebook Photo Caching

17

How did we collect the trace?

Page 18: An Analysis of Facebook Photo Caching

18

Trace Collection

Instrumentation Scope

Backend (Haystack

)

Browser

Cache

Edge Cache

OriginCache

PoPClient

Data Center

(Object-based sampling)

• Request-based: collect X% of requests• Object-based: collect reqs for X% objects

Page 19: An Analysis of Facebook Photo Caching

19

Sampling on Power-law

Object rank

Page 20: An Analysis of Facebook Photo Caching

20

Sampling on Power-law

• Req-based: bias on popular content, inflate cache perf

Object rank

Req-based

Page 21: An Analysis of Facebook Photo Caching

21

Sampling on Power-law

Object-based

Object rank

• Object-based: fair coverage of unpopular content

Page 22: An Analysis of Facebook Photo Caching

22

Sampling on Power-law

• Object-based: fair coverage of unpopular content

Object rank

Object-based

Page 23: An Analysis of Facebook Photo Caching

23

Trace Collection

77.2M reqs

(Desktop)

12.3MBrowser

s

Instrumentation Scope1.4M photos, all reqs for each

Backend (Haystack

)

Browser

Cache

Edge Cache

OriginCache

PoPClient

Data Center

Resizer

R

2.6M photo objects, all reqs for each

12.3KServers

Page 24: An Analysis of Facebook Photo Caching

24

Analysis

• Traffic sheltering effects of caches

• Photo popularity distribution

• Size, algorithm, collaborative Edge

• In paper– Stack performance as a function of photo age– Stack performance as a function of social

connectivity– Geographical traffic flow

Page 25: An Analysis of Facebook Photo Caching

25

Traffic Sheltering

77.2M

26.6M11.2M

7.6M

Backend (Haystack

)

Browser

Cache

Edge Cache

OriginCache

PoPClient

Data Center

65.5%58.0%

31.8%

R

Traffic Share

65.5% 20.0% 4.6% 9.9%

Page 26: An Analysis of Facebook Photo Caching

26

Photo popularity and its cache impact

Page 27: An Analysis of Facebook Photo Caching

27

Popularity Distribution

• Browser resembles a power-law distribution

2%

Page 28: An Analysis of Facebook Photo Caching

28

Popularity Distribution

• “Viral” photos becomes the head for Edge

Page 29: An Analysis of Facebook Photo Caching

29

Popularity Distribution

• Skewness is reduced after layers of cache

Page 30: An Analysis of Facebook Photo Caching

30

Popularity Distribution

• Backend resembles a stretched exponential dist.

Page 31: An Analysis of Facebook Photo Caching

31

Popularity with Absolute Traffic

• Storage/cache designers: pick a layer

Page 32: An Analysis of Facebook Photo Caching

32

Popularity Impact on Caches

High Low M

Lowest

Each has 25% requests

Page 33: An Analysis of Facebook Photo Caching

33

Popularity Impact on Caches

• Browser traffic share decreases gradually

Page 34: An Analysis of Facebook Photo Caching

34

Popularity Impact on Caches

• Edge serves consistent share except for the tail

7.8%

22~23%

Page 35: An Analysis of Facebook Photo Caching

35

Popularity Impact on Caches

• Origin contributes most for “low” group

9.3%

Page 36: An Analysis of Facebook Photo Caching

36

Popularity Impact on Caches

• Backend serves the tail

70% Haystack

Page 37: An Analysis of Facebook Photo Caching

37

Can we make the cache better?

Page 38: An Analysis of Facebook Photo Caching

38

Simulation

• Replay the trace (25% warm up)

• Estimate the base cache size

• Evaluate two hit-ratios (object-wise, byte-wise)

Page 39: An Analysis of Facebook Photo Caching

39

Edge Cache with Different Sizes

• Picked San Jose edge (high traffic, median hit ratio)

59%

Page 40: An Analysis of Facebook Photo Caching

40

Edge Cache with Different Sizes

• “x” estimates current deployment size (59% hit ratio)

65%68%

59%

Page 41: An Analysis of Facebook Photo Caching

41

Edge Cache with Different Sizes

• “Infinite” size ratio needs 45x of current capacity

Infinite Cache

65%68%

59%

Page 42: An Analysis of Facebook Photo Caching

42

Edge Cache with Different Algos

• Both LRU and LFU outperforms FIFO slightly

Infinite Cache

Page 43: An Analysis of Facebook Photo Caching

43

S4LRU

Cache Space

More Recent

L3

L2

L1

L0

Page 44: An Analysis of Facebook Photo Caching

44

S4LRU

Cache Space

L3

L2

L1

L0Missed Object

More Recent

Page 45: An Analysis of Facebook Photo Caching

45

S4LRU

Cache Space

L3

L2

L1

L0

HitMore Recent

Page 46: An Analysis of Facebook Photo Caching

46

S4LRU

Cache Space

L3

L2

L1

L0

Evict

More Recent

Page 47: An Analysis of Facebook Photo Caching

47

Edge Cache with Different Algos

• S4LRU improves the most

68%

1/3x

Infinite Cache

59%

Page 48: An Analysis of Facebook Photo Caching

48

Edge Cache with Different Algos

• Clairvoyant (Bélády) shows much improvement space

Infinite Cache

Page 49: An Analysis of Facebook Photo Caching

49

Origin Cache

• S4LRU improves Origin more than Edge

14%

Infinite Cache

Page 50: An Analysis of Facebook Photo Caching

50

Which Photo to Cache

• Recency & frequency leads S4LRU

• Does age, social factors also play a role?

Page 51: An Analysis of Facebook Photo Caching

51

Collaborative cache on the Edge

Page 52: An Analysis of Facebook Photo Caching

52

Geographic Coverage of Edge

Small working set

Page 53: An Analysis of Facebook Photo Caching

53

Geographic Coverage of Edge

9 Edges with high-volume traffic

Page 54: An Analysis of Facebook Photo Caching

54

Geographic Coverage of Edge

Do clients stay with local Edge?

Page 55: An Analysis of Facebook Photo Caching

55

Geographic Coverage of Edge

Atlanta

Page 56: An Analysis of Facebook Photo Caching

56

Geographic Coverage of Edge

Atlanta

20% local

5% Dallas

35% D.C.

5% NYC

20% Miami

5% California

10% Chicago

• Atlanta has 80% requests served by remote Edges

Page 57: An Analysis of Facebook Photo Caching

57

Geographic Coverage of Edge

Atlanta

20% local

Miami

35% localDall

as50% local

Chicago

60% local

LA 18% local

NYC 35% local

• Substantial remote traffic is normal

Page 58: An Analysis of Facebook Photo Caching

58

Geographic Coverage of Edge

Amplified working set

Page 59: An Analysis of Facebook Photo Caching

59

Collaborative Edge

Page 60: An Analysis of Facebook Photo Caching

60

Collaborative Edge

• “Independent” aggregates all high-volume Edges

Page 61: An Analysis of Facebook Photo Caching

61

Collaborative Edge

• “Collaborative” Edge increases hit ratio by 18%

18%

Collaborative

Page 62: An Analysis of Facebook Photo Caching

62

Related WorkStorage Analysis

Content Distribution Analysis

Web Analysis

BSD file system (SOSP ’85), Sprite ( SOSP ’91), NT (SOSP ’99),NetApp (SOSP ’11), iBench (SOSP ’11)

Cooperative caching (SOSP ’99), CDN vs. P2P (OSDI ’02),P2P (SOSP ’03), CoralCDN (NSDI ’10), Flash crowds (IMC ’11)

Zipfian (INFOCOM ’00), Flash crowds (WWW ’02),Modern web traffic (IMC ’11)

Page 63: An Analysis of Facebook Photo Caching

63

Conclusion

• Quantify caching performance

• Quantify popularity changes across layers of caches

• Recency, frequency, age, social factors impact cache

• Outline potential gain of collaborative caching