+284,000,000,000 requests 5 different use cases Workload characteristics, locality, cache...

22
Workload Analysis of a Large-Scale Key-Value Store Berk Atikoglu, Yuehai Xu, Eitan Fracthenberg, Song Yiang, Mike Paleczny

Transcript of +284,000,000,000 requests 5 different use cases Workload characteristics, locality, cache...

Page 1: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

Workload Analysis of a Large-Scale Key-Value Store

Berk Atikoglu, Yuehai Xu, Eitan Fracthenberg, Song Yiang, Mike Paleczny

Page 2: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

2

Analyze Memcached at Facebook

+284,000,000,000 requests

5 different use cases

Workload characteristics, locality, cache effectiveness

Page 3: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

3

Why Is Caching Important?

Cache ServersWeb Servers

Database

Page 4: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

4

Motivation

Understand workload characteristics

Identify factors affecting performance

Provide a benchmark for future studies

Page 5: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

5

Memcached

Distributed memory caching system Key-value store for small objects

Hash Function

Memcached Servers

Key

Page 6: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

6

Tracing Methodology

Capture traces through a Linux Kernel Module (LKM)

Process traces with Hive

Memcached

Transport (TCP/UDP)

Network

Ethernet

LKM

Page 7: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

7

Facebook Deployment

Pool Size Description

USR Few User-account status information

APP Dozens Object metadata of a popular application

SYS Few System data on service location

VAR Dozens Server-side browser information

ETC Hundreds Nonspecific, general purpose

Contains server related information

Anything that doesn’t belong to a specific pool goes to ETC

Page 8: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

8

Analysis

Workload Characteristics

Locality, Cache Behavior

Page 9: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

9

Request Composition

> 99.8% GETGET:UPDATE = 30:1

Page 10: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

10

Key Size Distribution90% of VAR keys are 31B

USR keys are 16B or 21B

ETC is heterogeneous

Page 11: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

11

Value Size DistributionUSR values are only 2B

90% of values are smaller than 500B

Page 12: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

12

Value Size Dist. By Overall Weight

90% of data is generated by values of 500B or smaller except ETC

90% is 10KB or smaller values for ETC

Page 13: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

13

Request Rate Over Time

All pools show diurnal pattern except SYS

Page 14: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

14

Request Rate Over Time (ETC)

Night time in Western Semiphere

North America starts its day

Page 15: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

15

Analysis

Workload Characteristics

Locality, Cache Behavior

Page 16: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

16

Repeating Keys0.0003% of keys in 10% of requests in ETC

1% of keys in 55% of requests in ETC

Least frequent 50% of keys in 1% of requests in ETC

Page 17: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

17

Locality Over Time

USR APP ETC VAR SYS0

20

40

60

80

100

% of unique keys out of total in unit time

5min 60min

Page 18: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

18

Reuse Period of Keys99.9% of SYS keys are reused in 1hr

88.5% of ETC keys are reused in 1hr

96.4% of ETC keys are reused in 6hr

Page 19: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

19

Hit Rate98.2% 92.9% 81.4%

93.7% 98.7%

Why?

Page 20: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

20

Causes of ETC Cache Misses

Compulsory

Capacity

Invalidation

70% 22% 8%

81%

13%4% 2%hit miss: compulsory miss: capacity

miss: invalidation

Page 21: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

21

Conclusion

Analyzed 5 different memcached use cases

Different applications of memcached have extreme variations in access patterns

Answered pertinent questions to improve Facebook’s memcached usage

Page 22: +284,000,000,000 requests  5 different use cases  Workload characteristics, locality, cache effectiveness 1.

22

Thank You

Questions?