Workload Analysis of a Large-Scale Key-Value Store

Post on 19-Mar-2016

221 views 12 download

description

Berk Atikoglu, Yuehai Xu , Eitan Fracthenberg , Song Yiang , Mike Paleczny. Workload Analysis of a Large-Scale Key-Value Store. Analyze Memcached at Facebook. +284,000,000,000 requests 5 different use cases Workload characteristics, locality, cache effectiveness. - PowerPoint PPT Presentation

Transcript of Workload Analysis of a Large-Scale Key-Value Store

Workload Analysis of a Large-Scale Key-Value Store

Berk Atikoglu, Yuehai Xu, Eitan Fracthenberg, Song Yiang, Mike Paleczny

2

Analyze Memcached at Facebook

+284,000,000,000 requests

5 different use cases Workload characteristics, locality,

cache effectiveness

3

Why Is Caching Important?

Cache ServersWeb Servers

Database

4

Motivation

Understand workload characteristics Identify factors affecting

performance Provide a benchmark for future

studies

5

Memcached

Distributed memory caching system Key-value store for small objects

Hash Function

Memcached Servers

Key

6

Tracing Methodology

Capture traces through a Linux Kernel Module (LKM)

Process traces with Hive

MemcachedTransport (TCP/UDP)NetworkEthernet

LKM

7

Facebook Deployment

Pool Size DescriptionUSR Few User-account status informationAPP Dozens Object metadata of a popular

applicationSYS Few System data on service locationVAR Dozens Server-side browser informationETC Hundreds Nonspecific, general purpose

Contains server related information

Anything that doesn’t belong to a specific pool goes to ETC

8

Analysis

Workload Characteristics

Locality, Cache Behavior

9

Request Composition> 99.8% GET

GET:UPDATE = 30:1

10

Key Size Distribution90% of VAR keys are 31B

USR keys are 16B or 21B

ETC is heterogeneous

11

Value Size DistributionUSR values are only 2B

90% of values are smaller than 500B

12

Value Size Dist. By Overall Weight

90% of data is generated by values of 500B or smaller except ETC

90% is 10KB or smaller values for ETC

13

Request Rate Over TimeAll pools show diurnal pattern except SYS

14

Request Rate Over Time (ETC)

Night time in Western Semiphere

North America starts its day

15

Analysis

Workload Characteristics

Locality, Cache Behavior

16

Repeating Keys0.0003% of keys in 10% of requests in ETC

1% of keys in 55% of requests in ETC

Least frequent 50% of keys in 1% of requests in ETC

17

Locality Over Time

USR APP ETC VAR SYS020406080

100

% of unique keys out of total in unit time

5min 60min

18

Reuse Period of Keys99.9% of SYS keys are reused in 1hr

88.5% of ETC keys are reused in 1hr

96.4% of ETC keys are reused in 6hr

19

Hit Rate98.2% 92.9% 81.4%

93.7% 98.7%

Why?

20

Causes of ETC Cache Misses

Compulsory

Capacity

Invalidation

70% 22% 8%

81%

13%4% 2%hit miss: compulsory miss: capacity

miss: invalidation

21

Conclusion

Analyzed 5 different memcached use cases

Different applications of memcached have extreme variations in access patterns

Answered pertinent questions to improve Facebook’s memcached usage

22

Thank You

Questions?