Scaling Your Cache And Caching At Scale
-
Upload
alex-miller -
Category
Technology
-
view
9.330 -
download
1
description
Transcript of Scaling Your Cache And Caching At Scale
![Page 1: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/1.jpg)
Scaling Your Cache & Caching at Scale
Alex Miller@puredanger
![Page 2: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/2.jpg)
Mission• Why does caching work?
• What’s hard about caching? • How do we make choices as we
design a caching architecture?• How do we test a cache for
performance?
![Page 3: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/3.jpg)
What is caching?
![Page 4: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/4.jpg)
Lots of data
![Page 5: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/5.jpg)
Memory Hierarchy
Register
L1 cache
L2 cache
RAM
Disk
Remote disk
1E+00 1E+01 1E+02 1E+03 1E+04 1E+05 1E+06 1E+07 1E+08 1E+09
1000000000
10000000
200
15
3
1
Clock cycles to access
![Page 6: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/6.jpg)
Register
L1 Cache
L2 Cache
Main Memory
Local Disk
Remote Disk
Fast
Slow
Small Expensive
Big Cheap
Facts of Life
![Page 7: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/7.jpg)
Caching to the rescue!
![Page 8: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/8.jpg)
Temporal Locality
Stream:
Cache:
Hits: 0%
![Page 9: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/9.jpg)
Temporal Locality
Stream:
Cache:
Hits: 0%
Stream:
Cache:
Hits: 65%
![Page 10: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/10.jpg)
Non-uniform distribution
0
800
1600
2400
3200
0%
25%
50%
75%
100%Web page hits, ordered by rank
Page views, ordered by rank
Pageviews per rank% of total hits per rank
![Page 11: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/11.jpg)
Temporal locality+
Non-uniform distribution
![Page 12: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/12.jpg)
17000 pageviewsassume avg load = 250 ms
cache 17 pages / 80% of viewscached page load = 10 ms
new avg load = 58 ms
trade memory for latency reduction
![Page 13: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/13.jpg)
The hidden benefit:reduces database load
DatabaseMemory
line of over provisioning
![Page 14: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/14.jpg)
A brief aside...
• What is Ehcache?
• What is Terracotta?
![Page 15: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/15.jpg)
Ehcache Example
CacheManager manager = new CacheManager();Ehcache cache = manager.getEhcache("employees");cache.put(new Element(employee.getId(), employee));Element element = cache.get(employee.getId());
<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600" overflowToDisk="false"/>
![Page 16: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/16.jpg)
Terracotta
App Node
TerracottaServer
TerracottaServer
App NodeApp NodeApp Node
App NodeApp NodeApp NodeApp Node
![Page 17: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/17.jpg)
But things are not always so simple...
![Page 18: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/18.jpg)
Pain of LargeData Sets
• How do I choose which elements stay in memory and which go to disk?
• How do I choose which elements to evict when I have too many?
• How do I balance cache size against other memory uses?
![Page 19: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/19.jpg)
Eviction
When cache memory is full, what do I do?
• Delete - Evict elements
• Overflow to disk - Move to slower, bigger storage
• Delete local - But keep remote data
![Page 20: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/20.jpg)
Eviction in Ehcache
<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600" overflowToDisk="false"/>
Evict with “Least Recently Used” policy:
![Page 21: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/21.jpg)
Spill to Disk in Ehcache
<diskStore path="java.io.tmpdir"/>
<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600"
overflowToDisk="true" maxElementsOnDisk="1000000" diskExpiryThreadIntervalSeconds="120" diskSpoolBufferSizeMB="30" />
Spill to disk:
![Page 22: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/22.jpg)
Terracotta Clustering
<terracottaConfig url="server1:9510,server2:9510"/>
<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600" overflowToDisk="false">
<terracotta/> </cache>
Terracotta configuration:
![Page 23: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/23.jpg)
Pain of Stale Data
• How tolerant am I of seeing values changed on the underlying data source?
• How tolerant am I of seeing values changed by another node?
![Page 24: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/24.jpg)
Expiration
1 2 3 4 5 6 7 80 9
TTI=4
TTL=4
![Page 25: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/25.jpg)
TTI and TTL in Ehcache
<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600" overflowToDisk="false"/>
![Page 26: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/26.jpg)
Replication in Ehcache<cacheManagerPeerProviderFactory class="net.sf.ehcache.distribution. RMICacheManagerPeerProviderFactory" properties="hostName=fully_qualified_hostname_or_ip, peerDiscovery=automatic, multicastGroupAddress=230.0.0.1, multicastGroupPort=4446, timeToLive=32"/>
<cache name="employees" ...> <cacheEventListenerFactory class="net.sf.ehcache.distribution.RMICacheReplicatorFactory” properties="replicateAsynchronously=true, replicatePuts=true, replicatePutsViaCopy=false, replicateUpdates=true, replicateUpdatesViaCopy=true, replicateRemovals=true asynchronousReplicationIntervalMillis=1000"/></cache>
![Page 27: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/27.jpg)
Terracotta Clustering
Still use TTI and TTL to manage stale data between cache and data source
Coherent by default but can relax with coherentReads=”false”
![Page 28: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/28.jpg)
Pain of Loading
• How do I pre-load the cache on startup?
• How do I avoid re-loading the data on every node?
![Page 29: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/29.jpg)
Persistent Disk Store
<diskStore path="java.io.tmpdir"/>
<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600" overflowToDisk="true" maxElementsOnDisk="1000000" diskExpiryThreadIntervalSeconds="120" diskSpoolBufferSizeMB="30"
diskPersistent="true" />
![Page 30: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/30.jpg)
Bootstrap Cache Loader
<bootstrapCacheLoaderFactory class="net.sf.ehcache.distribution. RMIBootstrapCacheLoaderFactory" properties="bootstrapAsynchronously=true, maximumChunkSizeBytes=5000000" propertySeparator=",” />
Bootstrap a new cache node from a peer:
On startup, create background thread to pull the existing cache data from another peer.
![Page 31: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/31.jpg)
Terracotta Persistence
Nothing needed beyond setting up Terracotta clustering.
Terracotta will automatically bootstrap:- the cache key set on startup- cache values on demand
![Page 32: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/32.jpg)
Pain of Duplication
• How do I get failover capability while avoiding excessive duplication of data?
![Page 33: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/33.jpg)
Partitioning + Terracotta Virtual Memory
• Each node (mostly) holds data it has seen• Use load balancer to get app-level partitioning• Use fine-grained locking to get concurrency• Use memory flush/fault to handle memory
overflow and availability• Use causal ordering to guarantee coherency
![Page 34: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/34.jpg)
Scaling Your Cache
![Page 35: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/35.jpg)
21
1 JVM
Ehcache
2 or more JVMS2 or more
JVMs
EhcacheRMI
2 or more JVMS2 or more big JVMs
Ehcachedisk store
Terracotta OSS
2 or more JVMS2 or more
JVMs
2 or more JVMs2 or more
JVMs2 or more JVMslots of
JVMs
more scale
Terracotta FXEhcache FX
2 or more JVMS2 or more
JVMs2 or more JVMslots of
JVMs
Terracotta FXEhcache FX
NO NO YES YESYES
Scalability Continuumcausal
ordering
# JVMs
runtime
Ehcache DXmanagementand control
Ehcache EX and FXmanagementand control
![Page 36: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/36.jpg)
Caching at Scale
![Page 37: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/37.jpg)
Know Your Use Case• Is your data partitioned (sessions) or
not (reference data)?• Do you have a hot set or uniform
access distribution?• Do you have a very large data set?
• Do you have a high write rate (50%)?• How much data consistency do you
need?
![Page 38: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/38.jpg)
Types of caches
Name Communication Advantage
Broadcast invalidation
multicast low latency
Replicated multicast offloads db
Datagrid point-to-point scalable
Distributed 2-tier point-to-point
all of the above
![Page 39: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/39.jpg)
Common Data PatternsI/O pattern Locality Hot set Rate of
changeCatalog/customer
low low low
Inventory high high high
Conversations high high low
Catalogs/customers• warm all the
data into cache
• High TTL
Inventory• fine-grained
locking• write-behind to DB
Conversations• sticky load
balancer• disconnect
conversations from DB
![Page 40: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/40.jpg)
Build a Test
• As realistic as possible• Use real data (or good fake data)
• Verify test does what you think• Ideal test run is 15-20 minutes
![Page 41: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/41.jpg)
Cache Warming
![Page 42: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/42.jpg)
Cache Warming
• Explicitly record cache warming or loading as a testing phase
• Possibly multiple warming phases
![Page 43: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/43.jpg)
Lots o’ Knobs
![Page 44: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/44.jpg)
Things to Change• Cache size• Read / write / other mix• Key distribution• Hot set• Key / value size and structure• # of nodes
![Page 45: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/45.jpg)
Lots o’ Gauges
![Page 46: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/46.jpg)
Things to Measure
• Application throughput (TPS)• Application latency
• OS: CPU, Memory, Disk, Network• JVM: Heap, Threads, GC
![Page 47: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/47.jpg)
Benchmark and Tune
• Create a baseline• Run and modify parameters
• Test, observe, hypothesize, verify• Keep a run log
![Page 48: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/48.jpg)
Bottleneck Analysis
![Page 49: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/49.jpg)
Pushing It• If CPUs are not all busy...
• Can you push more load?• Waiting for I/O or resources
• If CPUs are all busy...• Latency analysis
![Page 50: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/50.jpg)
I/O Waiting• Database
• Connection pooling • Database tuning
• Lazy connections• Remote services
![Page 51: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/51.jpg)
Locking and Concurrency
Key Value1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
get 2get 2
put 8
put 1
2
Threads Locks
![Page 52: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/52.jpg)
Locking and Concurrency
Key Value1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
get 2get 2
put 8
put 12
Threads Locks
![Page 53: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/53.jpg)
Objects and GC
• Unnecessary object churn• Tune GC
• Concurrent vs parallel collectors• Max heap
• ...and so much more• Watch your GC pauses!!!
![Page 54: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/54.jpg)
Cache Efficiency
• Watch hit rates and latencies• Cache hit - should be fast
• Unless concurrency issue• Cache miss
• Miss local vs• Miss disk / cluster
![Page 55: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/55.jpg)
Cache Sizing• Expiration and eviction tuning
• TTI - manage moving hot set
• TTL - manage max staleness
• Max in memory - keep hot set resident
• Max on disk / cluster - manage total disk / clustered cache
![Page 56: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/56.jpg)
Cache Coherency
• No replication (fastest)
• RMI replication (loose coupling)• Terracotta replication (causal
ordering) - way faster than strict ordering
![Page 57: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/57.jpg)
Latency Analysis
• Profilers• Custom timers
• Tier timings• Tracer bullets
![Page 58: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/58.jpg)
mumble-mumble*
It’s time to add it to Terracotta.
* lawyers won’t let me say more
![Page 59: Scaling Your Cache And Caching At Scale](https://reader034.fdocuments.us/reader034/viewer/2022042623/540875768d7f724f088b4763/html5/thumbnails/59.jpg)
Thanks!
• Twitter - @puredanger• Blog - http://tech.puredanger.com
• Terracotta - http://terracotta.org• Ehcache - http://ehcache.org