Cassandra Summit 2014: Performance Tuning Cassandra in AWS
-
Upload
planet-cassandra -
Category
Technology
-
view
730 -
download
6
description
Transcript of Cassandra Summit 2014: Performance Tuning Cassandra in AWS
1!© 2014 by Intellectual Reserve, Inc. All rights reserved.!
Performance Tuning Cassandra In AWS"
Cassandra Summit 2014!Michael Nelson!
2!
Outline!
• The App: FamilySearch Family Tree!• The Test: Borland Silk Performer!• The Findings:!
• Row Cache!• Token Aware Driver!• Networking Issues!• Etc.!
3!
What Is FamilySearch?!
• Familysearch.org Website!• Very Large Single Pedigree (Family Tree)!• Largest Collection of Free Genealogical Records!• Largest Genealogical Library!• The Church of Jesus Christ of Latter-day Saints
(Mormons)!
4!
Why does FamilySearch exist?!
Visit http://mormon.org/family-history/!!
5!
Family Tree Data!
Family Tree: !• 900M+ Person Records, Open-Edit!• 500M+ Relationships, Open-Edit!• 8.4B Change Log Entries, ~1M / day!• 7TB in Cassandra (13TB in Oracle)!
• Dynamic OLTP system!• Data-dependent performance issues!
6!
Family Tree: Example 9 Gen Pedigree!
up to 511 person slots Dynamic content!
7!
Family Tree: Example Pedigree App!
31+ persons per sec0on Dynamic content!
8!
Family Tree: Example Ancestor Page!
10+ persons in families 100-‐1000+ changes Dynamic content!
9!
Cassandra Reimplementation!
• Event-Sourced Data Model – journal / views!• New Data Model – no indexes!• New Consistency Model – satisfies consistency!
JE #8
P1 P1 Views
A B
JE #6
P2 P2 Views
A B
10!
77% Reads / 23% Writes!
Reads:!• LOCAL_ONE!• Simple Queries!
Writes:!• LOCAL_QUORUM!• Atomic Batches!• Multiple Tables!• Multiple Rows!• Business Logic!
11!
A Little Optimization Goes A Long Way!
28 Node Cluster!• 250,000 op/sec!• Optimized App!
8 Node Cluster!• 200,000 op/sec!• Optimized App!• Row Cache!• Token Aware Driver!
12!
Test System!
Cassandra (Community Ed. 2.0.5)
Family Tree App Servers
(Datastax 2.0.0)
Silk Performer Load Agents
8 hi1.4xlarge: • 16 CPU • 61 GB RAM • 2 TB SSD • 10 Gb net
60 m2.2xlarge: • 4 CPU • 34 GB RAM • “moderate” net
25 m2.xlarge: • 2 CPU • 17 GB RAM • “moderate” net
13!
2x Throughput Increase!
0
50,000
100,000
150,000
200,000
Defaults Row Cache Token Aware concurrent_reads
op / sec
Reads Writes
14!
Row Cache = 35% More Throughput!
Default Key Cache:!• Cached Disk Location!• Data From Disk Cache!• ~11ms Reads!
Row Cache:!• Cached Row Contents!• ~7ms Reads!
15!
Configuring Row Cache!
cassandra.yaml:!# Maximum size of the row cache in memory.
# Default value is 0, to disable row caching. row_cache_size_in_mb: 32768
!Enable For Each Table Explicitly:!ALTER TABLE person_view WITH caching = 'ALL';
!
16!
90% Row Cache Hit Rate!
17!
Token Aware = 50% More Throughput!
Default Round Robin:!• Coordinator Middleman!• Adds Network Hops!• Load On Multiple Nodes!• ~7ms!
Token Aware:!• Reads From Replicas!• No Network Hops!• ~2ms!
18!
Configuring Token Aware!
Default Load Balancing Policy:!new RoundRobinPolicy()
Better:!new TokenAwarePolicy(new RoundRobinPolicy())
19!
concurrent_reads = 5% More Throughput!
Defaults:!concurrent_reads: 32
concurrent_writes: 32 native_transport_max_threads: 128
Improved:!concurrent_reads: 256 concurrent_writes: 256
native_transport_max_threads: 256
20!
Now Where’s The Bottleneck?!
• 181,000 reads/sec; 21,000 writes/sec!• CPU = 80%!• Network = 10%!• Disk < 5%!
21!
Network Mystery: C* ≤ 800Mb!
C* Never Exceeded 800Mb On 10Gb Network!!!
22!
Network Mystery: Cyclic Net Queues!
• About 5 Second Cycle of Net Queues Backing Up!• Client Machines Seemed OK!• Tweaking Network Stack Had No Impact:!
• net.core.wmem_max!• net.core.rmem_max!• net.ipv4.tcp_wmem!• net.ipv4.tcp_rmem!• net.core.somaxconn!• net.core.netdev_max_backlog!• net.ipv4.tcp_tw_recycle!• net.ipv4.tcp_max_syn_backlog!• net.ipv4.ip_local_port_range!• txqueuelen!
23!
Network Mystery: Cyclic Net Queues!
Send-Qs Backup!!
24!
Network Mystery: Cyclic Net Queues!
Recv-Qs Backup!!
25!
Network Mystery: Cyclic Net Queues!
Somewhat Normal – Then Starts Again!!
26!
2x Throughput Increase!
0
50,000
100,000
150,000
200,000
Defaults Row Cache Token Aware concurrent_reads
op / sec
Reads Writes
27!
Contact Info!
Michael Nelson"Development [email protected]!!Thanks to FamilySearch team!!!Thanks to the awesome presenters & organizers at
#CassandraSummit!!