Cassandra Day NY 2014: From Proof of Concept to Production

21
©2014 DataStax @tjake T Jake Luciani Apache Cassandra Committer & PMC Proof-of-Concept to Production 1

description

This talk will cover how to load test your Cassandra cluster for your applications schema and other best practices to gain confidence in your Cassandra deployment before you run in production.

Transcript of Cassandra Day NY 2014: From Proof of Concept to Production

Page 1: Cassandra Day NY 2014: From Proof of Concept to Production

©2014 DataStax

@tjake

T Jake LucianiApache Cassandra Committer & PMC

Proof-of-Concept to Production

1

Page 2: Cassandra Day NY 2014: From Proof of Concept to Production

©2014 DataStax

The way we build software

1. Proof of Concept 2. ?? 3. Production 4. Profit!

2

Do Nothing!

Preparation!

Development

Testing Performance

Operations

Monitoring

Page 3: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Cassandra Preparation

• Going to production with C* you must validate your assumptions and have a plan for when you: • loose nodes, disks, networks • have spikes of traffic • need to add more nodes • upgrade cassandra, java, hardware •…

3

Plan for all the nightmare scenarios. This gives you confidence in your system

Page 4: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Before we begin

• Be comfortable on the command line! •When something is going wrong you need to be able to get to the problem quickly and ask the write questions. Provide diagnostic information. • cassandra: nodetool, cqlsh • disk: iostat • cpu: top/htop • network: iftop • java: jstatd, jstack, jmx, visualvm (ok not command line) !

• cssh (csshx on osx)

4

Page 5: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Phase 1: DataModeling

• You’ve modeled your application in Cassandra • You’ve de-normalized based on queries !

• Stop. Stress test time… • C* 2.1 native CQL stress tool (works with 2.0) • CASSANDRA-6164 • https://github.com/tjake/cassandra/archive/6164.zip

5

Page 6: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

CQL Stress tool

•Why? Because you can push your cluster to the limit, see how *your* queries run on *your* hardware !

• cassandra-stress write -schema yaml=my.yaml !

• cassandra-stress read -schema yaml=my.yaml query=simple1

6

Page 7: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

CQL Stress

7

YAML File + Demo

Page 8: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent. 8

Drain Dump

Page 9: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Hardware

• Currently C* isn’t well suited for > 1TB per node • Except DSE Hadoop nodes which can be much larger !

• Ideally 1U or smaller (blades) • separate network, power, disk !

• If you have larger machines • VMs with disk per vm • Containers? !

• EC2 use I2 instances9

Page 10: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Unix level stuff

• turn off swap • turn off cpuspeed • switch to deadline kernel scheduler • socket buffers resize • install numactl • raise limits.conf esp (nofile and • stress your disks using something like bonnie++ to get a idea of the raw limits

10

Page 11: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Deployment

• Chef/Puppet/Ansible/etc !

• Simpler rollout and rollback !

• You should release your artifacts to a central location !

• Do this for Cassandra too •Makes upgrades easier

11

Page 12: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Monitoring

• Stress your system and learn where it breaks down • Use that to create your alerts !

• Know your SLAs • Define them at each layer of your architecture !

• OpsCenter for all things C* !

• You can also easily integrate C* metrics into other metrics systems • http://www.datastax.com/dev/blog/pluggable-metrics-reporting-in-cassandra-2-0-2

12

Page 13: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

C* Monitoring

• Specific to C* things to monitor • pending compactions • exception count • disk space

13

Page 14: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Cassandra Ops

• Understand operational basics like: • bootstrapping • repair • rebuild • scrub

14

Page 15: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Choose your own consistency

•When things go wrong you are in control • Build consistency controls into your application • In a pinch you can lower consistency and stay available

15

Page 16: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Backups

• Backups in C* are primarily to avoid human error • C* provides lightweight local snapshots • Traditional full backup of data in C* is hard todo • Your data needs to be de-duped since each nodes files contain data from many replicas

• If you need full traditional backup you are best to do full machine backups • At a minimum backup system tables (incase you loose the entire box)

16

Page 17: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Cassandra upgrades

• Read the release notes! NEWS.txt • Read the change log! CHANGES.txt • Understand the changes and how they impact your system !

!

• Do this even if you don’t plan on upgrading. • Someone else may have fixed a potential issue for you. !

• Always snapshot your data before upgrading

17

Page 18: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Canary node

•When rolling out a new version of C* or your application, roll it out only to a single node and watch it • Quickly see if something is terribly wrong • Gives you ability to verify new functionality before full rollout

18

Page 19: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Pre-Prod Environments

• Hard to do in large scale systems • Requires work like replaying traffic to second cluster • Doesn’t need to be 1:1 but offer a subset of real data to test with

19

Page 20: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

C* level stuff

• cassandra.yaml • Use stress to size your write and read pools • internode_compression: dc • lower request timeouts (improves tail latency) • set concurrent compactors to 1/4 your cores • in 2.1 we have off heap memtable • Turn on Authentication • Keeps you/apps from accidentally connecting to prod

20

Page 21: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Thanks!

21

Questions? ! @tjake