Download - Cassandra Day NY 2014: From Proof of Concept to Production

Transcript
Page 1: Cassandra Day NY 2014: From Proof of Concept to Production

©2014 DataStax

@tjake

T Jake LucianiApache Cassandra Committer & PMC

Proof-of-Concept to Production

1

Page 2: Cassandra Day NY 2014: From Proof of Concept to Production

©2014 DataStax

The way we build software

1. Proof of Concept 2. ?? 3. Production 4. Profit!

2

Do Nothing!

Preparation!

Development

Testing Performance

Operations

Monitoring

Page 3: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Cassandra Preparation

• Going to production with C* you must validate your assumptions and have a plan for when you: • loose nodes, disks, networks • have spikes of traffic • need to add more nodes • upgrade cassandra, java, hardware •…

3

Plan for all the nightmare scenarios. This gives you confidence in your system

Page 4: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Before we begin

• Be comfortable on the command line! •When something is going wrong you need to be able to get to the problem quickly and ask the write questions. Provide diagnostic information. • cassandra: nodetool, cqlsh • disk: iostat • cpu: top/htop • network: iftop • java: jstatd, jstack, jmx, visualvm (ok not command line) !

• cssh (csshx on osx)

4

Page 5: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Phase 1: DataModeling

• You’ve modeled your application in Cassandra • You’ve de-normalized based on queries !

• Stop. Stress test time… • C* 2.1 native CQL stress tool (works with 2.0) • CASSANDRA-6164 • https://github.com/tjake/cassandra/archive/6164.zip

5

Page 6: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

CQL Stress tool

•Why? Because you can push your cluster to the limit, see how *your* queries run on *your* hardware !

• cassandra-stress write -schema yaml=my.yaml !

• cassandra-stress read -schema yaml=my.yaml query=simple1

6

Page 7: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

CQL Stress

7

YAML File + Demo

Page 8: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent. 8

Drain Dump

Page 9: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Hardware

• Currently C* isn’t well suited for > 1TB per node • Except DSE Hadoop nodes which can be much larger !

• Ideally 1U or smaller (blades) • separate network, power, disk !

• If you have larger machines • VMs with disk per vm • Containers? !

• EC2 use I2 instances9

Page 10: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Unix level stuff

• turn off swap • turn off cpuspeed • switch to deadline kernel scheduler • socket buffers resize • install numactl • raise limits.conf esp (nofile and • stress your disks using something like bonnie++ to get a idea of the raw limits

10

Page 11: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Deployment

• Chef/Puppet/Ansible/etc !

• Simpler rollout and rollback !

• You should release your artifacts to a central location !

• Do this for Cassandra too •Makes upgrades easier

11

Page 12: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Monitoring

• Stress your system and learn where it breaks down • Use that to create your alerts !

• Know your SLAs • Define them at each layer of your architecture !

• OpsCenter for all things C* !

• You can also easily integrate C* metrics into other metrics systems • http://www.datastax.com/dev/blog/pluggable-metrics-reporting-in-cassandra-2-0-2

12

Page 13: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

C* Monitoring

• Specific to C* things to monitor • pending compactions • exception count • disk space

13

Page 14: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Cassandra Ops

• Understand operational basics like: • bootstrapping • repair • rebuild • scrub

14

Page 15: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Choose your own consistency

•When things go wrong you are in control • Build consistency controls into your application • In a pinch you can lower consistency and stay available

15

Page 16: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Backups

• Backups in C* are primarily to avoid human error • C* provides lightweight local snapshots • Traditional full backup of data in C* is hard todo • Your data needs to be de-duped since each nodes files contain data from many replicas

• If you need full traditional backup you are best to do full machine backups • At a minimum backup system tables (incase you loose the entire box)

16

Page 17: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Cassandra upgrades

• Read the release notes! NEWS.txt • Read the change log! CHANGES.txt • Understand the changes and how they impact your system !

!

• Do this even if you don’t plan on upgrading. • Someone else may have fixed a potential issue for you. !

• Always snapshot your data before upgrading

17

Page 18: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Canary node

•When rolling out a new version of C* or your application, roll it out only to a single node and watch it • Quickly see if something is terribly wrong • Gives you ability to verify new functionality before full rollout

18

Page 19: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Pre-Prod Environments

• Hard to do in large scale systems • Requires work like replaying traffic to second cluster • Doesn’t need to be 1:1 but offer a subset of real data to test with

19

Page 20: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

C* level stuff

• cassandra.yaml • Use stress to size your write and read pools • internode_compression: dc • lower request timeouts (improves tail latency) • set concurrent compactors to 1/4 your cores • in 2.1 we have off heap memtable • Turn on Authentication • Keeps you/apps from accidentally connecting to prod

20

Page 21: Cassandra Day NY 2014: From Proof of Concept to Production

©2013 DataStax Confidential. Do not distribute without consent.

Thanks!

21

Questions? ! @tjake