Cassandra and docker
-
Upload
ben-bromhead -
Category
Software
-
view
84 -
download
1
Transcript of Cassandra and docker
Who am I and what do I do?• Ben Bromhead
• Co-founder and CTO of Instaclustr -> www.instaclustr.com
• Instaclustr provides Cassandra-as-a-Service in the cloud.
• Currently support AWS, Azure, Heroku and Softlayer with more to come.
• 700+ nodes
Objectives• A quick intro on docker (for the Cassandra folk).
• Our docker story
• Working with Cassandra and docker.
• Running C* in a constrained env w/ docker
• Listen to my astonishment of all the progress docker has made since I last gave this talk
Why docker matters• Finally Developers have a solution to build once and deploy
anywhere
• Finally Ops/Admin has a solution to configure anywhere
• Finally DevOps is easy
• Dev == Test == Staging == Production
• Move with speed
Docker, how it works.• Runs anywhere (Linux kernel 2.6.32+)
• Uses lightweight VMs:
• Own process space (namespace)
• Process isolation and resource control (cgroups)
• Own network adapter
• Own filesystem (chroot)
• Linux Analog to Solaris Zones, *BSD jails
Docker, how it works.• What about the packaging component?
• Uses Union filesystem to create a git like workflow around your deployed code:
!!
Docker!Container!Image!Registry!
Push%
!!!!
Bins/!Libs!!!!!
App!A!
App!Δ!!
!!!!Bins/!
Docker'Engine' Docker'Engine'
Update'
Host'is'now'running'A’’''
App'Δ''
''''Bins/'
''''
Bins/'Libs'''''
App'A'
''''Bins/'
''''
Bins/'Libs'''''
App'A’’'
Host'running'A'wants'to'upgrade'to'A’’.'Requests'update.'Gets'only'diffs''
Why we started using Docker
• We are super duper big fans of the “Immutable server” concept
• Once it’s deployed you don’t touch it
• No config management, no chef, no puppet etc
• Seed at boot and be done with it
Why we started using Docker• Before Docker, we built AMIs in Amazon
• A new AMI for every deploy, version etc
• This meant we cycled our entire fleet of instances constantly
• Which is fine for some, but we work with persistent data
• Sooo much time streaming from replicas/copying backups from S3
Why we started using Docker• Docker images solved this for us
• Treat the host as a sterile environment
• Everything in a few docker containers which we can simply update
• Cycle the docker container instead of the AMI
• Yes… docker was primarily a package management tool for us
Docker at Instaclustr
• So how do we get on board the hype train an established devops practice? Without killing performance or stability?
• Ran in dev to get comfortable with it, then non-critical systems.
• Talked to others who use it in production
• https://github.com/docker/docker/issues - https://docs.docker.com/ You will spend a lot of time here
Docker & Cassandra - Networking
• 1st trial, throughput dropped in half!
• Writes sucked, streaming sucked, what was going on?
• Quick check with iperf showed a 50% hit in throughput
Docker & Cassandra - Networking
• Docker uses Linux Ethernet Bridges for basic software defined routing. This will hose your network throughput (2014).
• Use the host network stack instead (—net=host), 0% impact on Cassandra throughput (iperf still showed minor overhead)
• Also solves NAT issues in an AWS like networking environment.
Docker & Cassandra + Filesystem• The filesystems (AUFS, BTRFS etc) that bring great benefits to Dockers
workflow around building and snapshoting containers are not very good for databases.
• You also need keep your C* data, commitlogs & caches in a Docker volume mount for persistence.
• UnionFS (AUFS) is terrible for writing lots of big files.
• BTRFS is a pain to use from an ops point of view. Terrible
• Hooray volume mounts use the underlying filesystem. Put cassandra data dir on a volume mount with a decent fs (e.g. xfs)
Docker + Process Capabilities
• Docker by default drops all process capabilities except the minimum needed to start.
• https://github.com/docker/docker/blob/master/oci/defaults_linux.go#L64-L79
Docker + Process Capabilities• Cassandra needs to pin files to memory using Mlockall, otherwise things
get sloooow.
• Mlockall is a process capability.
• A process needs CAP_IPC_LOCK & RLIMIT_MEMLOCK in order to perform this operation. By default docker doesn't assign this to a running container…
• Can use --privileged and be done with it. Kind of lazy though
• Use --cap-add instead
Docker + SIGTERM propagation• When stopping the process docker will send a SIGTERM.
• Some interpreted languages treat PID 1 differently. E.g. Python/Bash does not have default signal handlers when it’s PID 1.
• Bad if you use a bash script to launch Cassandra
• Java to the rescue!
• Make sure you run the cassandra bash script with -f (foreground)
• exec causes the JVM to replace the bash process… making the world a happier place
Docker + SIGTERM propagation• Tools like OpsCenter Server will have trouble with this.
• Can be fixed using a wacky combination of trap and wait stanzas in your OpsCenter Server script (see http://veithen.github.io/2014/11/16/sigterm-propagation.html)
• But now you have a bash script that duplicates init/systemd/supervisord
• The debate rages on…
Docker + CoreOS
• Docker + fav OS + CM?, CoreOS + etcd, Swarm + Machine, Deis etc
• We chose CoreOS (Appeared to be sane, etcd is cool, systemd if you are into that kind of thing)
• Docker (the company) now does their own thing… did you know they now call Docker… Docker Engine… who’d have thunk.
Docker + CoreOS
• Disable automatic updates + restarts (seriously do this)
• Fix logging, otherwise you will log to 3 locations (/var/log/cassandra, journalctl and dockers json based log
• JVM will exit with error 143 (128 + 15 for SIGTERM). Need to ignore that in your systemd service definition.
Docker + Dev Env• Docker relies on Linux kernel capabilites… so no native docker in OS X
• We use OSX for dev, so we run vagrant and the CoreOS vagrant file
• Install Docker userland tools in OS X and forward ports to the vagrant box running CoreOS
• Our env is a little strange, we a single cassandra instance on a single CoreOS vm.
• Docker for mac now uses a lighter weight virtualisation layer native to OSX.
• Look at https://github.com/tobert/cassandra-docker for full dockerisation!
Docker + C* + Dev Env
• How do I run lots of C* instances on a VM or my dev laptop without it falling over?
• Backwards performance tuning!
• Make it run as slowly, but as stable as possible!
Docker + C* + Dev Env• Set Memory to be super low (you can go higher than this), edit your
cassandra-env.sh:
MAX_HEAP_SIZE="128M"HEAP_NEWSIZE=“24M"
Docker + C* + Dev Env• Tune compaction to have free reign and to smash the disk
concurrent_compactors:1in_memory_compaction_limit_in_mb:2compaction_throughput_mb_per_sec:0
Docker + C* + Dev Env• Let’s use HSHA thrift server as it reduces the memory per thread
used.
rpc_server_type:hsha
Docker + C* + Dev Env• The HSHA server also lets us limit the number of threads serving in
flight requests, but still have a large number of clients connected.
concurrent_reads:4concurrent_writes:4rpc_min_threads:2rpc_max_threads:2
• You can play with these to get the right numbers based on how your clients connect, but keep them low.
Docker + C* + Dev Env• This is Dev! Caches have no power here!
key_cache_size_in_mb:0reduce_cache_sizes_at:0reduce_cache_capacity_to:0
Docker + C* + Dev Env
• How well does this work?!?!
• Will survive running the insane workload in the c* 2.1 new stresstest tool.
• We run this on AWS t2.small instances
• Sign up at https://www.instaclustr.com and give our new Developer nodes a spin!