Sheep it

47
Deploying with Consul + Docker

Transcript of Sheep it

Deploying withConsul + Docker

http://docker.com

http://consul.io

Hello!I am Lucas Fontes

Services Team Lead at Uken Studios

I take screenshotsSpirit Animal: Left Shark

internets: @lxfontes

Uken Studios

Rails shopHTML5 and Unity3d~ 150 servers (300 peak)- ~ 40 Databases- ~ 70 App servers- SOA with nsq.io

Static server listLock-step commands on all serversHAProxy config via ChefSnowflake servers. (DB operations/scheduled tasks)

~15 clustersShared HAProxy cluster

Bare Metal

DBs (cpu speed / memory / ssd)Web (cpu count / nic bonding)

Tons of unused network transfer

Monthly billing upgrades

idle hardware (redundancy)

PowerUP ~ 2013

Build - Heroku buildpackslxfontes/slugbuilder

Orchestration - Redis / Rails (private)

Run - Docker/lxc hostslxfontes/slugrunner-rb

Dynamic LB - knucklesuken/knuckles

Amazon happened

~4 - 10 hours provisioning time

Virtual Servers meh

Softlayer -> AWS

Switch - 4 hours cut-over

Amazon. yey.what now?

What if auto-scaling kicks in mid-deploy?What if instance dies mid-deploy?Trust chef nodes will register themselves!

Static server listLock-step commands on all serversHAProxy config via ChefSnowflake servers. (DB operations/scheduled tasks)

HAProxy + Chef

Make Sensu remove dead nodes

move on.

PowerUP

much BuildPackssuch Scalingwow Dynamic LB (knuckles)

Poor adoptionManual Database migrationsNo love for buildpacksNo more idle hardware

Ops Requirements

- Not tied to AWS- Containers / packs

- Either straight LXC, docker, rkt- Dynamic Load Balancing

- ELB only for SSL termination- Easy to troubleshoot

- Heavy logging, unique-ids everywhere- Configuration Decoupling

- Not tied to chef/ansible/puppet- DB location, worker count

- Deploy anything- Not just rails

Developers requirements

Dev Requirements

- Run pre/post deploy scripts- slack notifications- close jira tickets

- CDN friendly- GitHub- 'Can haz ephemeral shell?'

- run rake tasks manually- ‘Can haz production shell?’

- troubleshooting

- One click deploy

Out thereor… “why you should not do it yourself”

Out there

- Clusters- deis, kubernetes, shipbuilder, coreos, mesos

- Config management- chef, ansible

- Deploy tools- capistrano, mina, fabric, packer

- Services- ECS, CodeDeploy, Elastic Beanstalk- shippable, ezautoscaling (!?)

They all work!Not the way we want!

Workflow

BUILD

RUN

RELEASE

BUILD

How? Dockerfile on each repodocker build and done

Store? Docker Registry

Target Image? repo + prod|stagingregistry.uken.int/uken/titans_prod

Target Version? Git revisionregistry.uken.int/uken/titans_prod:cc3e68a

RUN

Sealed container onlydocker --rm=true

Pre releaseDatabase Migrations / SeedsCDN Asset upload

Post releaseEphemeral shell sessions

RELEASE

Upload image to docker registry

Notify hostsregistry.uken.int/uken/titans_prod:cc3e68aGo download this version!

Docker pull + restart on hosts

‘’But Lucas… really...

How hosts reload ?

Load balancing ?

This looks complicated

Distributed QuorumAgents on all hosts / load balancers

Key Value Pairstitans_prod/version: registry.uken.int/uken/titans_prod:cc3e68atitans_prod/env/DB_HOST: db1.uken.int

DNS + HTTP APIkinda important

Service Registration + Health Checksweb1.node.consul => 10.28.3.1titans_prod.service.consul => 10.28.3.1 port 5001

Consul

consul-template

Host

Detectchange

RunTemplate

Pull+RunImage

RegisterService

docker

consul

consul-template

consul-template

‘’But Lucas…

How hosts reload ?

Load balancing ?

This looks complicated

Use Consul DNS APII said it was important

Lookup by repo + environment

titans_prod.service.consulsrv1.node.consul port 5100srv2.node.consul port 4200srv3.node.consul port 9000

proxy_pass http://$target;

gist

‘’But Lucas…

How hosts reload ?

Load balancing ?

This looks complicated

Not really

BuildDockerfiledocker build

Run / shellsdocker pulldocker run

Releasedocker pushconsul set kv pair

Frontend Provisioningec2 snapshots

Load Balancersec2 snapshots

Samsonby Zendesk

Tricks

Consul + DNS

Run a local cache + consul chain (unbound)stub-zone: name: "consul" stub-addr: 127.0.0.1@8600

CNAME databases / caches on alt domain

Docker

Filesystem Driversoverlay / aufs / btrfs / devicemapper

STDOUT loggingBad for long lived containers - fixed in 1.6!

Container linkingMeh. --net=host

Need volumes during build?docker run && docker commit

Dockerfiles: Base Image

ubuntu

uken/base

uken/ruby uken/djangouken/go

shim

Dockerfiles: Passing parameters

# Inject SSH key

ADD http://127.0.0.1:9090/ssh_key /root/.ssh/id_rsa

RUN chmod 600 /root/.ssh/id_rsa

# Revision + deploy vars

ADD http://127.0.0.1:9090/env /etc/profile.d/custom_env.sh

No ‘official’ wayHTTP trick

Dockerfiles: Caching

FROM uken/ruby_20:latest

ENV RAILS_ENV design

ENV WORKERS 4

ADD ./Gemfile /app/Gemfile

ADD ./Gemfile.lock /app/Gemfile.lock

RUN cd /app && bundle install --jobs 4 --deployment

ADD ./ /app

RUN cd /app && bundle exec rake assets:precompile

Procfiles

Foreman on desktopddollar/foreman

Forego on serversddollar/forego

web: bundle exec unicorn -c config/unicorn.rb -p ${PORT:-3000} -E ${RAILS_ENV:-development}realtime: ./realtime/realtime_server.linux -ws $PORTsidekiq: bundle exec sidekiq -v -c 3cron: cron -f

LB tricks

HTTP/1.1 200 OK...X-U-Fe: 10.1.0.248:5003

add_header X-U-Fe $target;proxy_pass http://$target;

LB tricks

error_page 500 = /errors/internal-server-error;

location ~ ^/errors/(?<code>[a-z-])+ { proxy_pass http://error.uken.com:80/titans/$code; }

Reload away.No more redirects!

Thanks!@lxfontes

github.com/uken/autoscaling