Roy Bauweraerts & Erwin de Keijzer · Registers tasks as applicationid.service.consul (marathon...
Transcript of Roy Bauweraerts & Erwin de Keijzer · Registers tasks as applicationid.service.consul (marathon...
Running docker in production
Roy Bauweraerts & Erwin de Keijzer
Hello!Mijndomein
● Webhosting company founded in 2003● 572.870 Domains● 194.870 Customers
md3Behold the monolith
● Own iron● 1 release every 4 weeks (+ hotfixes)
○ mostly night releases● (mostly) manual process● Releases by Operations
µd3behold the distributed monolith
● 3 dedicated aws ec2 instances per microservice
● Multiple releases every week● Releases by Developers● Service discovery with Consul
ConsulService discovery
Consul
● Service discovery
● Failure detection
● Multi datacenter
● Key/Value storage
µd3Behold the distributed monolith
● Tightly coupled● Expensive● Complicated to introduce new services
○ write playbooks○ add instances○ add “service” to deploy server○ create healthchecks
● High overhead
The goalBut why?!
To migrate to a platform that allows us to quickly add, change or remove functionality
with high confidence, without compromising the user experience or availability.
Docker
Docker
Docker containers wrap a piece of software in a complete filesystem that contains everything
needed to run:
code, runtime, system tools, system libraries –
anything that can be installed on a server.
This guarantees that the software will always run the same, regardless of its environment.
Their own words
ChallengesDo you accept?
● Running docker containers
● Environment consistency & configuration
● Service discovery
● Logging
● Request routing
● Monitoring
● Updates without downtime
Challenge#1 Running docker containers
What is the easiest and most reliable method of managing your containers (CRUD & scale) with minimal effort and without affecting your customers?
Challenge
Kubernetes
#1 Running docker containers
Nomad Docker swarm Amazon ecs
Challenge
Kubernetes
#1 Running docker containers
Nomad Docker swarm Amazon ecs
Challenge
DC/OS
“DC/OS is an enterprise grade datacenter-scale
operating system,
providing a single platform for running containers,
big data, and distributed apps in production.”
#1 Running docker containers
Challenge
Pro:● It keeps the containers
running rather well● Easy bootstrap● CRUD web interface● Logging possibilites● Rolling updates based on
health checks
Con:● Lots of moving parts● Distributed state● No native consul integration● Webui has flaws● No internal name spacing● No way of running services on
all agents
#1 Running docker containersDC/OS
Challenge#1 Running docker containers
ChallengesDo you accept?
● Running docker containers
● Environment consistency & configuration
● Service discovery
● Logging
● Request routing
● Monitoring
● Updates without downtime
Challenge#2 Environment
How can you guarantee that your code behaves the same?
here
and here
Challenge#2 Environment
● Use the same artifact in all your environments
● Artifact combines all resources needed for running your code:
○ os
○ libraries
○ plugins
○ tooling
● Configuration is injected during runtime
● Use the same artifact in all your environments
● Artifact combines all resources needed for running your code:
○ os
○ libraries
○ plugins
○ tooling
● Configuration is injected during runtime
Challenge#2 Environment
} docker
● Use the same artifact in all your environments
● Artifact combines all resources needed for running your code:
○ os
○ libraries
○ plugins
○ tooling
● Configuration is injected during runtime
Challenge#2 Environment
} docker
}consul-template
environment consistencyChallenge#2 Environment
➭ cat parameters.yml.ctmpl ---{{ tree "config/mijndomein" | explode | toYAML }}
➭ consul-template -consul consul.service.consul:8500 -once -template "parameters.yml.ctmpl:parameters.yml"
➭ cat parameters.yml---example: data
environment consistencyChallenge#2 Environment
consul-template
parameters.yml
Challenge#2 Environment
ChallengesDo you accept?
● Running docker containers
● Environment consistency & configuration
● Service discovery
● Logging
● Request routing
● Monitoring
● Updates without downtime
Challenge#3 Service discovery
10.0.0.2 10.0.0.310.0.0.1
A AC DB FE AB
13:00
How do you let your containers discover other containers in continuously changing environment?
Challenge#3 Service discovery
10.0.0.2 10.0.0.310.0.0.1
13:10
D AC DE FB AC
How do you let your containers discover other containers in continuously changing environment?
Challenge#3 Service discovery
10.0.0.2 10.0.0.310.0.0.1
13:20
A BC AE FB DC
13:20
How do you let your containers discover other containers in continuously changing environment?
Challenge#3 Service discovery
Mesos DNSContainers need to communicate with services outside
of DC/OS.
DC/OS service portsOutside services also need to know the IP addresses.
Consul DNSDC/OS cannot communicate with consul.
Challenge#3 Service discovery
Mesos consul
“Mesos to Consul bridge for service discovery.”
Challenge#3 Service discovery
● Watches Mesos
● Registers tasks as applicationid.service.consul
○ (marathon labels can be used define your own servicename)
● Registers consul (http) health checks based on marathon labels
● Updates on a predefined interval
○ Not ideal, compromises between consistency and performance
Mesos consul
Challenge#3 Service discovery
ChallengesDo you accept?
● Running docker containers
● Environment consistency & configuration
● Service discovery
● Logging
● Request routing
● Monitoring
● Updates without downtime
Challenge#4 Logging
How do you determine what is happening
● at the application level● at the domain level
with minimal effort?
Challenge#4 Logging
Application
Stdout & stderr available through web interface for
realtime insights.
Also logged to Elasticsearch with rich metadata for
statistics and historical insights.
Challenge#4 Logging
Challenge#4 Logging
Domain
Events that are sent through RabbitMQ also get
stored in Elasticsearch
ChallengesDo you accept?
● Running docker containers
● Environment consistency & configuration
● Service discovery
● Logging
● Request routing
● Monitoring
● Updates without downtime
Challenge#5 Request routing
AWSELB / ALB
DCOSAgents
to
How do you make sure your requests reach the correct containers?
from
Challenge#5 Request routing
AWSELB / ALB DCOS
Agents
GET / HTTP/1.1Host: www.mijndomein.nl
10.0.0.1:32001
Challenge#5 Request routing
AWSELB / ALB DCOS
Agents
GET /producten HTTP/1.1Host: www.mijndomein.nl
10.0.0.2:32003
Challenge#5 Request routing
AWSELB / ALB DCOS
Agents
GET /login HTTP/1.1Host: auth.mijndomein.nl
10.0.0.3:32005
Challenge#5 Request routing
Register your containers in the AWS ALB Complex and mistake prone
Static proxy (NGINX, Apache2, HAProxy)Large featureset but a lot of manual labour
Dynamic proxy (Fabio/Traefik)Easy but limited
Challenge#5 Request routing
AWSELB / ALB DC/OS
Agentsnginx proxy
Host: w
ww
.mijndom
ein.nl
GET / HTTP/1.1
GET /producten HTTP/1.1
Host: auth.mijndomein.nl
Challenge#5 Request routing
ChallengesDo you accept?
● Running docker containers
● Environment consistency & configuration
● Service discovery
● Logging
● Request routing
● Monitoring
● Updates without downtime
Challenge#6 Monitoring
How do you automatically check (and fix) the health of your containers?
● Marathon checks● Consul health checks● Alerting with Datadog
Challenge#6 Monitoring
Marathon
Challenge#6 Monitoring
Challenge#6 Monitoring
Datadog
● Visualisation of all tracked metrics
● Alerting on predefined limits
○ hard thresholds (request rate == 0)
○ dynamic thresholds (disk usage suddenly grows
faster than before)
ChallengesDo you accept?
● Running docker containers
● Environment consistency & configuration
● Service discovery
● Logging
● Request routing
● Monitoring
● Updates without downtime
Challenge#7 Rolling updates
How do you update applications and servers without affecting your customers?
Challenge#7 Rolling updates
Applications
Challenge#7 Rolling updates
Challenge#7 Rolling updates
Servers
environment consistencyChallenge#7 Rolling updatesDC/OS Agents
1a
1b
1c
environment consistencyChallenge#7 Rolling updatesDC/OS Agents
1a
1b
1c
environment consistencyChallenge#7 Rolling updatesDC/OS Agents
1a
1b
1c
ChallengesDo you accept?
● Running docker containers
● Environment consistency & configuration
● Service discovery
● Logging
● Request routing
● Monitoring
● Updates without downtime
AfterthoughtsWould we do it again?
○ Entire environment has become more complex
then before.
○ DC/OS schedules single containers, which made us
create multi-process containers.
○ Lack of namespacing forces us to separate accept
and production environments and also allows more
internal communication than necessary.
○ Secrets and ACL are not part of the Free DC/OS.
AfterthoughtsWould we do it again?
○ Since setting up DC/OS we have had a 200%
increase in microservices.
○ Because Dev, Accept and Prod are so similar, we
have had nearly no bugs introduced by the
environment.
○ Introducing new microservices to production can
now be achieved in a few hours.
○ We now run over 40 unique microservices (about 75
containers) on 12 instances.
AfterthoughtsWould we do it again?
Components
DC/OSMasters
DC/OSAgents
AWSELB / ALB
MySQL
Redis
RabbitMQ
Elasticsearch
mesosconsul
Consul
Bye bye!That’s all folks
https://github.com/mijndomein/docker-in-production-talk