Burn down the silos! Helping dev and ops gel on high availability websites

177
Burn Down The Silos Burn Down The Silos Lindsay Holmwood Lindsay Holmwood

description

HA websites are where the rubber meets the road - at 200km/h. Traditional separation of dev and ops just doesn't cut it. Everything is related to everything. Code relies on performant and resilient infrastructure, but highly performant infrastructure will only get a poorly written application so far. Worse still, root cause analysis in HA sites will more often than not identify problems that don't clearly belong to either devs or ops. The two options are collaborate or die. This talk will introduce 3 core principles for improving collaboration between operations and development teams: consistency, repeatability, and visibility. These principles will be investigated with real world case studies and associated technologies audience members can start using now. In particular, there will be a focus on: - fast provisioning of test environments with configuration management - reliable and repeatable automated deployments - application and infrastructure visibility with statistics collection, logging, and visualisation

Transcript of Burn down the silos! Helping dev and ops gel on high availability websites

Page 1: Burn down the silos! Helping dev and ops gel on high availability websites

Burn Down The Silos

Burn Down The Silos

Lindsay HolmwoodLindsay Holmwood

Page 2: Burn down the silos! Helping dev and ops gel on high availability websites

DevOps

Page 3: Burn down the silos! Helping dev and ops gel on high availability websites

Case studyCase study

Page 4: Burn down the silos! Helping dev and ops gel on high availability websites

Applying with technology

Page 5: Burn down the silos! Helping dev and ops gel on high availability websites

High pro!le fundraising website

High pro!le fundraising website

Page 6: Burn down the silos! Helping dev and ops gel on high availability websites

Strong siloisationDevs + Ops in di"erent companies

Strong siloisationDevs + Ops in di"erent companies

Page 7: Burn down the silos! Helping dev and ops gel on high availability websites

100% uptime

Page 8: Burn down the silos! Helping dev and ops gel on high availability websites
Page 9: Burn down the silos! Helping dev and ops gel on high availability websites

3 Concepts

Page 10: Burn down the silos! Helping dev and ops gel on high availability websites

ConsistencyConsistency

Page 11: Burn down the silos! Helping dev and ops gel on high availability websites

Repeatability

Page 12: Burn down the silos! Helping dev and ops gel on high availability websites

VisibilityVisibility

Page 13: Burn down the silos! Helping dev and ops gel on high availability websites
Page 14: Burn down the silos! Helping dev and ops gel on high availability websites

ConsistencyConsistency

Page 15: Burn down the silos! Helping dev and ops gel on high availability websites

ensuring identical

behaviour

Page 16: Burn down the silos! Helping dev and ops gel on high availability websites

within an environment

Page 17: Burn down the silos! Helping dev and ops gel on high availability websites

or across multiple environments

Page 18: Burn down the silos! Helping dev and ops gel on high availability websites
Page 19: Burn down the silos! Helping dev and ops gel on high availability websites

con!guration management

Page 20: Burn down the silos! Helping dev and ops gel on high availability websites

testing

Page 21: Burn down the silos! Helping dev and ops gel on high availability websites
Page 22: Burn down the silos! Helping dev and ops gel on high availability websites

Puppetlanguage

libraryclient/server

Page 23: Burn down the silos! Helping dev and ops gel on high availability websites

package { "apache2": ensure => installed }

service { "apache2": enable => true, ensure => running }

Page 24: Burn down the silos! Helping dev and ops gel on high availability websites

class apache2 { package { "apache2": ensure => installed }

service { "apache2": enable => true, ensure => running }}

Page 25: Burn down the silos! Helping dev and ops gel on high availability websites

Puppet work#owwrite apply debug

Page 26: Burn down the silos! Helping dev and ops gel on high availability websites

class apache2 { package { "apache2": ensure => installed }

service { "apache2": enable => true, ensure => running }}

Page 27: Burn down the silos! Helping dev and ops gel on high availability websites

class apache2 { package { "apache2": ensure => installed }

service { "apache2": enable => true, ensure => running, require => [ Package["apache2"] ] }}

Page 28: Burn down the silos! Helping dev and ops gel on high availability websites

Complex manifests

Page 29: Burn down the silos! Helping dev and ops gel on high availability websites

Proprietary software

Page 30: Burn down the silos! Helping dev and ops gel on high availability websites

Lots of debugging

Page 31: Burn down the silos! Helping dev and ops gel on high availability websites

VMware snapshots

Page 32: Burn down the silos! Helping dev and ops gel on high availability websites
Page 33: Burn down the silos! Helping dev and ops gel on high availability websites

Multiple deploy environments

Page 34: Burn down the silos! Helping dev and ops gel on high availability websites

Con!guration driftCon!guration drift

Page 35: Burn down the silos! Helping dev and ops gel on high availability websites

“roles”

Page 36: Burn down the silos! Helping dev and ops gel on high availability websites

define app_server($collectd_destination, $logging_destination) { server { $fqdn: logging_destination => $logging_destination } include apache2 include mysql::client collectd::client { $fqdn: collection_destination => $collectd_destination }

if $environment == "production" { include production::only::module }}

Page 37: Burn down the silos! Helping dev and ops gel on high availability websites

node "app-01.stage.charity.com" { app_server { $fqdn: collectd_destination => "stats-01.stage.charity.com", logging_destination => "log-01.stage.charity.com" }}

node "app-01.production.charity.com" { app_server { $fqdn: collectd_destination => "stats-01.production.charity.com", logging_destination => "log-01.production.charity.com" }}

Page 38: Burn down the silos! Helping dev and ops gel on high availability websites

node /app-\d+.stage.charity.com/ { app_server { $fqdn: collectd_destination => "stats-01.stage.charity.com", logging_destination => "log-01.stage.charity.com" }}

node /app-\d+.production.charity.com/ { app_server { $fqdn: collectd_destination => "stats-01.production.charity.com", logging_destination => "log-01.production.charity.com" }}

Page 39: Burn down the silos! Helping dev and ops gel on high availability websites

30 minute builds

Page 40: Burn down the silos! Helping dev and ops gel on high availability websites

ConsistencyConsistency

Page 41: Burn down the silos! Helping dev and ops gel on high availability websites
Page 42: Burn down the silos! Helping dev and ops gel on high availability websites

Repeatability

Page 43: Burn down the silos! Helping dev and ops gel on high availability websites

Function of consistency

Page 44: Burn down the silos! Helping dev and ops gel on high availability websites

automate,to remove

human error

Page 45: Burn down the silos! Helping dev and ops gel on high availability websites

increase speed by shortening

feedback loops

Page 46: Burn down the silos! Helping dev and ops gel on high availability websites
Page 47: Burn down the silos! Helping dev and ops gel on high availability websites

automated deployments

Page 48: Burn down the silos! Helping dev and ops gel on high availability websites

con!gurationmanagement

Page 49: Burn down the silos! Helping dev and ops gel on high availability websites
Page 50: Burn down the silos! Helping dev and ops gel on high availability websites

Capistrano

Page 51: Burn down the silos! Helping dev and ops gel on high availability websites

Ruby DSL around SSH-in-a-for-loop

Page 52: Burn down the silos! Helping dev and ops gel on high availability websites

Simple, powerful, can blow your legs o"

Page 53: Burn down the silos! Helping dev and ops gel on high availability websites

Not a substitute for con!guration management

Page 54: Burn down the silos! Helping dev and ops gel on high availability websites

railsless-deployhttp://bit.ly/i56ra9

Page 55: Burn down the silos! Helping dev and ops gel on high availability websites

Removes Rails-ism

Page 56: Burn down the silos! Helping dev and ops gel on high availability websites

Great for PHP

Page 57: Burn down the silos! Helping dev and ops gel on high availability websites

capistrano-multistage(part of capistrano-ext)

http://bit.ly/i2moIp

Page 58: Burn down the silos! Helping dev and ops gel on high availability websites

# config/deploy.rb

set :user, "deploy"set :application, "charity.com"set :keep_releases, 10set :deploy_to, "/srv/#{application}"

set :stages, %w(uat staging production)

Page 59: Burn down the silos! Helping dev and ops gel on high availability websites

# config/deploy/stage.rb

role :app, "app-01.stage.charity.com", "app-02.stage.charity.com"role :static, "static-01.stage.charity.com”

Page 60: Burn down the silos! Helping dev and ops gel on high availability websites

# config/deploy/production.rb

role :app, "app-01.prod.charity.com", "app-02.prod.charity.com", "app-03.prod.charity.com"role :static, "static-01.prod.charity.com”

Page 61: Burn down the silos! Helping dev and ops gel on high availability websites

cap staging deploy # deploy to staging# testcap production deploy # deploy to production

Page 62: Burn down the silos! Helping dev and ops gel on high availability websites

Capistrano bootstrap w/ Puppet

Page 63: Burn down the silos! Helping dev and ops gel on high availability websites

class capistrano::user {

group { "deploy": gid => 499 }

user { "deploy": uid => 499, gid => 499, home => "/home/deploy", shell => "/bin/bash", require => Group["deploy"] }

file { "/home/deploy/.ssh/authorized_keys": source => "puppet:///modules/capistrano/authorized_keys", mode => 644, owner => "deploy", group => "deploy", require => [ User["deploy"] ] }

}

Page 64: Burn down the silos! Helping dev and ops gel on high availability websites

define capistrano::site {

include capistrano::user

file { "/srv/$name": ensure => directory, owner => deploy, group => deploy, require => [ User["deploy"] ] }

file { "/srv/$name/releases": ensure => directory, owner => deploy, group => deploy, require => [ File["/srv/$name"] ] }

...

Page 65: Burn down the silos! Helping dev and ops gel on high availability websites

...

file { "/srv/$name/shared/log": ensure => directory, owner => www-data, group => www-data, }

file { "/etc/$name": source => "puppet:///modules/capistrano/etc/$name", recurse => true, mode => "644", owner => root, group => root }

}

Page 66: Burn down the silos! Helping dev and ops gel on high availability websites

define app_server($collectd_destination, $logging_destination) {

include apache2 include mysql::client capistrano::site { “charity.com”: }

}

Page 67: Burn down the silos! Helping dev and ops gel on high availability websites

Deploying to a new app server is as easy as:

Page 68: Burn down the silos! Helping dev and ops gel on high availability websites

# config/deploy/production.rb

role :app, "app-01.prod.charity.com", "app-02.prod.charity.com", "app-03.prod.charity.com"role :static, "static-01.prod.charity.com”

Page 69: Burn down the silos! Helping dev and ops gel on high availability websites

# config/deploy/production.rb

role :app, "app-01.prod.charity.com", "app-02.prod.charity.com", "app-03.prod.charity.com", "app-04.prod.charity.com"role :static, "static-01.prod.charity.com”

Page 70: Burn down the silos! Helping dev and ops gel on high availability websites

or...

Page 71: Burn down the silos! Helping dev and ops gel on high availability websites

# deploy/production.rb

role :app, "app-01.prod.charity.com", "app-02.prod.charity.com", "app-03.prod.charity.com", "app-04.prod.charity.com"role :static, "static-01.prod.charity.com”

Page 72: Burn down the silos! Helping dev and ops gel on high availability websites

# deploy/production.rb

role :app, *(1..4).map do |n| "app-%.2d.prod.charity.com" % nend

role :static, "static-01.prod.charity.com”

Page 73: Burn down the silos! Helping dev and ops gel on high availability websites
Page 74: Burn down the silos! Helping dev and ops gel on high availability websites

git-svn mirrorgit-svn mirror

Page 75: Burn down the silos! Helping dev and ops gel on high availability websites

182MB * 20 == PAIN182MB * 20 == PAIN

Page 76: Burn down the silos! Helping dev and ops gel on high availability websites

remote_cache

Page 77: Burn down the silos! Helping dev and ops gel on high availability websites

bad with svn tagsbad with svn tags

Page 78: Burn down the silos! Helping dev and ops gel on high availability websites

git-svn + cron

Page 79: Burn down the silos! Helping dev and ops gel on high availability websites

fast clonescommit access

21st century tech

Page 80: Burn down the silos! Helping dev and ops gel on high availability websites

Repeatability

Page 81: Burn down the silos! Helping dev and ops gel on high availability websites
Page 82: Burn down the silos! Helping dev and ops gel on high availability websites

VisibilityVisibility

Page 83: Burn down the silos! Helping dev and ops gel on high availability websites

one eye on the past

one eye on the past

Page 84: Burn down the silos! Helping dev and ops gel on high availability websites

one eye on the future

one eye on the future

Page 85: Burn down the silos! Helping dev and ops gel on high availability websites
Page 86: Burn down the silos! Helping dev and ops gel on high availability websites

metric collectionmetric collection

Page 87: Burn down the silos! Helping dev and ops gel on high availability websites

code changescode changes

Page 88: Burn down the silos! Helping dev and ops gel on high availability websites

monitoringmonitoring

Page 89: Burn down the silos! Helping dev and ops gel on high availability websites

reportsreports

Page 90: Burn down the silos! Helping dev and ops gel on high availability websites
Page 91: Burn down the silos! Helping dev and ops gel on high availability websites

metric collectionmetric collection

Page 92: Burn down the silos! Helping dev and ops gel on high availability websites

collectd

Page 93: Burn down the silos! Helping dev and ops gel on high availability websites

lightweightstatistics

collectiondaemon

Page 94: Burn down the silos! Helping dev and ops gel on high availability websites

platform for collecting

time series data

platform for collecting

time series data

Page 95: Burn down the silos! Helping dev and ops gel on high availability websites

plugin basedplugin based

Page 96: Burn down the silos! Helping dev and ops gel on high availability websites

network awarenetwork aware

Page 97: Burn down the silos! Helping dev and ops gel on high availability websites

well de!ned APIswell de!ned APIs

Page 98: Burn down the silos! Helping dev and ops gel on high availability websites

curl_json

Page 99: Burn down the silos! Helping dev and ops gel on high availability websites

<Plugin curl_json> <URL "http://localhost:5984/_stats"> Instance "httpd" <Key "httpd/requests/count"> Type "http_requests" </Key>

<Key "httpd_status_codes/*/count"> Type "http_response_codes" </Key> </URL></Plugin>

Page 100: Burn down the silos! Helping dev and ops gel on high availability websites

/metrics

Page 101: Burn down the silos! Helping dev and ops gel on high availability websites
Page 102: Burn down the silos! Helping dev and ops gel on high availability websites

code changescode changes

Page 103: Burn down the silos! Helping dev and ops gel on high availability websites

application&

con!g management

Page 104: Burn down the silos! Helping dev and ops gel on high availability websites
Page 105: Burn down the silos! Helping dev and ops gel on high availability websites

Your new best friend

Page 106: Burn down the silos! Helping dev and ops gel on high availability websites

monitoringmonitoring

Page 107: Burn down the silos! Helping dev and ops gel on high availability websites

sudo mmm_control show # blocks under high IOecho -en “show\nquit\n” | nc 127.1 9988 # instant

Page 108: Burn down the silos! Helping dev and ops gel on high availability websites

sudo mmm_control show # blocks under high IOecho -en “show\nquit\n” | nc 127.1 9988 # instant

socket = ::TCPSocket.new("127.0.0.1", 9988)socket.print("show\nquit\n")output = socket.read.split("\n")hosts = output.map do |line| parts = line.scan(/nasty regex/).flatten

{ "hostname" => parts[0], "address" => parts[1], "mode" => parts[2], "state" => parts[3], "role" => parts[5], "role_address" => parts[6] }end

Page 109: Burn down the silos! Helping dev and ops gel on high availability websites
Page 110: Burn down the silos! Helping dev and ops gel on high availability websites

reportsreports

Page 111: Burn down the silos! Helping dev and ops gel on high availability websites

mk-query-digest&

logrotate

Page 112: Burn down the silos! Helping dev and ops gel on high availability websites

# prerotate

SLOWLOG_FILENAME="/var/log/mysql/mysql-slow.log"OPTIONS="--report --group-by distill --order-by Query_time:max --timeline --report-format query_report,profile"DATE="$(date +%Y-%m-%dT%H:%M:%S%z)"REPORT_FILENAME="/tmp/$(hostname)-mysql-slow-query-report-$DATE"mk-query-digest $SLOWLOG_FILENAME $OPTIONS > $REPORT_FILENAME

SUBJECT="MySQL Slow Queries Report for $(hostname) as of $DATE"RECIPIENTS="[email protected],[email protected]"cat $REPORT_FILENAME | nail -n -E -s "$SUBJECT" "$RECIPIENTS"

Page 113: Burn down the silos! Helping dev and ops gel on high availability websites
Page 114: Burn down the silos! Helping dev and ops gel on high availability websites

RetrospectivesRetrospectives

Page 115: Burn down the silos! Helping dev and ops gel on high availability websites

Slave explosionSlave explosion

Page 116: Burn down the silos! Helping dev and ops gel on high availability websites

Background:

Page 117: Burn down the silos! Helping dev and ops gel on high availability websites

Background:MySQL replication

Page 118: Burn down the silos! Helping dev and ops gel on high availability websites

Background:MySQL replication MMM

Page 119: Burn down the silos! Helping dev and ops gel on high availability websites

Background:MySQL replication MMM2 masters + 4 slaves

Page 120: Burn down the silos! Helping dev and ops gel on high availability websites

REPLICATION_FAILon one slave

Page 121: Burn down the silos! Helping dev and ops gel on high availability websites

Down to 3 nodes

Page 122: Burn down the silos! Helping dev and ops gel on high availability websites

Increased cluster load

Page 123: Burn down the silos! Helping dev and ops gel on high availability websites

REPLICATION_DELAYon another slave

Page 124: Burn down the silos! Helping dev and ops gel on high availability websites

Down to 2 nodes

Page 125: Burn down the silos! Helping dev and ops gel on high availability websites

Inspectionof REPLICATION_DELAY slave

Page 126: Burn down the silos! Helping dev and ops gel on high availability websites

Swapping like madHalf the memory allocated

Page 127: Burn down the silos! Helping dev and ops gel on high availability websites

Shutdown, upgrade,

boot

Page 128: Burn down the silos! Helping dev and ops gel on high availability websites

VisibilityVisibility

Page 129: Burn down the silos! Helping dev and ops gel on high availability websites

metric collectionmetric collection

Page 130: Burn down the silos! Helping dev and ops gel on high availability websites

ConsistencyConsistency

Page 131: Burn down the silos! Helping dev and ops gel on high availability websites
Page 132: Burn down the silos! Helping dev and ops gel on high availability websites

Database connectivityDatabase connectivity

Page 133: Burn down the silos! Helping dev and ops gel on high availability websites

Soft launch

Page 134: Burn down the silos! Helping dev and ops gel on high availability websites

PHP connection errors

Page 135: Burn down the silos! Helping dev and ops gel on high availability websites

Con!g parses + loads

Page 136: Burn down the silos! Helping dev and ops gel on high availability websites

Add con!g dump url

Page 137: Burn down the silos! Helping dev and ops gel on high availability websites

VisibilityVisibility

Page 138: Burn down the silos! Helping dev and ops gel on high availability websites

curl_json

Page 139: Burn down the silos! Helping dev and ops gel on high availability websites

Redeploy

Page 140: Burn down the silos! Helping dev and ops gel on high availability websites

...

Page 141: Burn down the silos! Helping dev and ops gel on high availability websites

Typo

Page 142: Burn down the silos! Helping dev and ops gel on high availability websites

2 reviewersof con!g management

Page 143: Burn down the silos! Helping dev and ops gel on high availability websites

both in ops team

Page 144: Burn down the silos! Helping dev and ops gel on high availability websites

VisibilityVisibility

Page 145: Burn down the silos! Helping dev and ops gel on high availability websites

code changescode changes

Page 146: Burn down the silos! Helping dev and ops gel on high availability websites

Your new best friend

Page 147: Burn down the silos! Helping dev and ops gel on high availability websites

reviewer diversitydevs should have visibility of ops changes

Page 148: Burn down the silos! Helping dev and ops gel on high availability websites
Page 149: Burn down the silos! Helping dev and ops gel on high availability websites

Data consistencyData consistency

Page 150: Burn down the silos! Helping dev and ops gel on high availability websites

New release

Page 151: Burn down the silos! Helping dev and ops gel on high availability websites

Database migrations

Page 152: Burn down the silos! Helping dev and ops gel on high availability websites

Release promotion

Page 153: Burn down the silos! Helping dev and ops gel on high availability websites

uat

stage

production

Page 154: Burn down the silos! Helping dev and ops gel on high availability websites

uat

stage

production

!

Page 155: Burn down the silos! Helping dev and ops gel on high availability websites

uat

stage

production

!

!

Page 156: Burn down the silos! Helping dev and ops gel on high availability websites

uat

stage

production

!

!

"

Page 157: Burn down the silos! Helping dev and ops gel on high availability websites

uat

stage

production

uat

stage

production

!

!

"

Page 158: Burn down the silos! Helping dev and ops gel on high availability websites

CREATE TABLE fooshould have been

CREATE TABLE IF NOT EXIST foo

Page 159: Burn down the silos! Helping dev and ops gel on high availability websites

ConsistencyConsistency

Page 160: Burn down the silos! Helping dev and ops gel on high availability websites

uat

stage

production

uat

stage

production

Page 161: Burn down the silos! Helping dev and ops gel on high availability websites

uat

stage

production

uat

stage

production

Page 162: Burn down the silos! Helping dev and ops gel on high availability websites

uat

stage

production

uat

stage

production

Page 163: Burn down the silos! Helping dev and ops gel on high availability websites

Repeatability

Page 164: Burn down the silos! Helping dev and ops gel on high availability websites
Page 165: Burn down the silos! Helping dev and ops gel on high availability websites

ConsistencyConsistency

Page 166: Burn down the silos! Helping dev and ops gel on high availability websites

ensuring identical

behaviour

Page 167: Burn down the silos! Helping dev and ops gel on high availability websites

within an environment

Page 168: Burn down the silos! Helping dev and ops gel on high availability websites

or across multiple environments

Page 169: Burn down the silos! Helping dev and ops gel on high availability websites

Repeatability

Page 170: Burn down the silos! Helping dev and ops gel on high availability websites

Function of consistency

Page 171: Burn down the silos! Helping dev and ops gel on high availability websites

automate,to remove

human error

Page 172: Burn down the silos! Helping dev and ops gel on high availability websites

increase speed by shortening

feedback loops

Page 173: Burn down the silos! Helping dev and ops gel on high availability websites

VisibilityVisibility

Page 174: Burn down the silos! Helping dev and ops gel on high availability websites

one eye on the past

one eye on the past

Page 175: Burn down the silos! Helping dev and ops gel on high availability websites

one eye on the future

one eye on the future

Page 176: Burn down the silos! Helping dev and ops gel on high availability websites

Communicate!Communicate!

Page 177: Burn down the silos! Helping dev and ops gel on high availability websites

Thank you!http://www.#ickr.com/photos/48722974@N07/4682302824/ http://www.#ickr.com/photos/lyza/4144764381/

http://www.#ickr.com/photos/acediscovery/3030548744/ http://www.#ickr.com/photos/matchew/424026531/

http://www.#ickr.com/photos/andrew_wertheimer/5268407700/ http://www.#ickr.com/photos/mrwoodnz/4289893182/

http://www.#ickr.com/photos/azrasta/4528604334/ http://www.#ickr.com/photos/myprofe/4396178084/

http://www.#ickr.com/photos/boliston/2351083198/ http://www.#ickr.com/photos/nnova/4834954885/

http://www.#ickr.com/photos/brianwestcott/1497708345/ http://www.#ickr.com/photos/pjern/2150874047/

http://www.#ickr.com/photos/brunogirin/73014722/ http://www.#ickr.com/photos/rubodewig/5161937181/

http://www.#ickr.com/photos/eole/4500783172/ http://www.#ickr.com/photos/rutty/460520720/

http://www.#ickr.com/photos/jacockshaw/1811056252/ http://www.#ickr.com/photos/sarah_lincoln/4740037328/

http://www.#ickr.com/photos/jenny-pics/2719309611/ http://www.#ickr.com/photos/shindotv/3835365695/

http://www.#ickr.com/photos/ldsykora/2414497811/ http://www.#ickr.com/photos/thalamus/306881919/

http://www.#ickr.com/photos/listed_crime/1342164481/ http://www.#ickr.com/photos/traviscrawford/323366600/

http://www.#ickr.com/photos/localsurfer/369116556/ http://www.#ickr.com/photos/webtreatsetc/5303216304/

Credits: