Experiences from DevOps production: Deployment, performance, failure.

83
Experiences from production Deployment, performance, failure David Mytton All Your Base - Oct 2014 blog.serverdensity.com

description

In his All Your Base talk, David Mytton (founder of Server Density) will talk you through our experiences in handling large scale MongoDB deployments.

Transcript of Experiences from DevOps production: Deployment, performance, failure.

Page 1: Experiences from DevOps production: Deployment, performance, failure.

Experiences from productionDeployment, performance, failure

David MyttonAll Your Base - Oct 2014

blog.serverdensity.com

Page 2: Experiences from DevOps production: Deployment, performance, failure.

David Mytton

Page 3: Experiences from DevOps production: Deployment, performance, failure.

serverdensity.com/allyourbase

Page 4: Experiences from DevOps production: Deployment, performance, failure.

Slides: twitter.com/davidmytton

Page 5: Experiences from DevOps production: Deployment, performance, failure.

Agenda

● Performance

● Architecture

● Downtime

● Preparation

● Where to host?

Page 6: Experiences from DevOps production: Deployment, performance, failure.

Server Density Architecture

Page 7: Experiences from DevOps production: Deployment, performance, failure.

Server Density Architecture

● ~100 servers - Ubuntu 12.04

Page 8: Experiences from DevOps production: Deployment, performance, failure.

Server Density Architecture

● ~100 servers - Ubuntu 12.04

● 50:50 virtual/dedicated

Page 9: Experiences from DevOps production: Deployment, performance, failure.

Server Density Architecture

● ~100 servers - Ubuntu 12.04

● 50:50 virtual/dedicated

● 200TB/m processed data

Page 10: Experiences from DevOps production: Deployment, performance, failure.

Server Density Architecture

● ~100 servers - Ubuntu 12.04

● 50:50 virtual/dedicated

● 200TB/m processed data

● Nginx, Python, MongoDB

Page 11: Experiences from DevOps production: Deployment, performance, failure.

Server Density Architecture

● ~100 servers - Ubuntu 12.04

● 50:50 virtual/dedicated

● 200TB/m processed data

● Nginx, Python, MongoDB

● Softlayer > 1TB RAM, 5TB SSDs

Page 12: Experiences from DevOps production: Deployment, performance, failure.

Two choices for deployment

Page 13: Experiences from DevOps production: Deployment, performance, failure.

Two choices for deployment

● Virtualized

● Bare metal

Page 14: Experiences from DevOps production: Deployment, performance, failure.

Advantages of virtualization

● Easy to manage

Page 15: Experiences from DevOps production: Deployment, performance, failure.

Advantages of virtualization

● Easy to manage

● Fast boot

Page 16: Experiences from DevOps production: Deployment, performance, failure.

Advantages of virtualization

● Easy to manage

● Fast boot

● Easier to resize/migrate

Page 17: Experiences from DevOps production: Deployment, performance, failure.

Advantages of virtualization

● Easy to manage

● Fast boot

● Easier to resize/migrate

● Templating/snapshots

Page 18: Experiences from DevOps production: Deployment, performance, failure.

Advantages of virtualization

● Easy to manage

● Fast boot

● Easier to resize/migrate

● Templating/snapshots

● Containment

Page 19: Experiences from DevOps production: Deployment, performance, failure.

Disadvantages of virtualization

● Another layer

Page 20: Experiences from DevOps production: Deployment, performance, failure.

Disadvantages of virtualization

● Another layer

● Hypervisor overhead

Page 21: Experiences from DevOps production: Deployment, performance, failure.

Disadvantages of virtualization

● Another layer

● Hypervisor overhead

● Host contention

Page 22: Experiences from DevOps production: Deployment, performance, failure.

Disadvantages of virtualization

● Another layer

● Hypervisor overhead

● Host contention

● i/o performance

Page 23: Experiences from DevOps production: Deployment, performance, failure.

Advantages of bare metal

● Dedicated resources

Page 24: Experiences from DevOps production: Deployment, performance, failure.

Advantages of bare metal

● Dedicated resources

● Direct access to hardware

Page 25: Experiences from DevOps production: Deployment, performance, failure.

Advantages of bare metal

● Dedicated resources

● Direct access to hardware

● Customisable specs

Page 26: Experiences from DevOps production: Deployment, performance, failure.

Advantages of bare metal

● Dedicated resources

● Direct access to hardware

● Customisable specs

● Performance

Page 27: Experiences from DevOps production: Deployment, performance, failure.

Disadvantages of bare metal

● Build/deploy time

Page 28: Experiences from DevOps production: Deployment, performance, failure.

Disadvantages of bare metal

● Build/deploy time

● More difficult to resize

Page 29: Experiences from DevOps production: Deployment, performance, failure.

Disadvantages of bare metal

● Build/deploy time

● More difficult to resize

● Difficult to migrate/snapshot

Page 30: Experiences from DevOps production: Deployment, performance, failure.

Disadvantages of bare metal

● Build/deploy time

● More difficult to resize

● Capex/lifetime

● Difficult to migrate/snapshot

Page 31: Experiences from DevOps production: Deployment, performance, failure.

Performance problems?

Page 32: Experiences from DevOps production: Deployment, performance, failure.

Performance problems?

Easy answer: move to bare metal!

Page 33: Experiences from DevOps production: Deployment, performance, failure.

Key performance factors

● Network

Page 34: Experiences from DevOps production: Deployment, performance, failure.

Key performance factors

● Network

● EC2: Cluster compute, high memory, high i/o, high storage

● GCE: Higher CPU instances

Page 35: Experiences from DevOps production: Deployment, performance, failure.

Key performance factors

● Network

Page 36: Experiences from DevOps production: Deployment, performance, failure.

Key performance factors

● Network

Location Ping RTT LatencyWithin USA 40-80msTrans-Atlantic 100msTrans-Pacific 150msEurope-Japan 300ms

Page 37: Experiences from DevOps production: Deployment, performance, failure.

Networking performance

AWS

GCE

bit.ly/googlevsamazon

Page 38: Experiences from DevOps production: Deployment, performance, failure.

Key performance factors

● Memory

Page 39: Experiences from DevOps production: Deployment, performance, failure.

http://blog.pythonisito.com/2011/12/mongodbs-write-lock.html

Page 40: Experiences from DevOps production: Deployment, performance, failure.

http://blog.pythonisito.com/2011/12/mongodbs-write-lock.html

Page 41: Experiences from DevOps production: Deployment, performance, failure.

Key performance factors

● Memory is expensive

Page 42: Experiences from DevOps production: Deployment, performance, failure.

Key performance factors

● Disk

● SSDs!

Page 43: Experiences from DevOps production: Deployment, performance, failure.

Key performance factors

● Disk

● SSDs!

GCE: 256GB = $83.20/m

EC2: 256GB = $35.32/m

SL: 200GB = $81/m

Page 44: Experiences from DevOps production: Deployment, performance, failure.

Why cloud?

● Flexible

Page 45: Experiences from DevOps production: Deployment, performance, failure.

Why cloud?

● Flexible

● Unlimited resources

Page 46: Experiences from DevOps production: Deployment, performance, failure.

Why cloud?

● Flexible

● Unlimited resources

● Cheap to get started

Page 47: Experiences from DevOps production: Deployment, performance, failure.

Why cloud?

● Flexible

● Unlimited resources

● Cheap to get started

● Other products

Page 48: Experiences from DevOps production: Deployment, performance, failure.

Why colo?

Page 49: Experiences from DevOps production: Deployment, performance, failure.

Why colo?

● Vastly cheaper

Page 50: Experiences from DevOps production: Deployment, performance, failure.

Why colo?

● Vastly cheaper

● Complete control

Page 51: Experiences from DevOps production: Deployment, performance, failure.

Let’s talk about downtime

Page 52: Experiences from DevOps production: Deployment, performance, failure.

2013 Spend: ~$5bn

Page 53: Experiences from DevOps production: Deployment, performance, failure.

2013 Spend: ~$6bn

Page 54: Experiences from DevOps production: Deployment, performance, failure.

2013 Spend: ~$4bn

Page 55: Experiences from DevOps production: Deployment, performance, failure.

You will have downtime

How much do you spend?

Page 56: Experiences from DevOps production: Deployment, performance, failure.

Preparation

Page 57: Experiences from DevOps production: Deployment, performance, failure.

Preparation - On Call

● Rotations

Page 58: Experiences from DevOps production: Deployment, performance, failure.

Preparation - On Call

● Off call

● Rotations

Page 59: Experiences from DevOps production: Deployment, performance, failure.

Preparation - On Call

● Off call

● Rotations

● Work the next day?

● Reachability - Train, 3G/4G (edge?!), Do Not Disturb mode, system updates

Page 60: Experiences from DevOps production: Deployment, performance, failure.

Preparation - On Call

● Off call

● Rotations

● Work the next day?

● Reachability - Train, 3G/4G (edge?!), Do Not Disturb mode, system updates

● Work the next day?

Page 61: Experiences from DevOps production: Deployment, performance, failure.

Preparation - Documentation

Page 62: Experiences from DevOps production: Deployment, performance, failure.

Preparation - Documentation

● Searchable

Page 63: Experiences from DevOps production: Deployment, performance, failure.

Preparation - Documentation

● Searchable

● Easy to edit

Page 64: Experiences from DevOps production: Deployment, performance, failure.

Preparation - Documentation

● Searchable

● Easy to edit

● Independent of your infrastructure

Page 65: Experiences from DevOps production: Deployment, performance, failure.

Preparation - Documentation

● Searchable

● Easy to edit

● Independent of your infrastructure

● Up to date

Page 66: Experiences from DevOps production: Deployment, performance, failure.
Page 67: Experiences from DevOps production: Deployment, performance, failure.

Unexpected failures

Page 68: Experiences from DevOps production: Deployment, performance, failure.

Unexpected failures

● Communication systems

Page 69: Experiences from DevOps production: Deployment, performance, failure.

Unexpected failures

● Communication systems

● Network connectivity

Page 70: Experiences from DevOps production: Deployment, performance, failure.

Unexpected failures

● Communication systems

● Network connectivity

● Access to support

Page 71: Experiences from DevOps production: Deployment, performance, failure.

ALERT!

Page 72: Experiences from DevOps production: Deployment, performance, failure.

ALERT!

1. Load up incident response checklist

Page 73: Experiences from DevOps production: Deployment, performance, failure.

ALERT!

1. Load up incident response checklist

2. Log incident in JIRA

Page 74: Experiences from DevOps production: Deployment, performance, failure.

ALERT!

1. Load up incident response checklist

2. Log incident in JIRA

3. Log into Ops War Room

Page 75: Experiences from DevOps production: Deployment, performance, failure.

ALERT!

1. Load up incident response checklist

2. Log incident in JIRA

4. Public status post

3. Log into Ops War Room

Page 76: Experiences from DevOps production: Deployment, performance, failure.

ALERT!

1. Load up incident response checklist

2. Log incident in JIRA

4. Public status post

5. Initial investigation

3. Log into Ops War Room

Page 77: Experiences from DevOps production: Deployment, performance, failure.

Key response principles

Page 78: Experiences from DevOps production: Deployment, performance, failure.

Key response principles

● Log everything

Page 79: Experiences from DevOps production: Deployment, performance, failure.

Key response principles

● Log everything

● Frequent public status updates

Page 80: Experiences from DevOps production: Deployment, performance, failure.

Key response principles

● Log everything

● Frequent public status updates

● Gather the team

Page 81: Experiences from DevOps production: Deployment, performance, failure.

Key response principles

● Log everything

● Frequent public status updates

● Gather the team

● Escalate!

Page 82: Experiences from DevOps production: Deployment, performance, failure.

Summary

● Performance

● Architecture

● Downtime

● Preparation

● Where to host?

Page 83: Experiences from DevOps production: Deployment, performance, failure.

どもありがとうございます

@davidmytton

[email protected]

blog.serverdensity.com

serverdensity.com/allyourbase