Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section...

114
Devops Workshop (Section 4) John Willis @botchagalupe

Transcript of Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section...

Page 1: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

Devops Workshop (Section 4)

John Willis @botchagalupe

Page 2: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast
Page 3: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

Section 4 - The Second Way - Feedback

Page 4: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

Accelerate Feedback

Page 5: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way - Amplify Feedback

3

“3% of the problems have figures, 97% of the problems do not”

- Dr Deming

Page 6: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ The Second Way - Goals

▪ Right to Left ▪ Find and Fix Fast ▪ Shorten and Amplify Feedback

3

The Second Way - Amplify Feedback

Page 7: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

▪ Accelerate Feedback

▪ Telemetry ▪ Fault Injection ▪ Safety Culture

3

Page 8: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

▪ Accelerate Feedback

▪ Telemetry ▪ Fault Injection ▪ Collaboration ▪ Safety Culture

3

Page 9: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

▪ Telemetry

▪ Monitoring ▪ Logging ▪ Analytics

3

Page 10: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

3

Source: Gene Kim - itrevolution.com

Page 11: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

3

monitorama.comJason Dixon, John Allspaw, Dr Neil Gunther, Mathias Meyer, John Vincent, Jordan Sissel, Sean Porter, Katherine Daniels, Lindsay Holmwood, Adrian Cockcroft, Bridget Kromhout, Kyle Kingsbury, James Turnbull

Page 12: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

▪ Accelerate Feedback

▪ Telemetry ▪ Fault Injection ▪ Safety Culture

3

Page 13: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

▪ Fault Injection

▪ Reduce MTBF ▪ Reduce MTTR

3

Page 14: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

▪ Fault Injection

▪ Game Day ▪ Netflix Simian Army ▪ Netflix FIT

3

Page 15: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

▪ Game Day

▪ Reduces MTBF ▪ Reduces MTTR

3

Page 16: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

▪ Netflix Simian Army

▪ Chaos Monkey (Hosts) ▪ Chaos Gorilla (Data Center) ▪ Latency Monkey (Inject Latency) ▪ Conformity Monkey (Best Practice) ▪ Security Monkey (Security Violations)

3

Page 17: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

▪ FIT : Failure Injection Testing

▪ Limit the blast ratio of the failure ▪ Telemetry of path of the failure ▪ Dependency telemetry

3

Page 18: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

▪ Accelerate Feedback

▪ Telemetry ▪ Fault Injection ▪ Safety Culture

3

Page 19: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

3

“In a complex system, doing the same thing twice will not predictably or necessarily lead to the same result.”

Sidney Dekker

Page 20: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

Views on Human Error

Page 21: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ The Second Way - Right to Left

▪ Creating a Service Reliability Culture ▪ Fast Feedback ▪ Understanding Monitoring ▪ Understanding Complexity

3

The Second Way - Amplify Feedback

Page 22: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ The Second Way - Right to Left

▪ Creating a Service Reliability Culture ▪ Fast Feedback ▪ Understanding Monitoring ▪ Understanding Complexity

3

Creating a Service Reliability Culture

Page 23: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Service Reliability Culture is Like a Team Sport

▪ Availability ▪ Latency ▪ Performance ▪ Change Management ▪ Monitoring ▪ Emergency Response ▪ Capacity Planning

3

Creating a Service Reliability Culture

Page 24: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Core Conflict “Dev vs Ops”

▪ Operations don’t really know the code base ▪ The team the knows least about the code typically

has the responsibility of it’s launch

3

Creating a Service Reliability Culture

Page 25: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Understanding Service Levels

▪ Service Level Agreements ▪ Service Level Objectives (Targets) ▪ Service Level Indicators

3

Creating a Service Reliability Culture

Page 26: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Service Level Agreements

▪ Between the business and the customer ▪ Typically a financial contract ▪ Can be MTTR or MTBF based ▪ Not all services have an explicit SLA

3

Creating a Service Reliability Culture

Page 27: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Service Level Objectives

▪ Typically the basis for SLA’s ▪ Between the service and the system ▪ Typically target based ▪ All services should have an SLO ▪ Determine actions to take on missed SLO’s ▪ SLO’s should be tracked historically

3

Creating a Service Reliability Culture

Page 28: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Service Level Objectives - Picking Targets

▪ Try and keep them simple ▪ Don’t over design ▪ Let them evolve ▪ Will learn over time

3

Creating a Service Reliability Culture

Page 29: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Service Level Indicators

▪ Quantitative measure of a service ▪ Used as indicators for the SLO’s ▪ Monitor SLI’s and compare to SLO’s

3

Creating a Service Reliability Culture

Page 30: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Service Level Indicators (Examples)

▪ Latency ▪ Errors ▪ Availability ▪ Throughput

3

Creating a Service Reliability Culture

Page 31: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Generalized Indicators

▪ Management By Objectives (MBO) ▪ Key Performance Indicators (KPI) ▪ Objective and Key Results (OKR)

3

Creating a Service Reliability Culture

Page 32: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way - Amplify Feedback

3

“Management is doing things right; leadership is doing the right things.” ― Peter F. Drucker

Page 33: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way - Amplify Feedback

3

“A production line that never stopped was either extremely good or extremely bad”

- Taiichi Ohno

Page 34: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Understanding Risk and Failure

▪ 100% reliability is a myth ▪ All systems go down ▪ Not all services are equal ▪ Manage risk and failure by service ▪ Managing reliability is about managing risk ▪ Managing risk is about cost

3

Creating a Service Reliability Culture

Page 35: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Understanding the Cost of Reliability

▪ High availability systems ▪ Opportunity costs

3

Creating a Service Reliability Culture

Page 36: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Understanding the Cost of Reliability

▪ Is it a free service? ▪ Is it a revenue based service?

3

Creating a Service Reliability Culture

Page 37: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ How Many 9‘s

▪ One (90%) - 36.5 days per year ▪ Two (99%) - 3.65 days per year ▪ Three (99.9%) - 8.76 hours per year ▪ Four (99.99%) - 52.56 minutes per year ▪ Five (99.999%) - 5.26 minutes per year ▪ Six (99.9999% - 31.5 seconds per year

3

Creating a Service Reliability Culture

Page 38: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Example: On Million Per Day

▪ Two (99%) - 3.65 days per year = $3.65M ▪ Three (99.9%) - 8.76 hours per year = $365k ▪ Four (99.99%) - 52.56 minutes per year = $36.5k ▪ Five (99.999%) - 5.26 minutes per year = $3.65k ▪ Six (99.9999% - 31.5 seconds per year = $365

3

Creating a Service Reliability Culture

Page 39: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Example: On Million Per Day

▪ Two (99%) - 3.65 days per year = $3.65M ▪ Three (99.9%) - 8.76 hours per year = $365k ▪ Four (99.99%) - 52.56 minutes per year = $36.5k ▪ Five (99.999%) - 5.26 minutes per year = $3.65k ▪ Six (99.9999% - 31.5 seconds per year = $365

3

Creating a Service Reliability Culture

Page 40: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Google Site Reliability Engineers

▪ Google defined the job title ▪ Google SRE was created in 2003 ▪ No NOC ▪ A team that focuses on reliability

▪ Focus on service ▪ Focus on engineering

3

Creating a Service Reliability Culture

Page 41: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Benjamin Treynor Sloss

▪ The number one feature for a product is that it works.

▪ The second most import feature for a product is that it works.

▪ The third most import feature for a product is that it works.

3

Creating a Service Reliability Culture

Page 42: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

Fast Feedback

3

“You built it, you run it”

- Werner Vogels

Page 43: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ The Second Way - Right to Left

▪ Creating a Service Reliability Culture ▪ Fast Feedback ▪ Understanding Monitoring ▪ Understanding Complexity

3

Fast Feedback

Page 44: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Fast Feedback

▪ Design for failure ▪ Adaptive systems - Feedback loops ▪ Developer managed service ▪ Contingency, peer reviews and pairing ▪ Embedded engineers

3

Fast Feedback

Page 45: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Design for Failure

▪ Software resiliency typically is better than hardware based

▪ Cost ▪ Easier to change (fix, upgrade, replace) ▪ Faster to fix ▪ Easier to experiment

3

Fast Feedback

Page 46: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Design for Failure

▪ MTTR over MTBF ▪ Game Days ▪ Chaos Monkey(s) ▪ Fault Injection

3

Fast Feedback

Page 47: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

▪ Fast Feedback

▪ A/B Testting ▪ Dark Deploys ▪ Inject Deployment Metrics in Monitoring ▪ Developers Wear Pagers ▪ Pair Programming ▪ Peer Reviews

3

Page 48: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

▪ Deploys - Upgrading Live Services

▪ Rolling Upgrades ▪ Canary ▪ Blue Green Deploys ▪ Toggling Feature

3

Page 49: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

▪ Fast Feedback

▪ A/B Testting ▪ Dark Deploys ▪ Inject Deployment Metrics in Monitoring

3

Page 50: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

Fast Feedback

3

“Reality is made up of circles but we see straight lines”

- Peter Senge

Page 51: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Peer Reviews - Guidelines

▪ All changes are peer reviewed ▪ Everyone monitors the commit logs ▪ High risk changes should include an SME ▪ Break up larger changes into smaller ones

3

Fast Feedback

Page 52: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Pairing

▪ Pair programming for everything ▪ Pair programming is slower but decrease bugs up

to 70% to 80% ▪ Spreads knowledge ▪ Great for training ▪ Setup pair times ▪ Need a culture that values pair programming

3

Fast Feedback

Page 53: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Embedded Engineers

▪ Operations in development ▪ Development in operations

3

Fast Feedback

Page 54: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ ChatOps

“Everyone is pairing all the time”

Jesse Newland (Github)

3

Fast Feedback

Page 55: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ ChatOps Definition (Atlassian)

▪ ChatOps is a collaboration model that connects people, tools, process, and automation into a transparent workflow. This flow connects the work needed, the work happening, and the work done in a persistent location staffed by the people, bots, and related tools.

3

Fast Feedback

Source: http://blogs.atlassian.com/2016/01/what-is-chatops-adoption-guide/

Page 56: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ ChatOps Origins

▪ Originally based on chat bots ▪ Github’s use of Hubot ▪ Jesse Newland - ChatOps at Github ▪ Putting tools in the middle of the conversation

3

Fast Feedback

Page 57: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ ChatOps Chat Tools

▪ Slack ▪ Campfile ▪ Hipchat

3

Fast Feedback

Page 58: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ ChatOps Benefits

▪ It’s like a multiuser terminal where everyone can see the conversation and the commands interwoven.

▪ There is a historical record of the commands and the conversation. ▪ Provides a great training tools - teaching by doing ▪ Great for tactical incident resolution - everyone gets to see the

conversation and commands ▪ Dynamically manage the on call rotation. ▪ Can manage all aspects of the “devops” practices from one central

place. ▪ Mobile operations tool for free.

3

Fast Feedback

Page 59: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ ChatOps Examples

▪ Run a command ▪ Deploy code ▪ Check logs ▪ Check status from Github or Jenkins ▪ Change the on call rotation ▪ Check Nagios alert ▪ Graph monitoring or alert data ▪ Take a system online of offline ▪ Kill a job or process ▪ Answer help desk questions (ML)

3

Fast Feedback

Page 60: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

Understanding Monitoring

3

“It’s not the upfront capital that kills you, it’s the operations and maintenance on the back end.”

- Gene Kim

Page 61: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ The Second Way - Right to Left

▪ Creating a Service Reliability Culture ▪ Fast Feedback ▪ Understanding Monitoring ▪ Understanding Complexity

3

Understanding Monitoring

Page 62: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ The Visible Ops Handbook (Kim, Behr, Spafford)

▪ Culture of Causality

▪ 80% of all outages are caused by a change ▪ 80% of restoration time is spent trying to figure

out what changed ▪ High performance organizations look for the

most recent change first

3

Understanding Monitoring

Page 63: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

3

Page 64: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Advanced Application Monitoring Tools

▪ New Relic ▪ AppDynamics ▪ Dynatrace

3

Understanding Monitoring

Page 65: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ SAS Monitoring Tools

▪ Data Dog ▪ HonyComb ▪ SignalFX

3

Understanding Monitoring

Page 66: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

Understanding Monitoring

3

Page 67: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Why Monitor

▪ Alerting ▪ Visualizing ▪ Collecting ▪ Trending ▪ Anomalies ▪ Learning

3

Understanding Monitoring

Page 68: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Google’s Four Golden Signals

▪ Latency ▪ Traffic ▪ Errors ▪ Saturation

3

Understanding Monitoring

Page 69: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Looking at the Service Stack

▪ Business Indicators ▪ Application Indicators ▪ Infrastructure Indicators ▪ User Based Indicators ▪ Deployment Indicators

3

Understanding Monitoring

Page 70: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Other Examples

▪ Resolution times ▪ Abandoned shopping carts ▪ Sales transactions ▪ Churn rate ▪ Deployment promotions ▪ Lead time ▪ Forum posts

3

Understanding Monitoring

Page 71: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Monitoring Deployments

3

Understanding Monitoring

Source: Mike Brittain - Etsy Code as Craft

Page 72: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Monitoring Deployments

3

Understanding Monitoring

Source: Mike Brittain - Etsy Code as Craft

Page 73: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Werner Vogels - Monitoring Question

▪ We monitor a lot of stuff but there is only one metric we can about. Order rate. We have years of heuristics telling us it’s upper and lower limits.

3

Understanding Monitoring

Page 74: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Facebook

3

Understanding Monitoring

Page 75: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Components of a monitoring system

▪ Sensing/Measuring ▪ Collecting ▪ Analysis/Computation ▪ Alerting ▪ Escalation ▪ Visualization

3

Understanding Monitoring

Source: Limoncelli - The Practice of Cloud System Administration V2

Page 76: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Black Box vs White Box

▪ Black Box Monitoring ▪ Symptom based ▪ Active Problems ▪ User’s experience

▪ White Box Monitoring ▪ Agents ▪ Logs ▪ Instrumentation

3

Understanding Monitoring

Page 77: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Types of Metrics (Raw)

▪ Gauges ▪ Counters ▪ Timers

3

Understanding Monitoring

Page 78: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Types of Metrics (Derived)

▪ Delta ▪ Rates ▪ Ratios

3

Understanding Monitoring

Page 79: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Analysis

▪ Real Time ▪ Correlation ▪ Historical ▪ Anomaly Detection ▪ Machine Learning

3

Understanding Monitoring

Page 80: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

3

Understanding Monitoring

Page 81: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Statistical Analysis

▪ Mean ▪ Median ▪ Percentiles ▪ Standard Deviation ▪ Median Absolute Deviation

3

Understanding Monitoring

Page 82: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

3

Understanding Monitoring

Source: Wikipedia

68–95–99.7 Rule

Page 83: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Non-Guassian Distribution Data

▪ Most IT operations and performance data doesn’t have a Guassian Distribution

▪ This can lead to over or under alerting

3

Understanding Monitoring

Page 84: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Median ▪ Median Absolute Deviation

3

Understanding Monitoring

Page 85: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Histograms

3

Understanding Monitoring

Page 86: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Percentiles

3

Understanding Monitoring

Page 87: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Percentiles

3

Understanding Monitoring

Page 88: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Inverse Quantiles

▪ Instead of measuring how many slow transactions there are (99 Quantile)

▪ Measure how many transactions are too slow

▪ Modality Changes

3

Understanding Monitoring

Page 89: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Modality Changes

3

Understanding Monitoring

Source: Theo Schlossnagel http://www.slideshare.net/postwait/adaptive-availability

Page 90: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Aggregate Graphs

3

Understanding Monitoring

Source: datadoghq.com

Page 91: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Anomaly Detection

▪ Finding patterns in data that do not conform to expected behavior

▪ Can be used for noise reduction

3

Understanding Monitoring

Source: Chandola - Anomaly Detection : A Survey

Page 92: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Anomaly Detection - Research Areas

▪ Statistics ▪ Machine Learning ▪ Information Theory ▪ Data Mining

3

Understanding Monitoring

Source: Chandola - Anomaly Detection : A Survey

Page 93: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Anomaly Detection - Characteristics

▪ High Cardinality ▪ Minimizing False Positives ▪ Seasonality ▪ Non Normally Distributions

3

Understanding Monitoring

Source: Chandola - Anomaly Detection : A Survey

Page 94: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Anomaly Detection - Netflix

3

Understanding Monitoring

Page 95: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Ebay Case Study

3

Understanding Monitoring

Source: http://www.ebaytechblog.com/2015/08/19/statistical-anomaly-detection/

Page 96: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Ebay Case Study

3

Understanding Monitoring

Source: http://www.ebaytechblog.com/2015/08/19/statistical-anomaly-detection/

Page 97: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Ebay Case Study

3

Understanding Monitoring

Source: http://www.ebaytechblog.com/2015/08/19/statistical-anomaly-detection/

Page 98: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

Understanding Complexity

3

“I smile and start to count on my fingers: One, people are good. Two, every conflict can be removed. Three, every situation, no matter how complex it initially looks, is exceedingly simple. Four, every situation can be substantially improved; even the sky is not the limit. Five, every person can reach a full life. Six, there is always a win-win solution. Shall I continue to count?”

-Eliyahu M. Goldratt

Page 99: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ The Second Way - Right to Left

▪ Creating a Service Reliability Culture ▪ Fast Feedback ▪ Understanding Monitoring ▪ Understanding Complexity

3

Understanding Complexity

Page 100: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

The Second Way

▪ Complexity

3

Page 101: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ In Search of Certainty

▪ Mark Burgess invented Desired State Configuration Management 20+ years ago

▪ Created Promise Theory 10+ years ago

▪ Uses realms of physics and biology to assert that uncertainty is an unescapable fact of technology.

3

Understanding Complexity

Page 102: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Cybernetics

▪ Norbert Wiener defined in 1948 ▪ Circular Causality ▪ Self Steering Approach ▪ Listen, Calibrate, Change and Adapt ▪ Systemic Approach

3

Understanding Complexity

Page 103: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Cynefin

▪ Defined by Dave Snowden ▪ Designed to describe the evolutionary

nature of complex systems ▪ Draws on research from complex

adaptive systems theory, cognitive science, anthropology and psychology

3

Understanding Complexity

Source: Wikipedia - Cynefin

Page 104: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Cause and Effect is Obvious

▪ Sense ▪ See what’s coming in

▪ Categorise ▪ Make it fit predetermined

categories ▪ Respond

▪ Decide what to do

3

Understanding Complexity

Source: Wikipedia - Cynefin

Page 105: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Cause and Effect Requires Analysis

▪ Sense ▪ See what’s coming in

▪ Analyse ▪ Investigate or analyse, using

expert knowledge ▪ Respond

▪ Decide what to do

3

Understanding Complexity

Source: Wikipedia - Cynefin

Page 106: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Cause and Effect in Retrospect

▪ Probe ▪ Experimental input

▪ Sense ▪ Failures or successes

▪ Respond ▪ Decide what to do, amplify or

dampen

3

Understanding Complexity

Source: Wikipedia - Cynefin

Page 107: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Cause and Effect Undetermined

▪ Act ▪ Attempt to stabilize

▪ Sense ▪ Failures or successes

▪ Respond ▪ Decide what to do next

3

Understanding Complexity

Source: Wikipedia - Cynefin

Page 108: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

3

Understanding Complexity

Source: old.cognitive-edge.com

Page 109: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Circuit Breaker Patterns

▪ Wrap a protected function call in a circuit breaker object

▪ Monitors for failures ▪ When a threshold is met trip a

circuit breaker ▪ Calls are then returned with an

error

3

Understanding Complexity

Page 110: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Circuit Breaker Patterns

3

Understanding Complexity

Source: Martin Fowler http://martinfowler.com/bliki/CircuitBreaker.html

Page 111: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Netflix - Circuit Breaker - Hystrix

▪ Give protection from and control over latency and failure from dependencies accessed

▪ Stop cascading failures in a complex distributed system.

▪ Fail fast and rapidly recover. ▪ Fallback and gracefully degrade when possible. ▪ Enable near real-time monitoring, alerting, and

operational control.3

Understanding Complexity

Page 112: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Netflix - Circuit Breaker - Hystrix

▪ Isolates access points between services ▪ Can setup triggers (trip if 10 calls within 10

seconds take longer than 5 seconds) ▪ Provides fall back options (error, default value, null

value, or special error)

3

Understanding Complexity

Page 113: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

3

Understanding Complexity

Source: https://github.com/netflix/hystrix/wiki

Page 114: Devops Workshop (Section 4)€¦ · Devops Workshop (Section 4) John Willis @botchagalupe. Section 4 - The Second Way - Feedback. Accelerate Feedback. ... Embedded engineers 3 Fast

▪ Other Users of Circuit Break Pattern

▪ Spring Boot ▪ Nginx Plus ▪ Envoy (ISTIO)

3

Understanding Complexity