Turning Human Capital into High Performance Organizational Capital

87
Devops: Turning Human Capital into High Performance Organizational Capital John Willis @botchagalupe

Transcript of Turning Human Capital into High Performance Organizational Capital

Page 1: Turning Human Capital into High Performance Organizational Capital

Devops: Turning Human Capital into High Performance Organizational Capital

John Willis @botchagalupe

Page 2: Turning Human Capital into High Performance Organizational Capital

• One of the founding members of “Devopsdays” • Co-author of the “Devops Handbook”. • Author of the “Introduction to Devops” on Linux Foundation edX. • Podcaster at devopscafe.org • Devops Enterprise Summit - Cofounder • Nine person in at Chef (VP of Customer Enablement) • Formally Director of Devops at Dell • Found of Socketplane (Acquired by Docker) • 10 Startups over 25 years

About Me

https://github.com/botchagalupe/my-presentations

Page 3: Turning Human Capital into High Performance Organizational Capital
Page 4: Turning Human Capital into High Performance Organizational Capital
Page 5: Turning Human Capital into High Performance Organizational Capital
Page 6: Turning Human Capital into High Performance Organizational Capital

How would I describe Devops to a CEO?

Page 7: Turning Human Capital into High Performance Organizational Capital

How would you describe Devops to a CEO?

Page 8: Turning Human Capital into High Performance Organizational Capital

Exercise Time (Deep Breath)

Page 9: Turning Human Capital into High Performance Organizational Capital

The consequences of failure have never been greater…

Page 10: Turning Human Capital into High Performance Organizational Capital

Wanna know how?

Page 11: Turning Human Capital into High Performance Organizational Capital
Page 12: Turning Human Capital into High Performance Organizational Capital

Devops Practices and Patterns• Continuous Delivery

• Everything in version control • Small batch principle • Trunk based deployments • Manage flow (WIP) • Automate everything

• Culture • Everyone is responsible • Done means released • Stop the line when it breaks • Remove silos

12

itrevolution.com/devops-handbook

Page 13: Turning Human Capital into High Performance Organizational Capital

Human Capital and High Performance

Organizations

Page 14: Turning Human Capital into High Performance Organizational Capital

30x 200xmore frequent deployments

faster lead times

60x 168xthe change success rate

faster mean time to recover (MTTR)

2x 50%more likely to exceed profitability, market share & productivity goals

higher market capitalization growth over 3 years*

High performers compared to their peers…

Data from 2014/2015 State of DevOps Report - https://puppetlabs.com/2015-devops-report

Recent IT Performance Data is Compelling

Page 15: Turning Human Capital into High Performance Organizational Capital

30x 200xmore frequent deployments

faster lead times

60x 168xthe change success rate

faster mean time to recover (MTTR)

2x 50%more likely to exceed profitability, market share & productivity goals

higher market capitalization growth over 3 years*

High performers compared to their peers…

Data from 2014/2015 State of DevOps Report - https://puppetlabs.com/2015-devops-report

Recent IT Performance Data is Compelling

Faster

HigherQuality

MoreEffective

2555x

Page 16: Turning Human Capital into High Performance Organizational Capital

Fast

CheapGood

“Pick Two!”

Conventional Wisdom

Page 17: Turning Human Capital into High Performance Organizational Capital

Faster, Better, and Cheaper?

Page 18: Turning Human Capital into High Performance Organizational Capital

Organizational culture was one of the strongest predictors of both IT performance and the overall performance of the

organization

Page 19: Turning Human Capital into High Performance Organizational Capital

Devops is about Humans

19

Devops is a set of practices and patterns that turn human

capital into high performance organizational capital.

Page 20: Turning Human Capital into High Performance Organizational Capital
Page 21: Turning Human Capital into High Performance Organizational Capital

Google

• Over 15,000 engineers in over 40 offices • 4,000+ projects under active development • 5500+ code submissions per day (20+ p/m) • Over 75M test cases run daily • 50% of code changes monthly • Single source tree

• Over 75M test cases run daily

Page 22: Turning Human Capital into High Performance Organizational Capital

Amazon

• 11.6 second mean time between deploys. • 1079 max deploys in a single hour. • 10,000 mean number of hosts

simultaneously receiving a deploy. • 30,000 max number of hosts simultaneously

receiving a deploy

Page 23: Turning Human Capital into High Performance Organizational Capital

23

Unicorns and Horses (Enterprises)

Unicorns

Enterprise

Shamelessly stolen and repurposed from: Pete Cheslock

Page 24: Turning Human Capital into High Performance Organizational Capital

Enterprise Organizations

• Ticketmaster - 98% reduction in MTTR • Nordstrom - 20% shorter Lead Time • Target - Full Stack Deploy 3 months to minutes • USAA - Release from 28 days to 7 days • ING - 500 applications teams doing devops • CSG - From 200 incidents per release to 18

Page 25: Turning Human Capital into High Performance Organizational Capital

Faster, Better, and Cheaper. How?

Page 26: Turning Human Capital into High Performance Organizational Capital

Lean Safety Culture Learning Organization

Page 27: Turning Human Capital into High Performance Organizational Capital

Lean

Page 28: Turning Human Capital into High Performance Organizational Capital

Service now

Parts Unlimited - "Major Release 6"

Early 2014

Project Initiation

ZRA (finance)

Approve Project

Monthly Steering Meeting

Portfolio

C-level

Steering Comittee

Provides Input

Project Charter

High-Level• Stories• Project Info• Description• Budget• Schedule

PMStakeholders (Tech and Biz)

Create Work Breakdown

Work Breakdown (MS Proj)

High-Level• Milestones• Resource

Planning

3 months 3 monthsHold / Pause

Create Requirements

(Project Meeting)

MS Office

• Detailed Req for new features

• Technology refreshes

• ERD (Infra req)• DRD (Dev req)• BRD (Biz req)

Share Point

Create Design

Tech ReqTech

ReqTech Req

Tech Leads Architects Vendor Arch

Ops Arch

High-LevelServer Tickets

3 months

Receive Request for

Servers

Create Server

Request Spreadsheet

ServerReq

PMTixattach

Route for Approval

Tix

1 week 1 week

• Budget• Appropriate

Resources DB

App or Web

orApproved Into Ops

Delivery Queue

Delivery Manager

"Matt"

Service now

"Heads up"

Assign to Delivery Engineer

Delivery Engineer

Clarify or Confirm Req with Dev or

QA

1 - 6 weeks

Provision Server

and Rework

DBA Validation

App/Web Validation

RestoreData

1 weekApp

Team

App Team

PMStakeholders (Tech and Biz)

Dev Leads

4 weeks

ARB Queue

Detailed Analysis and Requirements

Jira "Stories"

Maybe

Track Ticket Dependencies

Confluence Pages

Team Leads and PMs

Assign Requirements

add more detail for their teams

Architecture Review Board

"Bill" plus Architects

Working Group

Ops? (sometimes)

Devs, PM, Engr, QA

Development Sprint

2 week c/t

Existing Dev Environments

Acquire / Prepare needed

dataOps DBA

Service Data Setup

(Mainframe)

"Jennifer"

Test Data Configuration

Manager

Development Deploy to Integration

Dev, QA

Integration & Regression

Testingfocused on service

ScrumDev/QA

Integ03

ScrumDev/QA

Test Link

Sprint Review

Release to Prod

Product Owners(Using own

criteria)

Create CAB ticket

or

Scrum Team Ops Team(if legacy)

Push Deployment to Stage

Stage

Email Notification

Jira

NewArch

Build VMs

Jira

Ops

ServiceNow

Legacy

QA LeadPMsQAs

End to end testing in Prod

Prod Env

PrdDB

Go-No Go decision meeting

Team Leads

Jira

Ops

By Cluster

"Remove Feature Flag"

(if new arch)

16 weeks

6 weeks H/C: 6 3 weeks H/C: 8

4 weeks H/C:8 3 weeks H/C: 14

Data Setup Integration Testing

DEv Arch

Create Change Tickets > 100

Service Now

ComputeNet

FacilityCablingStorage

"Linda"Ops PM

RESET DELIVERY

DATE!

Steering Comittee

Fix Tickets!

"Linda" Ops PM

Dev Leadership

Assign Dev Team

Ops Intake Meeting

Dev Leadership

1 week

GroupCIOs and

Arch Leads

QA

SteeringDesign

Dev BreakdownDev / Test

Staging Release

Server Requirements GatheringServer Approval and Assignment

Provisioning

Production Release

Initiation and Planning

Create OpsTickets

TS PD

TS PD

Gaps in Requirements• Licenses• Dependencies on 3rd party apps• Capacity planning always seem low

("robbing Peter to pay Paul")• Don't purchase in advance even though

we know it's coming

Duplicate info across different documents

EP

D

D

Procurement of physical servers can take months (lead times for procurement plus facilities groups)

Too many Env. in on ticket cases audit confusionPiecemeal requests ("2 this week, 3 next week")

1 queue for delivery team with ~1,000 tickets at once

Capacity issues cause delay

Often told to stop everything and do something else

TS

D

M

TS

M W

W TS EP

HNo monitoring or backup for some environments

30% of delivery teams time spent "consulting" on performance and dealing with unfounded requests for more capacity

3-5 days to fix~10% S/R

H

D M

TS

H

Often skips CAB. What CAB reviews is often not what built

All manual setup. 1 person really knows how. Low data quality.

Manual process with lots of back and forth.

Many tickets with mismatched priorities

Mostly manual testing

Manual, per clusterFrequently down.

External service updates take offline. Lots of contention.

EPM

D

PDM W

TS

TS D

M TS

PDM

M

S/R - 90%

S/R - 55%

S/R - 15%

D

S/R - 20%

S/R - 50%

Sometimes submits server requests directly to delivery Ad-hoc requests get

lost, maybe 2-3 week delays

TS

High Level

S/R - 75%

9+ months of planning before implementation starts

(and information / requirements still incorrect or incomplete!)

Dev and QA told to submit sever request 6-8 weeks in advance (only done 50% of time)

W5. New "white glove" engagement model

3. Standard product catalog("Environments on Demand")

2. Visualization of flow of work and expected upcoming work

4. Shorten from Design to Implementation

1. Fully Automated Environment Provisioning

7. Small Batches

8. Write end-to-end customer

func. tests

11. Resolve interface to

legacy

10. Test data setup

automation

13. Dev Deploy to Prod for legacy

14. Unify change

management tools

15. Tool

9. Service Verification test writing: shift left to Dev(test early)

12. Remove Bottleneck and Environment Contention(test more)

• Make the work visibile for all • Manage flow and eliminate waste • Build alignment and consensus across team boundaries • Empower teams to find and fix what is getting in the way

Page 29: Turning Human Capital into High Performance Organizational Capital

• Small Batch • Reduce Work in Process (WIP) • 1x1 Flow • Reduce Bottlenecks (TOC) • Optimize Globally

Page 30: Turning Human Capital into High Performance Organizational Capital

Where does lean come from?

Page 31: Turning Human Capital into High Performance Organizational Capital

Let’s talk Kata

Page 32: Turning Human Capital into High Performance Organizational Capital

I fear not the man who has practiced 10,000 kicks

once, but I fear the man who has practiced one

kick 10,000 times

- Bruce Lee

Page 33: Turning Human Capital into High Performance Organizational Capital

Toyota is not a story about techniques. It’s an organization defined primarily by the unique behavior routines it continually

teaches to all it’s members.

Mike Rother (Page 262-263)

Page 34: Turning Human Capital into High Performance Organizational Capital

Wanna see what Kata looks like in Devops?

Page 35: Turning Human Capital into High Performance Organizational Capital
Page 36: Turning Human Capital into High Performance Organizational Capital
Page 37: Turning Human Capital into High Performance Organizational Capital

I have no idea how to answer

that question. It would literally

never occur to me not to do it!

KATA

Page 38: Turning Human Capital into High Performance Organizational Capital

We are what we repeatedly do. Excellence, then, is not

an act, but a habit.

The Dude

Page 39: Turning Human Capital into High Performance Organizational Capital

Improvement Kata

Coaching Kata

Page 40: Turning Human Capital into High Performance Organizational Capital
Page 41: Turning Human Capital into High Performance Organizational Capital
Page 42: Turning Human Capital into High Performance Organizational Capital
Page 43: Turning Human Capital into High Performance Organizational Capital
Page 44: Turning Human Capital into High Performance Organizational Capital

• Capability 1: Seeing problems as they occur • Complex work is managed so that problems in design are revealed • They see problems as they occur, through relentless testing of

assumptions

• Capability 2: Swarming and solving problems as they are seen to build new knowledge

• Problems that are seen are solved so that new knowledge is built quickly

• Improvement of daily work is prioritized above daily work

• Capability 3: Spreading new knowledge throughout the organization

• The new discovery of local knowledge and improvements are turned into global improvements, shared throughout the organization

• Learning is fed back to prevent future failures

• Capability 4: Leading by developing • The job of leaders is not the command and control, but to create

other capable leaders who can perpetuate this system of work

Page 45: Turning Human Capital into High Performance Organizational Capital

Safety Culture

Page 46: Turning Human Capital into High Performance Organizational Capital
Page 47: Turning Human Capital into High Performance Organizational Capital

Wanna See Another Video?

Page 48: Turning Human Capital into High Performance Organizational Capital
Page 49: Turning Human Capital into High Performance Organizational Capital
Page 50: Turning Human Capital into High Performance Organizational Capital

Views on Human Error

Page 51: Turning Human Capital into High Performance Organizational Capital

▪ Views on Human Error

▪ The old view of human error (First Story)

▪ Human error is the cause of accidents ▪ To explain failure,you must seek failure ▪ You must find people’s: inaccurate assessments,wrong decisions, bad judgments

Page 52: Turning Human Capital into High Performance Organizational Capital

▪ Views on Human Error

▪ The new view of human error (Second Story)

▪ Human error is a symptom of trouble deeper inside a system ▪ To explain failure, do not try to find where people went wrong ▪ Instead, find how people’s assessments and actions made sense at the time, given the circumstances that surrounded them

Page 53: Turning Human Capital into High Performance Organizational Capital

▪ Bad Apple Theory - Throw away the bad apples

▪ Complex systems are basically safe, they need to be protected from unreliable people (bad apples) ▪ Human errors cause accidents: humans are the dominant contributor to more than two thirds of mishaps ▪ Errors occur because of human loss of situation awareness, complacency, negligence ▪ Errors are introduced to the system only through the inherent unreliability of people.

Page 54: Turning Human Capital into High Performance Organizational Capital

What can go wrong usually goes right, but then we draw the wrong conclusion.

Murphy’s Law is Wrong! Sidney Dekker The Field Guide to Human Error

Page 55: Turning Human Capital into High Performance Organizational Capital

Blameless Culture

A blameless culture believes that systems are NOT inherently safe and humans do the best they can to keep them running.

Page 56: Turning Human Capital into High Performance Organizational Capital

Thematic Vagabonding

People jump from one topic to the next, treating all superficially, in certain cases picking up topics dealt with earlier at a later time; they don’t go beyond the surface with any topic and seldom finish with any. (Dörner, 1980)

Page 57: Turning Human Capital into High Performance Organizational Capital

Your organization must continually affirm that individuals are NEVER the ‘root cause’ of outages.

Page 58: Turning Human Capital into High Performance Organizational Capital

▪ Awesome Postmortems - Mindweather LLC

▪ in complex systems, there is no root cause, except… ▪ there are (multiple) conditions, some of which are unknowable, unfixable, outside our control ▪ people did what made sense at the time, given the information they had (no counterfactuals) ▪ failure and success are both normal in complex systems ▪ getting the full account* of what happened is more important than blame/punishment

Page 59: Turning Human Capital into High Performance Organizational Capital

▪ Hindsight bias: ▪ knew-it-all-along, to see the event as having been predictable, counterfactuals

▪ Outcome bias: ▪ evaluating the quality of a decision when the outcome of that decision is already known

▪ Availability bias: ▪ preference by decision makers to information and events that are more recent

▪ Fundamental attribution error: ▪ explain behavior in terms of internal disposition, such as personality traits, abilities, motives, etc. as opposed to external situational factors

Page 60: Turning Human Capital into High Performance Organizational Capital

▪ Just Culture at Etsy (John Allspaw)

▪ Encourage learning by having these blameless Post-Mortems on outages and accidents

▪ Understand how an accidents happen, in order to better equip ourselves from it happening in the future

▪ Gather details from multiple perspectives on failures, and we don’t punish people for making mistakes

▪ Enable and encourage people who do make mistakes to be the experts on educating the rest of the organization how not to make them in the future

Page 61: Turning Human Capital into High Performance Organizational Capital

▪ Just Culture at Etsy (John Allspaw)

▪ Accept that there is always a discretionary space where humans can decide to make actions or not, and that the judgement of those decisions lie in hindsight

▪ Accept that the Hindsight Bias will continue to cloud our assessment of past events, and work hard to eliminate it

▪ Accept that the Fundamental Attribution Error is also difficult to escape, so we focus on the environment and circumstances people are working in when investigating accidents

Page 62: Turning Human Capital into High Performance Organizational Capital
Page 63: Turning Human Capital into High Performance Organizational Capital

Learning Organization

Page 64: Turning Human Capital into High Performance Organizational Capital

That’s how it’s always been done

around here!

Page 65: Turning Human Capital into High Performance Organizational Capital

You are either building a learning organization… or you will be losing to someone who is

- Walter Sobchak - Andrew Clay Shafer

Page 66: Turning Human Capital into High Performance Organizational Capital
Page 67: Turning Human Capital into High Performance Organizational Capital

▪Dr Deming

Page 68: Turning Human Capital into High Performance Organizational Capital

A learning organization is a place where people are continually discovering how they create their reality.

- Peter Senge

Page 69: Turning Human Capital into High Performance Organizational Capital

▪ Five Disciplines must be adopted to become a learning organization

▪ Systems Thinking

▪ Personal Mastery

▪ Mental Models

▪ Shared Vision

▪ Team Learning

Page 70: Turning Human Capital into High Performance Organizational Capital

Ladder of Inference Chris Argyris

• Action • Beliefs • Conclusions • Assumptions • Meanings • Select • Observe

Page 71: Turning Human Capital into High Performance Organizational Capital

Ladder of Inference

▪ Can create bad judgement ▪ Our assumptions can lead us to bad conclusions ▪ Question your assumptions and conclusions ▪ Seek contrary data ▪ Make your assumptions visible to others ▪ Invite others to test your assumptions and conclusions ▪ Inquire other peoples assumptions and conclusions ▪ Move down the ladder instead of up

Page 72: Turning Human Capital into High Performance Organizational Capital

Ladder of Inference - Bad Judgement ▪ Observe - Notice people in the first row ▪ Select - Person in front row keep looking at their phone ▪ Meaning - Not listening to my presentation ▪ Assumption - He is not interested ▪ Conclusion - Doesn’t like my new idea ▪ Beliefs - Their team always blocks new ideas ▪ Action - I send a nasty email to their boss

Page 73: Turning Human Capital into High Performance Organizational Capital

Ladder of Inference - Alternative Assumption ▪ Observe - I notice people in the first row ▪ Select - Person in the front row keep looking at their phone ▪ Meaning - Not listening to my presentation ▪ Assumption - Try and engage with a question (safely) ▪ Conclusion - Might find out that they are late for another meeting and they really don’t want to miss this one… so they sent an email noticing the next meeting team that they will be late…. ▪ Beliefs - They are very excited about this new idea ▪ Action - Both teams setup another meeting to engage.

Page 74: Turning Human Capital into High Performance Organizational Capital
Page 75: Turning Human Capital into High Performance Organizational Capital
Page 76: Turning Human Capital into High Performance Organizational Capital
Page 77: Turning Human Capital into High Performance Organizational Capital
Page 78: Turning Human Capital into High Performance Organizational Capital

Lean Safety Culture Learning Organization Psychology

Page 79: Turning Human Capital into High Performance Organizational Capital

▪ very Interesting research….

▪ Christina Maslach - Organizational Burnout

▪ Geri Puleo - Burnout (BDOC)

▪ Carol Dweck - Mindsets

▪ Kelly McGonigal - Stress

https://github.com/botchagalupe/my-presentations

Page 80: Turning Human Capital into High Performance Organizational Capital

Bonus

Page 81: Turning Human Capital into High Performance Organizational Capital
Page 82: Turning Human Capital into High Performance Organizational Capital

▪ Anomaly Response

▪ Computers do not resolve outages.. people do ▪ Trade-off’s under pressure ▪ Cognition in the wild ▪ An outage is not a detective story ▪ With each step the story changes ▪ Need to see what’s happing with incomplete information ▪ Tools don’t always make thing better

Page 83: Turning Human Capital into High Performance Organizational Capital

▪ Anomaly Response - Internet Services are Opaque

▪ Network layer abstractions ▪ Variability in network performance ▪ Interdependent and decoupled services ▪ Internet based distributed computing ▪ Geographically distributed communication ▪ Open internet facing interactions

Page 84: Turning Human Capital into High Performance Organizational Capital

▪ Anomaly Response - Challenges

▪ Teamwork ▪ Communication ▪ Diagnosis ▪ Decision Making ▪ Coordination ▪ Improvisation ▪ Tooling

Page 85: Turning Human Capital into High Performance Organizational Capital

▪ Anomaly Response - Dynamic Fault Management

▪ Cascading effects ▪ Tempo changes and time pressure ▪ Multiple interleaved tasks ▪ Multiple interacting goals ▪ Need to revise assessments as new evidence comes in

Page 86: Turning Human Capital into High Performance Organizational Capital

"In dynamic fault management, intervention precedes or is interwoven with diagnosis"

- Woods (1994)

Page 87: Turning Human Capital into High Performance Organizational Capital

Source: (Woods) John Allspaw - http://bit.ly/AllspawThesis