Leading A DevOps Transformation: Lessons Learned
-
Upload
gene-kim -
Category
Leadership & Management
-
view
951 -
download
1
Transcript of Leading A DevOps Transformation: Lessons Learned
@RealGeneKim
Dr. Steve Spear
5
@RealGeneKim6
@RealGeneKim
Dr. Steven Spear
@RealGeneKim
“As tempting as it seems, you cannot reorganize your way to continuous improvement and adaptiveness. What is decisive is not the form of the organization, but how people act and react.
“The roots of Toyota’s success lie not in its organizational structures, but in developing capability and habits in its people. It surprises many people, in fact, to find that Toyota is largely organized in a traditional, functional-department style.”
– Mike Rother
@RealGeneKim
Dr. Steven Spear “While designing perfectly safe systems is likely
beyond our abilities, safe systems are close to achievable” when the following conditions are met…
@RealGeneKim
Capability 1 See problems as they occur:
Complex work is managed so that problems in design are revealed
They see problems as they occur, through relentless testing of assumptions
Automated testing in the deployment pipeline, proactive monitoring of the production environment, …
Source: Dr. Steven Spear
@RealGeneKim
Capability 2 Swarming and solving problems as they are seen
to build new knowledge Problems that are seen are solved so that new
knowledge is built quickly Improvement of daily work is prioritized above daily
work
Stopping work when builds, tests, deployments and services break,enabling fast feedback loops, especially to Dev…
Source: Dr. Steven Spear
@RealGeneKim
Capability 3 Spreading new knowledge throughout the
organization The new discovery of local knowledge and
improvements are turned into global improvements, shared throughout the organization
Learning is fed back into the system to prevent future failures
High trust culture, blameless post-mortems when things go wrong, single source code repositories enterprise-wide, …
Source: Dr. Steven Spear
@RealGeneKim
Capability 4 Leading by developing
The job of leaders is not to command and control, but to create other capable leaders who can perpetuate this system of work
Source: Dr. Steven Spear
Encouraging experimentation and learning, coaching, removing obstacles, enabling
@RealGeneKim
“Culture isn’t just touchy-feely kumbahyah. Instead, it is the consistent response by a group of people to conditions. When we change culture, we fundamentally shift how people respond to a situation.
– Dr. Steven Spear
@RealGeneKim
“The most effective way is for senior leaders to change the conversation from ‘did you carry your orders out?’ to ‘what did you learn today?’ ”
– Dr. Steven Spear
@RealGeneKim
The “Big Bang” Transformation Dream
Start
Finish
Source: Damon Edwards (@damonedwards)
@RealGeneKim
The “Big Bang” Transformation Reality
Start
Finish
Fear
Panic
Abort
Maybe
People revert to legacy behaviors
Source: Damon Edwards (@damonedwards)
@RealGeneKim
“Big J” vs “Little J’s”
Start
Finish
Start
Finish
Big Bang Continuous Improvement
Source: Damon Edwards (@damonedwards)
@RealGeneKim
Other Side Of Innovation
19
@RealGeneKim
Breaking The Bottlenecks In The Flow Environment creation Code deployment Test setup and run (mention @rohansingh) Overly tight architecture Development Product management
@RealGeneKim
Blackboard Learn: 2005-Present
23Source: David Ashman, Chief Architect, Blackboard, Inc. (@davidbashman)
LoC
Commits
The Problem
@RealGeneKim
Blackboard Learn Building Blocks
24Source: David Ashman, Chief Architect, Blackboard, Inc. (@davidbashman)
@RealGeneKim
Target
“stopping changes makes it worse”“still working out how to apply this to legacy”“still challenged to scale across thousands of people”
Make structural changes Modernise technology Connect important dots Build an internal
incubator Develop learning
service offerings Prioritise demand based
on constraints Six internal DevOps
conferneces
instead of waiting 3-6 months an individual can build a full stack
automatically
200 trained in DevOps
Source: Rob England (@theitskeptic)
@RealGeneKim
Chivas Nambiar, Verizon
@RealGeneKim
CSG bill printing
40 dev teams, 1000 staffA release has been practiced 70 timesPhoenix servers not snowflakes
Improve work visibility Single intake of work:
dev, ops, requests Go see, and role rotation Change behaviour to
change culture Legacy test automation Strangler pattern Telemetry and shared
understandingRegression tests went from 20% to
5% of effort
Incidents per release2013: 2012015: 18
Source: Rob England (@theitskeptic)
CSG International Confidential and Proprietary Information Copyright © 2015 CSG Systems International, Inc. and/or its affiliates (“CSG International”). All rights reserved.28
1b. Dependency VisibilityMake your team and system dependencies visible. Leverage this to increase
understanding, unwind handoffs and move towards feature teams.41 Teams
7 Iterations
“Conway’s Board”
http://www.scaledagileframework.com/release-planning
@RealGeneKim
Michael Hrenko, Blue Shield of CA
@RealGeneKim
Nordstrom
“stopped optimizing for cost, started optimizing for speed”“in 2015, 20% lead time reduction target across the board for customer facing properties”
Goal: make cycle time visible
Created internal Kata coaches and trainers to help internal teams
Experimenting with microservices for e-commerceCosmetic Business
Office lead time: 7d to ”nearly real time”
Source: Rob England (@theitskeptic)
@RealGeneKim
TicketMaster
73 dev teams, 100% push their own code3 days DevOps training = access to ProdMetal-to-money deployment, no handsEgo is a forcefield against learningBlocked is an unacceptable state
Breaking bread together
Breath customer air 4 in the box: mgr,
ops, UX, process Dev teams on call Metrics: outcomes
over outputs
98% reduction in MTTR
Source: Rob England (@theitskeptic)
@RealGeneKim
USAA military insurance
If you have to rely on heroics your process is brokenIt is requirements and testing that take the time
Elevator pitch: aspirational, same page
Have an internal brand
Daily regression test runs overnight
Leading indicators on a dashboard
Release 28 days -> 7 days
with 40 years of legacy
Source: Rob England (@theitskeptic)
@RealGeneKim
Sherwin-Williams
Never mind the technology: you need a salesman and a politician
Using SAFe Maturity model
Code Environments Data Tests Process
Use value mapping to find the pain points
46,000 code deploys a year
Provision an Oracle server in 15 minutes
Source: Rob England (@theitskeptic)
@RealGeneKim
ING
TiTo today in today out: go home with a clean slateAgile can learn from ITIL and ITIL can learn from Agile
Eliminate duplicate admin, make ITIL as lean as possible
Reserve 30% of sprint capacity for incidents
Problem management stories as backlog
Minimise tech debt: ThisSprintInNextSprintOut
Daily CAB Permission to change from
other team members
500 app teams doing DevOps
Source: Rob England (@theitskeptic)
@RealGeneKim
Capital One
Its never going to be perfect; its only going to get better.All new software must justify why not open source.Dev, QA and some prod on public cloud
Started with automated builds for one team
Developers are customers of the toolmakers
All code peer reviewed before merge to trunk
Building a server 60 days $25k-> on demand
Internal DevOps conference 1200
attendees
Source: Rob England (@theitskeptic)
@RealGeneKim
HP
DevOps is a parallel mode for us.The war is over: the source control tool is git.Trust but verify.Minimum viable process.
CIO in the room ChatOps
integration Lightweight peer
reviews Collaboration
without playing with org charts
Vertical integration: “the future is not evenly distributed”
One change = one deploy
Source: Rob England (@theitskeptic)
@RealGeneKim
IBM
JAT tool built on ANT to test CICSRD&T tool on Intel/Linux
Mainframe test automation
Refactoring to callable services
Recompile optimisationVISANET has
been up for 19 years
Dev time reduced by 90%
Testing from weeks
to hours
Source: Rob England (@theitskeptic)
@RealGeneKim
2013: 15k Devs, 4k projectsThe biggest obstacle is how we see the world
One version of code Open repository
75 million test cases a day
5,500 code commits a day
@RealGeneKim
CSCIf you ask people to change they don’t go straight to awesome, first they get worse.You have to practice.
Measure baseline Visualise system
of work Identify waste Change the
bureaucracy Measure
improvement
Deployments 12 hours
-> 12 minutes
Source: Rob England (@theitskeptic)