Agile and Lean Support and Maintenance of IT Services
Jaroslav Procházka, Ph.D. Extended version of ISD 2010 presentation Tieto / University of Ostrava www.differ.cz
Your feedback is invaluable! Please share with me, why you looked for Agile and Lean Support and Maintenance topic. I have e-book on this topic in progress and would like to provide the best value ;)
Take short survey: What exactly do you look for here?
Terms definition ¢ Maintenance (IEEE 610.12 Glossary of SW Engineering):
l “Software maintenance is the process of modifying a software system or component after delivery to correct faults, improve performances or other attributes, or adapt to a changed environment.”
l Types: corrective, perfective, adaptive
¢ IT Service (ITIL v3): l “A service is a means of delivering value to clients by facilitating
outcomes clients want to achieve without the ownership of specific costs and risks.”
¢ Support (ITIL v3) l Incident, Event and Problem management process and Service Desk
function
Let’s start with a story of one team …
XXL team …
¢ Mark, Petr, Dan and Rachel work as L2 support and maintenance of legacy batch processing system that generates customer invoices
¢ They struggle with incidents caused by poor architecture and long processing time of legacy system and technologies
¢ Manual checks need to be done (physical files copied in server folders to avoid batch failure)
¢ Insufficient hardware requires ongoing investments
¢ Person on call needed in 8/5 no shift mode (not very favorite rotating responsibility)
Current state (measure perspective)
Category of measure Current value
Mean processing runtime (median) 04:49:21
Cumulated monthly runtime 154 h
Runtime per 1000 invoices 00:06:12
Invoices processed per second 2.69
It causes many issues
¢ Millions of invoices per week l Total invoices count is still increasing
¢ Legacy technology, architecture and batch processing model cause: l common rework if processing failed l higher maintenance costs
(because of rework) l decreasing team motivation
(no creative work, manual checks in error-prone environment, rework)
Current status of mainstream support and maintenance approaches (generalization based on our research, experience and observations)
Current state in IT Service Support and Maintenance ¢ Support and Maintenance approaches nowadays exist in
form of rigorous process descriptions with required comprehensive (paper) documentation, e.g. l IEEE-STD-1219 l ISO 12207 Maintenance sub-process l Mantema l ITIL v2 and v3 l ISO 20000
Some numbers … ¢ Support and Maintenance – the longest phase of IT service
lifecycle l Intended period of use about 5-10 years (Application software) l Real period (mainly telco, banking, government sector): 15+ years
¢ It consumes large part of overall lifecycle costs • Bennett, Rajlich (2000) Software Maintenance and Evolution: A Roadmap
¢ Poor support and maintenance approach can decrease the cost and quality of operated and maintained service l Currently more than 50% of IT budget spent on maintenance:
• Boehm B (1981) Software Engineering Economics • Hopkins R, Jenkins K (2008) Eating the IT Elephant
Bud
get
Year
Administration, planning, etc. Wan, Voice, Mainframe, Help desk, end user support Operations and maintenance New development
Significant changes and improvements
Source: Gartner, IT spending and Staffing Report, 2007, 2008 and 2009
Statements and discussion I.
1. Process orientation of support and maintenance approaches (prescribed phases, activities, documentation, measuring process over measuring value to the customer) omitting human aspect by assuming the same result with different people, consequences: 1. detailed procedures reducing errors but also creativity, proactive
behavior and learning (affects people motivation) 2. comprehensive (paper) documentation as the main form of
collaboration and knowledge sharing contributing to more expensive maintenance; this approach omits different context, language and typology of people involved in communication
3. inability to change quickly to respond to market changes; 4. low motivation of maintenance teams and high fluctuation;
Statements and discussion II.
2. Measuring process performance (KPIs) rather than business value delivered to the customer/end users l measuring productivity with no respect to nature of creative work à
lacking focus on solving root causes l utilization as key measure hinders learning, innovative ideas, proactive
behavior (lack of slack space)
Business value: money saved or earned by delivering new feature (new revenue, revenue increase or operational efficiency; Total Cost of Ownership; decreased maintenance costs) -> Bills M (2004) ROI: Bad Practice, Poor Results
Statements and discussion III.
3. Often lacking practical implementation approach and answers to typical practical questions l How to adjust the process/framework under our constraints? l Shall I use every activity and artifact?
4. Missing connection to software (IS/IT service) development process or if it mentions, it does not reflect best practices and latest evolution in SWD domain l E.g. ITIL recommending waterfall approach
Statement evidence I. ¢ Traditional maintenance approaches quite expensive
l Currently more than 50% (even up to 95%) of IT budget spent on maintenance:
• Boehm B (1981) Software Engineering Economics • Hopkins R, Jenkins K (2008) Eating the IT Elephant • Our and community experience: customer IT budget details
l Documentation overhead as part of traditional approaches • IEEE-STD-1219 recommends 41 different IEEE standards to document software
maintenance lifecycle! • de Souza et al. (2005, 2007)
l Low contribution to quality (documentation, process) • Prechelt, Unger-Lampert, Philippsen, Tichy (2002) Two controlled experiments… • Svensson & Host (2005) Introducing agile process in SW maintenance…
l Error corrections consuming around 21% of maintenance effort • Bennett, Rajlich (2000) Software Maintenance and Evolution: A Roadmap
Statement evidence II. ¢ Poor results of traditional approaches in various types of projects
• Standish Chaos research 1999, 2001, 2004, 2006, 2009 (only 28-35% success rate)
¢ Inability to change software quickly – business opportunity losses • Bennett, Rajlich (1981) Software Maintenance and Evolution: A Roadmap • Forrester (2010) Application Outsourcing Clients Are Satisfied, But Want More
¢ Low customer satisfaction (if considering innovativeness, connection to business goals)
• Forrester (2010) Application Outsourcing Clients Are Satisfied, But Want More • Our internal customer satisfaction surveys’ comments
¢ Low maintenance team members’ motivation and productivity • Boehm (1981) Software Engineering Economics • Our internal team satisfaction survey (max results 2.8 out of 5)
Statement evidence III.
¢ Low specialist’s control over final product/service quality: only 5-20% defects come from specialist’s sphere of control, the rest is system or management sphere of control
• Deming (1986) Out of the Crisis • Liker (2003) The Toyota way • Juran, Defeo (2010) Juran’s Quality Handbook, 6th edition
¢ Low maintenance team members’ motivation and productivity • Boehm (1981) Software Engineering Economics • Our internal team satisfaction survey (max results 2.8 out of 5)
Good stuff in support and maintenance approaches ¢ There exist many respected and used approaches
l But it causes its usability and relevancy issues at the same time
¢ Some approaches widely implemented l e.g. Materna research declares about 60% enterprises in
Middle Europe and above 70% in Scandinavia following ITIL®
¢ Visible attempts to connect IT and business and also SWD with support and maintenance l See ITSM/ITIL, CobiT, Enterprise Unified Process
Evolution …
Milestones in SWD approaches
¢ From chaos or ad hoc approaches in 50th and 60th ¢ To detailed rigorous processes in 80th ¢ To iterative & incremental (Agile and Lean) in late 90th
Ad hoc, chaotic Rigorous,
process oriented Agile (iterative & incremental) and Lean
Agile widely accepted in SWD
¢ Agile widely accepted and used as Software Development (SWD) approach for more then 10 years even in complex and traditional environment, see adoption surveys l Ambler 2005-2009 l VersionOne 2009, 2010 l Forrester 2008, 2009
Source: VersionOne2010
Source: Ambler 2008
And it works! Agile and Lean success evidence
¢ Agile has better success rate then traditional approaches • Standish Chaos research 1999, 2001, 2004, 2006, 2009 • Adoption surveys: Ambler 2007, 2009, VersionOne 2009
¢ Agile maintenance empirical studies, experiment evaluations l Simplified source code by over 40% and improved software quality
(defect reduction) by 67% • Poole, Huisman (2001) Using XP in a maintenance environment • Poole, Murphy, Huisman, Higgins (2001) Extreme maintenance • Alshayeh & Li (2005) An empirical study of system design instability metric… • Dinh-Trong, Bieman (2005) A replication case study of Open Source development…
l Reduced traditional maintenance stuff by 40% • Poole, Huisman (2001), Poole, Murphy, Huisman, Higgins (2001)
¢ Toyota (lean): people centric approach to car manufacturing and Software development
• Liker (2003), Ohno (1988), Poppendieck (2006)
Agile Success Evidence
Source: VersionOne2010
Let’s apply the lessons from SWD to support and maintenance
Support & Maintenance status
Software Development status
Ad hoc Rigorous Agile/Lean
Short Agile Introduction ¢ Agile manifesto
1. communication and cooperation among all people (business and IT; within IT teams);
2. focus on continuous delivery of valuable software; 3. working software as the primary measure of progress; 4. change tolerance understood as an advantage for the customer
¢ Agile approach: l Applies those common principles
and values
Applicability of Agile
Agile Maintenance: State-of-the-art
¢ Maintenance in-built in Agile l if Agile development is followed: ongoing development and
maintenance à no need for separate maintenance processes, teams à better architecture and business knowledge à quicker ability to change, cheaper maintenance
¢ Two levels of application and research l Level 1: Methods – mostly XP, Scrum, Lean/Kanban
• XP, Scrum: Poole & Huisman (2001), Svensson & Host (2005), Alshayeh & Li (2005) • Open Source Agile methods: Berglund, Priestley(2001)
l Level 2: Practices • Rudzki, Hammouda, Mikkola (2009) • de Souza et al. (2007)
l Our approach: customizable framework, focus also on support, goes beyond software only
¢ No papers on application of Agile in support (SD, IM, PrM) found
Let’s get back to our story …
¢ Team sits together and think during regular retrospective l How to get more sophisticated development tasks, not
just simple ones? l How to improve current situation? l What should we do differently? l What can we do proactively without customer order?
Investments and proactive approach
Actions team performed ¢ Measure existing and predicted trends ¢ Proposal of Java rewrite ¢ Comparison of both solutions
(technical one: performance, reliability, HW fit, …) ¢ Presentation to the customer
l Failed!!! l Customer did not understand the value and reason why to invest in
rewriting
Investments and proactive approach (round 2)
Actions performed in next round ¢ Better monitoring à upcoming issues predicted based on real,
more exact data ¢ Proof of Concept in Java (100 man-hours investment)
l Real time processing l No new HW needed, less HW requirements for this technology l More error resistant
¢ Measuring current solution and Java pilot l Comparison prepared
¢ Presented to customer second time in money speech l Customer got the point finally, but put the solution in backlog l Soon accepted without any push when the first predicted issue
become true (in approximately 6 weeks)
Summary
¢ Proposal accepted in deep financial crisis in 2009
¢ Benefits for the customer: l Customers core business process not affected l Maintenance costs lowered l Less incidents in few weeks
¢ Benefits for the team and vendor l 160000 EUR new development project (higher vendor Net Sales) l Higher team motivation: they prepared proposal and implemented
the solution (creative work), maintaining their work l Incident decrease – less firefighting, no on call disturbances (out of
office hours)
Results
0
1
2
3
4
5
0
50000
100000
150000
200000
bře.09 dub.09 kvě.09 čvn.09 čvc.09 srp.09 zář.09
Del
epom
ent p
roje
ct c
ost
Mot
ivat
ion
bře.09 čvn.09 zář.09 Development projects 8300 7500 160000 People motivation 2,7 2,7 4,1
Service - telco
Category of measure Current value Proof of Concept value
Mean processing runtime (median) 04:49:21 01:41:16
Cumulated monthly runtime 154 h 54 h
Runtime per 1000 invoices 00:06:12 00:02:10
Invoices processed per second 2.69 4.44
Solution Summary: Control Framework of Agile and Lean Support and Maintenance
Framework goals and structure
¢ Holistic approach l Connected to Agile and non-agile development framework
• Extended OpenUP/RUP framework by Production phase • Kroll, Kruchten (2003), Kroll, MacIsaac (2007)
l Focus on maintenance and support l Full set of Agile and Lean practices as baseline for future process,
not only adjusted Scrum or XP to fit maintenance and support needs l Respecting human nature
¢ Consists of: 1. principles we defined based on existing research and empirical
experience 2. set of (common) practices helping to reach each principle in practice 3. control framework with measuring principle 4. implementation approach
Framework respecting human nature
¢ Cognitive, motivation and typology research as a basis 1. Building block 1: Context is important
• Leadership and communication Tabaka (2006) Collaboration Explained, Aldair (2009) Leadership and Motivation
2. Building block 2: People are different • Typology: Mayers-Bryggs Type Indicator (MBTI)
Keirsey, Bates (1984) Please Understand Me I. and II
3. Building block 3: Internal motivation matters • Herzberg (2003) One More Time: How Do You Motivate Employees?
Hackman, Oldman (1980) Work Redesign Melnik, Maurer (2006) Comparative Analysis of Job Satisfaction in Agile and Non-agile Software development Teams Melnik, Maurer (2007) Job Satisfaction and Motivation in a Large Agile Team Whitworth, Biddle (2007) Motivation and Cohesion in Agile Teams Nordström, Ridderstrale (2008) Funky Business forever
Framework respecting human nature: key features
¢ Incremental adoption of Agile/Lean practices to avoid change resistance
¢ Small Kaizen steps implemented towards ideal solution to overcome amygdala warning reaction l E.g. short pair work intervals, regular attempt to write source code
documentation, two times per day help others (leaving our comfort zone)
¢ Change covers vendor, team but also personal vision (manager’s role is to ensure this)
¢ Simple mechanism for immediate stop and root cause identification to fix quality issues l E.g. pair work or 5 whys
¢ Automation as key support for learning of new practices l E.g. Continuous integration, Test driven development, pair work
¢ Learning and feedback mechanism to achieve mastery l E.g. retrospective, rotation, slack space
Framework structure
Principle 1: More discipline, less bureaucracy
¢ Benefit: ability to react; commitment to rules; team’s contribution; motivated people; fun at work
¢ Pattern: rules defined with cooperation of teams; no detail activities, room for creativity; automated basic rules; regular retrospectives and rule changes; flat hierarchical structure
¢ Anti-pattern: detailed activities mandatory; leave the rules without changes for years; add new rules continuously; do not leave any decision power on the team but give them all the responsibilities.
¢ Symptoms: long lasting approvals; not followed enterprise business system (processes); a lot of regular reports; process oriented metrics; faking reports; unproductive meetings; decreasing technical team motivation
¢ Practices: iterative approach, kanban, retrospectives, rotation, visualization, continuous integration
Principle 2: Internal and cross team cooperation and systematic view
¢ Benefit: fulfilled needs; informed users; meeting business goals; contribution to the business value creation; fun at work
¢ Pattern: share goals; map goals to IT; define metrics from the business goals; rotate people; regular demonstration
¢ Anti-pattern: defy business side, they do not know programming; carry off information; establish different teams for development and maintenance of the services
¢ Symptoms: role oriented way of working; knowledge and information transfer via documents; unknown business goal in the teams; sub-optimization; unknown service or delivery status; 90% done syndrome; IT understood as the cost
¢ Practices: visualization, business scenarios, pair working, rotation, on-job learning, retrospective, fight with ambiguity
Measuring principle example Business objective
Contributing operational IT
objectives Operational objective
measures Set of relevant agile practices
Increase the turnover per one customer by 10% until the end of this year
Improve the quality
Number of critical defects (trend!) Ratio open/closed incidents Trend of repeating incidents Test coverage in %
Test driven approach Iterative approach Continuous integration Pair working
Increase productivity Team velocity (trend, not exact number!) Mean time to repair (MTTR)
Retrospective Rotation Pair working On-job learning
Principle 3: Proactive behavior and learning
¢ Benefit: satisfied customer; business supported by appropriate IT services; new business opportunity, new business orders from the customer
¢ Pattern: understand the business, propose future evolution according to the business scenarios; analyze, simulate and propose solutions; prototype and measure proposals
¢ Anti-pattern: only solve incidents and minor problems when occur; do not ask for business evolution; do not propose solutions, customer will not pay them
¢ Symptoms: unfamiliarity of customer; fire fighting; no time to do things right; vendor does not solve (propose solutions to) customer’s daily problems
¢ Practices: iterative approach, retrospective, rotation, visualization, business scenarios, on-job learning, fight with ambiguity
Principle 4: Risk driven approach
¢ Benefit: mitigate risks; plan the service according the existing constraints
¢ Pattern: fact identification; risk identification and given action proposal including assignee and deadline (actions performed as real work - integration, development, prototyping) instead of a detailed theoretical analysis or monitoring only
¢ Anti-pattern: ignore surroundings, changes; solve problems and risks when they appear; conduct detailed analysis before real implementation, integration, simulation, tests; only monitor the appearance of risk
¢ Symptoms: reactively driven service; surprises during operations; facts interpreted as risks; no time to do things right
¢ Practices: iterative approach, retrospective, business scenarios
List of the practices to support implementation of principles
¢ Iterative approach/kanban ¢ Pair working ¢ Rotation between development and
maintenance/support teams ¢ Test driven approach and unit testing ¢ Refactoring to incrementally clean and
improve the code and architecture ¢ Daily meetings to synchronize in the
team, share solutions, share service status and identify problems early
¢ On-job learning to gain knowledge and experience quickly
¢ Business scenario simulation – to adapt IT services to evolving business and be ready for cost efficient future solutions
¢ Retrospective for learning ¢ Continuous integration to get quick
feedback and improve the quality of the code/service
¢ Defensive programming to avoid common defects and have easier maintenance
¢ Visualization to uncover problems or trends, e.g. burn down chart, kanban, value stream analysis
¢ Fight with ambiguity (memory heuristic, keyword technique, fuzzy sets and fuzzy logic) to deal with different languages of business and IT people.
Implementation approach
1. Analysis – kaizen workshop, Theory of Constraint Current Reality Tree (CRT), questioning
2. Selecting practices based on measuring principle
3. Roadmap definition based on priorities and risks
4. Hands-on daily support 5. Regular objectives check
Example results Service X – forest Service Y – telecommunication (our story)
Short background
Data warehousing and ETL processing (200 workflows) supporting billing business process; 4 people in service, 5 maintaining Informatica platform (mixed teams in Czech and Finland)
L2 and L3 support and maintenance team, also developing small change requests; invoice processing in batches (XML and various scripts); 4 people in Czech, 2 people in Sweden; millions of invoices processed weekly
Perceived problems (by customer and teams)
(1) Unknown SLA targets (2) Recurring incidents (3) Critical incidents caused by platform and database
(1) Invoices processed in batches, failure required restart of the whole batch (2) Increasing (geometrically) processing time causing problems with invoicing (3) Very low motivation of the team
Achievements after applying our approach
(1) Identified dependencies and SLA times thanks to early piloting in transition period (2) 75% decrease of incidents in 5 months, changed nature of incidents (to low impact and priority ones) (3) Doubled end user satisfaction
(1) New 160 000 EUR business for our company (development projects for rewriting to real time processing) (2) Customer’s daily critical business problem solved (3) Increased team motivation (team having also development projects)
0
5
10
15
20
25
říj.09 lis.09 pro.09 led.10 úno.10
Cou
nt
říj.09 lis.09 pro.09 led.10 úno.10 Incident count 20 21 14 7 5 FTE 5 5 5 4 2,5
Service - forest
0
1
2
3
4
5
0
50000
100000
150000
200000
bře.09 dub.09 kvě.09 čvn.09 čvc.09 srp.09 zář.09
Del
epom
ent p
roje
ct c
ost
Mot
ivat
ion
bře.09 čvn.09 zář.09 Development projects 8300 7500 160000 People motivation 2,7 2,7 4,1
Service – telco (our story)
Results – general survey
Conclusion I. ¢ Agile and Lean approach – current cure for experienced problems
l Lack of proactive, innovative behavior l Customer and team motivation l Quality improvements l Responding to changing environment
¢ Holistic approach l Connected to Agile and non-agile development framework l Focused on operations, maintenance and support l Full set of practices as building blocks for future process
• Chosen based on business goals l Respects human nature
• Approach designed based on cognitive science, typology and motivation research
Conclusion II. ¢ Framework consists of:
1. principles we defined based on existing research and empirical experience
2. set of practices helping to reach each principle in practice 3. control framework with measuring principle 4. implementation approach
¢ Verified in different industry domains and customers l 13 services and projects in past three years l 10 successful implementations l 3 unsuccessful implementations
(not achieved milestones, agreed not to continue during regular check) l Telecommunications, public, energy, forest industry
e-book to be published
¢ In 2011 published “Operate IT differently. Agile and Lean operations, support and maintenance of information systems and IT services” in Czech
¢ Currently in progress e-book on this topic in English
¢ Help me to put in needed topics by answering this simple form.
About author
• Freelance Agile and Lean mentor • 13 years in IT (developer, support, project manager, consultant) • Also teaching and researching experience at University of Ostrava • Presenting at international conferences
• E.g. ISD, Lean IT Summit, ICGSE
• Blogs and free e-books at www.differ.cz (in Czech and English) • Linked in contact
Top Related