Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

65
IS YOUR TEAM INSTRUMENT RATED? (OR DEPLOYING 125,000 TIMES A DAY) J. Paul Reed Principal Consultant

description

J. Paul Reed's DevOpsDays Silicon Valley 2013 presentation "Is Your Team Instrument Rated?" The presentation discusses the operational model similarities between the National Airspace System and a well-run software development shop that employs DevOps methodologies.

Transcript of Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

Page 1: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

ISYOURTEAM INSTRUMENTRATED?(OR DEPLOYING 125,000 TIMES A DAY)

J. Paul ReedPrincipal Consultant

Page 2: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

J. PAUL REED

• “Sober Build Engineer”

•@SoberBuildEng

• Fifteen years as a build/release engineer

Page 4: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

IN PREPARATION FOR OUR FLIGHT...

Page 5: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

“CULTURE?”

Set of shared mental assumptions that guide interpretation and action in organizations by defining appropriate

behavior for various situations.– Ravisi & Schultz, via Wikipedia

(via Damon’s talk)

Page 6: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

“CULTURE” FOR TODAY

Incentives

+

Human Factors

Page 7: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

“CULTURE” FOR TODAY

Incentives

+

Human Factors

(organizational, behavioral, and economic)

(methods for facilitating and fostering those incentives)

Page 8: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

WHY AVIATION?

CraftProvided unique requirements, individuals perfecting their own

methods & techniques

TradeGroups of “craftspeople” sharing

domain knowledge

ScienceProcesses consistently repeatable

by others, under different environments/conditions

IndustryReduce/combine processes to optimize for specific business requirements or outcomes

“Progress”

Page 9: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

WHY AVIATION?

Page 10: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

WHY AVIATION?

But... when we’re talking incident response,the house is already on fire

Page 11: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

WHY AVIATION?

Dev Ops

Page 12: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

WHY AVIATION?

Dev Ops

Page 13: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

WHY AVIATION?

Scale much?

Page 14: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

VISUAL FLIGHT RULES

Page 15: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

VISUAL FLIGHT RULES

Page 16: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

VISUAL FLIGHT RULES

Page 17: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

INSTRUMENT FLIGHT RULES

Page 18: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

INSTRUMENT FLIGHT RULES

Page 19: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

INSTRUMENT FLIGHT RULES

Page 21: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

• Standardization

• Communication

• Expectations

• Remediation

WHAT IT IS

Page 22: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

STANDARDIZATION

A set of operational primitives based on your organizational and business requirements.

Page 23: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

STANDARDIZATION

Leveraged to define youroperational procedures.

Page 24: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

STANDARDIZATION

Leveraged to defineoperational procedures.

Leveraged to define youroperational procedures.

Page 25: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

• Standardization

• Communication

• Expectations

• Remediation

WHAT IT IS

Page 26: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

COMMUNICATION“We're cleared to New York's JFK Airport via the SAN FRANCISCO EIGHT, radar vectors to Linden, direct JSICA, direct Wilson Creek, Jet 80, Kansas City, Jet 24, Saint Louis, direct Brickyard, direct Rosewood, Jet 29, Jamestown, Jet 70, Wilkes Barre, to the LENDY FIVE arrival

into JFK; climb and maintain fifteen, one-five-thousand; expect three-five-zero in ten; squawk six-three-seven-seven.”

– Redwood Flight 34’s Inaugural Clearance

Page 27: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

COMMUNICATIONSFO SFO8

LIN JSICA ILC J80 MCI J24 STL VHP ROD

J29 JMS J70 LVZ LENDY5

– Redwood Flight 34’s Inaugural Clearance

Page 28: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

COMMUNICATIONSFO SFO8

LIN JSICA ILC J80 MCI J24 STL VHP ROD

J29 JMS J70 LVZ LENDY5

– Redwood Flight 34’s Inaugural Clearance

Page 30: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

• Hesitance to use appropriate terms to communicate the situation

COMMUNICATION

Page 31: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

• Hesitance to use appropriate terms to communicate the situation

• A transcontinental 707 arrives to bad weather & low on fuel; pilots do not use the single phrase necessary to indicate their situation, which would’ve activated emergency services (Avianca Flight 52)

COMMUNICATION

Page 32: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

• Hesitance to use appropriate terms to communicate the situation

• A transcontinental 707 arrives to bad weather & low on fuel; pilots do not use the single phrase necessary to indicate their situation, which would’ve activated emergency services (Avianca Flight 52)

•Misuse of defined terminology

COMMUNICATION

Page 33: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

• Hesitance to use appropriate terms to communicate the situation

• A transcontinental 707 arrives to bad weather & low on fuel; pilots do not use the single phrase necessary to indicate their situation, which would’ve activated emergency services (Avianca Flight 52)

•Misuse of defined terminology

• In 1995, a controller clears a 757 “directly” to the airport, setting off an accident chain (American Airlines Flight 965)

COMMUNICATION

Page 34: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

• Standardization

• Communication

• Expectations

• Remediation

WHAT IT IS

Page 35: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

EXPECTATIONS

Once standards are established and requirements/intentions communicated, expectations and responsibilities can be derived.

Page 36: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

• Standardization

• Communication

• Expectations

• Remediation

WHAT IT IS

Page 37: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

WHAT IT IS

With expectations and responsibilities clarified, remediation processes can be integrated into processes & automation, not tacked on or “invented on the fly.”

Page 38: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

• Static

• Blind reliance on automation, tooling, or process

• “Fun-Verboten”

WHAT IT IS NOT

Page 39: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

NOT STATIC

Page 41: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

• Static

• Blind reliance on automation, tooling, or process

• “Fun-Verboten”

WHAT IT IS NOT

Page 42: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

NOT AUTOMATION?!

Page 43: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

NOT AUTOMATION?!

Page 44: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

•Misreading/looking at the wrong metrics

NOT AUTOMATION?!

Page 45: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

•Misreading/looking at the wrong metrics

• A 737 suffers an engine “disturbance”; after not looking at all the appropriate instruments, pilots shut down the good engine; the remaining (bad) engine eventually fails fully (British Midlands Flight 92)

NOT AUTOMATION?!

Page 46: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

•Misreading/looking at the wrong metrics

• A 737 suffers an engine “disturbance”; after not looking at all the appropriate instruments, pilots shut down the good engine; the remaining (bad) engine eventually fails fully (British Midlands Flight 92)

• Partial automation failure and resulting confusion

NOT AUTOMATION?!

Page 47: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

•Misreading/looking at the wrong metrics

• A 737 suffers an engine “disturbance”; after not looking at all the appropriate instruments, pilots shut down the good engine; the remaining (bad) engine eventually fails fully (British Midlands Flight 92)

• Partial automation failure and resulting confusion

• After a series of instrument failures in the highly-automated A-330, the junior pilot pulls the plane into a prolonged stall (Air France 447)

NOT AUTOMATION?!

Page 48: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

NOT AUTOMATION?!

Page 49: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

• Static

• Blind reliance on automation, tooling, or process

• “Fun-Verboten”

WHAT IT IS NOT

Page 50: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

NOT “FUN VERBOTEN”

Page 51: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

NOT “FUN VERBOTEN”

Page 52: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

• Define organizational vocabulary & process primitives

• Formalize roles, responsibilities, and priorities

• Understand (current) limitations

• Investigate outcomes

GETTING “INSTRUMENT RATED”

Page 53: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

DEFINE YOUR APPROACHES

•Define what you do today, focusing on the “operational requirements”

•Derive (or define) primitives

•Define your operational dictionary

•Make sure they’re owned!

Page 54: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

•Define organizational vocabulary & process primitives

• Formalize roles, responsibilities, and priorities

• Understand (current) limitations

• Investigate outcomes

GETTING RATED

Page 55: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

FORMALIZE “2R+P”

• Be able to answer “Who is responsible for that?”

•Drill/train or delegate

•Determine “priority classes”

Page 56: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

•Define organizational vocabulary & process primitives

• Formalize roles, responsibilities, and priorities

• Understand (current) limitations

• Investigate outcomes

GETTING RATED

Page 57: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

LINE UP AND WAIT

For organizational change to even be a possibility,the current limitations need to be internalized.

Page 58: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

KNOW WHEN TO HOLD ‘EM

Page 59: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

•Define organizational vocabulary & process primitives

• Formalize roles & responsibilities

• Understand (current) limitations

• Investigate outcomes

GETTING RATED

Page 61: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

•NASA Aviation Safety Reporting System

AFTER THE “OOPS”

Page 62: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

•NASA Aviation Safety Reporting System

• Separation of investigation roles

•National Transportation Safety Board

AFTER THE “OOPS”

Page 63: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

•NASA Aviation Safety Reporting System

• Separation of investigation roles

•National Transportation Safety Board

• “No Blame” postmortems

• (Though not for the reason you might think!)

AFTER THE “OOPS”

Page 64: Is Your Team Instrument Rated? (Or Deploying 125,000 Times a Day)

OPERATIONAL MODELS

Incentives

+

Human Factors