Escalating complexity: DevOps learnings from Air France 447

238
Escalating complexity : DevOps learnings from Air France 447

description

On June 1, 2009 Air France 447 crashed into the Atlantic ocean killing all 228 passengers and crew. The 15 minutes leading up to the impact were a terrifying demonstration of the how thick the fog of war is in complex systems. Mainstream reports of the incident put the blame on the pilots - a common motif in incident reports that conveniently ignore a simple fact: people were just actors within a complex system, doing their best based on the information at hand. While the systems you build and operate likely don't control the fate of people's lives, they share many of the same complexity characteristics. Dev and Ops can learn an abundance from how the feedback loops between these aviation systems are designed and how these systems are operated. In this talk Lindsay will cover what happened on the flight, why the mainstream explanation doesn't add up, how design assumptions can impact people's ability to respond to rapidly developing situations, and how to improve your operational effectiveness when dealing with rapidly developing failure scenarios.

Transcript of Escalating complexity: DevOps learnings from Air France 447

Page 1: Escalating complexity: DevOps learnings from Air France 447

Escalating complexity:

DevOps learnings from Air France 447

Page 2: Escalating complexity: DevOps learnings from Air France 447

Lindsay Holmwood@auxesis

Page 3: Escalating complexity: DevOps learnings from Air France 447

Engineering Manager@

Bulletproof Networks

Page 4: Escalating complexity: DevOps learnings from Air France 447
Page 5: Escalating complexity: DevOps learnings from Air France 447

cucumber-nagiosVisage

Flapjack

Page 6: Escalating complexity: DevOps learnings from Air France 447
Page 7: Escalating complexity: DevOps learnings from Air France 447

•On 31 May 2009, Air France 447 departed from Rio de Janeiro-Galeão International Airport at 22:29 UTC. It was scheduled to arrive at Paris-Charles de Gaulle International Airport 11 hours later.

Page 8: Escalating complexity: DevOps learnings from Air France 447

•On 31 May 2009, Air France 447 departed from Rio de Janeiro-Galeão International Airport at 22:29 UTC. It was scheduled to arrive at Paris-Charles de Gaulle International Airport 11 hours later.

•3 hours and 45 minutes later, it crashed into the Atlantic Ocean, killing 216 passengers, and 12 aircrew.

Page 9: Escalating complexity: DevOps learnings from Air France 447

•On 31 May 2009, Air France 447 departed from Rio de Janeiro-Galeão International Airport at 22:29 UTC. It was scheduled to arrive at Paris-Charles de Gaulle International Airport 11 hours later.

•3 hours and 45 minutes later, it crashed into the Atlantic Ocean, killing 216 passengers, and 12 aircrew.

•There were no survivors.

Page 10: Escalating complexity: DevOps learnings from Air France 447
Page 11: Escalating complexity: DevOps learnings from Air France 447

•This is AF447’s flight path:

Page 12: Escalating complexity: DevOps learnings from Air France 447
Page 13: Escalating complexity: DevOps learnings from Air France 447

•AF447 charted a course through a band of equatorial thunderstorms

Page 14: Escalating complexity: DevOps learnings from Air France 447
Page 15: Escalating complexity: DevOps learnings from Air France 447

•This is what happened in the last 15 minutes of the flight.

Page 16: Escalating complexity: DevOps learnings from Air France 447
Page 17: Escalating complexity: DevOps learnings from Air France 447

02:03:44 (Bonin) The inter-tropical convergence... look, we're in it, between 'Salpu' and 'Tasil.' And then, look, we're right in it...

02:05:55 (Robert) Yes, let's call them in the back, to let them know...

02:05:59 (FA) Yes? Marilyn.

02:06:04 (Bonin) Yes, Marilyn, it's Pierre up front... Listen, in 2 minutes, we're going to be getting into an area where things are going to be moving arounda little bit more than now. You'll want to take care.

02:06:13 (FA) Okay, we should sit down then?

02:06:15 (Bonin) Well, I think that's not a bad idea. Give your friends a heads-up.

02:06:18 (FA) Yeah, okay, I'll tell the others in the back. Thanks a lot.

02:06:19 (Bonin) I'll call you back as soon as we're out of it.

02:06:20 (FA) Okay.

Page 18: Escalating complexity: DevOps learnings from Air France 447

02:03:44 (Bonin) The inter-tropical convergence... look, we're in it, between 'Salpu' and 'Tasil.' And then, look, we're right in it...

02:05:55 (Robert) Yes, let's call them in the back, to let them know...

02:05:59 (FA) Yes? Marilyn.

02:06:04 (Bonin) Yes, Marilyn, it's Pierre up front... Listen, in 2 minutes, we're going to be getting into an area where things are going to be moving arounda little bit more than now. You'll want to take care.

02:06:13 (FA) Okay, we should sit down then?

02:06:15 (Bonin) Well, I think that's not a bad idea. Give your friends a heads-up.

02:06:18 (FA) Yeah, okay, I'll tell the others in the back. Thanks a lot.

02:06:19 (Bonin) I'll call you back as soon as we're out of it.

02:06:20 (FA) Okay.

Page 19: Escalating complexity: DevOps learnings from Air France 447
Page 20: Escalating complexity: DevOps learnings from Air France 447

02:06:50 (Bonin) Let's go for the anti-icing system. It's better than nothing.

02:07:00 (Bonin) We seem to be at the end of the cloud layer, it might be okay.

02:08:03 (Robert) You can possibly pull it a little to the le!.

02:08:05 (Bonin) You can possibly pull it a little to the le!. We're agreed that we're inmanual, yeah?

Page 21: Escalating complexity: DevOps learnings from Air France 447
Page 22: Escalating complexity: DevOps learnings from Air France 447

•Captain Marc Dubois hands control to Robert + Bonin, and takes the second mandatory rest break.

Page 23: Escalating complexity: DevOps learnings from Air France 447
Page 24: Escalating complexity: DevOps learnings from Air France 447

02:10:06 (Bonin) I have the controls.

02:10:07 (Robert) Okay.

02:10:07 (Robert) What's this?

02:10:15 (Bonin) There's no good... there's no good speed indication.

02:10:16 (Robert) We've lost the, the, the speeds, then?

02:10:27 (Robert) Pay attention to your speed. Pay attention to your speed.

02:10:28 (Bonin) Okay, okay, I'm descending.

02:10:30 (Robert) Stabilize.

02:10:31 (Bonin) Yeah.

02:10:31 (Robert) Descend... It says we're going up... It says we're going up, so descend.

02:10:36 (Robert) Descend!

Page 25: Escalating complexity: DevOps learnings from Air France 447

02:10:06 (Bonin) I have the controls.

02:10:07 (Robert) Okay.

02:10:07 (Robert) What's this?

02:10:15 (Bonin) There's no good... there's no good speed indication.

02:10:16 (Robert) We've lost the, the, the speeds, then?

02:10:27 (Robert) Pay attention to your speed. Pay attention to your speed.

02:10:28 (Bonin) Okay, okay, I'm descending.

02:10:30 (Robert) Stabilize.

02:10:31 (Bonin) Yeah.

02:10:31 (Robert) Descend... It says we're going up... It says we're going up, so descend.

02:10:36 (Robert) Descend!

Page 26: Escalating complexity: DevOps learnings from Air France 447

02:10:06 (Bonin) I have the controls.

02:10:07 (Robert) Okay.

02:10:07 (Robert) What's this?

02:10:15 (Bonin) There's no good... there's no good speed indication.

02:10:16 (Robert) We've lost the, the, the speeds, then?

02:10:27 (Robert) Pay attention to your speed. Pay attention to your speed.

02:10:28 (Bonin) Okay, okay, I'm descending.

02:10:30 (Robert) Stabilize.

02:10:31 (Bonin) Yeah.

02:10:31 (Robert) Descend... It says we're going up... It says we're going up, so descend.

02:10:36 (Robert) Descend!

Page 27: Escalating complexity: DevOps learnings from Air France 447

02:10:06 (Bonin) I have the controls.

02:10:07 (Robert) Okay.

02:10:07 (Robert) What's this?

02:10:15 (Bonin) There's no good... there's no good speed indication.

02:10:16 (Robert) We've lost the, the, the speeds, then?

02:10:27 (Robert) Pay attention to your speed. Pay attention to your speed.

02:10:28 (Bonin) Okay, okay, I'm descending.

02:10:30 (Robert) Stabilize.

02:10:31 (Bonin) Yeah.

02:10:31 (Robert) Descend... It says we're going up... It says we're going up, so descend.

02:10:36 (Robert) Descend!

Page 28: Escalating complexity: DevOps learnings from Air France 447
Page 29: Escalating complexity: DevOps learnings from Air France 447

02:10:37 (Bonin) Here we go, we're descending.

02:10:38 (Robert) Gently!

02:10:41(Bonin) We're... yeah, we're in a climb.

02:10:49 (Robert) Damn it, where is he?

02:10:55 (Robert) Damn it!

02:11:03 (Bonin) I'm in TOGA, huh?

02:11:06 (Robert) Damn it, is he coming or not?

02:11:21 (Robert) We still have the engines! What the hell is happening? I don't understand what's happening.

02:11:21 (Robert) But we’ve got the engines. What's happening? Do you understand what’s happening or not?

02:11:32 (Bonin) I don't have control of the plane any more now. I don't have control of the plane at all!

Page 30: Escalating complexity: DevOps learnings from Air France 447

02:10:37 (Bonin) Here we go, we're descending.

02:10:38 (Robert) Gently!

02:10:41(Bonin) We're... yeah, we're in a climb.

02:10:49 (Robert) Damn it, where is he?

02:10:55 (Robert) Damn it!

02:11:03 (Bonin) I'm in TOGA, huh?

02:11:06 (Robert) Damn it, is he coming or not?

02:11:21 (Robert) We still have the engines! What the hell is happening? I don't understand what's happening.

02:11:21 (Robert) But we’ve got the engines. What's happening? Do you understand what’s happening or not?

02:11:32 (Bonin) I don't have control of the plane any more now. I don't have control of the plane at all!

Page 31: Escalating complexity: DevOps learnings from Air France 447

02:10:37 (Bonin) Here we go, we're descending.

02:10:38 (Robert) Gently!

02:10:41(Bonin) We're... yeah, we're in a climb.

02:10:49 (Robert) Damn it, where is he?

02:10:55 (Robert) Damn it!

02:11:03 (Bonin) I'm in TOGA, huh?

02:11:06 (Robert) Damn it, is he coming or not?

02:11:21 (Robert) We still have the engines! What the hell is happening? I don't understand what's happening.

02:11:21 (Robert) But we’ve got the engines. What's happening? Do you understand what’s happening or not?

02:11:32 (Bonin) I don't have control of the plane any more now. I don't have control of the plane at all!

Page 32: Escalating complexity: DevOps learnings from Air France 447

02:10:37 (Bonin) Here we go, we're descending.

02:10:38 (Robert) Gently!

02:10:41(Bonin) We're... yeah, we're in a climb.

02:10:49 (Robert) Damn it, where is he?

02:10:55 (Robert) Damn it!

02:11:03 (Bonin) I'm in TOGA, huh?

02:11:06 (Robert) Damn it, is he coming or not?

02:11:21 (Robert) We still have the engines! What the hell is happening? I don't understand what's happening.

02:11:21 (Robert) But we’ve got the engines. What's happening? Do you understand what’s happening or not?

02:11:32 (Bonin) I don't have control of the plane any more now. I don't have control of the plane at all!

Page 33: Escalating complexity: DevOps learnings from Air France 447
Page 34: Escalating complexity: DevOps learnings from Air France 447

02:11:37 (Robert) Controls to the le!!

02:11:41 (Robert) ...what is that?

02:11:41 (Bonin) I have the impression (we have) the speed

02:11:42 (Captain) Er, what are you doing?

02:11:43 (Robert) What’s happening? I don’t know, I don’t know what’s happening

02:11:45 (Bonin) We’re losing control of the aeroplane there

02:11:46 (Robert) We lost all control of the aeroplane. We don’t understand anything.We’ve tried everything.

02:11:51 (Captain) So take that, take that

02:11:55 (Robert) Take that, take that

02:11:57 (Robert) Try to take that

Page 35: Escalating complexity: DevOps learnings from Air France 447

02:11:37 (Robert) Controls to the le!!

02:11:41 (Robert) ...what is that?

02:11:41 (Bonin) I have the impression (we have) the speed

02:11:42 (Captain) Er, what are you doing?

02:11:43 (Robert) What’s happening? I don’t know, I don’t know what’s happening

02:11:45 (Bonin) We’re losing control of the aeroplane there

02:11:46 (Robert) We lost all control of the aeroplane. We don’t understand anything.We’ve tried everything.

02:11:51 (Captain) So take that, take that

02:11:55 (Robert) Take that, take that

02:11:57 (Robert) Try to take that

Page 36: Escalating complexity: DevOps learnings from Air France 447

02:11:37 (Robert) Controls to the le!!

02:11:41 (Robert) ...what is that?

02:11:41 (Bonin) I have the impression (we have) the speed

02:11:42 (Captain) Er, what are you doing?

02:11:43 (Robert) What’s happening? I don’t know, I don’t know what’s happening

02:11:45 (Bonin) We’re losing control of the aeroplane there

02:11:46 (Robert) We lost all control of the aeroplane. We don’t understand anything.We’ve tried everything.

02:11:51 (Captain) So take that, take that

02:11:55 (Robert) Take that, take that

02:11:57 (Robert) Try to take that

Page 37: Escalating complexity: DevOps learnings from Air France 447
Page 38: Escalating complexity: DevOps learnings from Air France 447

02:11:58 (Bonin) I have a problem - it’s that I don’t have vertical speed indication

02:12:01 (Captain) Alright

02:12:01 (Bonin) I have no more displays

02:12:02 (Robert) We have no valid displays

02:12:04 (Bonin) I have the impression that we have some crazy speed, no? What do you think?

02:12:06 (Robert) No.

02:12:07 (Bonin) No?

02:12:07 (Robert) No, above all don't extend the

02:12:07 (Bonin) Okay

02:12:09 (Robert) Don't extend

Page 39: Escalating complexity: DevOps learnings from Air France 447

02:11:58 (Bonin) I have a problem - it’s that I don’t have vertical speed indication

02:12:01 (Captain) Alright

02:12:01 (Bonin) I have no more displays

02:12:02 (Robert) We have no valid displays

02:12:04 (Bonin) I have the impression that we have some crazy speed, no? What do you think?

02:12:06 (Robert) No.

02:12:07 (Bonin) No?

02:12:07 (Robert) No, above all don't extend the

02:12:07 (Bonin) Okay

02:12:09 (Robert) Don't extend

Page 40: Escalating complexity: DevOps learnings from Air France 447
Page 41: Escalating complexity: DevOps learnings from Air France 447

02:12:11 (Bonin) So we’re still going down

02:12:12 (Robert) We’re pulling

02:12:14 (Robert) What do you think about it? What do you think? What do we need to do?

02:12:15 (Captain) There - I don’t know. There - it’s going down.

02:12:19 (Bonin) There you are.

02:12:20 (Bonin) That’s good we should be wings level, no it won’t

02:12:23 (Captain) The wings to flat horizon, the standby horizon

02:12:25 (Robert) The horizon!

02:12:26 (Bonin) Okay

02:12:26 (Robert) Speed?

Page 42: Escalating complexity: DevOps learnings from Air France 447

02:12:11 (Bonin) So we’re still going down

02:12:12 (Robert) We’re pulling

02:12:14 (Robert) What do you think about it? What do you think? What do we need to do?

02:12:15 (Captain) There - I don’t know. There - it’s going down.

02:12:19 (Bonin) There you are.

02:12:20 (Bonin) That’s good we should be wings level, no it won’t

02:12:23 (Captain) The wings to flat horizon, the standby horizon

02:12:25 (Robert) The horizon!

02:12:26 (Bonin) Okay

02:12:26 (Robert) Speed?

Page 43: Escalating complexity: DevOps learnings from Air France 447
Page 44: Escalating complexity: DevOps learnings from Air France 447

02:12:27 (Robert) You're climbing

02:12:28 (Robert) You're going down down down

02:12:28 (Captain) Going down

02:12:30 (Bonin) Am I going down now?

02:12:31 (Robert) Go down

02:12:32 (Captain) No you climb there

02:12:32 (Bonin) I'm climbing okay so we're going down

02:12:34 (Captain) You're climbing

02:12:39 (Bonin) Okay, we're in TOGA

02:12:41 (Bonin) What are we here?

02:12:41 (Bonin) On alti what do we have here?

Page 45: Escalating complexity: DevOps learnings from Air France 447

02:12:27 (Robert) You're climbing

02:12:28 (Robert) You're going down down down

02:12:28 (Captain) Going down

02:12:30 (Bonin) Am I going down now?

02:12:31 (Robert) Go down

02:12:32 (Captain) No you climb there

02:12:32 (Bonin) I'm climbing okay so we're going down

02:12:34 (Captain) You're climbing

02:12:39 (Bonin) Okay, we're in TOGA

02:12:41 (Bonin) What are we here?

02:12:41 (Bonin) On alti what do we have here?

Page 46: Escalating complexity: DevOps learnings from Air France 447

02:12:27 (Robert) You're climbing

02:12:28 (Robert) You're going down down down

02:12:28 (Captain) Going down

02:12:30 (Bonin) Am I going down now?

02:12:31 (Robert) Go down

02:12:32 (Captain) No you climb there

02:12:32 (Bonin) I'm climbing okay so we're going down

02:12:34 (Captain) You're climbing

02:12:39 (Bonin) Okay, we're in TOGA

02:12:41 (Bonin) What are we here?

02:12:41 (Bonin) On alti what do we have here?

Page 47: Escalating complexity: DevOps learnings from Air France 447
Page 48: Escalating complexity: DevOps learnings from Air France 447

02:12:43 (Captain) It's impossible

02:12:45 (Bonin) In alti what do we have?

02:12:47 (Robert) What do you mean on altitude?

02:12:48 (Bonin) Yeah yeah yeah, I'm going down, no?

02:12:50 (Robert) You're going down, yes

02:12:52 (Captain) Hey you

02:12:53 (Captain) You're in

02:12:54 (Captain) Get the wings horizontal

02:12:56 (Robert) Get the wings horizontal

02:12:56 (Bonin) That's what I'm trying to do

02:12:57 (Captain) Get the wings horizontal!

Page 49: Escalating complexity: DevOps learnings from Air France 447

02:12:43 (Captain) It's impossible

02:12:45 (Bonin) In alti what do we have?

02:12:47 (Robert) What do you mean on altitude?

02:12:48 (Bonin) Yeah yeah yeah, I'm going down, no?

02:12:50 (Robert) You're going down, yes

02:12:52 (Captain) Hey you

02:12:53 (Captain) You're in

02:12:54 (Captain) Get the wings horizontal

02:12:56 (Robert) Get the wings horizontal

02:12:56 (Bonin) That's what I'm trying to do

02:12:57 (Captain) Get the wings horizontal!

Page 50: Escalating complexity: DevOps learnings from Air France 447

02:12:43 (Captain) It's impossible

02:12:45 (Bonin) In alti what do we have?

02:12:47 (Robert) What do you mean on altitude?

02:12:48 (Bonin) Yeah yeah yeah, I'm going down, no?

02:12:50 (Robert) You're going down, yes

02:12:52 (Captain) Hey you

02:12:53 (Captain) You're in

02:12:54 (Captain) Get the wings horizontal

02:12:56 (Robert) Get the wings horizontal

02:12:56 (Bonin) That's what I'm trying to do

02:12:57 (Captain) Get the wings horizontal!

Page 51: Escalating complexity: DevOps learnings from Air France 447
Page 52: Escalating complexity: DevOps learnings from Air France 447

02:12:58 (Bonin) I'm at the limit... with the roll

02:13:00 (Captain) The rudder bar

02:13:05 (Captain) Wings horizontal.. go... gently, gently

02:13:11 (Captain) Hey er...

02:13:11 (Robert) We lost it all at the le!

02:13:13 (Robert) I've got nothing there

02:13:15 (Captain) What do you have?

02:13:17 (Captain) No wait

02:13:18 (Bonin) We're there, we're there, we're passing level one hundred

02:13:19 (Robert) Wait, me, I have, I have the controls, eh?

02:13:25 (Bonin) What is... how come we're continuing to go down right now?

Page 53: Escalating complexity: DevOps learnings from Air France 447

02:12:58 (Bonin) I'm at the limit... with the roll

02:13:00 (Captain) The rudder bar

02:13:05 (Captain) Wings horizontal.. go... gently, gently

02:13:11 (Captain) Hey er...

02:13:11 (Robert) We lost it all at the le!

02:13:13 (Robert) I've got nothing there

02:13:15 (Captain) What do you have?

02:13:17 (Captain) No wait

02:13:18 (Bonin) We're there, we're there, we're passing level one hundred

02:13:19 (Robert) Wait, me, I have, I have the controls, eh?

02:13:25 (Bonin) What is... how come we're continuing to go down right now?

Page 54: Escalating complexity: DevOps learnings from Air France 447

02:12:58 (Bonin) I'm at the limit... with the roll

02:13:00 (Captain) The rudder bar

02:13:05 (Captain) Wings horizontal.. go... gently, gently

02:13:11 (Captain) Hey er...

02:13:11 (Robert) We lost it all at the le!

02:13:13 (Robert) I've got nothing there

02:13:15 (Captain) What do you have?

02:13:17 (Captain) No wait

02:13:18 (Bonin) We're there, we're there, we're passing level one hundred

02:13:19 (Robert) Wait, me, I have, I have the controls, eh?

02:13:25 (Bonin) What is... how come we're continuing to go down right now?

Page 55: Escalating complexity: DevOps learnings from Air France 447
Page 56: Escalating complexity: DevOps learnings from Air France 447

02:13:28 (Robert) Try to find what you can do with your controls up there

02:13:30 (Robert) The primaries and so on

02:13:30 (Captain) It won't do anything

02:13:31 (Captain) It won't do anything

02:13:31 (Bonin) At level one hundred

02:13:36 (Bonin) Nine thousand feet

02:13:38 (Captain) Careful with the rudder bar there

02:13:39 (Robert) Climb, climb, climb, climb

02:13:40 (Bonin) But I've been at maxi nose-up for a while

02:13:42 (Captain) No, no, no... don't climb

02:13:43 (Robert) Descend, then... Give me the controls... Give me the controls!

Page 57: Escalating complexity: DevOps learnings from Air France 447

02:13:28 (Robert) Try to find what you can do with your controls up there

02:13:30 (Robert) The primaries and so on

02:13:30 (Captain) It won't do anything

02:13:31 (Captain) It won't do anything

02:13:31 (Bonin) At level one hundred

02:13:36 (Bonin) Nine thousand feet

02:13:38 (Captain) Careful with the rudder bar there

02:13:39 (Robert) Climb, climb, climb, climb

02:13:40 (Bonin) But I've been at maxi nose-up for a while

02:13:42 (Captain) No, no, no... don't climb

02:13:43 (Robert) Descend, then... Give me the controls... Give me the controls!

Page 58: Escalating complexity: DevOps learnings from Air France 447

02:13:28 (Robert) Try to find what you can do with your controls up there

02:13:30 (Robert) The primaries and so on

02:13:30 (Captain) It won't do anything

02:13:31 (Captain) It won't do anything

02:13:31 (Bonin) At level one hundred

02:13:36 (Bonin) Nine thousand feet

02:13:38 (Captain) Careful with the rudder bar there

02:13:39 (Robert) Climb, climb, climb, climb

02:13:40 (Bonin) But I've been at maxi nose-up for a while

02:13:42 (Captain) No, no, no... don't climb

02:13:43 (Robert) Descend, then... Give me the controls... Give me the controls!

Page 59: Escalating complexity: DevOps learnings from Air France 447
Page 60: Escalating complexity: DevOps learnings from Air France 447

02:13:46 (Bonin) Go ahead, you have the controls. We are still in TOGA, eh?

02:13:53 (Captain) AP OFF

02:13:59 (Bonin) Gentlemen

02:14:05 (Captain) Watch out, you're pitching up there

02:14:05 (Robert) I'm pitching up

02:14:06 (Captain) You're pitching up

02:14:07 (Robert) I'm pitching up

02:14:07 (Bonin) Well, we need to, we are at four thousand feet...

Page 61: Escalating complexity: DevOps learnings from Air France 447

02:13:46 (Bonin) Go ahead, you have the controls. We are still in TOGA, eh?

02:13:53 (Captain) AP OFF

02:13:59 (Bonin) Gentlemen

02:14:05 (Captain) Watch out, you're pitching up there

02:14:05 (Robert) I'm pitching up

02:14:06 (Captain) You're pitching up

02:14:07 (Robert) I'm pitching up

02:14:07 (Bonin) Well, we need to, we are at four thousand feet...

Page 62: Escalating complexity: DevOps learnings from Air France 447
Page 63: Escalating complexity: DevOps learnings from Air France 447

02:14:10 (Captain) You're pitching up

02:14:18 (Captain) Go on pull

02:14:19 (Bonin) Let's go, pull up, pull up, pull up

02:14:23 (Bonin) Damn it, we're going to crash... This can't be happening!

02:14:25 (Bonin) But what's happening?

02:14:26 (Captain) Ten degrees of pitch...

02:14:28 End of recording

Page 64: Escalating complexity: DevOps learnings from Air France 447

02:14:10 (Captain) You're pitching up

02:14:18 (Captain) Go on pull

02:14:19 (Bonin) Let's go, pull up, pull up, pull up

02:14:23 (Bonin) Damn it, we're going to crash... This can't be happening!

02:14:25 (Bonin) But what's happening?

02:14:26 (Captain) Ten degrees of pitch...

02:14:28 End of recording

Page 65: Escalating complexity: DevOps learnings from Air France 447

02:14:10 (Captain) You're pitching up

02:14:18 (Captain) Go on pull

02:14:19 (Bonin) Let's go, pull up, pull up, pull up

02:14:23 (Bonin) Damn it, we're going to crash... This can't be happening!

02:14:25 (Bonin) But what's happening?

02:14:26 (Captain) Ten degrees of pitch...

02:14:28 End of recording

Page 66: Escalating complexity: DevOps learnings from Air France 447

02:14:10 (Captain) You're pitching up

02:14:18 (Captain) Go on pull

02:14:19 (Bonin) Let's go, pull up, pull up, pull up

02:14:23 (Bonin) Damn it, we're going to crash... This can't be happening!

02:14:25 (Bonin) But what's happening?

02:14:26 (Captain) Ten degrees of pitch...

02:14:28 End of recording

Page 67: Escalating complexity: DevOps learnings from Air France 447

02:14:10 (Captain) You're pitching up

02:14:18 (Captain) Go on pull

02:14:19 (Bonin) Let's go, pull up, pull up, pull up

02:14:23 (Bonin) Damn it, we're going to crash... This can't be happening!

02:14:25 (Bonin) But what's happening?

02:14:26 (Captain) Ten degrees of pitch...

02:14:28 End of recording

Page 68: Escalating complexity: DevOps learnings from Air France 447

02:14:23 (Bonin) Damn it, we're going to crash... This can't be happening!

Page 69: Escalating complexity: DevOps learnings from Air France 447

02:14:23 (Bonin) Damn it, we're going to crash... This can't be happening!

Page 70: Escalating complexity: DevOps learnings from Air France 447
Page 71: Escalating complexity: DevOps learnings from Air France 447
Page 72: Escalating complexity: DevOps learnings from Air France 447

•Final Air France 447 Report: Pilots misunderstood their situation

•Poorly-trained pilots to blame for Air France crash that killed 228

•Final Air France crash report says pilots failed to react swiftly

•Air France 447 downed as crew ignored alarms

•Air France 447 crash a result of crew ignoring alarms

Page 73: Escalating complexity: DevOps learnings from Air France 447

•The Atlantic

•Daily Mail

•CNN

•New Scientist

•Gizmodo

Page 74: Escalating complexity: DevOps learnings from Air France 447

Convenient narrative

Page 75: Escalating complexity: DevOps learnings from Air France 447

“root cause”

Page 76: Escalating complexity: DevOps learnings from Air France 447

“human error”

Page 77: Escalating complexity: DevOps learnings from Air France 447

Bad apples

Page 78: Escalating complexity: DevOps learnings from Air France 447

“if we weed out the bad apples, the system

will equalise”

Page 79: Escalating complexity: DevOps learnings from Air France 447

What you call "root cause" is simply the place you stop looking any further

-- Sidney DekkerProfessor of Human Factors & Flight Safety, Lund University

Page 80: Escalating complexity: DevOps learnings from Air France 447

Duboistotal 10,988 flying hours, of which 6,258 as

Captain

hours on type 1,747, all as Captain

in the previous six months 346 hours, 18 landings, 15 take-offs

in the previous three months 168 hours, 8 landings, 6 take-offs

in the previous 30 days 57 hours, 3 landings, 2 take-offs

Page 81: Escalating complexity: DevOps learnings from Air France 447

Roberttotal 6,547 flying hours

hours on type 4,479 flying hours

in the previous six months 204 hours, 9 landings, 11 take-offs

in the previous three months 99 hours, 6 landings, 5 take-offs

in the previous 30 days 39 hours, 2 landings, 2 take-offs

Page 82: Escalating complexity: DevOps learnings from Air France 447

Bonintotal 2,936 flying hours

hours on type 807

in the previous six months 368 hours, 16 landings, 18 take-offs

in the previous three months 191 hours, 7 landings, 8 take-offs

in the previous 30 days 61 hours, 1 landings, 2 take-offs

Page 83: Escalating complexity: DevOps learnings from Air France 447
Page 84: Escalating complexity: DevOps learnings from Air France 447

Critical flaw:How would other pilots

react under the same circumstances?

Page 85: Escalating complexity: DevOps learnings from Air France 447

What appears in the crew behavior is that most probably, a different crew should have done the same action. So, we cannot blame this crew. What we can say is that most probably this crew and most crews were not prepared to face such an event.

-- Jean-Paul Troadec Bureau d'Enquêtes et d'Analyses

pour la Sécurité de l'Aviation Civile

Page 86: Escalating complexity: DevOps learnings from Air France 447

What appears in the crew behavior is that most probably, a different crew should have done the same action. So, we cannot blame this crew. What we can say is that most probably this crew and most crews were not prepared to face such an event.

-- Jean-Paul Troadec Bureau d'Enquêtes et d'Analyses

pour la Sécurité de l'Aviation Civile

Page 87: Escalating complexity: DevOps learnings from Air France 447

Actors in a complex system

Page 88: Escalating complexity: DevOps learnings from Air France 447

Systems in a complex system

Page 89: Escalating complexity: DevOps learnings from Air France 447

Systems in aseries of nested

complex systems

Page 90: Escalating complexity: DevOps learnings from Air France 447

“root cause”

Page 91: Escalating complexity: DevOps learnings from Air France 447

Cartesian-Newtonian worldview

Page 92: Escalating complexity: DevOps learnings from Air France 447
Page 93: Escalating complexity: DevOps learnings from Air France 447

hindsight != foresight

Page 94: Escalating complexity: DevOps learnings from Air France 447

[hindsight] converts a once vague, unlikely future into an

immediate, certain past-- Sidney Dekker

Professor of Human Factors & Flight Safety, Lund University

Page 95: Escalating complexity: DevOps learnings from Air France 447

We have all the facts

Page 96: Escalating complexity: DevOps learnings from Air France 447
Page 97: Escalating complexity: DevOps learnings from Air France 447
Page 98: Escalating complexity: DevOps learnings from Air France 447
Page 99: Escalating complexity: DevOps learnings from Air France 447
Page 100: Escalating complexity: DevOps learnings from Air France 447
Page 101: Escalating complexity: DevOps learnings from Air France 447
Page 102: Escalating complexity: DevOps learnings from Air France 447
Page 103: Escalating complexity: DevOps learnings from Air France 447
Page 104: Escalating complexity: DevOps learnings from Air France 447
Page 105: Escalating complexity: DevOps learnings from Air France 447
Page 106: Escalating complexity: DevOps learnings from Air France 447
Page 107: Escalating complexity: DevOps learnings from Air France 447

Investigation took

3 years

Page 108: Escalating complexity: DevOps learnings from Air France 447

Event unfolded in

10 minutes

Page 109: Escalating complexity: DevOps learnings from Air France 447

Fog of War

Page 110: Escalating complexity: DevOps learnings from Air France 447

Limited facts at handin a rapidly

developing situation

Page 111: Escalating complexity: DevOps learnings from Air France 447

Local rationality

Page 112: Escalating complexity: DevOps learnings from Air France 447

“people make what they think are best decisions based on data at hand”

Page 113: Escalating complexity: DevOps learnings from Air France 447

Hindsight affords us global

rationality

Page 114: Escalating complexity: DevOps learnings from Air France 447
Page 115: Escalating complexity: DevOps learnings from Air France 447

What systems were in play?

Page 116: Escalating complexity: DevOps learnings from Air France 447

Modes of operation

Page 117: Escalating complexity: DevOps learnings from Air France 447

Flight control modes

Page 118: Escalating complexity: DevOps learnings from Air France 447

Normal lawground, flight, flare modes

Alternate lawalternate law 1, alternate law 2

Page 119: Escalating complexity: DevOps learnings from Air France 447
Page 120: Escalating complexity: DevOps learnings from Air France 447
Page 121: Escalating complexity: DevOps learnings from Air France 447
Page 122: Escalating complexity: DevOps learnings from Air France 447
Page 123: Escalating complexity: DevOps learnings from Air France 447

Law reconfiguration

feedback

Page 124: Escalating complexity: DevOps learnings from Air France 447
Page 125: Escalating complexity: DevOps learnings from Air France 447

There are three major categories of message that can be transmitted:

• non-vocal (ATC) communication messages with an air traffic control centre

• operational communication messages (AOC) with the operator’s operations centre

• maintenance messages, exclusively from the aircra# to the maintenance centre

Page 126: Escalating complexity: DevOps learnings from Air France 447

Time of Reception Message

02:10 WRN/WN0906010210 221002006AUTO FLT AP OFF

02:10 WRN/WN0906010210 226201006AUTO FLT REAC W/S DET FAULT

02:10 WRN/WN0906010210 279100506F/CTL ALTN LAW

02:10 WRN/WN0906010210 228300206FLAG ON CAPT PFD SPD LIMIT

02:10 #0210/+2.98-30.59

02:10 WRN/WN0906010210 228301206FLAG ON F/O PFD SPD LIMIT

02:10 WRN/WN0906010210 223002506AUTO FLT A/THR OFF

02:10 WRN/WN0906010210 344300506NAV TCAS FAULT

02:11 WRN/WN0906010210 228300106FLAG ON CAPT PFD FD

02:11 WRN/WN0906010210 228301106FLAG ON F/O PFD FD

02:11 WRN/WN0906010210 272302006F/CTL RUD TRV LIM FAULT

02:11 WRN/WN0906010210 279045506MAINTENANCE STATUS EFCS 2

02:11 WRN/WN0906010210 279045006MAINTENANCE STATUS EFCS 1

02:11 FLR/FR0906010210 34111506EFCS2 1,EFCS1,AFS,,,,,PROBE-PITOT 1X2 / 2X3 / 1X3 (9DA),HARD

02:11 FLR/FR0906010210 27933406EFCS1 X2,EFCS2X,,,,,,FCPC2 (2CE2) /WRG:ADIRU1 BUS ADR1-2 TO FCPC2,HARD

02:12 WRN/WN0906010211 341200106FLAG ON CAPT PFD FPV

02:12 WRN/WN0906010211 341201106FLAG ON F/O PFD FPV

02:12 WRN/WN0906010212 341040006NAV ADR DISAGREE

02:13 FLR/FR0906010211 34220006ISIS 1,,,,,,,ISIS(22FN-10FC) SPEED OR MACH FUNCTION,HARD

02:13 FLR/FR0906010211 34123406IR2 1,EFCS1X,IR1,IR3,,,,ADIRU2 (1FP2),HARD

2:13:16 ~ 2:13:41 Possible "Loss of Signal" with satellite

02:13 WRN/WN0906010213 279002506F/CTL PRIM 1 FAULT

02:13 WRN/WN0906010213 279004006F/CTL SEC 1 FAULT

02:14 WRN/WN0906010214 341036006MAINTENANCE STATUS ADR 2

02:14 FLR/FR0906010213 22833406AFS 1,,,,,,,FMGEC1(1CA1),INTERMITTENT

02:14 WRN/WN0906010214 213100206ADVISORY CABIN VERTICAL SPEED

Page 127: Escalating complexity: DevOps learnings from Air France 447

02:10 WRN/WN0906010210 279100506F/CTL ALTN LAW

Page 128: Escalating complexity: DevOps learnings from Air France 447

02:10 WRN/WN0906010210 279100506F/CTL ALTN LAW

Page 129: Escalating complexity: DevOps learnings from Air France 447

small textual warning here

Page 130: Escalating complexity: DevOps learnings from Air France 447

Realisation

Page 131: Escalating complexity: DevOps learnings from Air France 447

02:13:40 (Bonin) But I've been at maxi nose-up for a while

02:13:42 (Captain) No, no, no... don't climb

02:13:43 (Robert) Descend, then... Give me the controls... Give me the controls!

...

02:14:28 End of recording

Page 132: Escalating complexity: DevOps learnings from Air France 447
Page 133: Escalating complexity: DevOps learnings from Air France 447

How does your HAprovide feedback?

Page 134: Escalating complexity: DevOps learnings from Air France 447

Reconfiguration feedback?

Page 135: Escalating complexity: DevOps learnings from Air France 447

How do these modes behave differently?

Page 136: Escalating complexity: DevOps learnings from Air France 447

What about modes you haven’t seen?

Page 137: Escalating complexity: DevOps learnings from Air France 447
Page 138: Escalating complexity: DevOps learnings from Air France 447

Sensory feedback

Page 139: Escalating complexity: DevOps learnings from Air France 447

Obvious change in

color, size, font

Page 140: Escalating complexity: DevOps learnings from Air France 447

Know how colour is processed

by the brain

Page 141: Escalating complexity: DevOps learnings from Air France 447

Familiarise yourself with type

Page 142: Escalating complexity: DevOps learnings from Air France 447

Familiarise yourself with type

Page 143: Escalating complexity: DevOps learnings from Air France 447

Familiarise yourself with type

Page 144: Escalating complexity: DevOps learnings from Air France 447

Optimise for 3am you

Page 145: Escalating complexity: DevOps learnings from Air France 447
Page 146: Escalating complexity: DevOps learnings from Air France 447

Input control

Page 147: Escalating complexity: DevOps learnings from Air France 447

Co-pilot feedback

Page 148: Escalating complexity: DevOps learnings from Air France 447

Averaged input

Page 149: Escalating complexity: DevOps learnings from Air France 447

dual input feedback

Page 150: Escalating complexity: DevOps learnings from Air France 447

CRM

Page 151: Escalating complexity: DevOps learnings from Air France 447

Startling effect

Page 152: Escalating complexity: DevOps learnings from Air France 447
Page 153: Escalating complexity: DevOps learnings from Air France 447

Are your inputs averaged?

Page 154: Escalating complexity: DevOps learnings from Air France 447

How do your engineers troubleshoot

during incidents?

Page 155: Escalating complexity: DevOps learnings from Air France 447

Every man for himself?

Page 156: Escalating complexity: DevOps learnings from Air France 447

How do you co-ordinate change?

Page 157: Escalating complexity: DevOps learnings from Air France 447

Does someone have overview?

Page 158: Escalating complexity: DevOps learnings from Air France 447

How is that responsibility

assigned?

Page 159: Escalating complexity: DevOps learnings from Air France 447

How is information disseminated?

Page 160: Escalating complexity: DevOps learnings from Air France 447

How does the business know what is happening?

Page 161: Escalating complexity: DevOps learnings from Air France 447

Do you have a process?

Page 162: Escalating complexity: DevOps learnings from Air France 447

Do you practice this?

Page 163: Escalating complexity: DevOps learnings from Air France 447

What data do you rely on?

Page 164: Escalating complexity: DevOps learnings from Air France 447
Page 165: Escalating complexity: DevOps learnings from Air France 447

Pair

Page 166: Escalating complexity: DevOps learnings from Air France 447

Vocalise

Page 167: Escalating complexity: DevOps learnings from Air France 447

Minimise&

Compartmentalise

Page 168: Escalating complexity: DevOps learnings from Air France 447

Record

Page 169: Escalating complexity: DevOps learnings from Air France 447

Timeline

Page 170: Escalating complexity: DevOps learnings from Air France 447

maintained by co-ordinator

Page 171: Escalating complexity: DevOps learnings from Air France 447
Page 172: Escalating complexity: DevOps learnings from Air France 447

HUD

Page 173: Escalating complexity: DevOps learnings from Air France 447

isolated sensors

Page 174: Escalating complexity: DevOps learnings from Air France 447
Page 175: Escalating complexity: DevOps learnings from Air France 447

different values

Page 176: Escalating complexity: DevOps learnings from Air France 447

discrepancies

Page 177: Escalating complexity: DevOps learnings from Air France 447

02:12:27 (Robert) You're climbing

02:12:28 (Robert) You're going down down down

02:12:28 (Captain) Going down

02:12:30 (Bonin) Am I going down now?

02:12:31 (Robert) Go down

02:12:32 (Captain) No you climb there

02:12:32 (Bonin) I'm climbing okay so we're going down

02:12:34 (Captain) You're climbing

02:12:39 (Bonin) Okay, we're in TOGA

02:12:41 (Bonin) What are we here?

02:12:41 (Bonin) On alti what do we have here?

Page 178: Escalating complexity: DevOps learnings from Air France 447

CRM

Page 179: Escalating complexity: DevOps learnings from Air France 447
Page 180: Escalating complexity: DevOps learnings from Air France 447

Contextual navigation

Page 181: Escalating complexity: DevOps learnings from Air France 447

different navigation

requirements

Page 182: Escalating complexity: DevOps learnings from Air France 447

dashboards

Page 183: Escalating complexity: DevOps learnings from Air France 447

deep dive on details

Page 184: Escalating complexity: DevOps learnings from Air France 447

test a theory

Page 185: Escalating complexity: DevOps learnings from Air France 447

scientific method:improvised

Page 186: Escalating complexity: DevOps learnings from Air France 447

linkable

Page 187: Escalating complexity: DevOps learnings from Air France 447

correlation

Page 188: Escalating complexity: DevOps learnings from Air France 447

human pattern recognition

Page 189: Escalating complexity: DevOps learnings from Air France 447

human pattern recognition

(provided there is enough adaptive capacity)

Page 190: Escalating complexity: DevOps learnings from Air France 447
Page 191: Escalating complexity: DevOps learnings from Air France 447

Stream of alerts

Page 192: Escalating complexity: DevOps learnings from Air France 447

70 stall warnings

Page 193: Escalating complexity: DevOps learnings from Air France 447

•Final Air France 447 Report: Pilots misunderstood their situation

•Poorly-trained pilots to blame for Air France crash that killed 228

•Final Air France crash report says pilots failed to react swiftly

•Air France 447 downed as crew ignored alarms

•Air France 447 crash a result of crew ignoring alarms

Page 194: Escalating complexity: DevOps learnings from Air France 447

“They should have reacted!”

Page 195: Escalating complexity: DevOps learnings from Air France 447

autopilot disconnect

Page 196: Escalating complexity: DevOps learnings from Air France 447

alternate law reconfiguration

Page 197: Escalating complexity: DevOps learnings from Air France 447

alert priority level?

Page 198: Escalating complexity: DevOps learnings from Air France 447

overwhelmed by feedback

Page 199: Escalating complexity: DevOps learnings from Air France 447

Alert Fatigue

Page 200: Escalating complexity: DevOps learnings from Air France 447

startling effect

Page 201: Escalating complexity: DevOps learnings from Air France 447

reduced adaptive capacity

Page 202: Escalating complexity: DevOps learnings from Air France 447
Page 203: Escalating complexity: DevOps learnings from Air France 447

dampening

Page 204: Escalating complexity: DevOps learnings from Air France 447
Page 205: Escalating complexity: DevOps learnings from Air France 447

Brute force:

manual silence

Page 206: Escalating complexity: DevOps learnings from Air France 447

limit # of engineers who

watch alerts & graphs

Page 207: Escalating complexity: DevOps learnings from Air France 447

smarter alert aggregation?

Page 208: Escalating complexity: DevOps learnings from Air France 447

PagerDuty

Page 209: Escalating complexity: DevOps learnings from Air France 447

Flapjack

Page 210: Escalating complexity: DevOps learnings from Air France 447
Page 211: Escalating complexity: DevOps learnings from Air France 447

Systems thinking

Page 212: Escalating complexity: DevOps learnings from Air France 447

System capable of failure

Page 213: Escalating complexity: DevOps learnings from Air France 447

System capable of success

Page 214: Escalating complexity: DevOps learnings from Air France 447
Page 215: Escalating complexity: DevOps learnings from Air France 447

•System that enables communication

Page 216: Escalating complexity: DevOps learnings from Air France 447

•System that enables communication•System that exposes secrets

Page 217: Escalating complexity: DevOps learnings from Air France 447

•System that enables communication•System that exposes secrets

•System that rob us

Page 218: Escalating complexity: DevOps learnings from Air France 447

•System that enables communication•System that exposes secrets

•System that rob us•System that funds innovation

Page 219: Escalating complexity: DevOps learnings from Air France 447

•System that enables communication•System that exposes secrets

•System that rob us•System that funds innovation

•System that kills us

Page 220: Escalating complexity: DevOps learnings from Air France 447

•System that enables communication•System that exposes secrets

•System that rob us•System that funds innovation

•System that kills us•System that allow us to fly across the world

Page 221: Escalating complexity: DevOps learnings from Air France 447

Failure is pervasive

Page 222: Escalating complexity: DevOps learnings from Air France 447

Failure is complex

Page 223: Escalating complexity: DevOps learnings from Air France 447

Failure is just another mode of operation

Page 224: Escalating complexity: DevOps learnings from Air France 447

Your systemmay not

control fate of

people’s lives

Page 225: Escalating complexity: DevOps learnings from Air France 447

But people may depend on it

Page 226: Escalating complexity: DevOps learnings from Air France 447
Page 227: Escalating complexity: DevOps learnings from Air France 447

Anthropocentrism

Page 228: Escalating complexity: DevOps learnings from Air France 447

Technocentrism

Page 229: Escalating complexity: DevOps learnings from Air France 447

The squishy middle ground

Page 230: Escalating complexity: DevOps learnings from Air France 447

Operable Systems

man + machine

Page 231: Escalating complexity: DevOps learnings from Air France 447

No amoral actors

Page 232: Escalating complexity: DevOps learnings from Air France 447

We need to look at it from a systems approach, a human/technology system that has to work together. This involves aircra! design and certification, training and human factors. If you look at the human factors alone, then you're missing half or two-thirds of the total system failure

-- Chesley SullenbergerPilot, US 1549, Hudson River Ditching

Page 233: Escalating complexity: DevOps learnings from Air France 447

We need to look at it from a systems approach, a human/technology system that has to work together. This involves aircra! design and certification, training and human factors. If you look at the human factors alone, then you're missing half or two-thirds of the total system failure

-- Chesley SullenbergerPilot, US 1549, Hudson River Ditching

Page 234: Escalating complexity: DevOps learnings from Air France 447

• Damn it, we're going to crash... This can't be happening!

Page 235: Escalating complexity: DevOps learnings from Air France 447

•Final Air France 447 Report: Pilots misunderstood their situation

•Poorly-trained pilots to blame for Air France crash that killed 228

•Final Air France crash report says pilots failed to react swiftly

•Air France 447 downed as crew ignored alarms

•Air France 447 crash a result of crew ignoring alarms

Page 236: Escalating complexity: DevOps learnings from Air France 447

Thank you

Page 237: Escalating complexity: DevOps learnings from Air France 447

Thank you

Liked the talk? Let @auxesis know!