Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

30
May 9, 2008 IPA Lentedagen, Rhenen 1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1 , Pepijn Crouzen 2 , and Mariëlle Stoelinga 1 . 1 Formal Methods and Tools group CS, University of Twente, NL. 2 Dependable Systems and Software group, CS, Saarland University, Germany

description

Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains. Hichem Boudali 1 , Pepijn Crouzen 2 , and Mari ë lle Stoelinga 1 . 1 Formal Methods and Tools group CS, University of Twente, NL . 2 Dependable Systems and Software group, CS, Saarland University, Germany. - PowerPoint PPT Presentation

Transcript of Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

Page 1: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 1

Dynamic Fault Treeanalysis using

Input/Output Interactive Markov Chains

Hichem Boudali1, Pepijn Crouzen2, and Mariëlle Stoelinga1.

1Formal Methods and Tools groupCS, University of Twente, NL.

2Dependable Systems and Software group,CS, Saarland University, Germany

Page 2: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 2

Introduction:Dependability

Dependability:The trustworthiness of a computer system such that reliance can justifiably be placed upon the service it delivers.

Reliability:

The probability that a computer system does not fail within a given time bound.

Page 3: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 3

Introduction:Formal dependability

Continuous-time Markov chains (CTMC)

States and Markovian transitions Probability of traversing a λ-

transition within t time-units is:

1-e-λt

Tools: Reachability analysis (among others)

λ μ

μ λ

Pepijn Crouzen
Page 4: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 4

Introduction:CTMC characteristics

CTMCs describe probability distributions (phase-type distributions)

Phase-type distributions can approximate any arbitrary distribution arbitrarily closely

Goal: Find a CTMC which describes the probability of system failure within t time-units (i.e. the unreliability of the system)

Problem: Difficult to find the CTMC that models a large system

λ μ

μ λ

Pepijn Crouzen
Page 5: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 5

Introduction:Engineering dependability

Fault Trees (1960’s) Graphical Easy to use Syntax:

Basic events Gates

Semantics: logical formula Problem: Not expressive

enoughMem1fails

CPUfails

Workstation fails

OR

AND

Mem1fails

Page 6: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 6

Introduction:Engineering dependability

Dynamic Fault Trees (1992) Extension of classic fault

trees Additions:

Use of spares Dependencies Order-based failure

Tools: Convert to CTMC P2

System failure

P1

SPARE

OR

S

Page 7: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 7

But…DFT Drawbacks

Scalability Ambiguous syntax and semantics Lack of modularity:

Dynamic modules can not be reused Restrictions on spares and dependencies

Existing analysis technique is hard to extend or modify

Page 8: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 8

Outline

Case study: FTPP system DFT approach Formalizing DFTs

DFT semantics in I/O-IMCs Deep compositionality Extending the DFT formalism

Conclusion Future work

Page 9: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 9

Case study: FTPP

A

D

C

B

A

D

C

B

A DCB

A DCB

NE1

NE3

N

E

2

N

E

4

16 processors divided into 4 groups

4 network elements connect the processors

Per group 2 processors must be operational

Different configurations are possible

Page 10: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 10

D

D

D

D

S

S

S

S

C

B

AA

C

B

A

B

CC

A

B

NE1

NE3

N

E

2

N

E

4

Case study: FTPP

16 processors divided into 4 groups

4 network elements connect the processors

Per group 2 processors must be operational

Different configurations are possible

Dynamic redundancy management is possible

How reliable is each configuration?

Page 11: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 11

FTPP DFT

Group 1 Failure

2/3

S

CBA

Group 2 Failure

2/3

S

CBA

Group 3 Failure

2/3

S

CBA

Group 4 Failure

2/3

S

CBA

System Failure

FDEP

NE1

A A A A

FDEP

NE2

B B B B

FDEP

NE3

C C C C

FDEP

NE4

S S S S

OR

S

S

S

S

C

B

AA

C

B

A

B

CC

A

B

NE1

NE3

N

E

2

N

E

4

Page 12: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 12

C

A B

0.2

0.20.4

0.4

Failure rate:0.2 f/h

Failure rate:0.4 f/h

AND-gate Starting state:A is operationalB is operational

A has failedB is operational

Pr(A fails in T hours) = 1 – e-0.2•T

A’s Mean time to failure = 1/0.2 = 5 hours

A is operationalB has failed

A has failedB has failed

For static fault trees binary decision diagrams can be used! Otherwise: Convert the DFT into a CTMC. Analyze CTMC using standard solution techniques.

Existing DFT analysis[Dugan et al. 1992]

Unreliability =

Prob[Reaching in time T]

But…State space explosion:

CTMC grows exponentiallyFTPP difficult to analyze

Page 13: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 13

FTPP Results

Group 1 Failure

2/3

S

CBA

Group 2 Failure

2/3

S

CBA

Group 3 Failure

2/3

S

CBA

Group 4 Failure

2/3

S

CBA

System Failure

FDEP

NE1

A A A A

FDEP

NE2

B B B B

FDEP

NE3

C C C C

FDEP

NE4

S S S S

Analysis

methodMax number of

statesMax number of

transitionsUnreliability

(T=10)

Standard 32757 426826 2.55479 · 10-8

Compositional 1325 14153 2.55479 · 10-8

S

S

S

S

C

B

AA

C

B

A

B

CC

A

B

NE1

NE3

N

E

2

N

E

4

Page 14: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 14

What’s behind it?

Model local behavior We need compositional Markov

chains Combination of LTS and CTMC,

with I/O automata features Markovian transitions (CTMC) Interactive transitions (LTS) Action signature (IOA)

? - Input actions ! - Output actions ; - Internal actions

λ

failed!

I/O-IMC for

Basic event

Input/Output Interactive Markov Chains (I/O-IMC)

Page 15: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 15

Properties of IMCs: Combines stochastic behavior and interactive

behavior orthogonally CSP-style synchronization + interleaving semantics Maximal progress for internal transitions

Properties of IOIMCs: Unique outputs Input enabledness Outputs cannot be blocked! Maximal progress for output transitions

Input/Output Interactive Markov Chains

λ τ

Page 16: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 16

f(C)!f(A)?

f(B)?

f(B)?

f(A)?

f(C)!f(A)?

f(B)?

f(B)?

DFT semanticsDFT gate to I/O-IMC

Page 17: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 17

What is deep compositionality?

Group 1 Failure

2/3

S

CBA

Semantics of a DFT arises naturally ascomposition of the semantics of its building blocks

But: This may lead to huge models.

f(G1)

f(NE1) f(NE4)…f(NE1) f(NE4)

f(G1)

f(NE2) f(NE3)

Page 18: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 18

Why use deep compositionality?

Formally define semantics Many useful techniques

Combining models: Composition Refining models: Hiding Minimizing models: Bisimulation Reusing models: Renaming

Well supported by CADP toolset (VASY/INRIA)

Combat

State-space

explosion

Page 19: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 19

Compositional Aggregation

Translation Composition +

Abstraction

Aggregation

(minimization)

Repeat

Aggregated system CTMC (CTMDP)

Result: System failure probability

Analysis

Page 20: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 20

Compositional AggregationExample

f(C)!f(A)?

f(B)?

f(B)?

f(A)?Failure rate:

0.2 f/h

Failure rate:

0.4 f/h

f(A)!0.2 f(B)!0.4

Page 21: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 21

Compositional AggregationParallel Composition

1 2 3

1

2

3

4 5

1||1

0.2 f(A)!

f(A)?

f(A)?f(B)?

f(B)?

f(C)!

0.2

f(B)?

f(B)?

f(A)!

f(C)!1||2

2||3

3||1

f(B)?

0.2

f(A)!

3||2

4||3 5||3Inputs: f(A)? and f(B)?Outputs: f(C)!

Inputs: noneOutputs: f(A)!

C

A

C||A

Synchronize on f(A)

Page 22: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 22

f(A);

f(A);f(A)!

f(A)!

Compositional AggregationAbstraction (hiding)

1||10.2

f(B)?

f(B)?

f(B)?

0.2

f(C)!1||2

2||3

3||1

3||2

4||3 5||3

C

A B

Abstraction (hiding):

Makes signal internal

Page 23: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 23

f(A);

f(A);

Compositional AggregationAggregation (weak bisimulation)

1||10.2

f(B)?

f(B)?

f(B)?

0.2

f(C)!1||2

2||3

3||1

3||2

4||3 5||3

Weak bisimulation:

Disregard internal steps

Aggregation:

Finding a smaller model

equivalent (behaviorally)

to the original

Page 24: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 24

Compositional AggregationExample (continued)

1 2 3

1

2

3

4 5

1||1

0.2 f(B)!

0.2

0.2f(B)?

f(B)?

f(C)!

0.2

0.4

2||1

1||2

0.4

0.2

2||2

C||A

B

C||A||B

f(B)!

f(B)!4||3

3||3

0.2

f(C)!

5||3

Page 25: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 25

Compositional AggregationExample (continued)

0.2

0.4

0.4

0.2

C||A||Bf(C)!

Page 26: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 26

DFT extensions

Extensions: Inhibition Repair-policies Complex spares Complex dependencies …

Adding extensions in the compositional framework is easy: Modify translation of DFT building blocks Compositional aggregation algorithm is

unaltered

Free!

DSN07

Page 27: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 27

Extension: Repair

f(C)!f(A)?

f(B)?

f(B)?

f(A)?

r(C)!

r(A)?

r(A)?

r(B)?

r(B)?

r(C)!

r(C)!r(A)? r(B)?

r(B)? r(A)?

λ

f(A)!

µr(A)!

AND-gate C

Basic event A

Page 28: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 28

Conclusion:How we tackled drawbacks

State-space explosion. Ambiguous syntax and

semantics. Lack of modularity:

Dynamic modules can not be reused.

Restrictions on spares and dependencies.

Existing analysis technique is hard to extend and/or modify.

Compositional Aggregation

DAG

Extensions at thelowest level

I/O-IMC

Formal translation

Renaming!

Lifted!

Page 29: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 29

Future work

Fully automated tool (CORAL) More aggressive state reduction

Recent work: specialized acyclic algorithm

Apply deep compositionality to more advanced engineering formalisms! (see Boudali et al., DSN08)

Extend DFT formalism Repair Failure modes Non-exponential failure distributions Sophisticated dependencies

Page 30: Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains

May 9, 2008 IPA Lentedagen, Rhenen 30

The end!

Questions?