Download - Dimensions of Formal Verification and Validation Doron Drusinsky Bret Michael Mantak Shing Naval Postgraduate School.

Dimensions of Formal Verification and Validation

Doron DrusinskyBret Michael

Mantak Shing

Naval Postgraduate School

2

Contents

1. Tradeoffs MC vs. TP vs. EMC

Model/

Program

Spec

ifica

tion

Verification

3

The Role of Specification: “Have we built the right product?”

Model/Program

Spec

ifica

tion

Verification

E.g.,

“if pump pressure is turned Low then High and then Low again all within 10 seconds then pump should not be High for at least 20 additional seconds” class PumpCtl {

int x;

void pumpOn() {

…

}

}

Customer cognitive

requirements

Spec. =

Formal representation

10sec 20sec

x

4

The Role of Verification: “Have we built the product right?”

Model/Pro

gram

Spec

ifica

tion

Verification Verification =

The bridge between specification and implementation

E.g.,

“if pump pressure is turned Low then High and then Low again all within 10 seconds then pump should not be High for at least 20 additional seconds”

class PumpCtl {

int x;

void pumpOn() {

…

}

}

10sec 20sec

x

5

Verification vs. Validation Emphasis

Model/Pro

gram

Spec

ifica

tion

Verification

Most academic work is verification centered

We care about modeling, programming, and validation just as much.

6

Background: Primary Verification Techniques“True” Model-Checking: automatic verification

system

Test suite =

many inputs sequences

Formal spec

FM promise: no need to execute the system

(Finite State) Model of system

==?

Manual modeling (e.g., in Promela) or via abstraction tool

Cognitive/NL requirement

Formalization and validation

7

Background: Primary Verification Techniques“True” Model-Checking:

system

Formal spec


?

Manual translation (e.g., Promela) or via abstraction tool

Limitations: (1) limited validation (weak, hard-to-use spec.-langs)

(2) state-explosion

8

Background: Primary Verification Techniques“True” Model-Checking:

system

Formal spec


?

Manual translation (e.g., Promela) or via abstraction tool

Limitations: (1) limited validation (weak, hard-to-use spec.-langs)

(2) state-explosion

Focus of academic interest

A significant limitation in our opinion

9

Background: Primary Verification TechniquesTheorem Proving:

system

Test suite =

many inputs sequences

Formal spec

FM promise: no need to execute the system

(Infinite State) system

==?



10

Background: Primary Verification TechniquesTheorem Proving:

system

Formal spec

(Infinite State) system

==?

Limitations: (1) limited validation

(even weaker…, hard-to-use spec.-langs)

(2) Requires Ph.D. driver

11

Background: Primary Verification Techniques

Manual Testing:

Limitations: (1) Slow, poor verification coverage, expensive, hard to repeat

(2) Requires many human testers (slow and expensive)

(3) Validation is missing (effectively tester does both V&V)

Cognitive requirement

Express as NL

12

Background: Primary Verification TechniquesExecution based Model-Checking (EMC) =

Run-time Execution Monitoring (REM/RV) + Automatic Test Generation (ATG):



Monitor (REM)

ATG

13

Background: Primary Verification TechniquesExecution based Model-Checking (EMC):

Limitations: (1) No absolute coverage

(but more can be specified…)

PRO:Better validation -- easy to use, expressive languages, with simulation.

14

The Coverage Cube(more is better)

Program

Cove

rage

Spec

ifica

tion

Cov

erag

e Verification Coverage

Validation related: How well are requirements covered?

How well does model under verification match actual program?

Are all possible spec violations detectable?

15

The Coverage Cube(more is better)

MC/TP: 100% verification coverage (only for those req’s that can be specified…), much weaker in other dimensions

EMC: restricted verification coverage, good in other two dimensions

Program Cove

rage

Spec

ifica

tion

Cov

erag

e

Verification Coverage

EMC

MC, TP

16

The Cost Cube(more is worse)

Cost of writing specifications: how easy is it to write them and to get them right?

Cost of modeling: is a special modeling language required, is guided abstraction required?

Cost of verificationModeli

ng Cost

Spec

ifica

tion

Cos

t

Verification cost

17

The Cost Cube(more is worse)

· requires special modeling language or abstraction

· uses academic, relatively weak spec languages

· automatic verification

· program==model à 0 cost of modeling

· UML based specification with simulation

· Automatic test-generation and monitoring

Modeling C

ostSp

ecifi

catio

n C

ost

Verification cost

· requires special modeling language

· uses academic, relatively weak spec languages, supports limited patterns of requirements

· equired Ph.D level verification driver

EMC

TPMC

18

Example#1 of A Validation Issue:Weak Specification Coverage

“if pump pressure is repeatedly turned Low then High N or more times (N>1) within 10 seconds then pump should not be Low for at least 20 additional seconds”

Customer cognitive

requirement

19

Example #1 (cont.)

Init[]

Low

pumpLow[]/timer10.restart();

High /*Local Variables*/static final int N=3;TRTimeout timer10 = new TRTimeout(10);TRTimeout timer20 = new TRTimeout(20);int nCnt = 0;

pumpHigh[]/nCnt++;

nCnt>=N

Erroron entry/System.err.println("Assertion 22 failed!");bSuccess=false;

[true]

[false]

pumpLow[]

timeoutFire[]

Verify_Low_for_20_secondson entry/timer20.restart();

PumpLow[]

Customer cognitive

requirement

Statechart-assertion for RV and EMC

20

Example #1 (cont.)

Model/Program

Spec

ifica

tion

Verification

Outside the scope of MC/TP. They do not support:

•Real-time constraints (10 sec, 20 sec…)

•Counting (N times…)

•In general, they support at most ω-regular properties.

Init[]

Low

pumpLow[]/timer10.restart();

High /*Local Variables*/static final int N=3;TRTimeout timer10 = new TRTimeout(10);TRTimeout timer20 = new TRTimeout(20);int nCnt = 0;

pumpHigh[]/nCnt++;

nCnt>=N

Erroron entry/System.err.println("Assertion 22 failed!");bSuccess=false;

[true]

[false]

pumpLow[]

timeoutFire[]

Verify_Low_for_20_secondson entry/timer20.restart();

pumpHigh[]

21

Example#2 of Poor

Specification Coverage

Customer cognitive

requirement

Statechart-assertion for RV and EMCNL (time-series):Whenever the track count (cnt) Average Arrival Rate (ART) exceeds 80% of the MAX_COUNT_PER_MIN cnt ART must be reduced back to 50% of the MAX_COUNT_PER_MIN within 2 minute and cnt ART must remain below 60% of the MAX_COUNT_PER_MIN for at least 10 minutes.

If ART>80% Then ART>50%

2min 10min

And ART>60%

22

More about Specification LanguagesLTL or Buchi-Automata vs. Statechart-Assertions

LTL & Buchi-Automata have lower specification coverage and are more expensive to use, partial list of reasons:

1. Theoretical: weak descriptive power (ω-regular at best).2. Hard to use – the National Team can attest w.r.t. LTL3. Lack of support for most basic constraints (real-time).4. Infinite sequence semantics.5. They are propositional (e.g., Always P Eventually Q ), while real systems are

both conditional (propositional) and event-driven (see UML standard).

23

Example of Poor Program Coverage

Model/Program

Spec

ifica

tion

Verification

Program: InfusionPump.java

Can we verify the property in the context

of the REAL code?

24

Validation using JUnit or MSC

Validation. The StateRover uses JUnit-based simulation for validation.

Initon entry/ nCnt = 0;

[]

T

Erroron entry/bSuccess = false;

System.err.println("Assertion failed");

[]

P[]

A

Q/nCnt++;

/*Local Variables*/static final int N=2;int nCnt;

nCnt>N

[true][false]P[]/nCnt=0;

JUnit-based scenario:assertion.P();assertion.Q();assertion.Q();assertion.P();assertion.Q();assertTrue( assertions.isSuccess());

“No more than N (e.g., 2) Q events can follow a P event”

3 Q’s after 1’st P.

Is that OK?

Depends on cognitive expectation

25

Validation: What can go Wrong?


[]

T



[]

P[]

A

Q/nCnt++;


nCnt>N





Is that OK?


1. Assertion is incorrect (usually where blame is assigned).2. Natural lang. is ambiguous.3. NL was written for main scenario, doesn’t work as well for other scenarios. 4. Validation scenario is not what we think it is…

26

Thank you

27

Blunt User Questions

Q1. A property says “light must be on for at least 5 seconds after door opens”. My program already implements that, why write a spec.-property for that?

A. • Indeed, if everything we implemented was always correct the world would be a nice

place…• When the implementation changes, who is the “lobbyist” for this requirement?

We need a separate representative for each requirement.

28

Blunt User QuestionsQ2. Why not write a specification in Java (or in the language of the model).

A. We write spec’s as statechart-assertions. The motivation for not writing in Java is the same motivation that applies to using a code generator in general.


[]

T



[]

newTruck[]

A

newCar/nCnt++;


nCnt>N

[true][false]newTruck[]/nCnt=0;

“No more than N newCar events can follow a newTruck event”

29


Q3. What’s the difference between a model and a program.

A. Abstraction. Once the model has sufficient detail to be used as source code then it’s a

program. That’s how StateRover statechart models/programs are used.

30


Q4. Who says the spec. is correct?

A. Validation. The StateRover uses JUnit-based simulation for validation.


[]

T



[]

P[]

A

Q/nCnt++;


nCnt>N





Is that OK?


31

Comments of IV&V Director Dr. Caffal

1. Natural language requirements are typically vague, inconsistent, and incomplete.

2. Natural language requirements frequently have counter-examples to the expressed logic. The counter-examples are not easily observed by reading the requirement.

3. Unlike other disciplines, software developers oftentimes do not employ tools to describe behavior and elicit requirements.

4. Behavior specification comes in three flavors: what we want the system to do, what we do not want the system to do, and what we want the system to do under adverse conditions.

5. Nearly impossible to detect missing requirements.

6. Natural language requirements typically express constraints and limitations - rarely express behavior.The Team had to be hard pressed to come-up with behavioral requirements

7. Without specifying behavior, developers implicitly allow programmers to define behaviors. As such, system behaviors emerge without design and structure. Thus, emergent behaviors of systems are frequently an unhappy surprise to developers.

32

Behavioral specifications about: What we want the system to do

Whenever stopcommand isreceived thenvehicle shouldreach completestop within 30seconds

Init

Stop

stopCommand[]

[]

/*Local Variables*/TRTimeoutFireSimulatedTime timer = new TRTimeoutFireSimulatedTime(30, this);

timeoutFire[]

Primary.getSpeed() < 1Erroron entry/bSuccess=false;System.err.println("Assertion for Req.213 failed");

[false]

[true]

33

Behavioral specifications about: what we do not want the system to

do (“negative behavior”)

Pump should never operate until at least two seconds after valve-shut.

[]

Init

valveShut[]

/*Local Variables*/TRTimeoutFireSimulatedTime timer = new TRTimeoutFireSimulatedTime(2, this);

Count_2_secon entry/timer.restart();

timeoutFire[]

Erroron entry/bSuccess=false;System.err.println("Assertion for Req.213 failed");pumpStarted[]This is where the end user says:

I’ve already implemented this behavior the positive way, why do I need a negative behavior assertion?

34

Behavioral specifications about: what the system will do under adverse

conditions (recovery)

Red

Camera Count CarsC_0 []

CountBREAKon entry/nCnt = 1;

On

[]

Off

Shoot

bTest()

increment

newCar(Car obj)[isRolls(obj)]/nCnt = 4;

CriticalRegion

newCar[]newTruck[][]start[]

[false]

[true]

newTruck[]

[]

newCar[]

Assertion

Init

[]

T

ErrorBREAKon entry/bSuccess = false;


[]newCar[]/timeout.restart(); // restart timer newCar fires

primaryEntered("Off")[]

newTruck[]

timeoutFire()[]

35

Doing More for Validation


[]

T



[]

P[]

A

Q/nCnt++;


nCnt>N


Under development: tool that point out missing assertion simulation/validation scenarios

36

Thank You

37

backup

38


do (“negative behavior”)

As of cruiseSet speed should not change by more than 2% unless incline is more than 5% for more than 10 seconds.

V

stable

5sec

Speed instability

5secAmbiguous:a. Incline after speed instability

b. Incline during speed instability

DoorClosedon entry/timer_1sec.restart();

timeoutFire[]

[]

/*Local Variables*/TRTimeoutFireSimulatedTime timer_1sec = new TRTimeoutFireSimulatedTime(1, this);int speedAtCruiseSetTime = 0;int speedNow;int inclineDuration = 0;

Erroron entry/bSuccess=false;System.err.println("Assertion for Req.213 failed");

Init

cruiseSet[]/speedAtCruiseSetTime =primary.getSpeed();

on entry/speedNow= primary.getSpeed();if (primary.getIncline() > 5) inclineDuration++;else inclineDuration = 0;

abs(speedNow-speedAtCruiseSetTime)<0.02

[]

[true]

[false]

cruiseOff[]

checkInclineon entry/timer_1sec.restart();

timeoutFire[]primary.getIncline() > 5

[true]/inclineDuration++;

inclineDuration == 0

[true]

[false]

[false]

inclineDuration > 10

[]

[false]

39


do (“negative behavior”)Negative statement: As of cruiseSet speed should not

change by more than 2% unless incline is more than 5% for more than 10 seconds.

Positive statement: As of cruiseSet speed should be

98% stable unless incline is more than 5% for more than 10 seconds.

The key about negative behavior is not the way its phrased. It’s the fact that a system is built to do the positive, so it is assumed the negative is