Dimensions of Formal Verification and Validation
Doron DrusinskyBret Michael
Mantak Shing
Naval Postgraduate School
2
Contents
1. Tradeoffs MC vs. TP vs. EMC
Model/
Program
Spec
ifica
tion
Verification
3
The Role of Specification: “Have we built the right product?”
Model/Program
Spec
ifica
tion
Verification
E.g.,
“if pump pressure is turned Low then High and then Low again all within 10 seconds then pump should not be High for at least 20 additional seconds” class PumpCtl {
int x;
void pumpOn() {
…
}
}
Customer cognitive
requirements
Spec. =
Formal representation
10sec 20sec
x
4
The Role of Verification: “Have we built the product right?”
Model/Pro
gram
Spec
ifica
tion
Verification Verification =
The bridge between specification and implementation
E.g.,
“if pump pressure is turned Low then High and then Low again all within 10 seconds then pump should not be High for at least 20 additional seconds”
class PumpCtl {
int x;
void pumpOn() {
…
}
}
10sec 20sec
x
5
Verification vs. Validation Emphasis
Model/Pro
gram
Spec
ifica
tion
Verification
Most academic work is verification centered
We care about modeling, programming, and validation just as much.
6
Background: Primary Verification Techniques“True” Model-Checking: automatic verification
system
Test suite =
many inputs sequences
Formal spec
FM promise: no need to execute the system
(Finite State) Model of system
==?
Manual modeling (e.g., in Promela) or via abstraction tool
Cognitive/NL requirement
Formalization and validation
7
Background: Primary Verification Techniques“True” Model-Checking:
system
Formal spec
(Finite State) Model of system
?
Manual translation (e.g., Promela) or via abstraction tool
Limitations: (1) limited validation (weak, hard-to-use spec.-langs)
(2) state-explosion
8
Background: Primary Verification Techniques“True” Model-Checking:
system
Formal spec
(Finite State) Model of system
?
Manual translation (e.g., Promela) or via abstraction tool
Limitations: (1) limited validation (weak, hard-to-use spec.-langs)
(2) state-explosion
Focus of academic interest
A significant limitation in our opinion
9
Background: Primary Verification TechniquesTheorem Proving:
system
Test suite =
many inputs sequences
Formal spec
FM promise: no need to execute the system
(Infinite State) system
==?
Cognitive/NL requirement
Formalization and validation
10
Background: Primary Verification TechniquesTheorem Proving:
system
Formal spec
(Infinite State) system
==?
Limitations: (1) limited validation
(even weaker…, hard-to-use spec.-langs)
(2) Requires Ph.D. driver
11
Background: Primary Verification Techniques
Manual Testing:
Limitations: (1) Slow, poor verification coverage, expensive, hard to repeat
(2) Requires many human testers (slow and expensive)
(3) Validation is missing (effectively tester does both V&V)
Cognitive requirement
Express as NL
12
Background: Primary Verification TechniquesExecution based Model-Checking (EMC) =
Run-time Execution Monitoring (REM/RV) + Automatic Test Generation (ATG):
Cognitive/NL requirement
Formalization and validation
Monitor (REM)
ATG
13
Background: Primary Verification TechniquesExecution based Model-Checking (EMC):
Limitations: (1) No absolute coverage
(but more can be specified…)
PRO:Better validation -- easy to use, expressive languages, with simulation.
14
The Coverage Cube(more is better)
Program
Cove
rage
Spec
ifica
tion
Cov
erag
e Verification Coverage
Validation related: How well are requirements covered?
How well does model under verification match actual program?
Are all possible spec violations detectable?
15
The Coverage Cube(more is better)
MC/TP: 100% verification coverage (only for those req’s that can be specified…), much weaker in other dimensions
EMC: restricted verification coverage, good in other two dimensions
Program Cove
rage
Spec
ifica
tion
Cov
erag
e
Verification Coverage
EMC
MC, TP
16
The Cost Cube(more is worse)
Cost of writing specifications: how easy is it to write them and to get them right?
Cost of modeling: is a special modeling language required, is guided abstraction required?
Cost of verificationModeli
ng Cost
Spec
ifica
tion
Cos
t
Verification cost
17
The Cost Cube(more is worse)
· requires special modeling language or abstraction
· uses academic, relatively weak spec languages
· automatic verification
· program==model à 0 cost of modeling
· UML based specification with simulation
· Automatic test-generation and monitoring
Modeling C
ostSp
ecifi
catio
n C
ost
Verification cost
· requires special modeling language
· uses academic, relatively weak spec languages, supports limited patterns of requirements
· equired Ph.D level verification driver
EMC
TPMC
18
Example#1 of A Validation Issue:Weak Specification Coverage
“if pump pressure is repeatedly turned Low then High N or more times (N>1) within 10 seconds then pump should not be Low for at least 20 additional seconds”
Customer cognitive
requirement
19
Example #1 (cont.)
Init[]
Low
pumpLow[]/timer10.restart();
High /*Local Variables*/static final int N=3;TRTimeout timer10 = new TRTimeout(10);TRTimeout timer20 = new TRTimeout(20);int nCnt = 0;
pumpHigh[]/nCnt++;
nCnt>=N
Erroron entry/System.err.println("Assertion 22 failed!");bSuccess=false;
[true]
[false]
pumpLow[]
timeoutFire[]
Verify_Low_for_20_secondson entry/timer20.restart();
PumpLow[]
Customer cognitive
requirement
Statechart-assertion for RV and EMC
20
Example #1 (cont.)
Model/Program
Spec
ifica
tion
Verification
Outside the scope of MC/TP. They do not support:
•Real-time constraints (10 sec, 20 sec…)
•Counting (N times…)
•In general, they support at most ω-regular properties.
Init[]
Low
pumpLow[]/timer10.restart();
High /*Local Variables*/static final int N=3;TRTimeout timer10 = new TRTimeout(10);TRTimeout timer20 = new TRTimeout(20);int nCnt = 0;
pumpHigh[]/nCnt++;
nCnt>=N
Erroron entry/System.err.println("Assertion 22 failed!");bSuccess=false;
[true]
[false]
pumpLow[]
timeoutFire[]
Verify_Low_for_20_secondson entry/timer20.restart();
pumpHigh[]
21
Example#2 of Poor
Specification Coverage
Customer cognitive
requirement
Statechart-assertion for RV and EMCNL (time-series):Whenever the track count (cnt) Average Arrival Rate (ART) exceeds 80% of the MAX_COUNT_PER_MIN cnt ART must be reduced back to 50% of the MAX_COUNT_PER_MIN within 2 minute and cnt ART must remain below 60% of the MAX_COUNT_PER_MIN for at least 10 minutes.
If ART>80% Then ART>50%
2min 10min
And ART>60%
22
More about Specification LanguagesLTL or Buchi-Automata vs. Statechart-Assertions
LTL & Buchi-Automata have lower specification coverage and are more expensive to use, partial list of reasons:
1. Theoretical: weak descriptive power (ω-regular at best).2. Hard to use – the National Team can attest w.r.t. LTL3. Lack of support for most basic constraints (real-time).4. Infinite sequence semantics.5. They are propositional (e.g., Always P Eventually Q ), while real systems are
both conditional (propositional) and event-driven (see UML standard).
23
Example of Poor Program Coverage
Model/Program
Spec
ifica
tion
Verification
Program: InfusionPump.java
Can we verify the property in the context
of the REAL code?
24
Validation using JUnit or MSC
Validation. The StateRover uses JUnit-based simulation for validation.
Initon entry/ nCnt = 0;
[]
T
Erroron entry/bSuccess = false;
System.err.println("Assertion failed");
[]
P[]
A
Q/nCnt++;
/*Local Variables*/static final int N=2;int nCnt;
nCnt>N
[true][false]P[]/nCnt=0;
JUnit-based scenario:assertion.P();assertion.Q();assertion.Q();assertion.P();assertion.Q();assertTrue( assertions.isSuccess());
“No more than N (e.g., 2) Q events can follow a P event”
3 Q’s after 1’st P.
Is that OK?
Depends on cognitive expectation
25
Validation: What can go Wrong?
Initon entry/ nCnt = 0;
[]
T
Erroron entry/bSuccess = false;
System.err.println("Assertion failed");
[]
P[]
A
Q/nCnt++;
/*Local Variables*/static final int N=2;int nCnt;
nCnt>N
[true][false]P[]/nCnt=0;
JUnit-based scenario:assertion.P();assertion.Q();assertion.Q();assertion.P();assertion.Q();assertTrue( assertions.isSuccess());
“No more than N (e.g., 2) Q events can follow a P event”
3 Q’s after 1’st P.
Is that OK?
Depends on cognitive expectation
1. Assertion is incorrect (usually where blame is assigned).2. Natural lang. is ambiguous.3. NL was written for main scenario, doesn’t work as well for other scenarios. 4. Validation scenario is not what we think it is…
26
Thank you
27
Blunt User Questions
Q1. A property says “light must be on for at least 5 seconds after door opens”. My program already implements that, why write a spec.-property for that?
A. • Indeed, if everything we implemented was always correct the world would be a nice
place…• When the implementation changes, who is the “lobbyist” for this requirement?
We need a separate representative for each requirement.
28
Blunt User QuestionsQ2. Why not write a specification in Java (or in the language of the model).
A. We write spec’s as statechart-assertions. The motivation for not writing in Java is the same motivation that applies to using a code generator in general.
Initon entry/ nCnt = 0;
[]
T
Erroron entry/bSuccess = false;
System.err.println("Assertion failed");
[]
newTruck[]
A
newCar/nCnt++;
/*Local Variables*/static final int N=3;int nCnt;
nCnt>N
[true][false]newTruck[]/nCnt=0;
“No more than N newCar events can follow a newTruck event”
29
Blunt User Questions
Q3. What’s the difference between a model and a program.
A. Abstraction. Once the model has sufficient detail to be used as source code then it’s a
program. That’s how StateRover statechart models/programs are used.
30
Blunt User Questions
Q4. Who says the spec. is correct?
A. Validation. The StateRover uses JUnit-based simulation for validation.
Initon entry/ nCnt = 0;
[]
T
Erroron entry/bSuccess = false;
System.err.println("Assertion failed");
[]
P[]
A
Q/nCnt++;
/*Local Variables*/static final int N=2;int nCnt;
nCnt>N
[true][false]P[]/nCnt=0;
JUnit-based scenario:assertion.P();assertion.Q();assertion.Q();assertion.P();assertion.Q();assertTrue( assertions.isSuccess());
“No more than N (e.g., 2) Q events can follow a P event”
3 Q’s after 1’st P.
Is that OK?
Depends on cognitive expectation
31
Comments of IV&V Director Dr. Caffal
1. Natural language requirements are typically vague, inconsistent, and incomplete.
2. Natural language requirements frequently have counter-examples to the expressed logic. The counter-examples are not easily observed by reading the requirement.
3. Unlike other disciplines, software developers oftentimes do not employ tools to describe behavior and elicit requirements.
4. Behavior specification comes in three flavors: what we want the system to do, what we do not want the system to do, and what we want the system to do under adverse conditions.
5. Nearly impossible to detect missing requirements.
6. Natural language requirements typically express constraints and limitations - rarely express behavior.The Team had to be hard pressed to come-up with behavioral requirements
7. Without specifying behavior, developers implicitly allow programmers to define behaviors. As such, system behaviors emerge without design and structure. Thus, emergent behaviors of systems are frequently an unhappy surprise to developers.
32
Behavioral specifications about: What we want the system to do
Whenever stopcommand isreceived thenvehicle shouldreach completestop within 30seconds
Init
Stop
stopCommand[]
[]
/*Local Variables*/TRTimeoutFireSimulatedTime timer = new TRTimeoutFireSimulatedTime(30, this);
timeoutFire[]
Primary.getSpeed() < 1Erroron entry/bSuccess=false;System.err.println("Assertion for Req.213 failed");
[false]
[true]
33
Behavioral specifications about: what we do not want the system to
do (“negative behavior”)
Pump should never operate until at least two seconds after valve-shut.
[]
Init
valveShut[]
/*Local Variables*/TRTimeoutFireSimulatedTime timer = new TRTimeoutFireSimulatedTime(2, this);
Count_2_secon entry/timer.restart();
timeoutFire[]
Erroron entry/bSuccess=false;System.err.println("Assertion for Req.213 failed");pumpStarted[]This is where the end user says:
I’ve already implemented this behavior the positive way, why do I need a negative behavior assertion?
34
Behavioral specifications about: what the system will do under adverse
conditions (recovery)
Red
Camera Count CarsC_0 []
CountBREAKon entry/nCnt = 1;
On
[]
Off
Shoot
bTest()
increment
newCar(Car obj)[isRolls(obj)]/nCnt = 4;
CriticalRegion
newCar[]newTruck[][]start[]
[false]
[true]
newTruck[]
[]
newCar[]
Assertion
Init
[]
T
ErrorBREAKon entry/bSuccess = false;
System.err.println("Assertion failed");
[]newCar[]/timeout.restart(); // restart timer newCar fires
primaryEntered("Off")[]
newTruck[]
timeoutFire()[]
35
Doing More for Validation
Initon entry/ nCnt = 0;
[]
T
Erroron entry/bSuccess = false;
System.err.println("Assertion failed");
[]
P[]
A
Q/nCnt++;
/*Local Variables*/static final int N=2;int nCnt;
nCnt>N
[true][false]P[]/nCnt=0;
Under development: tool that point out missing assertion simulation/validation scenarios
36
Thank You
37
backup
38
Behavioral specifications about: what we do not want the system to
do (“negative behavior”)
As of cruiseSet speed should not change by more than 2% unless incline is more than 5% for more than 10 seconds.
V
stable
5sec
Speed instability
5secAmbiguous:a. Incline after speed instability
b. Incline during speed instability
DoorClosedon entry/timer_1sec.restart();
timeoutFire[]
[]
/*Local Variables*/TRTimeoutFireSimulatedTime timer_1sec = new TRTimeoutFireSimulatedTime(1, this);int speedAtCruiseSetTime = 0;int speedNow;int inclineDuration = 0;
Erroron entry/bSuccess=false;System.err.println("Assertion for Req.213 failed");
Init
cruiseSet[]/speedAtCruiseSetTime =primary.getSpeed();
on entry/speedNow= primary.getSpeed();if (primary.getIncline() > 5) inclineDuration++;else inclineDuration = 0;
abs(speedNow-speedAtCruiseSetTime)<0.02
[]
[true]
[false]
cruiseOff[]
checkInclineon entry/timer_1sec.restart();
timeoutFire[]primary.getIncline() > 5
[true]/inclineDuration++;
inclineDuration == 0
[true]
[false]
[false]
inclineDuration > 10
[]
[false]
39
Behavioral specifications about: what we do not want the system to
do (“negative behavior”)Negative statement: As of cruiseSet speed should not
change by more than 2% unless incline is more than 5% for more than 10 seconds.
Positive statement: As of cruiseSet speed should be
98% stable unless incline is more than 5% for more than 10 seconds.
The key about negative behavior is not the way its phrased. It’s the fact that a system is built to do the positive, so it is assumed the negative is
Top Related