1 CSE 2102 Chapter 6: Software Verification Prof. Steven A. Demurjian Computer Science & Engineering...

1

CSE

2102

Chapter 6: Software VerificationChapter 6: Software Verification

Prof. Steven A. DemurjianComputer Science & Engineering Department

The University of Connecticut371 Fairfield Road, Box U-2155

Storrs, CT 06269-2155

[email protected]://www.engr.uconn.edu/

~steve(860) 486 – 4818

(860) 486 – 3719 (office)

2

CSE

2102

Overview of Chapter 6 Motivation: Goals and Requirements of Verification Approaches to Verification Testing

Goals Theoretical Foundations Empirical Testing Principles Testing in the Small/Large Separation of Concerns and Testing Concurrent and Real-Time Systems Object-Oriented Systems

Informal Analysis Techniques Debugging/Role of Source Code Control Verifying Software Properties

3

CSE

2102

Motivation: Goals and Requirements What kind of Assurance do we get through Testing?

Information Assurance (Info Used as Expected) Security Assurance (Info not Misused)

What Happens in Other Engineering Fields? Civil Engineering – Design and Build a Bridge Requirements (Mathematical Formulas) Modeling (Wind Tunnels + Prototypes) Practical (Tensile Strength of Steel, Weight

Bearing of Concrete/Cement) When the Bridge is Built and Loaded with (Worst

Case) Semis Filled with Cargo in both Directions, it Must Not Fail

Verify Product (Bridge) and Process (Construction) Reality: All Parties in the Process are Fallible!!

4

CSE

2102

Motivation Verification in Computing Typically Accomplished by

Running Test Cases Not All Possible Executions of Software Tested Evaluation of Documentation, User Friendliness,

Other Software Characteristics Often Ignored Verification of Running Deployed Code Difficult

(If Not Impossible) State of CT Insurance Dept. Project

Most Divisions Do Alpha/Beta Testing One Division Wants to Just Jump Right to System

without any Testing Their Current System is Full of Holes and Allows

Incorrect/Inconsistent Data to be Entered Is this Reasonable in Some Situations?

5

CSE

2102

Motivation Verification Process Itself Must be Verified

This Means that the Software Process Itself Must be Verified

Consider CMU’s Software Engineering Institute www.sei.cmu.edu/ Capability Maturity Model Integration (CMMI) Many Companies Strive to Attain Certain SEI Level in

their Software Management/Development In addition, Recall Software Qualities

Correctness, Performance, Robustness, Portability, Visibility, etc.

How are These Verified? Empirically? Qualitatively?

Consider Portability: Web App – How do you Make sure it Works for all Browser/OS Platforms?

6

CSE

2102

Motivation Results of Verification Likely NOT Binary

Don’t get 0 or 1 result – Often Must Assess Result Errors in Large Systems are Unavoidable Some Errors are Left and are “Tolerable” Others are Critical and Must be Repaired Correctness is Relative Term

Consider the Application, Market, Cost, Impact, etc. Verification is Subjective and Objective

Subjective: Reusable, Portable, etc. Objective:

Correctness (Perform Tests) Performance (Response Time, Resource Usage, etc.) Portable (Can you try all Compiler/OS Combos?) Mobile: Does it Work on Every Device?

7

CSE

2102

Approaches to Verification Testing: Experimenting with Product Behavior

Explore the Dynamic Behavior Execute SW Under Different Conditions Seek Counter-examples/Potential Failure Cases Detail Scenarios of Usage/Test Scenarios Involve Customer – Who Knows Domain,

Business Logic, etc., to Formulate Test Cases Analysis: Examining Product w.r.t. Design,

Implementation, Testing Process, etc. Deduce Correct SW Operation as a Logical

Consequence of Design Decisions, Input from Customer, etc.

Static Technique – But Impossible to Verify if SW Engineers Correctly and Precisely Translated Design to Working Error-Free Code

8

CSE

2102

Testing Brief Motivation – Content and Techniques Four Goals of Testing Theoretical Foundations

Formalizing Program Behavior Testing as Input that Produces Output

Empirical Testing Principles How Does Programming Language Influence

Testing Testing in the Small/Large Separation of Concerns and Testing Concurrent and Real-Time Systems Object-Oriented Systems Users and Domain Specialists – Role in Testing

Case Study of CT Insurance Dept. Project

9

CSE

2102

Motivating Testing Testing Can Never Consider All Possible Operating

Conditions Approaches Focus on Identifying Test Cases and

Scenarios of Access for Likely Behavior If Bridge OK at 1 Ton, OK < 1 Ton What is an Analogy in Software?

If a System Works with 100,000 Data Items, it may be Reasonable to Assume it Works for < 100,000 Items

Problems with Software? Hard to Identify Other Scenarios that Completely

Cover All Possible Interactions and Behavior Software Doesn’t have Continuity of Behavior Exhibits Correct Behavior in Infinitely Many

Cases, but still be Incorrect in some Cases Ex: C and bitwise or in If Statement Story

10

CSE

2102

Motivating Testing

What’s a Realistic Example?procedure binary-search (key: in element;

table: in elementTable; found: out Boolean) isbegin

bottom := table'first; top := table'last; while bottom < top loop

if (bottom + top) rem 2 ≠ 0 then middle := (bottom + top - 1) / 2;

else middle := (bottom + top) / 2;

end if;if key ≤ table (middle) then

top := middle;else

bottom := middle + 1;end if;

end loop;found := key = table (top);

end binary-search

if we omit thisthe routineworks if the elseis never hit!(i.e. if size of table is a power of 2)

11

CSE

2102

Four Goals of Testing Dijkstra: “Program testing can be used to show the

presence of bugs, but never to show their absence.” “Notes on Structured Programs,” 1970 www.cs.utexas.edu/users/EWD/ewd02xx/EWD249.PDF Classic Article Still True Today Simply Point to any Major Software Release from

OS to Gameboy Games Pure Testing Cannot Absolutely Prove Correctness Testing Must be Used in Conjunction with Other

Techniques Need Sound and Systematic Principles

12

CSE

2102

Four Goals of Testing Goal 1: Testing Must be Based on Sound and

Systematic Techniques Test Different Execution Paths Provides Understanding of Product Reliability May Require the Insertion of Testing Code

(e.g., Timing, Conditional Compilation, etc.) Goal 2: Testing Must Help Locate Errors

Test Cases Must be Organized to Assist in the Isolation of Errors

This Facilitates Debugging This Requires Well Thought Out Testing

Paradigm as Software is Developed

13

CSE

2102

What is Conditional Compilation? Use precompiler command #define and #ifdef If debugflag defined – prints are in executabl Prints in Every Procedure/Function to Track Flow and

When Error Occurs – Enter w/o Exit#define debugflag 1

main(){#ifdefif debugflag == 1 then {printf(“Entering main code\n”); fflush(stdout);}#endif

/* CODE FOR MAIN

#ifdefif debugflag == 1 then {printf(“Exiting main code\n”); fflush(stdout);}#endif}

14

CSE

2102

Four Goals of Testing Goal 3: Testing Must be Repeatable

Same Input Produces the Same Output Execution Environment Influences Repeatability

if x=0 then return true else return false; If x not Initialized, Behavior Can’t be Predicated In C, the Memory Location is Access for a Value

(Which Could have Data from a Prior Execution) Goal 4: Testing Must be Accurate

Depends on the Precision of SW Specification Test Cases Created from Scenarios of Usage

Mathematical Formulas for Expressing SW can Greatly Assist in Testing Process Remember Logic Specs in Chapter 5…

15

CSE

2102

Theoretical Foundations of Testing Let P be a Program with Domain D and Range R

P: D R (may be partial) P is a Function

Let OR Denote Output Requirements as Stated in P’s Specification Correctness Defined by: OR D R Let d be an Element in D P(d) correct if <d, P(d)> OR P correct if all P(d) are correct

Failure: For some d, P(d) does not Satisfy OR which Indicates an Error or Defect (P returns Wrong Result)

Fault: Incorrect Intermediate State of Program Execution (P Fails/Crashes)

16

CSE

2102

Theoretical Foundations of Testing Failure: For some d, P(d) does not Satisfy OR which

Indicates an Error or Defect – Possibilities Include: Undefined Result (Error State) Wrong/Incorrect Result

Error: There is a Defect that Causes a Failure Typing Mistake (x typed instead of y) Programmer Forgot a Condition (x=0)

Fault: Incorrect Intermediate State of Program Execution

17

CSE

2102

Theoretical Foundations of Testing Test Cases

Key Issue is to Identify the Individual Test Cases and the Set of Tests that are in Domain D

Tests can be Designed to both: Test Successful Outcomes (Expected Positives) Test Incorrect Outcomes (Expected Failures for d in D)

Test Case t is an Element of D Test Set T is a Finite Subset of D

Test t is Successful if P(t) is Correct Test Set is Successful if P Correct for all t in T

Test Set T is Ideal if P is Incorrect, there Exists d in T such that P(d) is Incorrect

18

CSE

2102

Theoretical Foundations of Testing Test Criterion C Defines Finite Subsets of D: C 2D

Test Set T Satisfies C if it is an Element of C, e.g.:

C = {<x1, x2,..., xn> | n 3 i, j, k,

( xi<0 xj=0 xk>0)}

Two test Sets that Satisfy C:

<-5, 0, 22>

<-10, 2, 8, 33, 0, -19>

<1, 3, 99> Does not Satisfy C – Why?

19

CSE

2102

Theoretical Foundations of Testing Properties of Criteria

C is Consistent For any Pairs T1, T2 Satisfying C, T1 is Successful iff

T2 is Successful Either of them Provides the “same” Information

C is Complete If P is Incorrect, there is a test set T of C that is not

Successful C is Complete and Consistent

Identifies an ideal test set Allows Correctness to be proved! Very Difficult to Achieve in Practice for “Reasonably”

Size Complex Applications May be Required for Some Domains

20

CSE

2102

Theoretical Foundations of Testing What are Potential Problems/Roadblocks?

Impossible to Derive Algorithm that States Whether a Program, Test Set, or Criterion Satisfies the Prior Definitions

Impossible to Determine if d in in a Test Set T Impossible to Determine an Ideal Test Set Not Decidable (CSE3500) Whether a Test Set

Satisfies Criteria or Not As a Result, Full Mechanization (Automation) of

Testing is Not Possible Instead, Testing Requires Common Sense, Ingenuity,

Domain Knowledge (User Community), and Most Critically: A Methodological Approach!

21

CSE

2102

Empirical Testing Principles Leverage Theoretical Concepts for a Systematic

Methodological Approach to Empirical Testing Only Exhaustive Testing can Prove Correctness with

Absolute Certainty (Usually Infeasible) Overall Goal: Run “Sufficient” Number of Tests that

have the Potential to Uncover Errorsif X > Y then max := X;else max := Y;endif;

/* Test Set Below Detects the Error */{x = 3, y = 2, x = 2, y = 3}/* Test Set Below Does Not */{x = 3, y = 2, x = 4, y = 3, x = 5, y = 1}

22

CSE

2102

Empirical Testing Principles Testing Criterion are Need in Practice to Define

Significant Test Cases Group Domain Elements by Expected Behavior Representative Test Case from Each Group

Complete Coverage Principle: Choose Test Cases so that the Union of all Test Sets Cover D

n in input value.n < 0 print error message0 ≤ n < 20 print n!20 ≤ n ≤ 200 print n! in FP formatn > 200 print error message

23

CSE

2102

Complete Coverage Principle Try to Group Elements of D into Subdomains D1, D2,

…, Dn where any Element of each Di is likely to have Similar Behavior D = D1 D2 … Dn

Select one test as a Representative of the Subdomain If Dj Dk for all j, k (Partition), any Element

can be chosen from each Subdomain Otherwise choose Representatives to Minimize

number of tests, yet Fulfilling the Principle

24

CSE

2102

Empirical Testing Focus Testing in the Small: Test Individual “Small” Pieces

Testing a “Small” Module, a Class, or Methods of a Class

Testing in the Large: Test Larger Scale Modules (Collections of Pieces) Testing an Inheritance Hierarchy or Set of Related

Classes or a “Java Bean” or a Component Both are Achieved via:

BLACK BOX (Functional) Testing Partition based on the Module’s Specification Tests what the program is supposed to do

WHITE BOX (Structural) Testing Partition based on Module’s Internal Code Tests what the program does

25

CSE

2102

Testing in the Small WHITE BOX (Structural) Testing

Testing Software Using Information about Internal Structure – May Ignore the Specification

Tests what the Software Actually Does Performed Incrementally During Creation of Code Relies on Statements, Structure of Code Itself

BLACK BOX (Functional) Testing Testing without Relying on Way that Code

Designed and Coded – User/Domain Testing Evaluated Against the Specification Tests what the Software supposed to do Tests by Software Engineering and Domain User May be Performed as Part of Verification

26

CSE

2102

Testing in the Small – White Box Testing Focus on Structural Coverage Testing of the Code

Itself w.r.t. Various Statements and Execution Paths Consider the Code:

Testing Must Consider While Loop Conditional

Each May Require Different Test Cases

We’ll Consider Control Flow Coverage Criteria Statement coverage Edge coverage Condition coverage Path coverage

beginread (x); read (y);while x ≠ y loop if x > y then x := x - y; else y := y - x; end if;end loop;gcd : = x;

end;

27

CSE

2102

Statement Coverage Criterion Test Software to Execute All Statements at Least Once Formally:

Select a test set T s.t. every Elementary Stmt. in P is Executed at least once by some d in T

Objective: Try to Minimize the Number of Test Cases still Preserving the Desired Coverage

read (x); read (y);if x > 0 then

write ("1");else

write ("2");end if;if y > 0 then

write ("3");else

write ("4");end if;

{<x = 2, y = 3>, <x = - 13, y = 51>, <x = 97, y = 17>, <x = - 1, y = - 1>}covers all statements

{<x = - 13, y = 51>, <x = 2, y = - 3>} is minimal

28

CSE

2102

Problem with Statement Coverage A Particular Test Case While Covering All Statements

May not Fully Test the Software

Solution: Rewrite the Code as:

if x < 0 then x := -x;

end if;z := x;

{x=-3} covers all statementsbut does not exercise the case when x is positive and the then branch is not entered

if x < 0 then

x := -x; else null; end if;z := x;

Coverage requires you to test both x < 0 and x >= 0 for completeness.

29

CSE

2102

Edge Coverage Criterion Informally:

Consider Each Program as a Control Flow Graph that Represents the Overall Program Structure Edges Represent Statements Nodes at the Ends of an Edge Represent Entry into the

Statement and Exit Intent: Examine Various Execution Paths to Make

sure that Every Edge is Visited at Least Once Formally:

Select Test set T such that Every Edge (Branch) of the Control Flow is Exercised at least once by some d in T

Overall: Edge Coverage is Finer Grained than Statement Coverage

30

CSE

2102

Edge Coverage Criterion Graphs

G G1 2 G1

G1

G1G2

I/O, assignment, or procedure call

if-then-else if-then

while loop

two sequential statements

31

CSE

2102

Simplification PossibleSimplification Possible

A Sequence of Edges can be Collapsed into just one edge

. . .n n nnn k-1 k1 2 3

n1n

k

32

CSE

2102

beginread (x); read (y);while x ≠ y loop

if x > y then x := x - y;

else y := y - x;

end if;end loop;gcd : = x;

end;

Example: Euclid’s Algorithm

Code and its … Corresponding Control Flow Graph

33

CSE

2102

Problem with Edge CoverageProblem with Edge Coverage

found := false; counter := 1;while (not found) and counter < number_of_items loop if table (counter) = desired_element then

found := true; end if; counter := counter + 1;end loop;if found then write ("the desired element is in the table");else write ("the desired element is not in the table");end if;

Test cases that Satisfy Edge Coverage: (1) empty table(2) table with 3 items, second of which is the item to findDOES NOT DISCOVER ERROR OF (< instead of ≤ )

34

CSE

2102

Condition Coverage Criterion Informally:

Utilize the Control Flow Graph Expanded with Testing of Boolean Expressions in Conditionals

Intent: Expand Execution Paths with Values Formally:

Select a test set T s.t. every edge of P’s Control Flow is traversed (Edge Coverage) and

All Possible Values of the Constituents of Compound Conditions are Exercised at Least Once

Overall: Condition Coverage is Finer Grained than Edge Coverage

35

CSE

2102

Condition CoverageCondition Coverage



Expand with Test Cases Related to found, counter < number_of_items, etc.(1) counter less than number_of_items, equal to, greater than(2) if equality satisfied or not(3) etc.

36

CSE

2102

Problem with Condition CoverageProblem with Condition Coverage

{<x = 0, z = 1>, <x = 1, z = 3>} causes the execution of all edgesfor each condition, but fails to expose the risk of a division by zero

if x ≠ 0 then

y := 5; else

z := z - x; end if;if z > 1 then

z := z / x; else

z := 0; end if;

37

CSE

2102

Path Coverage Criterion Informally:

Utilize the Control Flow Graph Expanded with Testing of Boolean Expressions in Conditionals

Expanded to Include All Possible “Paths” of Execution through Control Flow Graph

Don’t Just Cover Every Edge, but Explore all Alternate Paths from Start to Finish

Formally: Select a Test Set T which Traverses all Paths from

Initial to the Final Node of P’s Control Flow Overall: Path Coverage Finer that All Others so far… But:

Amount of Possibilities Prohibitively Large Impossible to Check All Possibilities - Exponential

38

CSE

2102

Example of Path CoverageExample of Path Coverage

{<x = 0, z = 1>, <x = 1, z = 3>} Covers Edges but Not All Paths

{<x = 0, z = 3>, <x = 1, z = 1>} Tests all Execution Paths

if x ≠ 0 then

y := 5; else

z := z - x; end if;if z > 1 then

z := z / x; else

z := 0; end if;

39

CSE

2102

What is a Strategy for a Search of a Table? Skip the Loop - number_of_items = 0 Execute a Loop once or twice (find element early) Execute the Loop to search the entire table



40

CSE

2102

Guidelines for White-Box Testing Testing Loops

Execute Loop Zero Times Execute Loop Maximum Number of Times Execute Loop Average Number of Times Think about Individual Loops, How they Work

(test condition at top or bottom), and are Used Testing Conditionals (If and Case Statements)

Always Cover all Edges Expand to Test Critical Paths of Interest

Other Considerations Choose Criterion and Then Select Input Values Select Criterion Based on The Code Itself Different Criteria may be Applied to Different

Portions of Code

41

CSE

2102

Summary: Problems with White-Box Testing Syntactically Indicated Behaviors (Statements, Edges,

Etc.) are Often Impossible Unreachable Code, Infeasible Edges, Paths, Etc. An Unreachable Statement Means 100% Coverage

Never Attained! Adequacy Criteria May be Impossible to Satisfy

Manual Justification for Omitting Each Impossible Test Case

Adequacy “Scores” Based on Coverage Example: 95% Statement Coverage

Other Possibilities: What if Code Omits Implementation of Some Part

of the Specification? White Box Test Cases Derived from the Code Will

Ignore that Part of the Specification!

42

CSE

2102

Module and Unit Testing Tools Myriad of Products Available

junit junit.sourceforge.net/ Objective-C OCUnit cocoadev.com/wiki/OCUnit Check for C check.sourceforge.net/ C/C++ Google Test MS Test for Visual Studio

Mobile Platforms Titanium Recommends jsunity (Javascript) Sensa Touch (Selenium, CasperJS, Siesta)

43

CSE

2102

Testing in the Small – Black Box Testing Treat Class, Procedure, Function, as a Black Box

Given “What” Box is Supposed to Do Understand its Inputs and Expected Outputs Execute Tests and Assess Results

Formulate Test Cases Based on What Program is Supposed to Do without Knowing Programming Paradigm (OO, Functional, etc.) Code Structure (Modularity, Inheritance, etc.)

ClassProcedureFunction

InputsExpectedOutputs

44

CSE

2102

Consider Sample Specification The program receives as input a record describing an

invoice. (A detailed description of the format of the record is given.) The invoice must be inserted into a file of invoices

that is sorted by date. The invoice must be inserted in the appropriate

position If other invoices exist in the file with the same date,

then the invoice should be inserted after the last one. Also, some consistency checks must be performed The program should verify whether the customer is

already in a corresponding file of customers, whether the customer’s data in the two files match, etc.

45

CSE

2102

What are the Potential Test Cases? An Invoice Whose Date is the Current Date An Invoice Whose Date is Before the Current Date

(This Might Be Even Forbidden By Law)Possible Sub-test Cases An Invoice Whose Date is the Same as Some

Existing Invoice An Invoice Whose Date Does Not Exist in Any

Previously Recorded Invoice Several Incorrect Invoices, Checking Different Types

of Inconsistencies

46

CSE

2102

Test Scenarios and Cases Participators in Testing

User/Domain Specialists to Formulate Test Cases Software Engineers Involved in Specification and

Design Software Developers Software Testers

Sample Testing State of CT Insurance Department Project Constant Renewal of Agents and Agencies

Renewal Scenarios to Process “Batches” Single vs. Multiple Renewals Scan Slip of Paper (1/3 sheet with Bar Code) + Check

Develop Scenarios, Testing Procedures, and Cases See Course Web Page…

47

CSE

2102

Four Types of Black Box Testing Testing Driven by Logic Specifications

Utilizes Pre and Post Conditions Syntax-Driven Testing

Assumes Presence of Underlying Grammar that Describes What’s in Box

Focus on Testing Based on Grammar Decision Table Based Testing

Input/Output or Input/Action Combinations Known in Advance – Outcome Based

Cause-Effect Graph Based Testing If X and Y and … then A and B and … Advance Knowledge of Expected Behavior in

Combination

48

CSE

2102

Logic-Specification Based Testing Consider Logic Specification of Inserting Invoice

Record into a File As Written, Difficult to Discern What to Test

for all x in Invoices, f in Invoice_Files{sorted_by_date(f) and not exist j, k (j ≠ k and f(j) =f(k)}

insert(x, f)

{sorted_by_date(f) and for all k (old_f(k) = z implies exists j (f(j) = z)) and for all k (f(k) = z and z ≠ x) implies exists j (old_f(j) = z) andexists j (f(j). date = x. date and f(j) ≠ x) implies j < pos(x, f) andresult x.customer belongs_to customer_file andwarning (x belongs_to old_f or x.date < current_date or ....)}

49

CSE

2102

Logic-Specification Based Testing Apply Coverage Criterion to Post Condition Rewrite as Below – Easier to Formulate Tests

TRUE implies sorted_by_date(f) and for all k old_f(k) = z implies exists j (f(j) = z) and for all k (f(k) = z and z ≠ x) implies exists j (old_f(j) = z)

and(x.customer belongs_to customer_file) implies resultand not (x.customer belongs_to customer_file and ...)

implies not resultandx belongs_to old_y implies warningandx.date < current_date implies warningand....

50

CSE

2102

Syntax-Driven Testing Applicable to Software Whose Input is Formally

Described by a Grammar Compiler, ATM Machine, etc. Recall State Machines – Know Allowable

Combinations in Advance Requires a Complete Formal Specification of

Language Syntax Specification is Utilized to Generate Test Sets Consider ATM Machine with Formal Steps:

<validate pin> ::= <scan card> <enter pin><withdraw> ::= <enter amt> <check balance> <dispense> or <enter amt> <check balance> <deny>

51

CSE

2102

Syntax-Driven Testing Consider the Following Expression Interpreter

What can be Implied?

Minimal Test Set: Programs that Execute All Statements of the

Language with Minimal or No Repetition

<expression> ::= <expression> + <term>|<expression> - <term> | <term>

<term> ::= <term> * <factor> | <term> / <factor> | <factor>

<factor> ::= ident | ( <expression>)

<expression> <expression> + <term><expression> <expression> + <term> * <factor>

52

CSE

2102

Decision Table-Based Testing Applicable when Specification is Describable by a

Decision Table Table Enumerates Combinations of Inputs that

Generate Combinations of Outputs (Actions) Advantages

Table Behavior Clearly Outlines Inputs and Expected Outcomes of “Black Box”

Tests can be Systematically Derived Based on Table Test for Each Input Verify Each Test Generates Expected Output

53

CSE

2102

Consider Following Specification … The word-processor may present portions of text in

three different formats: plain text (p), boldface (b), italics (i)

The following commands may be applied to each portion of text: make text plain (P), make boldface (B), make

italics (I), emphasize (E), super emphasize (SE). Commands are available to dynamically set E to mean

either B or I denote commands as E=B and E=I, respectively

Similarly, SE can be dynamically set to mean either B (command SE=B) or I (command SE=I), or B

and I (command SE=B+I.)

54

CSE

2102

… and Associated Decision Table

P

B

I

E

SE

E = B

E = I

SE = I

SE = B

SE = B + I

p b i b i b i b,i b,iaction

*

*

*

* *

*

*

* * *

*

*

*

*

*

55

CSE

2102

Cause-Effect Graphs Alternative to Decision Table that Structures the Input

and Expected Outputs in Graph Form Program Transformation Essentially Correspondence

Between Causes (Inputs) and Effects (Outputs)

B I P E E = B SE E = I SE = B SE = I SE = B + I

b

i

p

A N D

O R

A N DO

R

56

CSE

2102

Cause-Effect Graphs Additional Constraints can be Modeled using Dashed

Lines to Indicate Dependencies and Limitations

a

b

c

e

a

b

c

a

b

c

i

o

a

b

a

b

r m

at mostone

at leastone

one and onlyone

requires masks

57

CSE

2102

Cause-Effect Graphs

“Both B and I exclude P (i.e., one cannot ask both forplain text and, say, italics for the same portion of text.)

E and SE are mutually exclusive.”B I P E E = B SE E = I SE = B SE = I SE = B + I

m

m b

i

p

m m

A N D

O R

A N DO

RX m Y = X implies not Y

58

CSE

2102

Cause-Effect Graphs Complete Coverage Principle Applied by Generating

all Possible Combos of Inputs and Verifying Outputs Outcomes

Input Violates Graph’s Consistency Contraints Input Generates Incorrect Output No Input Possible for Particular Ouput

Another Strategy: Reduce Number of Tests by Going Backward from Outputs to Inputs OR node with true output:

Use input combinations with only one true input AND node with false output:

Use input combinations with only one false input

59

CSE

2102

Testing Boundary Conditions Testing Criteria Partition Input Domain into Groups

Corresponding to Output Behavior Typical Programming Errors often occur at Boundary

Between Different Groups Must Test Program Using Input Values “Inside”

the Groups and At their Boundaries Applies to White-box And Black-box Techniques

Employ an “Oracle” to Inspect Results of Test Executions to Reveal Failures Oracles are Required at each stage of Testing Automated test Oracles are Required for Running

Large amounts of tests Oracles are Difficult to Design - no Single Solution May be Person that Interprets Results

60

CSE

2102

Testing in the Large Testing in the Small Techniques Apply to

Program Segments, Methods, Classes (Limited) Now we Scale up to Consider the Verification of Large

System Behavior Prior to Deployment What can be Leveraged in this Regard?

Modular Structure of System to Test Modules Individually and then in Combination etc.

Class Structure (Inheritance and Relationships) Organization of Source Code in Repository Software Architecture that Maps Modules/Classes

to Various Architectural Components Objectives:

Localize Errors, Tested Modules Reused with Confidence, Classify Errors, Find Errors Early, etc.

61

CSE

2102

Testing in the Large Module Testing

Test a Module to Verify Whether its Implementation Matches External Behavior

Not all Modules can be Tested Independently May Require Access to Other Modules (Imports) May Utilize Other Modules (Composed Of)

Integration Testing Testing that is the Gradual Integration of Modules

and Subsystems System Testing

Testing the System as a Whole – Prior to Delivery Acceptance Testing

Performed By the Customer

62

CSE

2102

Module Testing Scaffolding Utilized to Create Environment in Which

Module Can be Tested - Facilitated by: Stub: Procedure that has the Same I/O Parameters

(Method Signature) as Missing Behavior but with Simplified Behavior

Driver: Code that Simulates the Use of Module being Tested by Other Modules in the System

PROCEDURE UNDER TEST DRIVERSTUB

CALL CALL

ACCESS TO NONLOCAL VARIABLES

63

CSE

2102

Integration Testing Big Bang Testing: Test Individual Modules and then

Entire System – What are Problems? All Inter-Module Dependencies Tested at Once Many Interactions Highly Complex/Difficult

Incremental Testing –Preferred Approach Applies Incrementality Principle to Testing – Advantages Modules Developed & Tested Incrementally Easier to Locate and Fix Errors Partial Aggregation of Modules May Constitute

Critical Subsystems Reduces Needs for Stubs and Drivers

Once a Few Modules Tested, they are “Stubs” and “Drivers” for Other Modules

Testing Working Behavior rather than Simulated

64

CSE

2102

Integration Testing Utilizes the USES Relationship Among Modules Can be Accomplished either Top-Down (needs Stubs)

or Bottom-Up (needs Drivers) Bottom-Up:

Lowest level Modules FirstThat Don’t use Others

From Leaves to Root Top-Down:

Root Level Modules May FanOut to Multiple Children

Requires Many Stubs

A

B C

D E

65

CSE

2102

Example of Module Testing

M1 M2

2,1 2,2M M

M1 USES M2 and M2 IS_COMPOSED_OF {M2,1, M2,2}

CASE 1Test M1, providing a stub for M2 and a driver for M1 Then provide an implementation for M2,1 and a stub for M2,2

CASE 2Implement M2,2 and test it by using a driver, Implement M2,1 and test the combination of M2,1 and M2,2 (i.e., M2) by using a driverLast, implement M1 and test it with M2, using a driver for M1

66

CSE

2102

Testing Object-Oriented Systems Testing in the Small

Test Each Class by Calling All of its Methods (Public, Private, and Protected)

Verify Inability to Access Private Data Verify Visibility of Methods Employ Stubs and Drivers as Necessary

Testing in the Large Organize Test Structure According to Class

Structure & Interactions (Inheritance, Component) Separation of Public Interface vs. Private Impl.

May Make Testing Execution Paths Difficult Both Must Consider New OO Issues:

Inheritance, Genericity, Polymorphism, Dynamic Binding, etc.

67

CSE

2102

Testing Classes in Hierarchy “Flattening” the Entire Hierarchy and Considering

every Class as a Totally Independent Component Does not Exploit Incrementality Doesn’t Take Advantage of Inheritance Links

Is there a Strategy that can Leverage Hierarchy and its Meaning to Facilitate the Testing Process?

Personnel

Consultant Employee

Manager Admin_Staff Technical_Staff

68

CSE

2102

One Inheritance Testing Stategy Consider Identifying Series of Potential Tests

Tests that Don’t Have to be Repeated for Any Heir Done at Highest Level – Not Repeated for Descendants

Tests that Must be Performed for Class X and All of its Descendants

Tests that Must be Redone by Applying the Same Input Data to Verify if Output is Not (or is) Changed

Tests that Must Be Modified by Adding Other Input Parameters and Verifying that the Output Changes Accordingly

What about Higher Level Testing Issues?

69

CSE

2102

Black & White Box Testing for OO Black Box

Test the Public Interface – Expected Usage Does Class Provide Required Functionality? Does Every Method work as Expected? Is Methods Signature Correctly Defined? Ignore Details of the Implementation

White Box Test Correctness of Hidden Implementation For Public, Protect, and Private Methods Use Previous Criteria (Edge, Condition, etc.) as

Necessary Ignore if Public Interface Matches Specification

70

CSE

2102

What Other Testing is Required? What are the Dangers of Limiting yourself to White or

Black Box Testing? What are Effects of the Following on OO Testing?

Incorrect Information Hiding Too Much Functionality in Class Unused Functionality in Class Classes without Functionality

How do we Deal with: Genericity Polymorphism Dynamic Binding

Suggestions???

71

CSE

2102

Separation of Concerns in Testing Testing Involves Multiple Phases and Different

Stakeholders, all of Who have Different Goals Apply Separation of Concerns to

Test for Different Qualities: Performance Robustness Portability User Friendliness

Test Different Aspects: Database Interactions GUI Security Code Logging etc.

72

CSE

2102

Separation of Concerns in Testing In Addition – Need to Step Back and Consider Large

Class Issues w.r.t. Testing Concerns: Three Popular Types are:

Overload Testing Test System Behavior/Performance During Peaks Saturday AM at Supermarket, 10000 License renewals

Robustness Testing Test System Under Unexpected Conditions Power Failure, Erroneous User Commands, etc.

Regression Testing Test to Verify Degradations of Correctness or Other

Qualities as Software Evolves/Changes over Time

73

CSE

2102

Performance Testing In Stress Testing for Load under Performance Simultaneous users impact on DB, Server, etc. Testing Options:

Jmeter: Sends multiple http Requests to test Traffic Capacity

Webperformance: Tests loads DBUnit for searching large repositories SQL Injection testing to Prohibit passing of SQL

queries in String inputs of GUIs

74

CSE

2102

Testing Concurrent Systems Concurrent (and Multi-Tier) Systems more Difficult to

Specify, Design, and Test than Sequential Systems Issues that Arise Include:

Repeatability w.r.t. Time for Test Cases Non-determinism Inherent in Concurrent Apps Same Input Will Not Always Produce Same Output

(e.g., Consider a Database Search) Test May Identify Error in One Case that Not

Found in Subsequent Cases For Insurance Department Project:

Difficulty Replicating Errors (Two Locations) Their Database and Ours out of Sync Difference of Production vs. Prototype

Environment

75

CSE

2102

Testing Real Time Systems Verifying Real-Time Systems Must also take into

Account Implementation Details Scheduling Policies Machine Architecture CPU/Memory Speeds/Capacities

Completeness Must be Attained from Processing Speed Perspective

Test Cases Consists not only of Input Data, but also of the Times when such Data are Supplied Remember – Real-Time Systems can have Real-

Time Data Feeds Sensor and Other Data Arriving at Regular (or

Unknown) Intervals

76

CSE

2102

Complementing Testing with Analysis Towards Software Verification, Analysis is Employed

to Assess the Outcomes of Test Cases Differentiate Between Checking

Single Test Case (One Execution) Collection of Test Cases (Multiple Executions)

After TestCases.pdf Did we: Process 61 Checks Totaling $16,444 with 409 Renewal

Requests of 48 Agencies and 362 Agents (see 36) In Theory, Rigorous Formal Analysis Should Result in

Proving a Theorem In Practice

Difficult (Impossible) to Achieve for Large-Scale, General Purpose, Single/Multi Tier Systems

Required in Some Cases (DoD, Avionics, etc.)

77

CSE

2102

Analysis and Testing Analysis Can

Address Verification of Highly Desired Software Properties

Performed by People or Instruments (Software) Apply Formal Methods or Intuition/Experience Applied from Specification Thru Deployment

We’ll Consider Both Informal and Formal Techniques Informally

Code (Design) Walkthroughs Code (Design) Inspections

Formally Proving Correctness Pre- and Post- Conditions of Design/Code/Program

78

CSE

2102

Informal Analysis –Walkthroughs Organized Activity with Group Participants Participants Mimic “Computer” – Paper Execution

Test Cases are Selected in Advance Execution by Hand – Simulate Test Record State (Results) on Paper, Whiteboard,

Computer File, etc. Key Personnel

Designer (Design walkthrough) Developer (Code walkthrough) Attendees (Technical – Designers/Developers) Moderator (Control Meeting) Secretary

Take Notes for Changes Subsequent Walkthroughs Verified Against Notes

79

CSE

2102

Informal Analysis –Walkthroughs Recommended Guidelines

Small Number of People (Three to Five) Participants Receive Written Documentation From

the Designer a Few Days Before the Meeting Predefined Duration of Meeting (A Few Hours) Focus On the Discovery of Errors, Not on Fixing

Them Participants: Designer, Moderator, and a Secretary Foster Cooperation; No Evaluation of People

Experience Shows that Most Errors are Discovered By the Designer During the Presentation, While Trying to Explain the Design to Other People.

80

CSE

2102

Code Inspection Similar to Walkthroughs but Differs in Goal to

Identify Commonly Made Errors Code Checked by Reviewing for Classic Errors

Uninitialized Variables Jumps into Loops Incompatible Assignment (char = int) Nonterminating Loops Array indexes out of bounds Improper Storage Allocation Mismatch of Actual to Formal Parameters Equality Comparison for Floating Point Values

What do Today’s Compilers Offer in Regard to this?

81

CSE

2102

Debugging Debugging: Locate and Correct Errors Previously – Least Understood Process in D & D Today – Marked Improvement in Debugging Process

through Modern Programming Languages Debugging Requires: Understanding of Specification

when Failures Don’t Match Expectations Location of error Not Always Apparent Modularity and Incremental Testing can Help

Where Did “Debugging” Come from? Admiral Grace Hopper (COBOL Inventor) 1947 Harvard Mark II Machine – Insect on Relay The text of the log entry (from September 9, 1947),

reads "1545 Relay #70 Panel F (moth) in relay. First actual case of bug being found".

In Smithsonian

82

CSE

2102

Debugging Closing the Gap Between Faults and Failures

Fault: Incorrect State During Program Execution Failure: Program Halts in Incorrect State Fault Does Not Imply Failure!

Techniques to Expose Faults Memory Dumps – Complete Content of Memory Inspection of Intermediate Code (Assembly) Leveraging Modern Compliers/Checkpointing

Faults and Failures Require Precise Recording of Steps and State of Test Case from Start to End

83

CSE

2102

Debugging in CT Insurance Project Utilization of Track files to Track Program State in the

Event of Error File Located on each User’s Machine and can be

Tracked and Forwarded when Errors Occur Key User’s Identified File sent by User or Retrieved by Administrator File Overwrites Periodically

Software Developers can Leverage Extensive Source Code Infrastructure for Tracking Errors Framework Extensible Developers can Include their Own Tracking

Ability to Turn off the Code – but Minimal Overhead For Code – see Course Web Page…

84

CSE

2102

Sample File OutputSample File Output

85

CSE

2102

Verifying Other Properties - Performance Measurable Software Quality with its Own Techniques Worst Case Analysis

Prove that the System Response Time is Bounded by some Function of the External Requests If every register is used do Scanners Still Work Quickly

Average (Expected) Case Analysis On a Daily basis, Does SW Perform as Needed

Statistical Analysis: What is Standard Deviation in Response Time? Recall Queueing Models, Simulation, etc.

All Require an Understanding of Requirements

86

CSE

2102

Verifying Other Properties - Reliability Measure the Probability of Occurrence of Failure

Utilize Statistical and Probabilistic Techniques Predict the Probability of Failure – What can be

Tolerated if Application Fails? Meaningful Parameters Include:

Average Total Number of Failures Observed at time t: AF(t)

Failure Intensity: FI(t)=AF'(t) Mean Time to Failure at time t: MTTF(t)=1/FI(t)

Time in the Model can be Execution or Clock or Calendar Time

MTTF – Used for Both Software and Hardware

87

CSE

2102

Basic Reliability Model Different Models of Reliability

Basic Assumes Failure Intensity is Constant Failure Decrease Over Time (Errors Fixed) Finite Number of Failures Time/Ability to Remove Errors Does Not Vary

Logarithmic Decrement of Failure Intensity per Failure (Is this OK?) Assumes Infinite Number of Failures

Different Models Used at Different Times Based on Application, Observed Behavior, etc.

88

CSE

2102

Comparing Models

FI

AF

AF0

AF

Basic model

Logarithmic model

89

CSE

2102

Verifying Other Subjective Qualities Consider Notions like Simplicity, Reusability,

Understandability, User Friendliness, etc. Software Metrics Allow an Assessment of the Degree

that each Quality is Attained For Example, Halstead’s Metric tries to Measure

Abstraction Level, Effort, etc., by Measuring 1, number of distinct operators in the program 2, number of distinct operands in the program N1, number of occurrences of operators in the

program N2, number of occurrences of operands in the

program Overall, Most Tools (Compilers/UML) Offer 30-40

Different Metrics

90

CSE

2102

Metrics in Together Architect

91

CSE

2102


92

CSE

2102


93

CSE

2102

Audits in Together Architect

94

CSE

2102

Formal Verification – Towards Correctness Axiomatic Semantics if a Field of Computer Science

that Seeks to Prove the Correctness of Software Techniques Utilizes

Pre Condition: True Before Code Executes Post Condition: True After Code Executes Proof Rules: Programming Language Statements

Axiomatic Semantics Can be Applied at: Program, Module, Method, and Code Segments

Notationally: If Claim 1 and Claim 2 have been Proven, then you Can Deduce Claim 3

Claim1, Claim2

Claim3

95

CSE

2102

Consider the Following Example Before the Execution – the Pre-Condition is True Other Possibilities are:

{true and a = 0 and b = 0} {true and x = 0}

{true} – pre-conditionbegin

read (a); read (b);x := a + b;write (x);

end{output = input1 + input2} – post-condition

96

CSE

2102

Proof Rules for a Programming Language Consider the Rule for Sequence What Does it Say?

For S1, F1 is the pre and F2 is the post For S2, F2 is the pre and F3 is the post Below Line, we can Collapse and Remove the

Intermediate Condition

{F1}S1{F2}, {F2}S2{F3}

{F1}S1;S2{F3}

sequence

{x=0} x:=25 {x=25}, {x=25} x:=x+10 {X=35}

{x=0} x:=25 ; x:=x+10 {X=35}

97

CSE

2102

Proof Rules for a Programming Language Conditional Statement – How Does it Work?

For S1 (If portion) If cond is True, the Use Post that Follows S1

For S2 (Else portion) If cond is False, the Use Post that Follows S1

While Statement Works Similarly:

{Pre and cond} S1 {Post},{Pre and not cond} S2 {Post}{Pre} if cond then S1 ; else S 2 ; end if; {Post}

if-then-else

while-do{I and cond} S {I}{I} while cond loop S; end loop; {I and not cond}

I loop invariant

98

CSE

2102

Proving Correctness Partial Correctness

Validity of {Pre} Program {Post} Guarantees that if Pre holds Before Execution of Program, and if program terminates, then Post will be Achieved

Total Correctness Pre Guarantees Program’s Termination and the

Truth of PostThese problems are undecidable!!!

99

CSE

2102

Consider an Example

{n =1}i := 1; j := 1;found := false;while i =n loop

if table (i) = x thenfound := true;i := i + 1

elsetable (j) := table (i);i := i + 1; j := j + 1;

end if;end loop;n := j - 1;{not exists m (1 =m =n and table (m) = x) andfound =exists m (1 =m =old_n and old_table (m) = x)}

old_table, old_n constants denoting the values of table and of nbefore execution of the program fragment

100

CSE

2102

Correctness at Module Level Prove Correctness of Individual Operations and Use

this to Prove Correctness of Overall Module

module TABLE;exports

type Table_Type (max_size: NATURAL): ?;no more than max_size entries may be stored in a table; user modules must guarantee thisprocedure Insert (Table: in out TableType ;

ELEMENT: in ElementType);procedure Delete (Table: in out TableType;

ELEMENT: in ElementType);function Size (Table: in Table_Type) return NATURAL;provides the current size of a table…

end TABLE

101

CSE

2102

Module Correctness Suppose that the Following have been Proven:

Using these Two, we can Prove Properties of Table Management/Manipulation

Consider the Sequence:Insert (T, x);Delete (T, x);

After the Delete, we Guarantee X not Present in T

{true}Delete (Table, Element);{Element Table};

{Size (Table) < max_size}Insert (Table, Element){Element Table};

102

CSE

2102

Chapter 6 Summary Verification can Span Many Aspects of the D&D

Process from the Specification Through Deployment Testing Ideas/Concepts from Different Perspectives

Testing in the Large vs. Testing in the Small White Box Testing vs. Black Box Testing Condition Criteria Informal vs. Formal Techniques Debugging/Walkthroughs/Inspections User vs. Domain Specialist vs. SW Designer vs. …

Objective of All is to Yield Programs that Work Correctly from Execution Standpoint (no Errors) and from a User Standpoint (behaves as Expected)

1 CSE 2102 Chapter 6: Software Verification Prof. Steven A. Demurjian Computer Science & Engineering...

Documents

Transcript of 1 CSE 2102 Chapter 6: Software Verification Prof. Steven A. Demurjian Computer Science & Engineering...