CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

60
CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification

Transcript of CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

Page 1: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

CS4723 Software

Validation and Quality Assurance

Lecture 11Static Bug Detection and

Verification

Page 2: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

2

Static bug detection

Static bug detection is a minor approach for software quality assurance, compared with testing

Compared to testing Work for specific kinds of bugs

Sometimes not scalable

Generate false positives

Easy to start (no build, no setup, no install …)

Sometimes can guarantee the software to be free of certain kinds of bugs

No need for debugging

Page 3: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

3

State-of-art: static bug detection

Type-specific detection (Fixed Specification and improvement is provided) Major or important type of bugs

Null pointer, memory leak, unsafe cast, injection, buffer overflow, Dynamic SQL error, racing, deadlock, dead loop, html error, UI inconsistency, i18n bugs, …

A large bunch of techniques for each kind of bugs Most of them have severe limitations preventing them

from practical usage

Specification based detection Model checking, symbolic execution, theorem proving

Page 4: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

4

Specification

A description of the correct behavior of software

We must have formal specification to do static bug detection

Three main types of specifications Value

Temporal

Data Flow

Page 5: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

5

Value Specification

The value (s) of one or several variable (s) must satisfy a certain constraint

Example: Final Exam Score <= 100

sortedlist(0) >= sortedlist(1)

http_url.startsWith(“http”)

Sql_query belongs to Language_SQL

Page 6: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

6

Temporal Specification

Two events (or a series of events) must happen in a certain order

Example lock() -> unlock()

file.open() -> file.close() and file.open() -> file.read()

They are different, right?

Temporal Logic Lock() -> F(unlock())

(!read())U(open())

Page 7: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

7

Data Flow Specification

Data from a certain source must / must not flow to a certain sink

Example: ! Contact Info -> Internet

Password -> encryption -> Internet

Data Flow Specification are mainly for security usage

Page 8: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

8

General Specifications

Common behaviors of all software a/b -> b!=0

a.field -> a!=null

a[x] -> x<a.length()

p.malloc() -> p.free()

lock(s) -> unlock(s)

while(Condition) -> F(!Condition)

<script> xxx </script> -> ! User_input -> xxx

! Hard-coded string -> User Interface

Divide by 0

Null Pointer Reference

Buffer Overflow

Memory Leak

deadlock

Infinite Loop

XSS

I18n error

Page 9: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

9

Checking SpecificationsBasic ways

Value Specifications Symbolic execution

Abstract Interpretation

Temporal Specification Model Checking

Data Flow Specification Graph traversal (Data Dependence Graph)

Page 10: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

Static symbolic execution

Basic Example

y = read(); y = 2 * y; if (y <= 12) y = 3; else y = y + 1;print ("OK");

T (y=s), s is a symbolic variable for input

Here T is the condition for the statement to be executed, (y=s) is the relationship of all variables to the inputs after the statement is executed

T (y=2*s)T (y=2*s)T^y<=12 (y = 3)

T^!(y<=12) (y= 2*s + 1)

T^ 2*s<=12 (y= 3 ) | T^!(2*s<=12) (y=2*s + 1)

(2*s <= 12 & y = 3) & y <= 0 Not Satisfiable

!(2*s <= 12) & (y = 2*s + 1) & y<=0 Not SatisfiableProve y > 0?

Page 11: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

11

Static symbolic execution

Complex Example

y = read(); p = 1; while(y < 10){ y = y + 1; if y >2 p = p + 1; else p = p + 2;}print (p);

T (y=s), s is a symbolic variable for inputT (p = 1, y = s)T (p = 1, y = s)T^ s<10 (y = s + 1, p = 1)

T^!(2 < s + 1< 10) (y = s + 1, p = 2)

T^s + 1<=2 (y = s + 1, p = 3)

T^ 2<s+1<10 (y = s + 2, p = 2) | s+1<=2 (y = s + 2, p = 3)

Prove p > 0?

Page 12: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

12

Abstract Interpretation

Symbolic execution tries to record all changes and relations in the memory with symbolic values Too many things to record, not scalable Usually only a small part of data is useful

Abstract Interpretation Using similar ways with symbolic execution Instead of using symbolic values, using abstract

values…

Page 13: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

13

Abstract Interpretation

Abstract domains A map from concrete values to abstract values Example:

Integer -> +, -, 0 String -> [0…9]*, other Pointer -> null / not null

Abstract Operations +, -, *, /, concatenation … Join: when two branch merge, or a statement is

executed for the second time OP: Dom*Dom -> Dom

Page 14: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

14

Abstract Operations

An example of integers Integer -> +, -, 0 + (+) + = + - (+) - = - + (+) - = ?

Two special abstract values in abstract domains : means all possible values : means no value

Page 15: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

Abstract Interpretation

Complex Example

y = read(); p = 1; while(y < 10){ y = y + 1; if y >2 p = p + 1; else p = p + 2;}print (p);

p =

p > 0

p > 0

p > 0 (+) 1 -> p > 0

Prove p > 0?

p > 0

p > 0 (+) 2 -> p > 0

p > 0 (join) p > 0 -> p > 0

p > 0

p > 0 (+) 1 -> p > 0

p > 0 (+) 2 -> p > 0

It is called a fixed point!

p > 0 (join) p > 0 -> p > 0

Page 16: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

Abstract Interpretation

Can we make sure there is a fix point?

y = read(); p = 1; while(y < 10){ y = y + 1; p = p*(-1);}print (p);

p =

p > 0

p > 0

p > 0 (*) -1 -> p < 0

Prove p != 0?

p > 0 p < 0

We should try to join p<0 and p>0

Page 17: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

Abstract Interpretation

First trial, use the later value: Join (a, b) = b

y = read(); p = 1; while(y < 10){ y = y + 1; p = p*(-1);}print (p);

p =

p > 0

p > 0

p > 0 (*) -1 -> p < 0

Prove p != 0?

p > 0 p < 0

p < 0 (*) -1 -> p > 0

Cannot reach a fixed point!

Page 18: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

18

Abstract Domain

How to make sure we reach a fixed point? Order all abstract values as a partial order Join operations should all be monotonic

+ --0

+ --0

>=0 <=0

0 --+

!=0

Page 19: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

Abstract Interpretation

Can we make sure there is a fix point?

y = read(); p = 1; while(y < 10){ y = y + 1; p = p*(-1);}print (p);

p =

p > 0

p > 0

p > 0 (*) -1 -> p < 0

Prove p != 0?

p > 0 p < 0

We should make sure the value of p stays or go upOtherwise, cannot reach a fixed point…

Join: p!=0

p!=0 * (-1) -> p!=0 Join: p!=0

Page 20: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

20

Abstract Domain

So the idea is Abstract value of a variable can only stays Or goes up until reaches

Designing appropriate domain is important Same domain for the !=0 example? No! Consider a = a(>0) + b(>0) For the first domain:

+(+)+ = +, join(+, +) = + For the second domain:

!=0(+)!=0 = , join(!=0, ) =

!=00 --+

!=0

Page 21: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

21

Abstract Domain

Common Abstract domains Numeric values Regular expressions Sets

They are relatively easy to define order Operations are monotonic

Page 22: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

22

Numeric Value Domains

Most widely used domains are just about ‘+’, ‘-’, ‘0’

It is also possible to have a number of value ranges

0-60- 60-90 90-100 100+

60- 0-90 60-100 90+

90- 0-100 60+

100- 0+

Page 23: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

23

String Value Domains

Usually useful for prove string formats All URLs start with “http”? All file names end with “.php”? …

Prefix domains

abcde abcdd abce abl

abcd*

abc*

ab*

Page 24: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

24

Set Domains

Useful for determine the possible constant values of a variable

Join represents the relation of subsume and merge

abcde abcdd abce abl

{abcde, abcdd}

{abcde, abcdd, abce}

{abcde, abcdd, abce, abl}

{abcdd, abce} {abce, abl}

{abcdd, abce, abl}

Page 25: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

25

Checking SpecificationsBasic ways

Value Specifications Symbolic execution

Abstract Interpretation

Temporal Specification Model Checking

Data Flow Specification Graph traversal (Data Dependence Graph)

Page 26: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

26

Model Checking

Basic idea Transform the program to an automaton

Program states are state of the automaton, and statements are transitions / edges

Checking temporal properties on the automaton by traversing it

Page 27: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

27

Model Checking: Model Building

Basic approach: Use Control Flow Graph:

View all program states after a statement as ONE state

Use Abstract Values View all program states after a statement with

same abstract values as ONE state Use Concrete values

View all program states after a statement with same concrete values as ONE state: usually impossible

Page 28: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

28

An example with CFG-model Checking whether a file is closed in all

casesboolean load(){ f.open(); line = f.read(); while(line!=null){ if(line.contains('key')){ f.close() return true; }else if(line.contains('value')){ f.close() } line = f.read(); } return false;}

Start

opened

new line read

!=null

key

value

none==null

f is not open

closed

closed

ret

Page 29: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

29

An example with CFG-model Traversing the model to find contrary

examples Start

opened

new line read

!=null

key

value

none==null

f is not open

closed

closed

ret

Page 30: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

30

An example with CFG-model Read must before close

Start

opened

new line read

!=null

key

value

none==null

f is not open

closed

closed

ret

Page 31: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

31

Temporal Logic

The basic idea of model checking is to find a certain path in the model that violate the specification

Describe the sequential relationship among a number of events: the specification So that any specification can just be read by a

path finding tool Do not need to bother writing a path finding tool

for each proof

Page 32: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

32

Usage of Temporal Logic

Describe the sequential relationship among a number of events

U: until PUQ means that P has to be true until Q is true

!read(f)Uopen(f) !close(f)Uopen(f)

F: Future FP means that P will be true some time in future

open(f) -> Fclose(f) close(f) -> !Fread(f)

Page 33: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

33

Checking SpecificationsBasic ways

Value Specifications Symbolic execution

Abstract Interpretation

Temporal Specification Model Checking

Data Flow Specification Graph traversal (Data Dependence Graph)

Page 34: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

34

Some Simple check with Graph Traversal

Check x flows to w

Check (!z used as divider)U(Z is written)

Page 35: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

35

Program slicing for sum = 0 -> sum = 1entry:main

expression: sum=0

expression: i=1

control-point: while i<11

call-site: add

expression:sum=add$0

call-site: add

expression:i=add$1

actual-out:add$0

actual-out:add$1

actual-in:sum$0

actual-in: i$0

actual-in: i$1

entry: add

Formal-in: a Formal-in:b formal-out:add$result

expression: add$result=a+b

???

actual-in: 1

Page 36: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

36

Sensitivity of graph traversal

Context Sensitivity Example:

x = f(x); y = f(y);

Flow Sensitivity Example:

x = 2; x = 3; y = x;

int f(int i){ return i;}

Page 37: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

37

Problems of static bug detection

Lack of Specifications Very rare project-specific formal specification

Solutions: General specifications (for typical bugs) Mining specifications (for API-specific, project-specific

specifications)

False Positives vs. Efficiency More sensitivities -> higher cost

Path sensitivity is rarely achieved

Combination of all sensitivities -> Incomputable problems

Page 38: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

38

State-of-practice: static bug detection

Findbugs A tool developed by researchers from UMD

Widely used in industry for code checking before commit

The idea actually comes from Lint

Lint A code style enforcing tool for C language

Find bad coding styles and raise warnings Bad naming Hard coded strings …

Page 39: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

39

Idea: do it reversely Most static bug detection tools

Set up a specification (either from users or well-defined ones) E.g., Devisor should not be 0, null pointer should not

be referred to, the salary of a personal cannot be negative

Check all possible cases to guarantee that the specification hold

Otherwise provide counter-examples

Findbugs Detect code patterns for bugs

E.g., a = null, b = a.field; str.replace(“ ”, “”);

Page 40: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

40

Characters of Findbugs Based on existing concrete code patterns

Check code patterns locally: only do inner-procedure analysis What are the advantages and disadvantages of

doing so?

Perform bug ranking according to the probability and potential severity of bugs Probability: the bug is likely to be true

Severity: the bug may cause severe consequence if not fixed

Page 41: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

41

Application of Findbugs-like tools Findbugs is adopted by a number of large

companies such as Google Usually only the issues with highest

confidence/severity are reported as issues

A statistics in Google 2009: More than 4000 issues are identified, in which

1700 bugs are confirmed, and 1100 are fixed.

The software department of USAA is using PMD, an alternative of Findbugs

Page 42: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

42

Patterns to be checked 404 bug patterns in 6 major categories

Bad Practice / Dodgy code

Correctness

Internationalization

Vulnerability / Security

Multithread correctness

Performance

Page 43: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

43

Bad Practice / Dodgy code Hackish code, not stable and may harm future

maintenance

Examples: Equals method should not assume type of object

argument

boolean Equals(Object o){

Myclass my = (Myclass)o;

return my.id = this.id;

}

Abstract class defines covariant compareTo() method

int compareTo(Myclass obj){ … }

Page 44: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

44

Correctness The code pattern may result in incorrect

behavior of the software

Examples: DMI: Collections should not contain themselves

List s = new …; …

if(s.contains(s)){ … }

DMI: Invocation of hashCode on an array

Int[] x = new int[10];

x.hashcode();

Page 45: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

45

Internationalization A code pattern that will hard future i18n of

the software

Example: Use toUpperCase, toLowerCase on localized

strings

String s = getLocale(key);

s.toUpperCase(); Perfrom tobytes() on localized strings

String s = getLocale(key);

s.getBytes();

Page 46: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

46

Multi-thread correctness A code pattern that may cause

incorrectness in multi-thread execution

Examples Synchronization on boxed primitive

private static Boolean inited = Boolean.FALSE;... synchronized(inited) { if (!inited) { init(); inited = Boolean.TRUE; } }...

Page 47: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

47

Vulnerability/Security The code pattern may result in vulnerability

or security issues

Examples: SQL: A SQL query is generated from a non-constant

String

String str = “select” + bb + ” ddd” + …

server.execute(str);

This code directly writes an HTTP parameter to JSP output, which allows for a cross site scripting vulnerability

Para = request.getParameter(key);

out.print(Para);

Page 48: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

48

Performance The code pattern may harm the performance

of the software

Examples: SBSC: Method concatenates strings using + in a loop

String s = "";for (int i = 0; i < field.length; ++i) { s = s + field[i]; }

StringBuffer buf = new StringBuffer();for (int i = 0; i < field.length; ++i) { buf.append(field[i]);}String s = buf.toString();

Page 49: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

49

Major problem: False positives

Overall precision 5% to 10% on open source and industry

projects

Developers want to make sure they do not waste effort on a false positive

Usually more bugs than developers can fix

Page 50: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

50

Solution: Bug ranking

Ranking bug categories Some categories are more likely to be

bugs than others How to give scores to each category?

Check large number of issues in the history of software

How large a proportion is fixed?

Raise precision to about 30% in the 25% top ranked bugs

Page 51: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

51

Findbugs

Disadvantages Can not guarantee the software to be free of certain

bugs

Still involve many false positives

Advantages Easy to start

Scalable

Relatively less false positives

Some what like testing Becomes the most popular and practical static bug

detection techniques

Page 52: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

52

Findbugs

Demo Install as plugin

Run Findbugs

Review issues

Page 53: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

Review of Static Bug Detection

Specification-based static bug detection Value Specifications : Symbolic Execution,

Abstract Interpretation

Temporal Specifications: Model Checking

Data Flow Specifications: Dependence Graph, Traversing

Pattern-based static bug detection Findbugs

Bug Ranking

Page 54: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

CS4723 Course Project

Writing Test Scripts for Android Apps

Page 55: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

55

Description

Learn about user-interface testing and testing Android Apps

Writing test scripts for the stock app: contact manager

Page 56: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

56

Requirements

All features of the contact manger must be covered. For example, contact manager also affects the

name shown in incoming and outgoing calls, messages.

Once the emulator is started, the test scripts should be able to run fully automatically.

Page 57: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

57

Requirements

The test script should setup all data by itself and clean up the data at the end so that the script can be executed from time to time.

The test script should automatically open and close emulator logging system for recording test results.

Page 58: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

58

Deliverables

Test Scripts System logs Due: Apr 28th

Page 59: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

59

Evaluation

Covered Features (10 points) No runtime error (4 points) Running normally for multiple times (4

points) Logging (2 points)

Page 60: CS4723 Software Validation and Quality Assurance Lecture 11 Static Bug Detection and Verification.

60

Demo

Download and install Android SDK Create emulator and start emulator Download and install python Simple monkeyrunner scripts Mouse click recorder