Use of Models in Analysis and Design

74
Use of Models in Analysis and Design Sriram K. Rajamani Rigorous Software Engineering Microsoft Research, India

description

Use of Models in Analysis and Design. Sriram K. Rajamani Rigorous Software Engineering Microsoft Research, India. Models. Abstractions of reality All branches of science and engineering use models. Some examples: Differential equations State machines Models enable conquering complexity - PowerPoint PPT Presentation

Transcript of Use of Models in Analysis and Design

Page 1: Use of Models in Analysis and Design

Use of Models in Analysis and Design

Sriram K. Rajamani

Rigorous Software Engineering

Microsoft Research, India

Page 2: Use of Models in Analysis and Design

Models

• Abstractions of reality

• All branches of science and engineering use models. Some examples:– Differential equations– State machines

• Models enable conquering complexity– Allow focus on one issue at a time, while

ignoring others

Page 3: Use of Models in Analysis and Design

Models in software engineering

• Mainstream mantra:– “Code is truth, and only truth”

• Models are used, but not widely– Requirements capturing (UML):

• Used specialized domains: telcom, automotive, embedded sytstems

– Development tools:• Testing and verification

– Design• Model driven development

Page 4: Use of Models in Analysis and Design

This talk

• Use of models in analysis and design– Personal experience

• Analysis:– Extracting analyzable models from source code using

iterative refinement

• Design:– My assessment of state of the art and important

research problems

Page 5: Use of Models in Analysis and Design

Models in Analysis

Page 6: Use of Models in Analysis and Design

Software Validation

• Large scale reliable software is hard to build and test.

• Different groups of programmers write different components.

• Integration testing is a nightmare.

Page 7: Use of Models in Analysis and Design

Property Checking

• Programmer provides redundant partial specifications

• Code is automatically checked for consistency

• Different from proving whole program correctness – Specifications are not complete

Page 8: Use of Models in Analysis and Design

Interface Usage Rules

•Rules in documentation–Incomplete, unenforced, wordy

–Order of operations & data access

–Resource management

•Disobeying rules causes bad behavior

–System crash or deadlock

–Unexpected exceptions

–Failed runtime checks

Page 9: Use of Models in Analysis and Design

Does a given usage rule hold?

• Checking this is computationally impossible!

• Equivalent to solving Turing’s halting problem (undecidable)

• Even restricted computable versions of the problem (finite state programs) are prohibitively expensive

Page 10: Use of Models in Analysis and Design

Why bother?

Just because a problem is undecidable, it doesn’t go away!

Page 11: Use of Models in Analysis and Design

Automatic property checking = Study of tradeoffs

• Soundness vs completeness – Missing errors vs reporting false alarms

• Annotation burden on the programmer

• Complexity of the analysis– Local vs Global– Precision vs Efficiency– Space vs Time

Page 12: Use of Models in Analysis and Design

Broad classification

• Underapproximations– Testing

• After passing testing, a program may still violate a given property

• Overapproximations– Type checking

• Even if a program satisfies a property, the type checker for the property could still reject it

Page 13: Use of Models in Analysis and Design

Current trend

• Confluence of techniques from different fields:– Model checking– Automatic theorem proving– Program analysis

• Significant emphasis on practicality

• Several new projects in academia and industry

Page 14: Use of Models in Analysis and Design

Model Checking• Algorithmic exploration of state space of the

system

• Several advances in the past decade: – symbolic model checking– symmetry reductions– partial order reductions– compositional model checking– bounded model checking using SAT solvers

• Most hardware companies use a model checker in the validation cycle

Page 15: Use of Models in Analysis and Design

enum {N, T, C} state[1..2]

int turn

init

state[1] = N; state[2] = N

turn = 0

trans

state[i]= N & turn = 0 -> state[i] = T; turn = i

state[i] = N & turn !=0 -> state[i] = T

state[i] = T & turn = i -> state[i] = C

state[i] = C & state[2-i] = N -> state[i] = N

state[i] = C & state[2-i] != N -> state[i] = N; turn = 2-i

Page 16: Use of Models in Analysis and Design

N1,N2turn=0

T1,N2turn=1

T1,T2turn=1

C1,N2turn=1

C1,T2turn=1

N1,T2turn=2

T1,T2turn=2

N1,C2turn=2

T1,C2turn=2

N = noncritical, T = trying, C = critical

Page 17: Use of Models in Analysis and Design

Model Checking• Strengths

– Fully automatic (when it works)– Computes inductive invariants

• I such that F(I) I

– Provides error traces

• Weaknesses– Scale– Operates only on models

• How do you get from the program to the model?

Page 18: Use of Models in Analysis and Design

Theorem proving– Early theorem provers were proof checkers

• They were built to support asssertional reasoning in the Hoare-Dijkstra style

• Cumbersome and hard to use

– Greg Nelson’s thesis in early 80s paved the way for automatic theorem provers• Theory of equality with uninterpreted functions• Theory of lists• Theory of linear arithmetic• Combination of the above !

– Automatic theorem provers based on Nelson’s work are widely used• ESC• Proof Carrying Code

Page 19: Use of Models in Analysis and Design

Theory of Equality. • Symbols: =, , f, g, …• Axiomatically defined:

E = E

E2 = E1

E1 = E2

E1 = E2 E2 = E3

E1 = E3

E1 = E2

f(E1) = f(E2)

• Example of a satisfiability problem: g(g(g(x)) = x g(g(g(g(g(x))))) = x g(x) x

• Satisfiability problem decidable in O(n log n)

Page 20: Use of Models in Analysis and Design

a : array [1..len] of int;

int max := -MAXINT;i := 1;{ 1 j i. a[j] max}while (i len)

if( a[i] > max) max := a[i];

i := i+1;endwhile{ 1 j len. a[j] max}

( 1 j i. a[j] max) ( i > len)

( 1 j len. a[j]

max}

Page 21: Use of Models in Analysis and Design

Automatic theorem proving

• Strengths– Handles unbounded domains naturally– Good implementations for

• equality with uninterpreted functions• linear inequalities• combination of theories

• Weaknesses– Hard to compute fixpoints– Requires inductive invariants

• Pre and post conditions• Loop invariants

Page 22: Use of Models in Analysis and Design

Program analysis

• Originated in optimizing compilers– constant propagation– live variable analysis– dead code elimination– loop index optimization

• Type systems use similar analysis• Are the type annotations consistent?

Page 23: Use of Models in Analysis and Design

Program analysis• Strengths

– Works on code – Pointer aware– Integrated into compilers– Precision efficiency tradeoffs well studied

• flow (in)sensitive• context (in)sensitive

• Weakenesses– Abstraction is hardwired and done by the

designer of the analysis– Not targeted at property checking

(traditionally)

Page 24: Use of Models in Analysis and Design

Model Checking, Theorem Proving and Program Analysis

• Very related to each other

• Different histories– different emphasis– different tradeoffs

• Complementary, in some ways

• Combination can be extremely powerful

Page 25: Use of Models in Analysis and Design

What is the key design challenge in a model checker for software?

It is the model!

Page 26: Use of Models in Analysis and Design

Model Checking Hardware

Primitive values are booleans

States are boolean vectors of fixed size

Models are finite state machines !!

Page 27: Use of Models in Analysis and Design

Characteristics of Software

Primitive values are more complicated– Pointers– Objects

Control flow (transition relation) is more complicated– Functions– Function pointers– Exceptions

States are more complicated – Unbounded graphs over values

Variables are scoped– Locals– Shared scopes

Much richer modularity constructs– Functions– Classes

Page 28: Use of Models in Analysis and Design

Sequential C program

Finite state machines

Source code

FSM

modelchecker

Traditional approach

Page 29: Use of Models in Analysis and Design

Sequential C program

Finite state machines

Source code

FSM

abstraction

modelchecker

C data structures, pointers,procedure calls, parameter passing,scoping,control flow

Automatic abstraction

Boolean program

Data flow analysis implemented using BDDs

SLAM

Push down model

Page 30: Use of Models in Analysis and Design

An optimizing compiler doubles performance every 18 years

-Todd Proebsting

Computing power doubles every 18 months

-Gordon Moore

Page 31: Use of Models in Analysis and Design

When I use a model checker, it runs and runs for ever and never comes back… when I use a static analysis tool, it comes back immediately and says “I don’t know”

- Patrick Cousot

Page 32: Use of Models in Analysis and Design

Source Code

TestingDevelopment

PreciseAPI Usage Rules

(SLIC)

Software Model Checking

Read forunderstanding

New API rules

Drive testingtools

Defects

100% pathcoverage

Rules

Static Driver VerifierStatic Driver Verifier

Page 33: Use of Models in Analysis and Design

SLAM – Software Model Checking

• SLAM innovations– boolean programs: a new model for software– model creation (c2bp)– model checking (bebop)– model refinement (newton)

• SLAM toolkit– built on MSR program analysis infrastructure

Page 34: Use of Models in Analysis and Design

SLIC

• Finite state language for stating rules– monitors behavior of C code– temporal safety properties– familiar C syntax

• Suitable for expressing control-dominated properties – e.g. proper sequence of events– can encode data values inside state

Page 35: Use of Models in Analysis and Design

State Machine for Locking

Unlocked Locked

Error

Rel Acq

Acq

Rel

state {

enum {Locked,Unlocked}

s = Unlocked;

}

KeAcquireSpinLock.entry {

if (s==Locked) abort;

else s = Locked;

}

KeReleaseSpinLock.entry {

if (s==Unlocked) abort;

else s = Unlocked;

}

Locking Rule in SLIC

Page 36: Use of Models in Analysis and Design

prog. P’prog. P

SLIC rule

The SLAM Process

boolean program

pathpredicates

slic

c2bp

bebop

newton

Page 37: Use of Models in Analysis and Design

do {KeAcquireSpinLock();

nPacketsOld = nPackets;

if(request){request = request->Next;KeReleaseSpinLock();nPackets++;

}} while (nPackets != nPacketsOld);

KeReleaseSpinLock();

ExampleDoes this code

obey the locking rule?

Page 38: Use of Models in Analysis and Design

do {KeAcquireSpinLock();

if(*){

KeReleaseSpinLock();

}} while (*);

KeReleaseSpinLock();

ExampleModel checking boolean program

(bebop)

U

L

L

L

L

U

L

U

U

U

E

Page 39: Use of Models in Analysis and Design

do {KeAcquireSpinLock();

nPacketsOld = nPackets;

if(request){request = request->Next;KeReleaseSpinLock();nPackets++;

}} while (nPackets != nPacketsOld);

KeReleaseSpinLock();

ExampleIs error path feasible

in C program?(newton)

U

L

L

L

L

U

L

U

U

U

E

Page 40: Use of Models in Analysis and Design

do {KeAcquireSpinLock();

nPacketsOld = nPackets; b = true;

if(request){request = request->Next;KeReleaseSpinLock();nPackets++; b = b ? false : *;

}} while (nPackets != nPacketsOld); !b

KeReleaseSpinLock();

ExampleAdd new predicateto boolean program

(c2bp)b : (nPacketsOld == nPackets)

U

L

L

L

L

U

L

U

U

U

E

Page 41: Use of Models in Analysis and Design

do {KeAcquireSpinLock();

b = true;

if(*){

KeReleaseSpinLock();b = b ? false : *;

}} while ( !b );

KeReleaseSpinLock();

b

b

b

b

ExampleModel checking

refined boolean program

(bebop)

b : (nPacketsOld == nPackets)

U

L

L

L

L

U

L

U

U

U

E

b

b

!b

Page 42: Use of Models in Analysis and Design

Example

do {KeAcquireSpinLock();

b = true;

if(*){

KeReleaseSpinLock();b = b ? false : *;

}} while ( !b );

KeReleaseSpinLock();

b : (nPacketsOld == nPackets)

b

b

b

b

U

L

L

L

L

U

L

U

U

b

b

!b

Model checking refined

boolean program(bebop)

Page 43: Use of Models in Analysis and Design

Observations about SLAM

• Automatic discovery of invariants– driven by property and a finite set of (false) execution paths– predicates are not invariants, but observations– abstraction + model checking computes inductive invariants

(boolean combinations of observations)

• A hybrid dynamic/static analysis– newton executes path through C code symbolically – c2bp+bebop explore all paths through abstraction

• A new form of program slicing– program code and data not relevant to property are dropped– non-determinism allows slices to have more behaviors

Page 44: Use of Models in Analysis and Design

Current status of SDV• Runs on 100s of

Windows drivers• Finds several bugs,

proves several properties• SDV now transferred

from MSR to Windows division

• Used to check several DDK and inbox drivers

• Beta Released at WINHEC 2005!

Page 45: Use of Models in Analysis and Design

Static Driver Verifier

Page 46: Use of Models in Analysis and Design

Static Driver Verifier• Driver: Parallel port device driver • Rule: Checks that driver dispatch routines do not call

IoCompleteRequest(…) twice on the I/O request packet passed to it by the OS or another driver

Page 47: Use of Models in Analysis and Design
Page 48: Use of Models in Analysis and Design
Page 49: Use of Models in Analysis and Design
Page 50: Use of Models in Analysis and Design
Page 51: Use of Models in Analysis and Design
Page 52: Use of Models in Analysis and Design
Page 53: Use of Models in Analysis and Design
Page 54: Use of Models in Analysis and Design
Page 55: Use of Models in Analysis and Design
Page 56: Use of Models in Analysis and Design

Call #1

Page 57: Use of Models in Analysis and Design
Page 58: Use of Models in Analysis and Design
Page 59: Use of Models in Analysis and Design
Page 60: Use of Models in Analysis and Design
Page 61: Use of Models in Analysis and Design
Page 62: Use of Models in Analysis and Design
Page 63: Use of Models in Analysis and Design

Call #2

Page 64: Use of Models in Analysis and Design

SLAM/SDV History (with Tom Ball)• 1999-2001

– foundations, algorithms, prototyping

– papers in CAV, PLDI, POPL, SPIN, TACAS

• March 2002– Bill Gates review

• May 2002– Windows committed to hire two

Ph.D.s in model checking to support Static Driver Verifier

• July 2002– running SLAM on 100+ drivers,

20+ properties

• September 3, 2002– made initial release of SDV to

Windows (friends and family)

• April 1, 2003– made wide release of SDV to

Windows (any internal driver developer)

• September, 2003– team of six in Windows working on

SDV– researchers moving into

“consultant” role

• November, 2003– demonstration at Driver Developer

Conference

• May, 2005– Beta ships at WinHEC 2005!

Page 65: Use of Models in Analysis and Design
Page 66: Use of Models in Analysis and Design

SLAM

• Boolean program model has proved itself

• Successful for domain of device drivers– control-dominated safety properties– few boolean variables needed to do proof or find real

counterexamples

• Counterexample-driven refinement– terminates in practice– incompleteness of theorem prover not an issue

Page 67: Use of Models in Analysis and Design

What is hard?

• Abstracting – from a language with pointers (C) – to one without pointers (boolean programs)

• All side effects need to be modeled by copying (as in dataflow)

• Open environment problem

Page 68: Use of Models in Analysis and Design

What stayed fixed?

• Boolean program model

• Basic tool flow

• Repercussions:– newton has to copy between scopes – c2bp has to model side-effects by value-result – finite depth precision on the heap is all

boolean programs can handle

Page 69: Use of Models in Analysis and Design

What changed?

• Interface between newton and c2bp

• We now use predicates for doing more things

• refine alias precision via aliasing predicates• newton helps resolve pointer aliasing imprecision

in c2bp

Page 70: Use of Models in Analysis and Design

Model Checking, Theorem Proving and Program Analysis

• Very related to each other

• Different histories– different emphasis– different tradeoffs

• Complementary, in some ways

• Combination can be extremely powerful

Page 71: Use of Models in Analysis and Design

What worked well?

• Specific domain problem

• Safety properties

• Shoulders & synergies

• Separation of concerns

• Summer interns & visitors

• Strategic partnership with Windows

Page 72: Use of Models in Analysis and Design

Predictions• The holy grail of full program verification

has been abandoned. It will probably remain abandoned

• Less ambitious tools like powerful type checkers will emerge and become more widely used

• These tools will exploit ideas from various analysis disciplines

• Tools will alleviate the “chicken-and-egg” problem of writing specifications

Page 73: Use of Models in Analysis and Design

Further Reading

See papers, slides from:

http://research.microsoft.com/slam

http://research.microsoft.com/~sriram

Page 74: Use of Models in Analysis and Design

GlossaryModel checking Checking properties by systematic exploration of the state-space of a

model. Properties are usually specified as state machines, or using temporal logics

Safety properties Properties whose violation can be witnessed by a finite run of the system. The most common safety properties are invariants

Reachability Specialization of model checking to invariant checking. Properties are specified as invariants. Most common use of model checking. Safety properties can be reduced to reachability.

Boolean programs “C”-like programs with only boolean variables. Invariant checking and reachability is decidable for boolean programs.

Predicate A Boolean expression over the state-space of the program eg. (x < 5)

Predicate abstraction A technique to construct a boolean model from a system using a given set of predicates. Each predicate is represented by a boolean variable in the model.

Weakest precondition The weakest precondition of a set of states S with respect to a statement T is the largest set of states from which executing T, when terminating, always results in a state in S.