Saturn Overview1 An Overview of the Saturn Project.

50
Saturn Overview 1 An Overview of the Saturn Project

Transcript of Saturn Overview1 An Overview of the Saturn Project.

Saturn Overview 1

An Overview of the Saturn Project

The Three-Way Trade-Off

• Precision– Modeling programs accurately enough to be useful

• Scalability– Saying anything at all about large programs

• Human Effort– How much work must the user do?– Either giving specifications, or interpreting results

Saturn Overview 2

Today’s focus

Not so much about this . . .

Saturn Overview 3

Precision

int f(int x) {

. . . . . . }

Intraprocedural analysis with minimal abstraction.

Ff

A(Fg)

A(Fh)

Primary abstraction is done at function boundaries.

[A(Ff), A(Fg), A(Fh)] A(Ff)

[A(Ff), A(Fg), A(Fh)]

[A(Ff), A(Fg), A(Fh)]

formula

Saturn Overview 4

Scalability

• Design constraint:

SAT formula size ~ function size

• Analyze one function at a time

• Parallel implementation– Server sends functions to

clients to analyze– Typically use 50-100 cores to

analyze Linux

Summaries

• Abstract at function boundaries– Compute a summary for function’s behavior

• Summaries should be small– Ideally linear in the size of the function’s interface

• Summaries are our primary form of abstraction– Saturn delays abstraction to function boundaries

Slogan: Analysis design is summary design!

Saturn Overview 5

Expressiveness

• Analyses written in Calypso

• Logic programs– Express traversals of the program– E.g., backwards/forwards propagation

• Constraints– For when we don’t know traversal order

• Written ~40,000 lines of Calypso code

Saturn Overview 6

Availability

• An open source project– BSD license

• All Calypso code available for published experiments

saturn.stanford.edu

Saturn Overview 7

People

Saturn Overview 8Brian

Hackett

Alex Aiken Suhabe Bugrara Isil Dillig

Thomas Dillig

Peter Hawkins

Yichen Xie(past)

Outline

• Saturn overview

• An example analysis– Intraprocedural– Interprocedural

• What else can you do?

• Survey of results

Saturn Overview 9

Saturn Architecture

C Program

C Frontend

C Syntax Databases

CalypsoInterpreter

Calypso analyses Constraint Solvers

Summary Databases Summary Reports UI10Saturn Overview

Parsing and C Frontend

Source Code

Build Interceptor

Preprocessed Source Code

CIL frontend

Abstract Syntax Tree Databases

Other possible frontends

11Saturn Overview

Calypso

• General purpose logic programming language– Pure– Prolog-like syntax

• Bottom-up evaluation– Magic sets transformation

• Also a (minor) moon of Saturn

12Saturn Overview

Helpful Features

• Strong static type and mode checking

• Permanent data (sessions) – stored as Berkeley DB databases– Sessions are just a named bundle of predicates

• Support for unit-at-a-time analysis

13Saturn Overview

Extensible Interpreter

Logic ProgramInterpreter

SAT Solver#sat predicate, …

LP Solver

DOT graph package

UI package14Saturn Overview

Scalability

• Interpreter is not very efficient

• OK, it’s slow

• But can run distributed analyses – 50-100 CPUs

• Scalability is more important than raw speed– Can run intensive analyses of the entire Linux kernel

(>6MLOC) in a few hours.

15Saturn Overview

Cluster Architecture

Master Node

Worker Node 1

Worker Node 100

Calypso DB

Calypso DB

Databases

16Saturn Overview

Job Scheduling

Saturn Overview 17

Dynamically track dependencies between jobs

Rerun jobs if new dependencies found

•Optimistic concurrency control

Job = a function body

Iterate to fixpoint for circular dependencies

Constraint Solvers

Calypso Analyses

Alias Analysis

Function Pointer Analysis

C Syntax Predicates

CFG Construction

Memory Model

NULL checker

Typestate verifier

18Saturn Overview

Check that a thread does not:– acquire the same lock twice– release the lock twice

Otherwise the application may deadlock or crash.

The Paradigmatic Locking Analysis

19Saturn Overview

Specification

Saturn Overview 20

locked unlocked

error

unlock

lock

unlock

lock

• We assume – one locking function lock(l) – one unlocking function unlock(l).

• We analyze one function at a time – produce locking summary describing the

FSM transitions associated with a given lock.

Basic Setup

21Saturn Overview

An Example Function & Summary

Saturn Overview 22

f( . . ., lock *L, . . .) { lock(L); . . . unlock(L);}

L: unlocked -> unlocked locked -> error

• Summaries are input state -> output state• The net effect of the function on the lock

• Summary size is independent of function size•Bounded by the square of the number of states

type lockstate ::= locked | unlocked | error.

• Predicates to describe lock states on nodes and edges of the CFG:

predicate node_state(P:pp,L:t_trace,S:lockstate,G:g_guard).

predicate edge_state(P:pp,L:t_trace,S:lockstate,G:g_guard).

Program point pp is a unique id for each point in the program

Trace t_trace is a unique name for a memory location

Guard g_guard is a boolean constraint

Lock States

23Saturn Overview

1. Initialize lock states at function entry

2. Join operator:– Combine edges to produce successor’s

node_state

3. Transfer functions for every primitive:– assignments– tests– function calls

The Intraprocedural Analysis

24Saturn Overview

Initializing a Lock

• Use fresh boolean variable

• Interpretation: is true ) L is locked– : is true ) L is unlocked

• Enforces that L cannot be both locked and unlocked simultaneously

25Saturn Overview

Notation

26Saturn Overview

(lock, state, guard)

At program point P, the lock is in state if guard is true.

P

node_state(P0,L,locked,LG):- entry(P0), is_lock(L), fresh_variable(L, LG).

node_state(P0,L,unlocked,UG):-

entry(P0),node_state(P0,L,locked,LG),

#not(LG, UG).

Initialization Rules

27Saturn Overview

f( . . ., lock *L, . . .) { . . .}

(L, locked, LG)(L, unlocked, UG)

Allocates new boolean variable associated with lock L.

P0

1. Initialize lock states at function entry

2. Join operator:– Combine edges to produce successor’s

node_state

3. Transfer functions for every primitive:– assignments– tests– function calls

The Intraprocedural Analysis

28Saturn Overview

node_state(P,L,S,G) :-

edge_state(P,L,S,_),

\/edge_state(P,L,S,EG):#or_all(EG,G).

Joins

Note: There is no abstraction in the

join . . .29Saturn Overview

(L, locked, F1) (L, locked, F2)

if (…)

(L, locked, F1ÇF2)

1. Initialize lock states at function entry

2. Join operator:– Combine edges to produce successor’s

node_state

3. Transfer functions for every primitive:– assignments– function calls– etc.

The Intraprocedural Analysis

30Saturn Overview

Assignments do not affect lock state:

edge_state(P1,L,S,G) :-

assign(P0,P1,_), node_state(P0,L,S,G).

Assignments

31Saturn Overview

X = E;

P0

P1

(L, S, G)

(L,S, G)

• Function summaries are the building blocks of interprocedural analysis.

• Generating a function summary requires:– Predicates encoding relevant facts– A session to store these predicates.

Interprocedural Analysis Basics

32Saturn Overview

1. Generating function summaries

2. Using function summaries– How do we retrieve the summary of a

callee?

– How do we map facts associated with a callee to the namespace of the currently analyzed function?

Interprocedural Analysis Outline

33Saturn Overview

session sum_locking(FN:string) containing[lock_trans].

predicate lock_trans(L: t_trace, S0: lockstate, S1: lockstate).

Summary Declaration

34Saturn Overview

sum_lockingDeclares a persistent database sum_locking (function name)holding lock_trans facts

Summaries for lock and unlock:

sum_locking("lock")->lock_trans(*arg0,locked,error) :- .

sum_locking("lock")->lock_trans(*arg0,unlocked,locked) :- .

sum_locking("unlock")->lock_trans(*arg0,unlocked,error) :- .

sum_locking("unlock")->lock_trans(*arg0,locked,unlocked) :-. *arg0 is the memory

location modified by lock and unlock

Summary Generation: Primitives

35Saturn Overview

sum_locking(F)->lock_trans(L, S0, S1) :-

current_function(F),entry(P0), node_state(P0, L, S0 , G0), exit(P1), node_state(P1, L, S1, G1), #and(G0, G1, G), guard_satisfiable(G).

Summary Generation: Other Functions

36Saturn Overview

F( . . ., lock *L, . . .) { . . .

}

P0

P1

(L, S0, G0)

(L, S1, G1)

if SAT(G1 Æ G2), then . . .

F: S0 ! S1

h

call_transfer(I, L, S0, S1, G) :- direct_call(I, F), call(P0, _, I),sum_locking(F)->lock_trans(CL, S0, S1),

instantiate(s_call{I}, P0, CL, L, G).

Summary Application Rule

37Saturn Overview

G( . . .) { F(. . .)

}

F: S0 ! S1

P0 (S0, L, G)

(S1, L, G)

Applications

• Bug finding

• Verification

• Software Understanding

38Saturn Overview

Saturn Bug Finding

• Early work– Locking

• Scalable Error Detection using Boolean Satisfiability. POPL 2005

– Memory leaks• Context- and Path-Sensitive Memory Leak Detection. FSE 2005

– Scripting languages• Static Detection of Security Vulnerabilities in Scripting Languages.

15th USENIX Security Symposium, 2006

• Recent work– Inconsistency Checking

• Static Error Detection Using Semantic Inconsistency Inference. PLDI 2007

39Saturn Overview

Examples: Null pointer dereferences

Application KLOC Warnings Bugs False Alarms FA Rate

Openssl-0.9.8b 339 55 47 6 11.30%

Samba-3.0.23b 516 68 46 19 29.20%

Openssh-4.3p2 155 9 8 1 11.10%

Pine-4.64 372 150 119 28 19.00%

Mplayer-1.0pre8 762 119 89 28 23.90%

Sendmail-8.13.8 365 9 8 1 11.10%

Linux-2.6.17.1 6200 373 299 66 18.10%

Total 8793 783 616 149 19.50%

40Saturn Overview

Lessons Learned

• Saturn-based tools improve bug-finding– Multiple times more bugs than previous results– Lower false positive rate

• Why?– “Sounder” than previous bug finding tools

• bit-level modeling, handling casts, aliasing, etc.

– Precise• Fully intraprocedurally path-sensitive• Partially interprocedurally path-sensitive

Saturn Overview 41

Lessons Learned (Cont.)

• Design of function summary is key to scalability and precision

• Summary-based analysis only looks at the relevant parts of the heap for a given function

• Programmers write functions with simple interfaces

Saturn Overview 42

Saturn Verification

• Unchecked user pointer dereferences– Important OS security property– Also called “probing” or “user/kernel pointers”

• Precision requirements– Context-sensitive– Flow-sensitive– Field-sensitive– Intraprocedurally path-sensitive

43Saturn Overview

Current Results for Linux-2.6.1

• 6.2 MLOC with 91,543 functions

• Verified 616 / 627 system call arguments – 98.2%– 11 false alarms

• Verified 851,686 / 852,092 dereferences– 99.95%– 406 false alarms

44Saturn Overview

Preliminary Lessons Learned

• Bug finders can be sloppy: ignore functions or points-edges that inhibit scalability or precision

• Soundness substantially more difficult than finding bugs

• Lightweight, sparsely placed annotations – Have programmers add some information– Makes verification tractable– Only 22 annotations need for user pointer analysis

45Saturn Overview

Saturn for Software Understanding

• A program analysis is a code search engine

• Generic question: Do programmers ever do X?– Write an analysis to find out– Run it on lots of code– Classify the results– Write a paper . . .

46Saturn Overview

Examples

• Aliasing is used in very stylized ways, at least in C– Cursors into data structures– Parent/child pointers– And 7 other idioms

How is Aliasing Used in Systems Software? FSE 2006

• Do programmers take the address of function ptrs?– Answer: Almost never.– Allows simpler analysis of function pointers

47Saturn Overview

Other Things We’ve Thought About

• Shape analysis– We notice the lack of shape information

• Interprocedural path-sensitivity– Needed for some common programming patterns

• Proving correctness of Saturn analyses

48Saturn Overview

Related Work

• Lots– All bug finding and verification tools of the last

10 years

• Particularly, though– Systems using logic programming (bddbddb)– ESP– Metal– CQual– Blast

Saturn Overview 49

saturn.stanford.edu

Saturn Overview 50