1 The logic-automaton connection and applications Mona project Initiated at the University of Aarhus...

30
1 The logic-automaton connection and applications Mona project Initiated at the University of Aarhus (BRICS) Google: Mona Michael I. Schwartzbach Anders Møller Nils Klarlund

Transcript of 1 The logic-automaton connection and applications Mona project Initiated at the University of Aarhus...

1

The logic-automaton connection and applications

Mona projectInitiated at the University of Aarhus (BRICS)

Google: Mona

Michael I. Schwartzbach Anders Møller

Nils Klarlund

2

Overview

• Introduction• Pointer reasoning and start of project• Verification of protocols• Fast parsing with declarative constraints over trees• What is WS1S?• The Mona tool in use• What else has been accomplished

3

Automata• Regular expressions automata

– Useful, right?– Solves problems such as

• Expressing text patterns• Expressing paths in graphs

• Are regular languages limiting our use of automata?– With complement operator, regular

expression emulate propositional logic!– But no quantification with REs?

4

QPL

• Quantified Propositional Logic is fundamental in verification

• Boil your problem down to QPL, then solve• A compositional framework for “modeling”

phenomena (albeit of limited expressive power)• How to solve?

– Use BDDs– But they are automata

• Albeit not general ones, they are acyclic

5

Mona in Essence: Extend QPL, Tie to Automata, Solve More

Problems!• WS1S is the answer

– It ties the class of all automata to a logic– It becomes a vehicle for the operations

• Cross product,

• Determinization,

• Subset construction,

• Projection,

• Complementation

6

That Verification Problems or Data Types or Invariants Can Be

Expressed As Regular Languages:

Not a New Idea• E.W.Dijkstra (parameterized verification)

• N.D. Jones & S. Muchnik (tree grammars)

• A. Gupta (parameterized hardware)

• Early 60es: logic and automata for describing temporal behavior of sequential circuits

7

Motivation I: Pointers• Pointer manipulation in program is very

difficult to get right• It shouldn’t be too difficult to verify that shapes

in the heap stay invariant over a few operations?• No dangling pointers, all allocated memory

accessible, no sharing of structures supposed to be separate

• WF(store) – Let X reachable nodes from x Y reachable nodes from y– Intersection of X and Y is empty– Union of X and Y is all of the store

x y

X

X

8

Floyd-Hoare Logic of Pointers• (WF(S) & S S’) => WF(S’)

– Where is the transition relation that reflects pointer surgery

• Is this even decidable?• Yes, because we can formulate it in WS1S

through predicate transformations• So, let’s build (1994)

– A decision procedure for WS1S– A tool for translating WF predicates to WS1S– That holds takes 4 hours to calculate!

9

Additional work

• Automatic Verification of Pointer Programs using Monadic Second-order Logic [PLDI ’97]

• Pointer Assertion Logic Engine [PLDI ’01]

• Related work on shape analysis

• We still didn’t explain WS1S

10

Motivation II: Parameterized verification:Sliding Window Protocol

(w. Mark A. Smith)

• A sequence number is used as an acknowledgement

• The window size is the max. number of messages in transit

• We model– Unbounded queues– Unbounded channels– Dynamic window size

11

Sliding window protocol

12

We must prove: What goes out is what comes in

• Variables (D is a finite domain)

SendBuf: Seq[D] := {}, hSendBuf: Int := 1, W: Int := choose n where (n > 0), RetranBuf: Seq[D] := {}, hRetranBuf: Int := 1, readyToSend: Bool := false,

• Some variables flex in one, some even in two dimensions

segment: D, seqNum: Int := 0, RcvBuf: Seq[D] := {}, hRcvB: Int := 1, sendAck: Bool := false, temp: D, transitSR: Map[Int, Mset[D]] := empty, transitRS: Map[Int, Mset[A]] := empty

13

What kind of code?

internal prepareNewSeg(d)pre readyToSend = false /\ hSendBuf <= len(SendBuf) /\ d = SendBuf[hSendBuf] /\ len(RetranBuf) < Weff RetranBuf := RetranBuf |- d; seqNum := hSendBuf; hSendBuf := hSendBuf + 1; readyToSend := true; segment := d internal sendpktSR(d) pre readyToSend = true /\ d = segmenteff readyToSend := false; transitSR := update(transitSR, seqNum, insert(d, transitSR[seqNum]))

14

We Note

• Operations work on both ends of linear lists

• We maintain pointers and length information

• The rest is nitty-gritty, boring stuff

• How is this related to regular languages?

• The system as it evolves over time is not a regular language!

• Are configurations regular languages?– Yes, if everything stretches in one dimension and

– Indexing operations not ‘too complicated’

– WS1S will make this precise

– Do changes to configurations, that is, operations, preserve regularity?

– WS1S again can help us understand

15

Motivation III: YakYak---A Fast Parser With

Constraints on Parse Trees

• Logic notations for parsing

• > 69 different Yacc-like parsers available…

• So what’s new: a concise, declarative way of specifying constraints on parse trees

• That also yields a fast parser

16

YakYak

• Consider HTML – An a element denotes a text anchor– Text is in bold if inside a bold element

• Here are two constraints– “For all positions p with p an a element there

is not a position q below p that is an a element”

– “If any part of a text within an a element is in bold then all anchor texts must be in bold”

17

How to Turn Such Constraints Into Automata?

• Note we need tree automata• Xpath formulation possible (if parse tree was

XML)• XML parse tree would be slow• Xpath query evaluation would be slow• Goal: one transition per production per constraint

in a pre-computed automaton that works bottom-up

• We need to go from formulas to tree automata!

18

What Is WS1S• Weak Second-order theory of 1 Successor• First-order terms t

– 0, p, t’ + 1

• Second-order terms T– Empty, P, T union T’, T intersection T’

• Formulas ’, ~v ’- t = t’, t < t’, t in T, b- ex2 P: - ex1 p: - ex0 b:

19

A. Meyer’s Result• Deciding WS1S is non-elementary• No finite stack of exponentials can limit the growth• Each quantifier bumps you up one exponential• Recall: first experiment for very simple example

took 4 hours to complete• Some people have suggested that we should have

given up at this point

• For more on this viewpoint– Google: Klarlund madman

20

Example

var2 P,Q;P\Q = {0,4} union {1,2};

var1 x;var0 A;

ex2 Q: x in Q & (all1 q: (0 < q & q <= x) => (q in Q => q - 1 notin Q) & (q notin Q => q - 1 in Q)) & 0 in Q;

A & x notin P;

21

Mona OutputA counter-example of least length (1) is:

P X XQ X Xx X 1A 0 X

P = {}, Q = {}, x = 0, A = false

A satisfying example of least length (7) is:

P X 1110100Q X 000X0XXx X 0000001A 1 XXXXXXX

P = {0,1,2,4}, Q = {}, x = 6, A = true

22

What Was Calculated

23

A BDD Represents a Boolean Function of Boolean Variables

• BDD = Boolean Decision

Diagram

• x1 or (x2 iff x3)

• Often the diagram is

very sparse

24

The Central Trick in Mona Is Similar

25

Now Formulate Algorithms

• Keep automata determinized and minimal

• Cross product (for & and v)

• Projection for existential quantification

• Subset construction for determinization

• Minimization

26

Three and Six-valued Logic

• To really make Mona work, we had to overcome spurious state space explosions

• They were a direct consequences of working in an only two-valued logic!

• The problem: say you want to model {green, blue, red}. You need two bits, say X and Y

• 00=green, 01=blue, 10=red. Then, what is the truth status of formulas when XY=11?

• For more, see [J. HOSC, to appear]• Many more tricks in [IJFCS 2002]

27

Applications of Mona• Debian/GNU package; also AIX package• Integrated with PVS, a leading theorem proving environment, SRI

• Used as essential tool in Ph.D. theses and other research such as• natural language processing (Ohio)

• duration calculus verifier (Mumbai)

• Mona as decision procedure for description logics (Dresden)

• verification of parameterized systems (Kiel)

• verification and reachability (Upsala)

• multimedia applications (Kent)

• automata-based representations for arithmetic (Santa Barbara)

• Presburger arithmetic (Synopsis)

• automata in control synthesis (Aarhus)

• acceleration of counter automata (Cachan)

• verification of structures in imperative programs (Tel Aviv)

• high-level language for verification (Toulouse)

• a WS2S specification language (Freiburg)

• YacYac parser generator (Aarhus)

• Pale pointer engine (Aarhus)

• Google Mona for home page with many papers online

28

Explain Automatic Pointer Reasoning I

• points-to(a,b) iff cell at a contains a pointer to b

• This predicate is definable for a wf store (because of list/tree assumptions)

• Assume we want to verify {P}S{Q}• S is straight-line code, say p^.next := x• The store after is the same as before except that

the predicate points-to(a,b) has changed for a=p

29

Explain Automatic Pointer Reasoning II

• Let Q’ be Q, rewritten to account for a=p situation

• The WF property can be expressed using least-fixed points in WS1S (or WS2S) based on the points-to predicate

• WF is assumed in initial store by storage layout model

• So, we need to verify P Q’ & WF’

30

Explain Automatic Pointer Reasoning III

• Sometimes we need an invariant

• x is only empty if y is empty, and p points to the last element of z

• (x=nil => y=nil)

& z <next*>p

& (z < > nil => p^.next=nil)