Verifying Interactive Web Programs Daniel R. Licata Shriram Krishnamurthi Brown University.

66
Verifying Interactive Web Programs Daniel R. Licata Shriram Krishnamurthi Brown University

Transcript of Verifying Interactive Web Programs Daniel R. Licata Shriram Krishnamurthi Brown University.

Verifying Interactive Web Programs

Daniel R. Licata Shriram Krishnamurthi

Brown University

Popular Press

Quote:

But when I clicked on the National [car rental] price […], the site responded with this message: “You have back-buttoned too far”. This was my first experience with “back-button” as a verb. […] Since that was patently untrue, I decoded its true meaning: “We ran out [of cars]”.

–M. Slatalla, New York Times, 2003-07-17

A Headache for Companies

● Minor problem: Users might get booked into the wrong hotels, onto the wrong flights, etc.

● Major problem: People might embarass you in newspapers and in public talks

The Orbitz Property

● Orbitz Property: the user should receive a reservation at the hotel that was displayed on the page he submitted

● In other words, the result does not depend on the page on which you did not click “Reserve”

Should all sites have this property?

Question

What would Amazon want?

The Amazon Property

Amazon property: at the end, every book the user added to his shopping cart is actually in his shopping cart

These properties are● not fixed in number● temporal in nature

model checking

Model Checking

1. From the source code of a program, generate a model that captures the behaviors of interest

2. Consume properties written by the developer

3. Automatically check whether or not the model satisfies the properties

Model Checking

1. From the source code of a program, generate a model that captures the behaviors of interest

2. Consume properties written by the developer

3. Automatically check whether or not the model satisfies the properties

Modelling Web Programs

Model = control-flow graph (CFG)

What would a model of Orbitz look like?

Modelling Orbitz

use chosen

to compute

reserved

display

hotel list

display

reservation

set chosen

use chosen

to compute

displayed

display details

for displayed

User Operations add Control Flow

● The browser's back-button introduced control flow not present in the original CFG

● Other browser operations do the same

How many operations do today's browsers provide?

One Browser

How can we model all of these operations?

Alt+Tab

User Operation Calculus

● Express all browser operations in terms of primitive user operations:

● submit form to server

● switch to previously-visited page

[Graunke et al., 2003]

● Only need to account for these two operations' control flow

Our Model: the WebCFG

● submit corresponds to program's control flow

Already in the CFG

● switch permits returning to any previously-visited Web-interaction point

Add edges from each Web-interaction node to the successors of all the others (WebCFG)

The Orbitz CFG

use chosen

to compute

reserved

display

hotel list

display

reservation

set chosen

use chosen

to compute

displayed

display details

for displayed

The Orbitz WebCFG

use chosen

to compute

reserved

display

hotel list

display

reservation

set chosen

use chosen

to compute

displayed

display details

for displayed

Model Checking

1. From the source code of a program, generate a model that captures the behaviors of interest

2. Consume properties written by the developer

3. Automatically check whether or not the model satisfies the properties

Properties

We want to state properties about Web

pages

Properties

<html>

<body bgcolor=yellow>

<table>

<td>

<p>

Residence Inn by

Marriot Charleston

Downtown

...

</html>

Web pages are written as HTML source

Properties

We want to reason about Web page texts

<html>

<body bgcolor=yellow>

<table>

<td>

<p>

Residence Inn by

Marriot Charleston

Downtown

...

</html>

Properties

How can we associate these texts with

the corresponding HTML source?<html>

<body bgcolor=yellow>

<table>

<td>

<p>

Residence Inn by

Marriot Charleston

Downtown

...

</html>

Relating Web Page Content to Source

● Parse the text?

Too hard

● Static-distance coordinates?

Too brittle

What else can we do?

Relating Web Page Content to Source

<html>

<body bgcolor=yellow>

<table>

<td>

<p id=”reserved”>

Residence Inn by

Marriot Charleston

Downtown

...

</html>

Capitalize on Cascading Style Sheet (CSS) ID

tags!

Relating Web Page Content to Source

● If the tag is in the HTML, it must be present in the source of the program that generates the page

● This relates Web page text to the Web program source expression that generates it

Annotating the WebCFG

<html>

<body bgcolor=yellow>

<table>

<td>

<p id=”reserved”>

Residence Inn by

Marriot Charleston

Downtown

...

</html>

tag=reserved

generate reservation page

generate reservation text

Annotate each WebCFG state with the

propositions true in that state

Defining our Property Language

● The annotated WebCFG describes the set of traces that potentially occur

● The developer writes an automaton accepting the set of traces that should occur

● Verification is containment of the former in the latter

[Vardi and Wolper, 1986]

Example Property

Password-Page Property: Before reaching an access-controlled page, the user must go through a password page

12

violation

tag=password-entry

tag=access-controlled

Note: In properties,

tags label transitions

Expressing the Orbitz Property

Orbitz Property: the user should receive a reservation at the hotel that was displayed on the page he submitted

Divide and conquer!

Orbitz Subproperty 1

use chosen

to compute

reserved

display

hotel list

display

reservation

set chosen use chosen

to compute

displayed

display details

for displayed

Property:

chosen does not change between the computation of displayed and the computation of reserved

We need additional propositions to express this property!

Orbitz Subproperty 1

use chosen

to compute

reserved

display

hotel list

display

reservation

set chosen use chosen

to compute

displayed

display details

for displayed

Property:

chosen does not change between the computation of displayed and the computation of reserved

set and join enable reasoning about data

Orbitz Subproperty 1

Property:

chosen does not change between the computation of displayed and the computation of reserved

1 2

violation

(set,chosen)

(join,chosen)

tag=reserved

set and join enable reasoning about data

Orbitz Subproperty 2

Property:

the value of

reserved comes

from the value of

displayed

We need additional propositions to express this property!

use chosen

to compute

reserved

display

hotel list

display

reservation

set chosen use chosen

to compute

displayed

display details

for displayed

Orbitz Subproperty 2

Augment CSS tagged propositions with additional information for reasoning about value flow

Property:

the value of

reserved comes

from the value of

displayed

(tagged,displayed,X)

(tagged,reserved,X)

(tagged,reserved,X)

1 2

violation

Property Idioms

● Writing these automata correctly is tricky

● The two Orbitz subproperties and the Amazon property occur repeatedly

● We provide abstractions of these properties as idioms in our property language

Model Checking

1. From the source code of a program, generate a model that captures the behaviors of interest

2. Consume properties written by the developer

3. Automatically check whether or not the model satisfies the properties

Verification Process

The model and properties we have described

are checkable by language containment

The Orbitz WebCFG

use chosen

to compute

reserved

display

hotel list

display

reservation

set chosen

use chosen

to compute

displayed

display details

for displayed

The Orbitz WebCFG

use chosen

to compute

reserved

display

hotel list

display

reservation

set chosen

use chosen

to compute

displayed

display details

for displayed

Verification Process

● The model and properties we have described are compatible with the FLAVERS algorithms

[Cobleigh, Naumovich, Clarke, and Osterweil, 2001-2002]

● FLAVERS supports “constraint” automata

● We can automatically generate constraints that rule out all the infeasible forward paths

Status

We have begun to apply our model checker to CONTINUE, a Web-based conference management application

● Written in Scheme; send/suspend primitive creates Web-interaction points

● MrFlow implements SBA

[Heintze, 1994; Flanagan and Felleisen, 1996; Meunier, 2001]

Minimization

● Some WebCFG states are not labeled

● We remove these from the model without affecting results

● CONTINUE: from ~17,000 to ~300 states

Future Work

● Better data reasoning (verification conditions)

● Concurrency

● Case studies and more idioms

Perspective

● Work encompasses traditional verification

● Structure of Web source programs matters

● Nature of environment models changes