Toshihiro YOSHINO (D1, Yonezawa Lab.) < [email protected] >
description
Transcript of Toshihiro YOSHINO (D1, Yonezawa Lab.) < [email protected] >
Secure Compiler Seminar 11/7
Survey: Modular Development ofCertified Program Verifiers
with a Proof Assistant
Toshihiro YOSHINO(D1, Yonezawa Lab.)
Today’s Paper
A. Chlipala (UC Berkeley). Modular Development of Certified Program Verifiers with a Proof Assistant. ICFP ’06. Implementation can be downloaded from web si
te below: ⇒ http://proofos.sourceforge.net/
Overview of the Paper
Case study to develop a certified program verifier with Coq Verifies memory safety of x86 machine code Its soundness is machine-checked Modular development by reusable functors
Possible to create a new verifier based on another type system with low cost
Constructing Certified Verifiers
Design and implement with Coq Use “extraction” feature of Coq to obtain a working verifi
er
A verifier can be formalized as:
load: program -> state loads a program The type program represents binary file format
safe: state -> Prop is the safety property we wish to verify for programs
[[P]] is notation for poption P option(O’Caml) or Maybe(Haskell) for domain Prop
Constructing Certified Verifiers
Abstraction refinement by multiple stages Each stage (component) is a functor which tran
sforms target states into source states Later components reason at higher levels of abstracti
on Use Coq’s module system to implement this mo
dular design
Formalization of x86 Instruction Set
PCC-style formalization Subset of x86 instruction set + ERROR instruction
mov, jcc, … Safety ≡ ERROR is unreachable
In combination with assertion, many properties can be proven
Can be formalizedcoinductively
Cope with infinitederivation
Types and Extraction in Coq
Basically Coq manipulates on terms of dependently-typed lambda calculus A proposition is represented as a type, its proof
as a term of that type Well known as Curry-Howard isomorphism
Proving step corresponds to type inference Given a goal, refine it interactively into subgoals, and
eliminate holes Rules used for these steps are called tactics
Types and Extraction in Coq
Program extraction from Coq code In short, extraction is to erase terms of sorts other t
han Set Brief example: isEven
Definition isEven : forall (n:nat), poption (even n).refine (fix isEven (n:nat) : poption (even n) := match n return … with | O => PSome _ _ | S (S n) => … | _ => PNone _ end); auto.Qed.
let rec isEven (n:nat) = match n with | O -> true | S (S n) -> isEven n | _ -> false
Definition isEven : forall (n:nat), poption (even n).refine (fix isEven (n:nat) : poption (even n) := match n return … with | O => PSome _ _ | S (S n) => … | _ => PNone _ end); auto.Qed.
poption: “option” for Domain “Prop”
Two constructors: PNone and PSome
PSome is given a proof of P Literately, PSome means “P holds and I have a proof
for that” and PNone “I am not sure” Can be used as failure-monad
PNone >>= _ = PNonePSome p >>= f = f p
In extraction, PSome corresponds to true, and PNone to false
soption
soption extends poption with a parameter Proposition about a term of domain T (of sort Set)
soption, too, can be used as failure monad
In the paper’s theoretical part, written as {{ x : T | P }}
Coq’s Module System
Used to build re-usable verification components Frequent pattern:
Module Type MACHINE. Parameter mstate : Set. Parameter minitState : mstate -> Prop.…End MACHINE.
Record state : Set := { stRegs32 : regs32;…}.
Inductive instr : Set := Arith : …| ….
Inductive exec : … := ….
Module M86 <: MACHINE. Definition mstate := state. Definition minstr := instr.…End M86.
Module ModelCheck
Provides fundamental methods of model checking Methods to prove theorems about infinite state
systems through exhaustive exploration Refine the model in each of the following stages
Abstract Concrete
Module ModelCheck
Introduced Elements absState: a set of abstract states
An abstract state is managed with “hypotheses”, states that are known to be safe
Hypothesis is used, for example, to formalize return pointer from a function
describes correspondence between machine states and abstract states Context(Γ) is deleted in extracting a verifier
init is a set (actually a list) of initial states It must be a set because one real machine state may corresp
ond to multiple abstract states There must be some elements in init that has no hypothesis
Module ModelCheck
Introduced Elements step describes execution step
Execute an instruction from the specified state soption is used because the execution may get stuck
Progress and Preservation must hold
Progress
Preservation
MACHINE: Input to the module
Module ModelCheck
The Concept Illustrated
State space ofa real machine
absState
step
Initial statesInitial states
Module Reduction
Translates x86 machine language into simpler RISC-style instruction set (SAL) x86 machine language is too complex and not
suitable for verification purposes One instruction may perform several basic
operations The same basic operations show up in the working of
many instructions
Reduction module also provides model checking layer for SAL programs
Module Reduction
SAL: Simplified Assembly LanguageNamed after the language used in Proof-C
arrying Code[Necula 1997]
RISC-style instruction set Arithmetics are extended to allow expressions
with parentheses and infix operatorsAdditional temporary registers TMPi
Module FixedCode
Ensures that code region is not overwritten by the code itself To simplify the verification framework
Definition is in the form of ModelCheck Additional check is performed only on storing to
the memory
Module TypeSystem
Support for a standard approach for type systems A set of types is introduced and typing rules for values
are described Subtype relation is also introduced
The definition in the figure suffices because Coq takes care of that part
And each register isassociated with a type
Module TypeSystem
viewShift represents shift of types’ view Occurs at places a program crosses an abstract
ion boundary For example, in function calls when the stack frame c
hanges Introducing existential is also a kind of view shift
Module WeakUpdate
Introduces a type system of weak update Each memory cell has a type associated and thi
s type does not change during a run A cell can be overwritten only with a value of its type
Dynamic memory management is out of the scope In real setting, memory is frequently reclaimed
and reused Garbage collector or malloc/free
The Rest of Modules
Module StackTypes Keeps track of types of stack slots
Module SimpleFlags Keeps track of flag values
In x86 (too), no atomic instruction for conditional test and jump at one time
Crucial for assuring pointer is valid (not null) or checking array boundary
Case Study:A Verifier for Algebraic Datatypes
Implemented the library and a sample verifier with Coq http://proofos.sourceforge.net/ Approx. 20K(+α) LoC
Main implementation consists of only 600 LoC 7,000 LoC for implementing library components 10,000 for generic utility
• 1,000 for bitvectors and fixed-precision arithmetics• 1,000 for a subset of x86 machine code
Auxiliary library from O’Caml implementation (not counted here)
x86 binary parsing, etc.
Related Work
Foundational PCC[Appel 2001]
Reduce TCB and also improve flexibility of PCC by constructing a system on some logical framework
However, efficiency is sacrificed by generality Theoretical issues seem to have priority to pragmatics
Epigram[McBride, McKinna 2004], ATS[Chen, Xi 2005],RSP[Westbrook et al. 2005] and GADTs[Sheard 2004]
Incorporate dependent types into program languages But the foundations of Coq’s implementation and metath
eories are simpler than them
Summary (of the Paper)
Designed a structure for modular certified verifiers Components are reusable functors Pipeline-style design
Implemented library components with Coq As a case study, memory safety verifier for x86
machine code is constructed
Relevance to My Research
I have been studying a framework to build verifiers for low-level languages First formalize the common language ADL Verification is done on the translated program
(in ADL)
Trying to prove correctness of translation Currently ongoing with Coq
Relevance to My Research
Both very similar approach ADL and SAL are both designed in a minimalist crit
eria Verification logic is built on top of the common lang
uage’s semantics To achieve high portability and flexibility
From this viewpoint, my project is covered by his… (x_x)
Correctness of translation is also proven by Coq in proofos
Positively thinking, my direction was not so wrong
Relevance to My Research
Comparison of two projects…
proofos[Chlipala 06]
L3Cover[Yoshino 06]
Common Language
SAL ADL
Implementation Coq Java
Parametrization ML-style module
OO-style (inheritance)