Post on 18-Dec-2015
Formal Methods and Computer Security
John MitchellStanford University
Invitation
• I'd like to invite you to speak about the role of formal methods in computer security.
• This audience is … on the systems end …
• If you're interested, let me know and we can work out the details.
Outline
What’s a “formal method”?Java bytecode verificationProtocol analysis
• Model checking• Protocol logic
Trust management• Access control policy language
Big Picture
Biggest problem in CS• Produce good software efficiently
Best tool• The computer
Therefore• Future improvements in computer
science/industry depend on our ability to automate software design, development, and quality control processes
Formal method
Analyze a system from its description• Executable code• Specification (possibly not executable)
Analysis based on correspondence between system description and properties of interest• Semantics of code• Semantics of specification language
Example: TCAS [Levison, Dill, …]
Specification• Many pages of logical formulas specifying
how TCAS responds to sensor inputsAnalysis
• If module satisfies specification, and aircraft proceeds as directed, then no collisions will occur
Method• Logical deduction, based on formal rules
Formal methods: good and bad
Strengths• Formal rules captures years of
experience• Precise, can be automated
Weaknesses• Some subtleties are hard to formalize• Methods cumbersome, time consuming
Formal methods sweet spot
System complexity * Property complexity
Multiplier parity OS verification
Use
rs *
Im
port
an
ce
Worthwhile
Not worth the effort
Not feasible
Target areas
Hardware verification Program verification
• Prove properties of programs• Requirements capture and analysis• Type checking and “semantic analysis”
Computer security• Mobile code security• Protocol analysis• Access control policy languages, analysis
Computer Security
Access control Network security OS security Web browser/server Database/application …
Crypto
Security
Goal: protect computer systems and digital information
Current formal methods use abstract view of cryptography
Mobile code: Java Applet
Local window Download
• Seat map• Airline data
Local data• User profile• Credit card
Transmission• Select seat• Encrypted msg
A.classA.javaJava
Compiler
B.class
Loader
Verifier
Linker
Bytecode Interpreter
Java Virtual Machine
Compile source code
Network
Java Virtual Machine Architecture
Java Sandbox
Four complementary mechanisms• Class loader
– Separate namespaces for separate class loaders– Associates protection domain with each class
• Verifier and JVM run-time tests– NO unchecked casts or other type errors, NO array overflow– Preserves private, protected visibility levels
• Security Manager– Called by library functions to decide if request is allowed– Uses protection domain associated with code, user policy– Enforcement uses stack inspection
Verifier
Bytecode may not come from standard compiler• Evil hacker may write dangerous bytecode
Verifier checks correctness of bytecode• Every instruction must have a valid operation code • Every branch instruction must branch to the start of
some other instruction, not middle of instruction • Every method must have a structurally correct
signature • Every instruction obeys the Java type discipline
Last condition is fairly complicated .
How do we know verifier is correct?
Many attacks based on verifier errors
Formal studies prove correctness• Abadi and Stata• Freund and Mitchell• Nipkow and others …
A type system for object initialization in the
Java bytecode language
Stephen Freund John MitchellStanford University
(Raymie Stata and Martín Abadi, DEC SRC)
Bytecode/Verifier Specification
Specifications from Sun/JavaSoft:• 30 page text description [Lindholm,Yellin]
• Reference implementation (~3500 lines of C code)
These are vague and inconsistentDifficult to reason about:
• safety and security properties• correctness of implementation
Type system provides formal spec
JVM uses stack machine
JavaClass A extends Object { int i void f(int val) { i = val + 1;}}
BytecodeMethod void f(int) aload 0 ; object ref this iload 1 ; int val iconst 1 iadd ; add val +1 putfield #4 <Field int i> return data
area
local variabl
es
operandstack
Return addr, exception info, Const pool res.
JVM Activation Record
refers to const pool
Java Object Initialization
No easy pattern to match Multiple refs to same uninitialized object
Point p = new Point(3);p.print();
1: new Point2: dup3: iconst 34: invokespecial <method Point(int)>5: invokevirtual <method print()>
JVMLi Instructions
Abstract instructions:• new allocate memory for object• init initialize object• use use initialized object
Goal• Prove that no object can be used
before it has been initialized
Typing Rules
For program P, compute for iDom(P) Fi : Var type type of each variable
Si : stack of types type of each stack location
Example: static semantics of inc P[i] = inc
Fi+1 = Fi
Si+1 = Si = Int
i+1 Dom(P) F, S, i P
Typing RulesEach rule constrains successors of instruction:
Well-typed = Accepted by Verifier
Alias Analysis
Other situations:
or
Equivalence classes based on line where object was created.
1: new P2: new P3: init P
init P
new P
i : uninitialized object of
type allocated on line i.
The new Instruction
Uninitialized object type placed on stack of types:
P[i] = new
Fi+1 = Fi
Si+1 = i Si
i Si
i Range(Fi)
i+1 Dom(P)F, S, i P
The init Instruction
Substitution of initialized object type for uninitialized object type:
P[i] = init
Si = j , j Dom(P)
Si+1 =[/ j]
Fi+1 =[/ j] Fi
i+1 Dom(P)F, S, i P
Soundness
Theorem: A well-typed program will not generate a run-time error when executed
Invariant:• During program execution, there is never
more than one value of type present.• If this is violated, we could initialize one
object and mistakenly believe that a different object was also initialized.
Extensions
Constructors• constructor must call superclass constructor
Primitive Types and Basic Operations Subroutines [Stata,Abadi]
• jsr L jump to L and push return address on stack• ret x jump to address stored in x• polymorphic over untouched variables
Dom(FL) restricted to variables used by subroutine
variables 1 and 2 contain references to
two different objects with type P11 .
Bug in Sun JDK 1.1.4
1: jsr 102: store 13: jsr 104: store 25: load 26: init P7: load 18: use P9: halt
10: store 011: new P12: ret 0
verifier allows use of uninitialized object
Related Work
Java type systems• Java Language [DE 97], [Syme 97], ...• JVML [SA 98], [Qian 98], [HT 98], ...
Other approaches• Concurrent constraint programs [Saraswat
97]• defensive-JVM [Cohen 97]• data flow analysis frameworks [Goldberg 97]• Experimental tests [SMB 97]
TIL / TAL [Harper,Morrisett,et al.]
Protocol Security
Cryptographic Protocol• Program distributed over network• Use cryptography to achieve goal
Attacker• Read, intercept, replace messages,
remember their contentsCorrectness
• Attacker cannot learn protected secret or cause incorrect protocol completion
Example Protocols
Authentication Protocols• Clark-Jacob report >35 examples (1997)• ISO/IEC 9798, Needham-S, Denning-
Sacco, Otway-Rees, Woo-Lam, KerberosHandshake and data transfer
• SSL, SSH, SFTP, FTPS, …Contract signing, funds transfer, …Many others
Characteristics
Relatively simple distributed programs• 5-7 steps, 3-10 fields per message, …
Mission critical• Security of data, credit card numbers, …
Subtle• Attack may combine data from many
sessions
Good target for formal methods However: crypto is hard to model
Run of protocol
A
BInitiate
Respond
C
D
Attacker
Correct if no security violation in any run
Protocol Analysis Methods
Non-formal approaches (useful, but no tools…)
• Some crypto-based proofs [Bellare, Rogaway]
• Communicating Turing Machines [Canetti]
BAN and related logics • Axiomatic semantics of protocol steps
Methods based on operational semantics• Intruder model derived from Dolev-Yao• Protocol gives rise to set of traces
– Denotation of protocol = set of runs involving arbitrary number of principals plus intruder
Example projects and tools
Prove protocol correct • Paulson’s “Inductive method”, others in HOL, PVS,• MITRE -- Strand spaces• Process calculus approach: Abadi-Gordon spi-calculus
Search using symbolic representation of states• Meadows: NRL Analyzer, Millen: CAPSL
Exhaustive finite-state analysis• FDR, based on CSP [Lowe, Roscoe, Schneider, …]• Clarke et al. -- search with axiomatic intruder model
Protocol analysis spectrum
Low High
Hig
hL
owSo
ph
isti
cati
on
of
atta
ck
s
Protocol complexity
Mur
FDR
NRLAthena
Hand proofs
Paulson
Bolignano
BAN logic
Spi-calculus
Poly-time calculus
Model checking
Multiset rewriting with
Protocol logic
Important Modeling Decisions
How powerful is the adversary?• Simple replay of previous messages• Block messages; Decompose, reassemble, resend• Statistical analysis, traffic analysis• Timing attacks
How much detail in underlying data types?• Plaintext, ciphertext and keys
– atomic data or bit sequences
• Encryption and hash functions– “perfect” cryptography– algebraic properties: encr(x*y) = encr(x) * encr(y) for RSA encrypt(k,msg) = msgk mod N
Four efforts (w/various collaborators)
Finite-state analysis• Case studies: find errors, debug specifications
Logic based model - Multiset rewriting• Identify basic assumptions• Study optimizations, prove correctness• Complexity results
Framework with probability and complexity• More realistic intruder model• Interaction between protocol and cryptography• Significant mathematical issues, similar to hybrid
systems (Panangaden, Jagadeesan, Alur, Henzinger, de Alfaro, …)
Protocol logic
Rest of talk
Model checking• Contract signing
MSR• Overview, complexity results
PPoly• Key definitions, concepts
Protocol logic• Short overview
Likely to run out of time …
Contract-signing protocols
John Mitchell, Vitaly ShmatikovStanford University
Subsequent work by Chadha, Kanovich, Scedrov,Other analysis by Kremer, Raskin
Example
Both parties want to sign the contract
Neither wants to commit first
Immunitydeal
General protocol outline
Trusted third party can force contract• Third party can declare contract binding if presented with first two messages.
A B
I am going to sign the contract
I am going to sign the contract
Here is my signature
Here is my signature
Assumptions
Cannot trust communication channel• Messages may be lost• Attacker may insert additional
messagesCannot trust other party in protocolThird party is generally reliable
• Use only if something goes wrong• Want TTP accountability
Desirable properties
Fair• If one can get contract, so can other
Accountability• If someone cheats, message trace
shows who cheatedAbuse free
• No party can show that they can determine outcome of the protocol
BA
m1= sign(A, c, hash(r_A) )
sign(B, m1, hash(r_B) )r_A
r_B
Agree
A BNetwork
T
Abort
???
Resolve Attack?
BA Net
T sigT (m1, m2)
m1
???
m2 A
T
Asokan-Shoup-Waidner protocol
If not alreadyresolved
a1
sigT (a1,abort)
Results
Exhaustive finite-state analysis• Two signing parties, third party• Attacker tries to subvert protocol
Two attacks• Replay attack
– Restart A’s conversation to fool B
• Inconsistent signatures– Both get contracts, but with different ID’s
Repair• Add data to m3, m4; prevent both attacks
Related protocol
Designed to be “abuse free”• B cannot take msg from A and show to C• Uses special cryptographic primitive• T converts signatures, does not use own
Finite-state analysis• Attack gives A both contract and abort• T colludes weakly, not shown accountable• Simple repair using same crypto primitive
[Garay, Jakobsson, MacKenzie]
BA
PCSA(text,B,T)
PCSB(text,A,T)
sigA(text)
sigB(text)
Agree
A BNetwork
T
m1 = PCSA(text,B,T)
Abort
???
Resolve Attack
BA Net
T PCSA(text,B,T)
sigB(text)
PCSA(text,B,T)
???
PCSB(text,A,T) B
T
sigT(abort)
abort AND sigB(text) abort
Leaked by T
Garay, Jakobsson, MacKenzie
Modeling Abuse-Freeness
Depend on set of traces through a state Approximation for finite-state analysis
• Nondet. challenge A to resolve or abort• If trace s.t. outcome challenge,
then A cannot determine the outcome
Abuse
Ability to determine the outcome
Ability to prove it
= +Not a trace property!
Conclusions
Online contract signing is subtle• Fairness• Abuse-freeness• Accountability
Several interdependent subprotocols• Many cases and interleavings
Finite-state tool great for case analysis!• Find bugs in protocols proved correct
Multiset Rewriting and Security Protocol Analysis
John MitchellStanford University
I. Cervesato, N. Durgin, P. Lincoln, A. Scedrov
A notation for inf-state systems
• Many previous models are buried in tools• Define common model in tool-independent formalism
Linear Logic( )
Process Calculus
Finite Automata
Proof search(Horn clause)
Multisetrewriting
Modeling Requirements
Express properties of protocols• Initialization
– Principals and their private/shared data
• Nonces– Generate fresh random data
Model attacker• Characterize possible messages by attacker• Cryptography
Set of runs of protocol under attack
Notation commonly found in literature
• The notation describes protocol traces• Does not
– specify initial conditions– define response to arbitrary messages– characterize possible behaviors of attacker
A B : { A, Noncea }Kb
B A : { Noncea, Nonceb }Ka
A B : { Nonceb }Kb
Rewriting Notation
Non-deterministic infinite-state systems Facts
F ::= P(t1, …, tn)
t ::= x | c | f(t1, …, tn)
States { F1, ..., Fn }• Multiset of facts
– Includes network messages, private state– Intruder will see messages, not private state
Multi-sorted first-order atomic formulas
Rewrite rules
Transition• F1, …, Fk x1 … xm. G1, … , Gn
What this means• If F1, …, Fk in state , then a next state ’ has
– Facts F1, …, Fk removed
– G1, … , Gn added, with x1 … xm replaced by new symbols
– Other facts in state carry over to ’
• Free variables in rule universally quantified Note
• Pattern matching in F1, …, Fk can invert functions
• Linear Logic: F1…Fk x1 … xm(G1…Gn)
Finite-State Example
• Predicates: State, Input
• Function: • Constants: q0, q1, q2, q3, a, b, nil
• Transitions: State(q0), Input(a x) State(q1), Input(x)
State(q0), Input(b x) State(q2), Input(x)
... Set of rewrite transition sequences = set of runs of automaton
q0
q1
q3
q2b
a
aa
b
b
b a b
Simplified Needham-Schroeder
PredicatesAi, Bi, Ni
-- Alice, Bob, Network in state i
Transitionsx. A1(x)
A1(x) N1(x), A2(x)
N1(x) y. B1(x,y)
B1(x,y) N2(x,y), B2(x,y)
A2(x), N2(x,y) A3(x,y)
A3(x,y) N3(y), A4(x,y)
B2(x,y), N3(y) B3(x,y)
picture next slide
A B: {na, A}Kb
B A: {na, nb}Ka
A B: {nb}Kb
AuthenticationA4(x,y) B3(x,y’) y=y’
Sample TraceA B: {na, A}Kb
B A: {na, nb}Ka
A B: {nb}Kb
A2(na)
A1(na)
A2(na)
A2(na)
A3(na, nb)
A4(na, nb)
A4(na, nb)
B2(na, nb)
B1(na, nb)
B2(na, nb)
B3(na, nb)
B2(na, nb)
N1(na)
N2(na, nb)
N3( nb)
x. A1(x)
A1(x) A2(x), N1(x)
N1(x) y. B1(x,y)
B1(x,y) N2(x,y), B2(x,y)
A2(x), N2(x,y) A3(x,y)
A3(x,y) N3(y), A4(x,y)
B2(x,y), N3(y) B3(x,y)
Common Intruder Model
Derived from Dolev-Yao model • Adversary is nondeterministic process• Adversary can
– Block network traffic– Read any message, decompose into parts– Decrypt if key is known to adversary– Insert new message from data it has observed
• Adversary cannot– Gain partial knowledge– Guess part of a key– Perform statistical tests, …
Formalize Intruder Model
Intercept, decompose and remember messages N1(x) M(x) N2(x,y) M(x), M(y)
N3(x) M(x)
Decrypt if key is known M(enc(k,x)), M(k) M(x)
Compose and send messages from “known” data M(x) N1(x), M(x)
M(x), M(y) N2(x,y), M(x), M(y)
M(x) N3(x), M(x)
Generate new data as needed x. M(x)
Highly nondeterministic, same for any protocol
Attack on Simplified Protocol
A2(na)
A1(na)
A2(na)
A2(na)
B1(na’, nb)
N1(na)x. A1(x)
A1(x) A2(x), N1(x)
N1(x) M(x)
x. M(x)
M(x) N1(x), M(x)
N1(x) y. B1(x,y)
M(na)
M(na), M(na’)
N1(na’)A2(na) M(na), M(na’)
A2(na) M(na), M(na’)
Continue “man-in-the-middle” to violate specification
Protocols vs Rewrite rules
Can axiomatize any computational system
But -- protocols are not arbitrary programs Choose principals
Client
Select roles
Client TGS Server
Thesis: MSR Model is accurate
Captures “Dolev-Yao-Needham-Millen-Meadows- …” model• MSR defines set of traces protocol and attacker• Connections with approach in other formalisms
Useful for protocol analysis• Errors shown by model are errors in protocol• If no error appears, then no attack can be
carried out using only the actions allowed by the model
Complexity results using MSR
Key insight: existential quantification () captures cryptographic nonce; main source of complexity
[Durgin, Lincoln, Mitchell, Scedrov]
only
,
only
,
Intruder w/o
Intruder with
Unbounded use of
Bounded use of
Bounded # of roles
NP – complete
Undecidable
??
DExp – time
All: Finite number of different roles, each role of finite length, bounded message size
Additional decidable cases
Bounded role instances, unbounded msg size• Huima 99: decidable• Amadio, Lugiez: NP w/ atomic keys• Rusinowitch, Turuani: NP-complete, composite keys• Other studies, e.g., Kusters: unbounded # data fields
Constraint systems• Cortier, Comon: Limited equality test• Millen, Shmatikov: Finite-length runs
All: bound number of role instances
Probabilistic Polynomial-Time Process Calculus
for Security Protocol Analysis
J. Mitchell, A. Ramanathan, A. Scedrov, V. Teague
P. Lincoln, M. Mitchell
Limitations of Standard Model
Can find some attacks• Successful analysis of industrial protocols
Other attacks are outside model• Interaction between protocol and encryption
Some protocols cannot be modeled• Probabilistic protocols• Steps that require specific property of
encryption Possible to “OK” an erroneous protocol
Non-formal state of the art
Turing-machine-based analysis• Canetti• Bellare, Rogaway• Bellare, Canetti, Krawczyk• others …
Prove correctness of protocol transformations• Example: secure channel -> insecure
channel
Language Approach
Write protocol in process calculus Express security using observational equivalence
• Standard relation from programming language theory P Q iff for all contexts C[ ], same observations about C[P] and C[Q]• Context (environment) represents adversary
Use proof rules for to prove security• Protocol is secure if no adversary can distinguish it
from some idealized version of the protocol
[Abadi, Gordon]
Probabilistic Poly-time Analysis
Adopt spi-calculus approach, add probability Probabilistic polynomial-time process calculus
• Protocols use probabilistic primitives– Key generation, nonce, probabilistic encryption, ...
• Adversary may be probabilistic• Modal type system guarantees complexity bounds
Express protocol and specification in calculus Study security using observational equivalence
• Use probabilistic form of process equivalence
[Lincoln, Mitchell, Mitchell, Scedrov]
Needham-Schroeder Private Key
Analyze part of the protocol P A B : { i } K
B A : { f(i) } K
“Obviously’’ secret protocol Q (zero knowledge)
A B : { random_number } K
B A : { random_number } K
Analysis: P Q reduces to crypto condition related to non-malleability [Dolev, Dwork, Naor]
– Fails for RSA encryption, f(i) = 2i
Technical Challenges
Language for prob. poly-time functions• Extend Hofmann language with rand
Replace nondeterminism with probability• Otherwise adversary is too strong ...
Define probabilistic equivalence• Related to poly-time statistical tests ...
Develop specification by equivalence• Several examples carried out
Proof systems for probabilistic equivalence • Work in progress
Basic example
Sequence generated from random seedPn: let b = nk-bit sequence generated from n random bits
in PUBLIC b end Truly random sequence
Qn: let b = sequence of nk random bits
in PUBLIC b end P is crypto strong pseudo-random generator
P QEquivalence is asymptotic in security parameter n
Compositionality
Property of observational equiv
A B C D
A|C B|D
similarly for other process forms
Current State of Project
New framework for protocol analysis • Determine crypto requirements of protocols !• Precise definition of crypto primitives
Probabilistic ptime language Pi-calculus-like process framework
• replaced nondeterminism with rand• equivalence based on ptime statistical tests
Proof methods for establishing equivalence
Future: tool development
Protocol logic
Alice’s information• Protocol• Private data• Sends and receives
Honest Principals,Attacker
Send
Receive
Protocol
Private Data
Intuition
Reason about local information• I chose a new number• I sent it out encrypted• I received it decrypted • Therefore: someone decrypted it
Incorporate knowledge about protocol• Protocol: Server only sends m if it got m’• If server not corrupt and I receive m
signed by server, then server received m’
Bidding conventions (motivation)
Blackwood response to 4NT –5 : 0 or 4 aces –5 : 1 ace –5 : 2 aces –5 : 3 aces
Reasoning • If my partner is following Blackwood,
then if she bid 5, she must have 2 aces
Logical assertions
Modal operator• [ actions ] P - after actions, P reasons
Predicates in • Sent(X,m) - principal X sent message m
• Created(X,m) – X assembled m from parts
• Decrypts(X,m) - X has m and key to decrypt m
• Knows(X,m) - X created m or received msg containing m and has keys to extract m from msg
• Source(m, X, S) – YX can only learn m from set S
• Honest(X) – X follows rules of protocol
Correctness of NSL
Bob knows he’s talking to Alice[ recv encrypt( Key(B), A,m ); new n; send encrypt( Key(A), m, B, n ); recv encrypt( Key(B), n ) ] B
Honest(A) Csent(A, msg1) Csent(A, msg3)
where Csent(A, …) Created(A, …) Sent(A, …)
msg1
msg3
Honesty rule (rule scheme)
roles R of Q. initial segments A R.
Q |- [ A ]X
Q |- Honest(X)
• This is a finitary rule:– Typical protocol has 2-3 roles– Typical role has 1-3 receives– Only need to consider A waiting to receive
Conclusions
Security Protocols• Subtle, Mission critical, Prone to error
Analysis methods• Model checking
– Practically useful; brute force is a good thing– Limitation: find errors in small configurations
• Proof methods– Time-consuming to use general logics– Special-purpose logics can be sound, useful
Room for another 5+ years of work
Access Control / Trust Mgmt
Conference Registration
Regular $1000Academic $500Student $100
Root CA
Stanford
Mitchell
Chander
Stanford is accred university
Mitchell is regular faculty
Chander is my student
Registration message
Root signs: Stanford is accred universityStanford signs: Mitchell is regular faculty Faculty can ident studentsMitchell signs: Chander is my student
Certification
Faculty can ident students
Formal methods
System complexity * Property complexity
Use
rs *
Im
port
an
ce
Worthwhile
Not worth the effort
Not feasible
System-Property Tradeoff
Low High
Hig
hL
owSo
ph
isti
cati
on
of
atta
ck
s
Protocol complexity
Mur
FDR
NRLAthena
Hand proofs
Paulson
Bolignano
BAN logic
Spi-calculus
Poly-time calculus
Model checking
Multiset rewriting with
Protocol logic
The Wedge of Formal Verification
Valueto Design
Effort Invested
Verify
RefuteAbstract
Invisible FM
Big Picture
Biggest problem in CS• Produce good software efficiently
Best tool• The computer
Therefore• Future improvements in computer
science/industry depend on our ability to automate software design, development, and quality control processes