Post on 28-Dec-2015
Designing Programming Languages to Improve Software Quality
David J. Pearce
Software Quality New Zealand, August 2013
@whileydavehttp://whiley.org
http://github.com/DavePearce
Null Pointers
“I call it my billion-dollar mistake. It was the invention of the null reference in 1965”
-- Tony Hoare
“Of all the exceptions a Java programmer might encounter, the null-pointer exception is among the most dreaded, and for good reason: it is one of the least informative exceptions that a program can signal.”
-- Eric Allen
(from java/lang/String.java)
Experiences with Null Pointers
/**
* …
* @throws NullPointerException if sb is
* null
*/
boolean contentEquals(StringBuffer sb) {
synchronized(sb) { … }
}
(also from java/lang/String.java)
/**
* Tests if this string starts with the
* specified prefix.
*
* @param prefix the prefix.
* @return true if the character sequence…
*/
public boolean startsWith(String prefix) {
…
char pa[] = prefix.value;
…
}
/**
* Tests if this string starts with the
* specified prefix.
*
* @param prefix the prefix.
* @return true if the character sequence…
*/
public boolean startsWith(String prefix) {
…
char pa[] = prefix.value;
…
}
Found 83/1101 java.lang methods were misdocumented!
Thoughts
• Why is documentation bad?– Because programmers write it– Programmers have deadlines– Documentation doesn’t help meet deadlines
• When is documentation not bad?– Think about types in Java– Method types automatically documented!– Compiler ensures they are accurate
@NonNull Types
void f(@NonNull Integer x) {
x.toString(); // safe!
}
void g(Integer x) {
x.toString(); // Syntax Error
}
void h(Integer x) {
if(x != null) {
x.toString(); // Safe!
} }
JML Specifications
public class IntVec { private int data[];
private int length = 0;
//@ invariant data != null;
//@ invariant length >= 0;
//@ invariant length <= data.length;
//@ requires size > 0;
public IntVec(int size) { data = new int[size]; }
//@ ensures \result >= 0;
public int size() { return length; }
…
More JML Specifications
/*@ public normal_behavior
@ requires s1 != null && s2 != null;
@ {|
@ requires s2.length==0 && s1.length==0; ensures !\result;
…
@ ensures \result == (
@ (\exists int i; 0<=i && i<s1.length && i<s2.length;
@ s1[i]<s2[i] && equal(s1,0,s2,0,i))
@ ||
@ (s1.length<s2.length && equal(s1,0,s2,0,s1.length)));
@ |}
@*/ public static pure model boolean lessThan(char[] s1, char[] s2);
(from JML spec of java/lang/String.java)
So … ?
• PROBLEM: modern languages make compile-time verification unnecessarily hard…
• ANSWER: design a language to simplify verification:– Pure Functions vs Impure Methods– Value Semantics– Structural subtyping & Flow Typing– Unbound Integers and Rationals– First-class Collections (sets, lists, maps)– Strict Concurrency Model (e.g. non-reentrancy)
Overview
• What is Whiley?– Hybrid functional / imperative language– Designed specifically for verification– Compiles to JVM (also prototype C backend)
• Why another language?– Verification is really hard– Many features of Java it even harder!– I think it’s basically impossible for Java– See ESC/Java and JML as good efforts here
A Zoo of Unusual Types!
• Primitives:– e.g.
• Collections (lists, sets, maps):– e.g.
• Records and Tuples:– e.g.
• Unions and Intersections:– e.g.
• Negations– e.g.
[int] {string} {int=>string}
!int
any int real charnull bool
{int x, int y} (int,int)
int|null int&null
Flow Typing
• A flow-sensitive approach to type checking• Types declared only for parameters and returns• Variables can have different types!• Conditionals and/or assignments cause retyping
int sum([int] items):
r = 0
for item in items:
r = r + item
return r
Flow Typing
• Type tests automatically retype variables!– (even on the false branch)
define Circle as {int x, int y, int r} define Rect as {int x, int y, int w, int h}
define Shape as Circle | Rect
real area(Shape s):
if s is Circle:
return PI * s.r * s.r
else:
return s.w * s.h
Flow Typing & Unions
• Cannot treat null|int like an int• Must distinguish cases by explicit type testing
null|int indexOf(string str, char c): …
[string] split(string str, char c): idx = indexOf(str,c) if idx is int: below = str[0..idx] above = str[idx..] return [below,above] else: return [str]
Can safely treat x as int here
Verification
• Function f():– Accepts an arbitrary integer …– Should return a natural number …– But, this implementation is broken!
define nat as int where $ >= 0
nat f(int x):
return x A compile time error!
define nat as int where $ >= 0
nat f(int x):
if x >= 0:
return x
else:
return 0
• Function f():– Accepts an arbitrary integer …– Returns a natural number …– This implementation satisfies the spec!
OK, because x implicitly a nat
Verification
• Function g():– Accepts a positive number …– And returns a natural number …– But, how to know pos subtypes nat ?
define nat as int where $ >= 0
define pos as int where $ > 0
nat g(pos x):
return xOK, because pos
implies nat
Verification
• Function h():– Accepts a natural number …– And returns a positive number …– But, how to know nat+1 gives pos ?
define nat as int where $ >= 0
define pos as int where $ > 0
pos h(nat x):
return x + 1OK, because
nat+1 gives pos
Verification
• Function h1() and h2() are identical
define nat as int where $ >= 0
define pos as int where $ > 0
pos h1(nat x):
return x + 1
int h2(int x) requires x>=0, ensures $>0:
return x + 1
Verification
• Function sum():– Accepts a list of natural numbers …– Then adds them together …– And returns a natural number.
define nat as int where $ >= 0
nat sum([nat] list): r = 0 for x in list where r >= 0: r = r + x return r
Ok, because adding nat to nat gives nat
Compiler Overview
• Developed 2009 – present• 304 classes (80KLOC) with 934 end-end tests• Also, 133 verifier tests and 15178 type system unit tests