Two techniques for programming by sketching (Stanford, November 2004) Rastislav Bodik, David...

Two techniques for programming by sketching

(Stanford, November 2004)

Rastislav Bodik, David Mandelin, Armando Solar-Lezama, Lin Xu UC BerkeleyRodric Rabbah MITKemal Ebcioglu, Doug Kimelman, Vivek Sarkar

IBM

Synthesis

• Program synthesis– given a specification, synthesize a program

meeting this spec– synthesis inverse to verification– most work in reactive systems (Pnueli,

Kupferman, …)

• Synthesis vs. compilation– synthesis involves a search for the desired

program

• Benefits– “less coding, more correctness”

Programming by sketching

sketch program = completed sketch

• Our approach– apply synthesis to software– “sketching”: specification is partial

(underspecified)

Two sketching techniques

Sketch: – partial implementation, provided by programmer

Sketch resolution: – completing the sketch into a full implementation– which one? (sketch completes into many

implementations!)

1. StreamBit:– behavioral spec + sketch full implementation

2. Prospector: – sketch several full implementations– user selects implementation with desired

behavior

StreamBit: Sketching high-performance implementations of

bitstream programs

Project lead: Armando Solar-Lezama

Bitstream Programs

• Bitstream programs: a growing domain– crypto: DES, Serpent, Rijndael, …– coding in general, NSA/BitTwiddle

• Bitstream programs operate under strict constraints– performance is very important

• up to 95% of server cycles spent in security-related processing

– correctness is crucial• subtle bug in Blowfish implementation allowed

over half the keys to be cracked in less than 10 minutes

Example

• “Drop every third bit in the bit stream.”• exhibits many features of complicated

permutations– exponentially many choices– greedy choice is suboptimal

• fast implementation can be sketched

SLOW O(w) FAST O(log w)

functionality

sketch

?? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

?? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

?? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

?? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

FAST implementation

+

Full sketch (13 lines of code)WSIZE=16;subsequence = Unroll[WSIZE](subsequence); subsequence = PermutFactor[ [shift(1:2 by 0), shift(17:18 by 0), shift(33:34 by 0)], [shift(1:16 by ?), shift(17:32 by ?), shift(33:48 by

?)] ] ( subsequence );

subsequence.subsequence_1=DiagSplit[WSIZE](subsequence);

for(i=0; i<3; ++i) {subsequence.subsequence_1.filter(i) =

PermutFactor[ [shift(1:16 by 0 || 1)], [shift(1:16 by 0 || 2)], [shift(1:16 by 0 || 4)] ]( subsequence.subsequence_1.filter(i) );}

Size: 13 lines

Compare with 100+ lines of such FORTRAN code (from BitTwiddle)

...

DATA MASKB2 /Z'FFC003FF000FFC00', Z'3FF000FFC003FF00', Z'0FFC003FF000FFC0', Z'03FF000FFC003FC0',...

c Compress 5-bit groups together

TB = IAND(TB + ISHFT(TB, SKIPBC), MASKB2(J)) TC = IAND(TC + ISHFT(TC, SKIPBC), MASKC2(J))

...

What you gain

• DropThird benchmark: – Speedups over naïve code with a 14 line

sketch:• 32 bit on a Pentium IV: 83.8%• 64 bit on an Itanium II: 233%

• DES benchmark:– 32 bit on a Pentium IV with 30 line sketch:

• 634% speedup over naïve• within 11% of hand optimized libDES

– 64 bit IA64 and IBM SP2• we beat libDES by 8%

What is sketching

• Key idea: separation of concerns– specify behavior without concern for performance– create implementation without concern for bugs

• domain expert:– writes a behavioral specification of her crypto algorithm– as clean as possible, no optimizations

• performance expert:– describes an efficient implementation of the clean

algorithm– neither reimplements nor describes in full – he only sketches an outline of the implementation;

compiler fills in details– if sketch is wrong, compiler complains no bugs can be

introduced

Compilation strategy

A sketch overrides a naïve compiler:

– naïve compiler translates the clean algorithm into target code,

•with a simple sequence of semantics-preserving transformations:

(1) make all filters word-size (unroll and split)(2) decompose word-size filters into machine instructions

– sketch “inserts” a step into the naïve sequence•Ex.: sketch decomposes a filter into a pipeline of filters •after sketch is applied, naïve compiler continues

The behavioral spec (StreamIt)

• StreamIt– synchronous dataflow language – filters represented internally as matrices

1 0 00 1 0

3

2

consumes a 3-bit chunk of input;produces a 2-bit of output.

xyz

xy

x =

Naïve compilation

1 0 00 1 0

3

2 1 0 00 1 0

3

2

1 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 0 0 0 00 0 0 1 0 0 0 0 0 0 0 00 0 0 0 1 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 00 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 1 0

12

8 1 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 0 0 0 00 0 0 1 0 0 0 0 0 0 0 00 0 0 0 1 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 00 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 1 0

12

8

rrobin 4,4,4

1 0 0 00 1 0 00 0 0 10 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 01 0 0 00 0 1 00 0 0 10 0 0 00 0 0 0

or

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 1 0 00 0 1 0

rrobin 4,4,4

1 0 0 00 1 0 00 0 0 10 0 0 0

0 0 0 00 0 0 00 0 0 01 0 0 00 0 1 00 0 0 10 0 0 00 0 0 0

or

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 1 0 00 0 1 0

0 0 0 00 0 0 00 0 0 00 0 0 0

duplicate

cat

rrobin 4,4,4

1 0 0 00 1 0 00 0 0 10 0 0 0

0 0 0 00 0 0 00 0 0 01 0 0 00 0 1 00 0 0 10 0 0 00 0 0 0

or

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 1 0 00 0 1 0

0 0 0 00 0 0 00 0 0 00 0 0 0

duplicate

cat

• Example: Drop Third Bit (word size W = 4 bits)– Unroll filter– decompose into filters operating on W=4 bits of input.– decompose into filters producing W=4 bits of output

Naïve compilation (cont.)

• Make each filter correspond to one basic operation available in the hardware

1 0 0 0 0 1 0 00 0 0 00 0 0 0

0 1 0 00 0 1 00 0 0 10 0 0 0

duplicate

or

0 0 0 00 0 0 00 0 1 00 0 0 0

t1 = in AND 1100

t2 = in SHIFTL 1

t3 = t2 AND 0010

out = t1 OR t3

in

The Full Picture

Level of abstraction (high low)

Implementations

Task Description

F.F_1

Decomposition without sketching

• User provides high level decomposition of F into F.F_i

• System Takes care of compiling F.F_i• Correctness is guaranteed as long as

[F.F_3] [F.F_2] [F.F_1] = F• Avoid spelling out the decomposition:

Sketch It![some properties] [some properties] [some

properties] = F

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 1 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 1 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 1 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 1 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 1 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0

F

F.F_2

F.F_3

F.F_1

F.F_2

F.F_3

specify FAST bit shifting algorithm w/out sketching:

Sketching: another example

A permutation from DES cipher (64 bits 64 bits)

32 bits 32 bits

Problem: when implemented as a table lookup, the table is very large

Idea: decompose into a pipeline of two permutations:1. provided by the programmer:

an inexpensive permutation2. automatically derived from the sketch:

two identical permutations (to be implemented as one smaller table)

shift(1:64 by 0 || 33 || -33), shift(1:2:31 by -33), shift(34:2:64 by 33),

[] // unspecifed; filled in by compiler

Sketch

Sketching: How it works• Start with a sketch

• Define xi,j as the amount bit i will move on step j

• Semantic equivalence imposes linear constraints on the xi,j

• Many of the constraints in the sketch also impose linear constraints on xi,j

• Solving the linear constraints produces a space of possible solutions• Map the nonlinear constraints to this solution space• Search

SketchDecomp[ [shift(1:32 by 0 || 1)], [shift(1:32 by 0 || 2)], [shift(1:32 by 0 || 4)], [shift(1:32 by 0 || 8)]

]( Filter );

First Solution Performance

00.5

11.5

22.5

33.5

44.5

5

0.00 1.00 2.00 3.00 4.00 5.00 6.00

Hours

Wo

rds

per

mic

rose

con

d

C 5

C 4

C 3

C 2

C 1

SBit 1

SBit 2

Sbit 3

Sbit 4

User Study (time to first solution)

C

StreamBit

User Study (developing a good implementation)

Performance over Time

0

2

4

6

8

10

12

0.00 2.00 4.00 6.00 8.00 10.00

Wo

rds

pe

r m

icro

se

co

nd

C 5C 4C 3C 2C 1SBit 1SBit 2SBit w sketchingSbit 3Sbit 4Cref

Implementing the fastest DES

0

0.2

0.4

0.6

0.8

1

1.2

fulltable notable sometable

PIV 2.5GHzPIII 700 MHzPIII 496.82IA64SolarisIBMSP

• How fast can we match the fastest DES implementation? – 6 different implementations in 4 hours– includes all but one trick used in libDES– so fast partly because sketching avoids bugs

Concluding Remarks

• StreamBit allows for – Task specification oblivious to performance– Implementation specification without bugs

• Same idea may apply in other domains– If people currently resort to very low level

coding– If some algebraic structure can be imposed on

the task– It may be amenable to implementation

sketching.

Mining Jungloids: Helping to Navigate the API Jungle

Project lead: David Mandelin

A software reuse problem

• big components reusable [Lampson’99]– OS, DBMS, browser

• small components challenging– flexibility: functionality cut finely, for fine control– size: in J2SE, 21,000 methods in 1000s of classes

• cost to understand and use– one of three obstacles to reuse [Lampson’99]

• searching for information – nearly ¼ of developer time [metallect.com]

often give up reuse and reimplement

IFile file = …ICompilationUnit cu = JavaCore.createCompilationUnitFrom(file); ASTNode node = AST.parseCompilationUnit(cu, false);

Example

programming task: parse a Java file into an AST

IFile file = …

ASTNode node = ?

Why so hard to find? (productivity: 2LOC/hour)1. class member browsers? two unknown classes used2. follow expected design? two levels of file handlers3. grep? method returns a subclass

The morale?

• type signatures– not very useful in finding desired code– but once found, can be used to verify

• so why not search existing code base?– somebody must have written these two lines

before!– yes, but not in same method

• for software engineering reasons

– or even same program• e.g.: parse an editor buffer, not a file

• still, sample code useful, as we will see …

Our goal

• We want a programmer’s “search engine” that – doesn’t merely find an example code– instead, it synthesizes the desired code– from two favorite sources:

• type signatures• existing code examples

More precisely

• mining input: – the API (type signatures from class definitions)– corpus of API client code

• search input:– a query specifying programmer’s intent

• output: – synthesized code – ready for insertion into user program– give several candidates (user selects one)

Formulating the code search problemWe must decide on the structure of:

– input query (coding intent)• easy to express for the user• yet specific enough for the search engine

– output code (synthesized code)• easy to understand and validate (by reading docs)• code should complete the program under construction

The query: from ‘have’ to ‘want’

• 1st observation– Reuse problems can usually be described with

a have-one-want-one query q=(h,w):

“What code will transform a (single) object of (static) type h into a (single) object of (static) type w?”

• Our parsing example: q = (IFile, ASTNode)IFile file = …ICompilationUnit cu = JavaCore.createCompilationUnitFrom(file); ASTNode node = AST.parseCompilationUnit(cu, false);

Output code: jungloid

• 2nd observation:– most queries can be answered with a jungloid

• jungloid: – a unary expression composed of unary expressions:

• field access• call to an instance method with 0 arguments• call to a static method or constructor with 1 argument• conversion to supertype• (multi-argument methods decomposed into unary ones)

IFile file = …ICompilationUnit cu = JavaCore.createCompilationUnitFrom(file); ASTNode node = AST.parseCompilationUnit(cu, false);

Coverage

An informal experiment:– using 16 coding headaches, collected by us

• Can the query express interesting problems? – yes, for 12 out of 16 coding problems

• Can queries be answered with a jungloid?– yes, all 12 queries answered with jungloids

• 9 of them are simple jungloids • 3 of them use some multi-argument methods

Prospector: our prototype

• Eclipse plugin– integrated with “code completion assist”

var.[CTRL+SPACE]

– the “want” type wWantType x = [CTRL+SPACE]

– a set H of “has” types obtained from context • local variables, arguments, class fields, globals

– issue queries (h,w) for each h H

fieldfoo()bar(int len, Object key)

fieldfoo()bar(int len, Object key)

Type signature graph

Any path from h to w is a (h,w)-jungloid

• 3rd observation:– desired jungloid typically among k shortest

paths (k=5)

IFile CompilationUnit

ICompilationUnit

ASTNode

IClassFile

JavaCore.createCompilationUnitFrom()

AST.parseCompilationUnit()supertyp

e

AST.parseCompilationUnit()

JavaCore.createClassFileFrom()

IJavaElement IResource

supertype

getResource()

IContainer

getParent()

Jungloids with downcasts

IDebugView debugger = ...Viewer viewer = debugger.getViewer();IStructuredSelection sel = (IStructuredSelection) viewer.getSelection();JavaInspectExpression expr = (JavaInspectExpression) sel.getFirstElement();

IDebugView

Viewer

ISelection

IStructuredSelection

JavaInspectExpressionObject

getViewer()

getSelection()

getFirstElement()getIn

put()

downcast

downcast

Our solution

• Besides downcasts, this problem appears in– method arguments of type Object (only accept

a JavaBean) – String objects (strings are highly polymorphic)

• Potential solutions– parametric type inference, alias analysis

• Our solution– mine a corpus of API uses for legal downcasts

Mining jungloids with downcasts

• Ideally, only correct jungloids are synthesized– correct = it must be possible to write a client code

in which the jungloid’s downcast succeeds, for at least one input

• This ideal can be approximated (overview): – use a corpus of API client code– extract jungloids with downcasts– use them to extend the signature graph

• In the limit, we meet the ideal– limit = infinitely large, bug-free corpus

• bug-free corpus– weak requirement: jungloids in corpus to succeed

for one input

Mining jungloids with downcasts (example)

IDebugView

Viewer

ISelection

IStructuredSelection

JavaInspectExpressionObject

getViewer()

getSelection()

getFirstElement()getIn

put()

Viewer’

ISelection’

IStructuredSelection’

Object’

getViewer() getSelection()

getFirstElement()

downcast

downcast

downcast

downcast

protected IJavaObject getObjectContext()

IWorkbenchPage page = …

IWorkbenchPart part = page.getActivePart();

IDebugView view = (IDebugView) part.getAdapter();

ISelection s = view.getViewer().getSelection();

IStructuredSelection sel = (IStructuredSelection)s;

Object selection = sel.getFirstElement();

JavaInspectExpression exp = (JavaInspectExpression)

selection;

...

}

protected IJavaObject getObjectContext()

IWorkbenchPage page = …

IWorkbenchPart part = page.getActivePart();

IDebugView view = (IDebugView) part.getAdapter();

ISelection s = view.getViewer().getSelection();

IStructuredSelection sel = (IStructuredSelection)s;

Object selection = sel.getFirstElement();

JavaInspectExpression exp = (JavaInspectExpression)

selection;

...

}

IStructuredSelection<JavaInspectExpression>

Viewer<IStructuredSelection<JavaInspectExpression>>

The jungloid mining algorithm (key idea)When extracting jungloids, how to determine the

necessary downcast context (i.e., jungloid suffix)?w.x.a.(T)s.y.a.(S)

What if the context is too short?– unsound: a query may synthesize a jungloid that

will throw exception in any client code

What if the context is too long?– incomplete: a query may fail to synthesize the

jungloid even though the corpus contains the necessary example

x.a.(T)y.a.(S)

Experiment 1 (ranking test)

• hypothesis: – to find the desired code, the user needs to

examine only top 5 candidate jungloids.

• result: – desired code in “top 5” 17 out 20 times (10 out

of 20, in “top 1”)– remaining three fixable

• methodology:– used 20 real-world coding tasks– collected from FAQs, newsgroups, our practice,

emails to us

Experiment 2 (user study)

• hypothesis:– Prospector-equipped programmers are better at solving

API programming problems than other programmers

• methodology: – 6 problems, each user did 3 with Prospector and 3

without– problems formulated not to reveal the query – sample problem:

“The new Java channel IO system represents files as channels. How do I get a channel that represents a String filename?”

– somewhat sparse data (10 users), surveys still trickling in

Experiment 2 (user study). Results.• Prospector shortens development time

– some problems solved only by Prospector users– when both groups succeeded, Prospector users 30%

faster

• Prospector may help enable reuse– non-Prospector users sometimes reimplemented

• Prospector may help avoid making mistakes– mistakes applying code found on internet into own

code

• We expect even stronger results on a more robust infrastructure.

Future work

• Coding task we currently can’t handle: – print an AST as Java source

• The limitation:– task is expressible as a (have,want) query– but result is not a jungloid (as defined in this

talk)ASTNode ast = ...ASTFlattener visitor = new ASTFlattener();ast.accept(visitor);String result = visitor.getResult();ASTNode ast = ...

ASTFlattener visitor = new ASTFlattener();ASTFlattener visitor2 = ast.accept(visitor);String result = visitor2.getResult();

Try it!

• Web demo– snobol.cs.berkeley.edu

• Eclipse plugin – coming soon– want to alpha test it?

Conclusion

Sketch: – partial implementation, provided by

programmer

1. StreamBit:– behavioral spec + sketch full implementation– goal: total correctness and performance

2. Prospector: – sketch several full implementations– user selects implementation with desired

behavior– goal: software reuse

Backup slides

Programming with jungloids

NodeItem node = (NodeItem) getModel();GraphNodeFigure f = (GraphNodeFigure) getFigure();f.getLabel().setName(node.getNodeName());Rectangle r = new Rectangle(node.x, node.y, -1, -1);GraphicalEditPart parent = (GraphicalEditPart) getParent();parent.setLayoutConstraint(this, f, r)

Two techniques for programming by sketching (Stanford, November 2004) Rastislav Bodik, David...

Documents

Transcript of Two techniques for programming by sketching (Stanford, November 2004) Rastislav Bodik, David...