Dimensions in Synthesis Part 2: Applications (Intelligent Tutoring Systems) Sumit Gulwani...

60
Dimensions in Synthesis Part 2: Applications (Intelligent Tutoring Systems) Sumit Gulwani [email protected] Microsoft Research, Redmond May 2012

Transcript of Dimensions in Synthesis Part 2: Applications (Intelligent Tutoring Systems) Sumit Gulwani...

Dimensions in SynthesisPart 2: Applications

(Intelligent Tutoring Systems)

Sumit [email protected]

Microsoft Research, Redmond

May 2012

Domain Insight

Bit-vector Algorithms

Geometry Constructions

Testing & Verification

Symbolic verification runs in reasonable time

Testing is probabilistically sound

Synthesis Strategy

Counter-example guided inductive synthesis

Brute-force search: Generate and Test

2

Recap

 

Students and Teachers

End-Users

Algorithm Designers

Software Developers

Most Transformational Target

Potential Users of Synthesis Technology

3

Most Useful Target

• Vision for End-users: Enable people to have (automated) personal assistants.

• Vision for Education: Enable every student to have access to free & high-quality education.

• Motivation– Online learning sites: Khan academy, Edx, Udacity,

Coursera• Increasing class sizes with even less personal attention

– New technologies: Tablets/Smartphones, NUI, Cloud• Various Aspects

– Solution Generation– Problem Generation – Automated Grading/Feedback – Content Entry

• Various Domains– K-12: Mathematics, Physics, Chemistry– Undergraduate: Introductory Programming, Automata

Theory – Language Learning

4

Intelligent Tutoring Systems

5

Intelligent Tutoring Systems

• Aspects– Solution Generation– Problem Generation– Automated Grading– Content Entry

• Domains– Geometry– Algebra– Introductory Programming– Automata Theory– Physics– Chemistry

Joint work with: Cerny, Henzinger, Radhakrishna, Zufferey

6

Classic Problem in Automata Theory Course

Solution Generation Engine

Let L be the language containing all strings over {a,b} that have the same number of occurrences of “ab” as occurrences of “ba”. Construct an automata that accepts L, or prove that L is non-regular.

“Regular”, <Automata for L>

7

Classic Problem in Automata Theory Course

Solution Generation Engine

Let L be the language containing all strings over {a,b} that have the same number of occurrences of “a” as occurrences of “b”. Construct an automata that accepts L, or prove that L is non-regular.

“Non-regular”, <Proof of non-regularity>

Formal Description of Input

• Regular Languages– Algorithm for automata synthesis.

• Non-regular Languages – Formal description of non-regularity proof.– Algorithm for proof synthesis.

8

Outline

Problem Description Languages

Examples• User-friendly logic• Context-free grammar

Interfaces• Membership Test

– Required for inductive synthesis of automata or non-regularity proof.

• Symbolic Membership Test– Required for verification of non-regularity proof.

9

Formal Description of Input

• same number of occurrences of “ab” as that of “ba”.

• all occurrences of “ab” start on an even position.

• consists of n a’s followed by n b’s.

10

User-friendly Logic

• Formal Description of Input

• Regular Languages Algorithm for automata synthesis.

• Non-regular Languages – Formal description of non-regularity proof.– Algorithm for proof synthesis.

11

Outline

We use Angluin’s L* algorithm.

• Membership Oracle – Provided by the PDL interface.

• Equivalence Oracle– Can be simulated using Membership Oracle on bit-

strings of size at most 2n, where n is the # of automata states.

– Theorem: Let be a DFA with states, let be a DFA with states. If , then there exists a word in , such that the length of is at most .

12

Automata Generation for Regular Languages

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 160

5

10

15

20

25

30

35

Automata Size

# of prob-lems

13

Distribution of Automata sizes

The automata size of educational problems is small!

-11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 20

5

10

15

20

25

30

35

40

Counterexample length – automata size

# of prob-lems

14

Distribution of counterexample lengths(relative to automata size)

The counterexample lengths are quite smaller than worst-case possibility of twice the automata size.

• Formal Description of Input

• Regular Languages– Algorithm for automata synthesis.

• Non-regular Languages Formal description of non-regularity proof.– Algorithm for proof synthesis.

15

Outline

A language is non-regular iff there exist functions and such that:

The above characterization can be shown equivalent to:

This allows for automation!

We refer to F/G as congruence/witness functions.

16

Myhill-Nerode Theorem: Non-regularity Condition

• Same number of occurrences of “a” as that of “b”.

Congruence Function: Witness Function:

• Same number of occurrences of “a” as that of “b”.

Congruence Function: Witness Function:

• .Congruence Function: Witness Function:

17

Examples of Congruence/Witness Functions

The following language is expressive enough to represent Congruence function F(i) and Witness Function G(i,j) for several classroom problems.

where c is any alphabet.

18

Language for Congruence/Witness Functions

• Formal Description of Input

• Regular Languages– Algorithm for automata synthesis.

• Non-regular Languages – Formal description of non-regularity proof. Algorithm for proof synthesis.

19

Outline

1. Approximation– Run L* algorithm using counterexamples of size .

2. Generation (using Inductive Synthesis)– For each eq. class, generate all congruence

functions that evaluate to a string in that class. Intersect pairs of such sets to generate candidates.

– For each candidate congruence fn., use similar methodology to generate candidate witness fn.

3. Testing based Validation– Test correctness of candidate fns. on

4. Verification– Use symbolic inference rules to verify correctness.

5. Refinement– If any of steps 2,3,4 fail, repeat with larger value of

k. 20

Algorithm for constructing Congruence/Witness Fns.

21

Experimental ResultsId PDL Generati

onCongruence Fn.

Witness Fn.

Test

Verify

1 UFPDL

5 0.1 0.1 0.1

2 UFPDL

5 0.2 0.1 0.1

3 UFPDL

5 0.9 0.1 0.1

3 CFG 5 0.4 0.1 0.1

4 UFPDL

5 1.6 0.1 0.2

5 CFG 10 0.2 0.1 0.1

6 CFG 5 0.1 0.1 0.1

9 CFG 10 0.1 0.1 0.1

11 UFPDL

8 12.1 0.4 0.1

12 UFPDL

5 10.1 0.4 0.2

13 UFPDL

5 6.4 0.2 0.1

15 UFPDL

7 7.8 1.1 0.1

15 CFG 7 0.1 0.1 0.1

16 CFG 5 0.2 0.1 0.1

17 UFPDL

5 1.6 0.4 0.1

18 CFG 5 0.3 0.1 0.1

19 UFPDL

10 15.8 4.5 1.7

21 CFG 5 0.5 0.1 0.1

22

Intelligent Tutoring Systems

• Aspects– Solution Generation– Problem Generation– Automated Grading– Content Entry

• Domains– Geometry– Algebra– Introductory Programming– Automata Theory– Physics– Chemistry

AAAI 2012: Singh, Gulwani, Rajamani.

New problems generated:

:

:

23

Trigonometry Problem

24

Algebra Problem Generation

Example Problem

Query GenerationQuery

Query Execution

New Problems

Query Refinement

Results OK?

Refined Query

No

Yes

Similar Problems

New problems generated:

25

Limits/Series Problem

New problems generated:

26

Integration Problem

New problems generated:

27

Determinant Problem

28

Intelligent Tutoring Systems

• Aspects– Solution Generation– Problem Generation– Automated Grading– Content Entry

• Domains– Geometry– Algebra– Introductory Programming– Automata Theory– Physics– Chemistry

Arxiv TR 2012: Rishabh Singh, Gulwani, Armando Solar-Lezama .

29

Background: PexForFun

using System;public class Program {public static int[] Puzzle(int[] a) {

int[] b = new int[a.Length];int count = 0;for(int i=a.Length; i < a.Length; i--){

b[count] = a[i];count++;

}return b;

} }

30

Buggy Program for Array Reverse

6:28::50 AM

using System;public class Program {public static int[] Puzzle(int[] a) {

int[] b = new int[a.Length];int count = 0;for(int i=a.Length-1; i < a.Length-1; i--){

b[count] = a[i];count++;

}return b;

} }

31

Buggy Program for Array Reverse

6:32::01 AM

using System;public class Program {public static int[] Puzzle(int[] a) {

int[] b = new int[a.Length];int count = 0;for(int i=a.Length-1; i < a.Length-1; i--){

b[count] = a[i];count++;

}return b;

} }

32

Buggy Program for Array Reverse

6:32::32 AM

No change! Sign of Frustation?

using System;public class Program {public static int[] Puzzle(int[] a) {

int[] b = new int[a.Length];int count = 0;for(int i=a.Length; i <= a.Length; i--){

b[count] = a[i];count++;

}return b;

} }

33

Buggy Program for Array Reverse

6:33::19 AM

using System;public class Program {public static int[] Puzzle(int[] a) {

int[] b = new int[a.Length];int count = 0;for(int i=a.Length; i < a.Length; i--){ Console.Writeline(i);

b[count] = a[i];count++;

}return b;

} }

34

Buggy Program for Array Reverse

6:33::55 AM

Same as initial attempt except Console.Writeline!

using System;public class Program {public static int[] Puzzle(int[] a) {

int[] b = new int[a.Length];int count = 0;for(int i=a.Length; i < a.Length; i--){ Console.Writeline(i);

b[count] = a[i];count++;

}return b;

} }

35

Buggy Program for Array Reverse

6:34::06 AM

No change! Sign of Frustation?

using System;public class Program {public static int[] Puzzle(int[] a) {

int[] b = new int[a.Length];int count = 0;for(int i=a.Length; i <= a.Length; i--){ Console.Writeline(i);

b[count] = a[i];count++;

}return b;

} }

36

Buggy Program for Array Reverse

6:34::56 AM

The student has tried this before!

using System;public class Program {public static int[] Puzzle(int[] a) {

int[] b = new int[a.Length];int count = 0;for(int i=a.Length; i < a.Length; i--){

b[count] = a[i];count++;

}return b;

} }

37

Buggy Program for Array Reverse

6:36::24 AM

Same as initial attempt!

using System;public class Program {public static int[] Puzzle(int[] a) {

int[] b = new int[a.Length];int count = 0;for(int i=a.Length-1; i < a.Length-1; i--){

b[count] = a[i];count++;

}return b;

} }

38

Buggy Program for Array Reverse

6:37::39 AM

The student has tried this before!

using System;public class Program {public static int[] Puzzle(int[] a) {

int[] b = new int[a.Length];int count = 0;for(int i=a.Length; i > 0; i--){

b[count] = a[i];count++;

}return b;

} }

39

Buggy Program for Array Reverse

6:38::11 AM

Almost correct! (a[i-1] instead of a[i] in loop body)

using System;public class Program {public static int[] Puzzle(int[] a) {

int[] b = new int[a.Length];int count = 0;for(int i=a.Length; i >= 0; i--){

b[count] = a[i];count++;

}return b;

} }

40

Buggy Program for Array Reverse

6:38::44 AM

Student going in wrong direction!

using System;public class Program {public static int[] Puzzle(int[] a) {

int[] b = new int[a.Length];int count = 0;for(int i=a.Length; i < a.Length; i--){

b[count] = a[i];count++;

}return b;

} }

41

Buggy Program for Array Reverse

6:39::33 AM

Back to bigger error!

using System;public class Program {public static int[] Puzzle(int[] a) {

int[] b = new int[a.Length];int count = 0;for(int i=a.Length; i < a.Length; i--){

b[count] = a[i];count++;

}return b;

} }

42

Buggy Program for Array Reverse

6:39::45 AM

No change! Frustation!

using System;public class Program {public static int[] Puzzle(int[] a) {

int[] b = new int[a.Length];int count = 0;for(int i=a.Length; i < a.Length; i--){

b[count] = a[i];count++;

}return b;

} }

43

Buggy Program for Array Reverse

6:40::27 AM

No change! More Frustation!!

using System;public class Program {public static int[] Puzzle(int[] a) {

int[] b = new int[a.Length];int count = 0;for(int i=a.Length; i < a.Length; i--){

b[count] = a[i];count++;

}return b;

} }

44

Buggy Program for Array Reverse

6:40::57 AM

No change! Too Frustated now!!! Gives up.

Provides additional value over counterexample feedback.

• More friendly feedback.– Helpful for students who give up after several

tries (with only counterexample feedback).

• Grading– Counterexample feedback does not distinguish

between a slightly incorrect solution and one that is very far off from being correct.

45

Proposal: Semantic Grading

46

Demo

Simplifying Assumptions• Correct solution is known.• Errors are predictable.• Programs are small.

Challenging Aspects• No logical specification.

– Instead program equivalence.• Higher density of errors than production code.• Generate multiple fixes

– To remain faithful to the student’s thought process.

47

Relation with Automated Bug Fixing

• Teacher provides:– A reference implementation– Model of errors that students make

• Sketch encoding:– Fuzz the program using error model.– Use a counter to keep track of number of

changes.

• Sketch solving:– To generate minimal fixes, assert (counter = i)

for increasing values of i.– Use off-the-shelf SAT solver to explore the state

space of possible corrections.48

Technique

Array Index Fuzzing: v[a] -> v[{a+1, a-1, v.Length-a-1}] Initialization Fuzzing: v=n -> v={n+1, n-1, 0}

Increment Fuzzing: v++ -> { ++v, v--, --v }

Return Value Fuzzing: return v -> return ?v

Conditional Fuzzing: a op b -> a’ ops { a+1, a-1, 0 }

where ops = { <, >, <=, >=, ==, != }

49

Error Model for Array Reverse Problem

50

Effectiveness of Error Models

F1 F2 F3 F4 F50

0.10.20.30.40.50.60.70.80.9

Array Reverse

Palindrome

Max

Factorial

isIncreasing

Sort

Error Models

Fract

ion o

f Pro

gra

ms

fixed

51

Efficiency of Error Models

F1 F2 F3 F4 F50

5

10

15

20

25

30

35

Array Reverse

Palindrome

Max

isIncreasing

Factorial

Sort

Error Models

Runnin

g T

ime (

in s

)

Palin

drom

eMax

isInc

reas

ing

Sort

0

10

20

30

40

50

60

70

80

Specific Error Model

Array Reverse Error Model

Num

ber

of

Pro

gra

ms

Fixed

52

Generality of Error Models

Benchmark Total

Fixed Changes

Time(s)

Array Reverse 305 254 1.73 2.69

String Palindrome 86 64 1.28 3.52

Array Maximum 99 68 1.25 7.47

Is Increasing Order 51 38 1.72 3.56

Array Sort 74 35 1.17 32.46

Factorial 70 30 1.25 4.99

Friday Rush 17 4 1 12.42

53

Experimental Results

54

Potential Workflow

Teacher grades an ungraded answer-script.

System generalizes corrections into error models.

System performs automated grading by considering all possible combinations and instantiations of all error models.

Any ungraded answer scripts?

Yes

No

55

Intelligent Tutoring Systems

• Aspects– Solution Generation– Problem Generation– Automated Grading– Content Entry

• Domains– Geometry– Algebra– Introductory Programming– Automata Theory– Physics– Chemistry

Joint work with: Alex Polozov and Sriram Rajamani

State-of-the-art Mathematical Editors• Text editors like Latex

– Unreadable text in prefix notation• WYSIWIG editors like Microsoft Word

– Change of cursor positions multiple times.– Back and forth switching between mouse &

keyboard

Our proposal: An intelligent predictive editor.• Mathematical text has low entropy and hence

amenable to prediction!

56

Mathematical Intellisense

Terms connected by the same AC operator can be thought of as terms belonging to a sequence.

There are 2 opportunities for predicting such terms.

Sequence Creation: T1, T2, T3, …

• Learn a function F such that F(Ti) = Ti+1

Sequence Transformation: T1, T2, T3, T4 -> S1, S2, …

• Learn a function F such that F(Ti) = Si

57

Reducing (Term) Prediction to Learning-By-Examples

58

Mathematical (Syntactic) Intellisense

Prove

= 1

59

Mathematical (Semantic) Intellisense

Long-term Goals

• Ultra-intelligent computer

• Model of human mind

• Inter-stellar travel

60