Julian Brad eld [email protected] IF{4

426
Introduction to Theoretical Computer Science Julian Bradfield [email protected] IF–4.07 1

Transcript of Julian Brad eld [email protected] IF{4

Page 1: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Introduction to Theoretical Computer Science

Julian Bradfield

[email protected]

IF–4.07

1

Page 2: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Aims

I understanding of computability, complexity and intractability;

I knowledge of lambda calculus, types, and type safety.

2

Page 3: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outcomes

By the end of the course you should be able to

I Explain decidability, undecidability and the halting problem.

I Demonstrate the use of reductions for undecidability proofs.

I Explain the notions of P, NP, NP-complete.

I Use reductions to show problems to be NP-hard.

I Write short programs in lambda-calculus.

I Explain and demonstrate type-inference for simple programs.

3

Page 4: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outcomes

By the end of the course you should be able to

I Explain decidability, undecidability and the halting problem.

I Demonstrate the use of reductions for undecidability proofs.

I Explain the notions of P, NP, NP-complete.

I Use reductions to show problems to be NP-hard.

I Write short programs in lambda-calculus.

I Explain and demonstrate type-inference for simple programs.

3

Page 5: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outcomes

By the end of the course you should be able to

I Explain decidability, undecidability and the halting problem.

I Demonstrate the use of reductions for undecidability proofs.

I Explain the notions of P, NP, NP-complete.

I Use reductions to show problems to be NP-hard.

I Write short programs in lambda-calculus.

I Explain and demonstrate type-inference for simple programs.

3

Page 6: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outcomes

By the end of the course you should be able to

I Explain decidability, undecidability and the halting problem.

I Demonstrate the use of reductions for undecidability proofs.

I Explain the notions of P, NP, NP-complete.

I Use reductions to show problems to be NP-hard.

I Write short programs in lambda-calculus.

I Explain and demonstrate type-inference for simple programs.

3

Page 7: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outcomes

By the end of the course you should be able to

I Explain decidability, undecidability and the halting problem.

I Demonstrate the use of reductions for undecidability proofs.

I Explain the notions of P, NP, NP-complete.

I Use reductions to show problems to be NP-hard.

I Write short programs in lambda-calculus.

I Explain and demonstrate type-inference for simple programs.

3

Page 8: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outcomes

By the end of the course you should be able to

I Explain decidability, undecidability and the halting problem.

I Demonstrate the use of reductions for undecidability proofs.

I Explain the notions of P, NP, NP-complete.

I Use reductions to show problems to be NP-hard.

I Write short programs in lambda-calculus.

I Explain and demonstrate type-inference for simple programs.

3

Page 9: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outline

I Introduction. The meaning of computation.

I Register machines and their programming.I Universal machines and the halting problem.I Decision problems and reductions.I Undecidability and semi-decidability.I Complexity of algorithms and problems.I The class PI Non-determinism and NPI NP-completenessI Beyond NP.I Lambda-calculus.I Recursion.I Types.I Polymorphism.I Type inference.

4

Page 10: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outline

I Introduction. The meaning of computation.I Register machines and their programming.

I Universal machines and the halting problem.I Decision problems and reductions.I Undecidability and semi-decidability.I Complexity of algorithms and problems.I The class PI Non-determinism and NPI NP-completenessI Beyond NP.I Lambda-calculus.I Recursion.I Types.I Polymorphism.I Type inference.

4

Page 11: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outline

I Introduction. The meaning of computation.I Register machines and their programming.I Universal machines and the halting problem.

I Decision problems and reductions.I Undecidability and semi-decidability.I Complexity of algorithms and problems.I The class PI Non-determinism and NPI NP-completenessI Beyond NP.I Lambda-calculus.I Recursion.I Types.I Polymorphism.I Type inference.

4

Page 12: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outline

I Introduction. The meaning of computation.I Register machines and their programming.I Universal machines and the halting problem.I Decision problems and reductions.

I Undecidability and semi-decidability.I Complexity of algorithms and problems.I The class PI Non-determinism and NPI NP-completenessI Beyond NP.I Lambda-calculus.I Recursion.I Types.I Polymorphism.I Type inference.

4

Page 13: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outline

I Introduction. The meaning of computation.I Register machines and their programming.I Universal machines and the halting problem.I Decision problems and reductions.I Undecidability and semi-decidability.

I Complexity of algorithms and problems.I The class PI Non-determinism and NPI NP-completenessI Beyond NP.I Lambda-calculus.I Recursion.I Types.I Polymorphism.I Type inference.

4

Page 14: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outline

I Introduction. The meaning of computation.I Register machines and their programming.I Universal machines and the halting problem.I Decision problems and reductions.I Undecidability and semi-decidability.I Complexity of algorithms and problems.

I The class PI Non-determinism and NPI NP-completenessI Beyond NP.I Lambda-calculus.I Recursion.I Types.I Polymorphism.I Type inference.

4

Page 15: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outline

I Introduction. The meaning of computation.I Register machines and their programming.I Universal machines and the halting problem.I Decision problems and reductions.I Undecidability and semi-decidability.I Complexity of algorithms and problems.I The class P

I Non-determinism and NPI NP-completenessI Beyond NP.I Lambda-calculus.I Recursion.I Types.I Polymorphism.I Type inference.

4

Page 16: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outline

I Introduction. The meaning of computation.I Register machines and their programming.I Universal machines and the halting problem.I Decision problems and reductions.I Undecidability and semi-decidability.I Complexity of algorithms and problems.I The class PI Non-determinism and NP

I NP-completenessI Beyond NP.I Lambda-calculus.I Recursion.I Types.I Polymorphism.I Type inference.

4

Page 17: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outline

I Introduction. The meaning of computation.I Register machines and their programming.I Universal machines and the halting problem.I Decision problems and reductions.I Undecidability and semi-decidability.I Complexity of algorithms and problems.I The class PI Non-determinism and NPI NP-completeness

I Beyond NP.I Lambda-calculus.I Recursion.I Types.I Polymorphism.I Type inference.

4

Page 18: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outline

I Introduction. The meaning of computation.I Register machines and their programming.I Universal machines and the halting problem.I Decision problems and reductions.I Undecidability and semi-decidability.I Complexity of algorithms and problems.I The class PI Non-determinism and NPI NP-completenessI Beyond NP.

I Lambda-calculus.I Recursion.I Types.I Polymorphism.I Type inference.

4

Page 19: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outline

I Introduction. The meaning of computation.I Register machines and their programming.I Universal machines and the halting problem.I Decision problems and reductions.I Undecidability and semi-decidability.I Complexity of algorithms and problems.I The class PI Non-determinism and NPI NP-completenessI Beyond NP.I Lambda-calculus.

I Recursion.I Types.I Polymorphism.I Type inference.

4

Page 20: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outline

I Introduction. The meaning of computation.I Register machines and their programming.I Universal machines and the halting problem.I Decision problems and reductions.I Undecidability and semi-decidability.I Complexity of algorithms and problems.I The class PI Non-determinism and NPI NP-completenessI Beyond NP.I Lambda-calculus.I Recursion.

I Types.I Polymorphism.I Type inference.

4

Page 21: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outline

I Introduction. The meaning of computation.I Register machines and their programming.I Universal machines and the halting problem.I Decision problems and reductions.I Undecidability and semi-decidability.I Complexity of algorithms and problems.I The class PI Non-determinism and NPI NP-completenessI Beyond NP.I Lambda-calculus.I Recursion.I Types.

I Polymorphism.I Type inference.

4

Page 22: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outline

I Introduction. The meaning of computation.I Register machines and their programming.I Universal machines and the halting problem.I Decision problems and reductions.I Undecidability and semi-decidability.I Complexity of algorithms and problems.I The class PI Non-determinism and NPI NP-completenessI Beyond NP.I Lambda-calculus.I Recursion.I Types.I Polymorphism.

I Type inference.

4

Page 23: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Course Outline

I Introduction. The meaning of computation.I Register machines and their programming.I Universal machines and the halting problem.I Decision problems and reductions.I Undecidability and semi-decidability.I Complexity of algorithms and problems.I The class PI Non-determinism and NPI NP-completenessI Beyond NP.I Lambda-calculus.I Recursion.I Types.I Polymorphism.I Type inference.

4

Page 24: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Assessment

The course is assessed by a written examination (70%) and two courseworkexercises (15% each).

Coursework deadlines: 16:00 on 14 February and 20 March (end of weeks5 and 9 – remember the gap week isn’t numbered).

The courseworks will, roughly, contain one question for each week ofthe course. Please do the questions in/following the week to which theyapply, rather than waiting till the deadline to do them all!

5

Page 25: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Lectures

I see the purpose of lectures as providing the key concepts on to whichyou can hook the rest of the material.

Slides are few in number, and text-dense: I will spend a lot of time talkingabout things and using the board, with the slides providing the reminderof what we’re doing.

Interaction in lectures is encouraged!

From time to time, we’ll do relatively detailed proofs, but mostly this willbe left to you in exercises and independent study.

Note: The last examinable lecture is (as currently planned) the Mondayof week 8, but there are two further lectures offered.

6

Page 26: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Textbooks

It will be useful, but not absolutely necessary, to have access to:

I Michael Sipser Introduction to the Theory of Computation, PWSPublishing (International Thomson Publishing)

I Benjamin C. Pierce Types and Programming Languages, MIT Press

There is also much information on the Web, and in particular Wikipediaarticles are generally fairly good in this area.

Generally I will refer to textbooks for the detail of material I discuss onslides.

7

Page 27: Julian Brad eld jcb@inf.ed.ac.uk IF{4

What is computation? What are computers?

Some computing devices:

The abacus – some millennia BP.

[Association pour le musee international du calcul de l’informatique et de

l’automatique de Valbonne Sophia Antipolis (AMISA)]

8

Page 28: Julian Brad eld jcb@inf.ed.ac.uk IF{4

First mechanical digital calculator – 1642 Pascal

[original source unknown]

9

Page 29: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The Difference Engine, [The Analytical Engine] – 1812, 1832 Babbage /Lovelace.

[ Science Museum ?? ]

Analytical Engine (never built) anticipated many modern aspects ofcomputers. See http://www.fourmilab.ch/babbage/.

10

Page 30: Julian Brad eld jcb@inf.ed.ac.uk IF{4

ENIAC – 1945, Eckert & Mauchley

[University of Pennsylvania] 11

Page 31: Julian Brad eld jcb@inf.ed.ac.uk IF{4

What do computers manipulate?

Symbols? Numbers? Bits? Does it matter?

What about real numbers? Physical quantities? Proofs? Emotions?

Do we buy that numbers are enough? If we buy that, are bits enough?

How much memory do we need?

12

Page 32: Julian Brad eld jcb@inf.ed.ac.uk IF{4

What do computers manipulate?

Symbols? Numbers? Bits? Does it matter?

What about real numbers? Physical quantities? Proofs? Emotions?

Do we buy that numbers are enough? If we buy that, are bits enough?

How much memory do we need?

12

Page 33: Julian Brad eld jcb@inf.ed.ac.uk IF{4

What do computers manipulate?

Symbols? Numbers? Bits? Does it matter?

What about real numbers? Physical quantities? Proofs? Emotions?

Do we buy that numbers are enough? If we buy that, are bits enough?

How much memory do we need?

12

Page 34: Julian Brad eld jcb@inf.ed.ac.uk IF{4

What do computers manipulate?

Symbols? Numbers? Bits? Does it matter?

What about real numbers? Physical quantities? Proofs? Emotions?

Do we buy that numbers are enough? If we buy that, are bits enough?

How much memory do we need?

12

Page 35: Julian Brad eld jcb@inf.ed.ac.uk IF{4

What can we compute?

If we can cast a problem in terms that our computers manipulate, can wesolve it?

Always? Sometimes? With how much time? With how muchmemory?

13

Page 36: Julian Brad eld jcb@inf.ed.ac.uk IF{4

What can we compute?

If we can cast a problem in terms that our computers manipulate, can wesolve it? Always? Sometimes?

With how much time? With how muchmemory?

13

Page 37: Julian Brad eld jcb@inf.ed.ac.uk IF{4

What can we compute?

If we can cast a problem in terms that our computers manipulate, can wesolve it? Always? Sometimes? With how much time?

With how muchmemory?

13

Page 38: Julian Brad eld jcb@inf.ed.ac.uk IF{4

What can we compute?

If we can cast a problem in terms that our computers manipulate, can wesolve it? Always? Sometimes? With how much time? With how muchmemory?

13

Page 39: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Register Machines

At the end of Inf2A, you saw Turing machines and undecidability. We’llreprise this using a different model a bit closer to (y)our ideas of what acomputer is.

The simplest Register Machine has:

I a fixed number of registers R0, . . . ,Rm−1, which each hold a naturalnumber ;

I a fixed program that is a sequence P = I0, I1, . . . , In−1 ofinstructions.

I Each instruction is one of:inc(i) add 1 to Ri , ordecjz(i , j) if Ri = 0 then goto Ij , else subtract 1 from Ri

I The machine executes instructions in order, except when decjzcauses a jump.

CLAIM: Register Machines can compute anything any other computercan.

What is unrealistic about these machines?

14

Page 40: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Register Machines

At the end of Inf2A, you saw Turing machines and undecidability. We’llreprise this using a different model a bit closer to (y)our ideas of what acomputer is.

The simplest Register Machine has:

I a fixed number of registers R0, . . . ,Rm−1, which each hold a naturalnumber ;

I a fixed program that is a sequence P = I0, I1, . . . , In−1 ofinstructions.

I Each instruction is one of:inc(i) add 1 to Ri , ordecjz(i , j) if Ri = 0 then goto Ij , else subtract 1 from Ri

I The machine executes instructions in order, except when decjzcauses a jump.

CLAIM: Register Machines can compute anything any other computercan.

What is unrealistic about these machines?

14

Page 41: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Register Machines

At the end of Inf2A, you saw Turing machines and undecidability. We’llreprise this using a different model a bit closer to (y)our ideas of what acomputer is.

The simplest Register Machine has:

I a fixed number of registers R0, . . . ,Rm−1, which each hold a naturalnumber ;

I a fixed program that is a sequence P = I0, I1, . . . , In−1 ofinstructions.

I Each instruction is one of:inc(i) add 1 to Ri , ordecjz(i , j) if Ri = 0 then goto Ij , else subtract 1 from Ri

I The machine executes instructions in order, except when decjzcauses a jump.

CLAIM: Register Machines can compute anything any other computercan.

What is unrealistic about these machines?

14

Page 42: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Register Machines

At the end of Inf2A, you saw Turing machines and undecidability. We’llreprise this using a different model a bit closer to (y)our ideas of what acomputer is.

The simplest Register Machine has:

I a fixed number of registers R0, . . . ,Rm−1, which each hold a naturalnumber ;

I a fixed program that is a sequence P = I0, I1, . . . , In−1 ofinstructions.

I Each instruction is one of:inc(i) add 1 to Ri , ordecjz(i , j) if Ri = 0 then goto Ij , else subtract 1 from Ri

I The machine executes instructions in order, except when decjzcauses a jump.

CLAIM: Register Machines can compute anything any other computercan.

What is unrealistic about these machines?

14

Page 43: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Register Machines

At the end of Inf2A, you saw Turing machines and undecidability. We’llreprise this using a different model a bit closer to (y)our ideas of what acomputer is.

The simplest Register Machine has:

I a fixed number of registers R0, . . . ,Rm−1, which each hold a naturalnumber ;

I a fixed program that is a sequence P = I0, I1, . . . , In−1 ofinstructions.

I Each instruction is one of:inc(i) add 1 to Ri , ordecjz(i , j) if Ri = 0 then goto Ij , else subtract 1 from Ri

I The machine executes instructions in order, except when decjzcauses a jump.

CLAIM: Register Machines can compute anything any other computercan.

What is unrealistic about these machines?

14

Page 44: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Register Machines

At the end of Inf2A, you saw Turing machines and undecidability. We’llreprise this using a different model a bit closer to (y)our ideas of what acomputer is.

The simplest Register Machine has:

I a fixed number of registers R0, . . . ,Rm−1, which each hold a naturalnumber ;

I a fixed program that is a sequence P = I0, I1, . . . , In−1 ofinstructions.

I Each instruction is one of:inc(i) add 1 to Ri , ordecjz(i , j) if Ri = 0 then goto Ij , else subtract 1 from Ri

I The machine executes instructions in order, except when decjzcauses a jump.

CLAIM: Register Machines can compute anything any other computercan.

What is unrealistic about these machines?

14

Page 45: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Register Machines

At the end of Inf2A, you saw Turing machines and undecidability. We’llreprise this using a different model a bit closer to (y)our ideas of what acomputer is.

The simplest Register Machine has:

I a fixed number of registers R0, . . . ,Rm−1, which each hold a naturalnumber ;

I a fixed program that is a sequence P = I0, I1, . . . , In−1 ofinstructions.

I Each instruction is one of:inc(i) add 1 to Ri , ordecjz(i , j) if Ri = 0 then goto Ij , else subtract 1 from Ri

I The machine executes instructions in order, except when decjzcauses a jump.

CLAIM: Register Machines can compute anything any other computercan.

What is unrealistic about these machines?14

Page 46: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Programming RMs

RMs are very simple, so programs become very large.

First of all, we’ll define ‘macros’ to make programs more understandable.

These are not functions or subroutines as in C or Java: they are like#define in C, human abbreviations for longer bits of code.

We’ll write them in English, e.g. ‘add Ri to Rj clearing Ri ’.

When we define a macro, we’ll number instructions from zero. When themacro is used, the numbers must be updated appropriately (‘relocated’).

We’ll also use symbolic labels for instructions, and use them in jumps.

We can then build macros into the hardware by adding new instructionsthat may have access to special registers: we’ll use negative indices forregisters that are not to be used by normal programs.

Convention: macros must leave all special registers they use at zero –and hence the special registers can be assumed to be zero on entry.

15

Page 47: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Programming RMs

RMs are very simple, so programs become very large.

First of all, we’ll define ‘macros’ to make programs more understandable.

These are not functions or subroutines as in C or Java: they are like#define in C, human abbreviations for longer bits of code.

We’ll write them in English, e.g. ‘add Ri to Rj clearing Ri ’.

When we define a macro, we’ll number instructions from zero. When themacro is used, the numbers must be updated appropriately (‘relocated’).

We’ll also use symbolic labels for instructions, and use them in jumps.

We can then build macros into the hardware by adding new instructionsthat may have access to special registers: we’ll use negative indices forregisters that are not to be used by normal programs.

Convention: macros must leave all special registers they use at zero –and hence the special registers can be assumed to be zero on entry.

15

Page 48: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Programming RMs

RMs are very simple, so programs become very large.

First of all, we’ll define ‘macros’ to make programs more understandable.

These are not functions or subroutines as in C or Java: they are like#define in C, human abbreviations for longer bits of code.

We’ll write them in English, e.g. ‘add Ri to Rj clearing Ri ’.

When we define a macro, we’ll number instructions from zero. When themacro is used, the numbers must be updated appropriately (‘relocated’).

We’ll also use symbolic labels for instructions, and use them in jumps.

We can then build macros into the hardware by adding new instructionsthat may have access to special registers: we’ll use negative indices forregisters that are not to be used by normal programs.

Convention: macros must leave all special registers they use at zero –and hence the special registers can be assumed to be zero on entry.

15

Page 49: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Programming RMs

RMs are very simple, so programs become very large.

First of all, we’ll define ‘macros’ to make programs more understandable.

These are not functions or subroutines as in C or Java: they are like#define in C, human abbreviations for longer bits of code.

We’ll write them in English, e.g. ‘add Ri to Rj clearing Ri ’.

When we define a macro, we’ll number instructions from zero. When themacro is used, the numbers must be updated appropriately (‘relocated’).

We’ll also use symbolic labels for instructions, and use them in jumps.

We can then build macros into the hardware by adding new instructionsthat may have access to special registers: we’ll use negative indices forregisters that are not to be used by normal programs.

Convention: macros must leave all special registers they use at zero –and hence the special registers can be assumed to be zero on entry.

15

Page 50: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Easy RM programs

I ‘goto Ij ’ using R−1 as temp0 decjz (−1, j)

I ‘clear Ri ’0 decjz (i , 2)1 goto 0

I ‘copy Ri to Rj ’ using R−2 as temp0 clear Rj

loop1 : 2 decjz (i , loop2)3 inc (j)4 inc (−2)5 goto loop1

loop2 : 6 decjz (−2, end)7 inc (i)8 goto loop2

end : 9

16

Page 51: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Easy RM programs

I ‘goto Ij ’ using R−1 as temp0 decjz (−1, j)

I ‘clear Ri ’0 decjz (i , 2)1 goto 0

I ‘copy Ri to Rj ’ using R−2 as temp0 clear Rj

loop1 : 2 decjz (i , loop2)3 inc (j)4 inc (−2)5 goto loop1

loop2 : 6 decjz (−2, end)7 inc (i)8 goto loop2

end : 9

16

Page 52: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Easy RM programs

I ‘goto Ij ’ using R−1 as temp0 decjz (−1, j)

I ‘clear Ri ’0 decjz (i , 2)1 goto 0

I ‘copy Ri to Rj ’ using R−2 as temp0 clear Rj

loop1 : 2 decjz (i , loop2)3 inc (j)4 inc (−2)5 goto loop1

loop2 : 6 decjz (−2, end)7 inc (i)8 goto loop2

end : 9

16

Page 53: Julian Brad eld jcb@inf.ed.ac.uk IF{4

RM programming exercises

I Addition/subtraction of registers

I Comparison of registers

I Multiplication of registers

I (integer) Division/remainder of registers

17

Page 54: Julian Brad eld jcb@inf.ed.ac.uk IF{4

RM programming exercises

I Addition/subtraction of registers

I Comparison of registers

I Multiplication of registers

I (integer) Division/remainder of registers

17

Page 55: Julian Brad eld jcb@inf.ed.ac.uk IF{4

RM programming exercises

I Addition/subtraction of registers

I Comparison of registers

I Multiplication of registers

I (integer) Division/remainder of registers

17

Page 56: Julian Brad eld jcb@inf.ed.ac.uk IF{4

RM programming exercises

I Addition/subtraction of registers

I Comparison of registers

I Multiplication of registers

I (integer) Division/remainder of registers

17

Page 57: Julian Brad eld jcb@inf.ed.ac.uk IF{4

How many registers?

So far, we have introduced registers as needed. Do we need machineswith arbitrarily many registers?

A pairing function is an injective function N× N→ N.

An easy mathematical example is (x , y) 7→ 2x3y .

Given a pairing function f , write 〈x , y〉2 for f (x , y). If z = 〈x , y〉2, letz0 = x and z1 = y .

Exercise: program pairing and unpairing functions in an RM.Exercise: design (or look up) a surjective pairing function.

Can generalize to 〈. . . 〉n : Nn → N. (Exercise: give two different ways.)

And thence to 〈. . . 〉 : N∗ → NThus we can compute with arbitrarily many numbers with a fixed numberof registers.

With a couple of extra instructions (‘goto’ and ‘clear’), just two userregisters are enough. (Think about what this means.)

18

Page 58: Julian Brad eld jcb@inf.ed.ac.uk IF{4

How many registers?

So far, we have introduced registers as needed. Do we need machineswith arbitrarily many registers?

A pairing function is an injective function N× N→ N.

An easy mathematical example is (x , y) 7→ 2x3y .

Given a pairing function f , write 〈x , y〉2 for f (x , y). If z = 〈x , y〉2, letz0 = x and z1 = y .

Exercise: program pairing and unpairing functions in an RM.Exercise: design (or look up) a surjective pairing function.

Can generalize to 〈. . . 〉n : Nn → N. (Exercise: give two different ways.)

And thence to 〈. . . 〉 : N∗ → NThus we can compute with arbitrarily many numbers with a fixed numberof registers.

With a couple of extra instructions (‘goto’ and ‘clear’), just two userregisters are enough. (Think about what this means.)

18

Page 59: Julian Brad eld jcb@inf.ed.ac.uk IF{4

How many registers?

So far, we have introduced registers as needed. Do we need machineswith arbitrarily many registers?

A pairing function is an injective function N× N→ N.

An easy mathematical example is (x , y) 7→ 2x3y .

Given a pairing function f , write 〈x , y〉2 for f (x , y). If z = 〈x , y〉2, letz0 = x and z1 = y .

Exercise: program pairing and unpairing functions in an RM.Exercise: design (or look up) a surjective pairing function.

Can generalize to 〈. . . 〉n : Nn → N. (Exercise: give two different ways.)

And thence to 〈. . . 〉 : N∗ → NThus we can compute with arbitrarily many numbers with a fixed numberof registers.

With a couple of extra instructions (‘goto’ and ‘clear’), just two userregisters are enough. (Think about what this means.)

18

Page 60: Julian Brad eld jcb@inf.ed.ac.uk IF{4

How many registers?

So far, we have introduced registers as needed. Do we need machineswith arbitrarily many registers?

A pairing function is an injective function N× N→ N.

An easy mathematical example is (x , y) 7→ 2x3y .

Given a pairing function f , write 〈x , y〉2 for f (x , y). If z = 〈x , y〉2, letz0 = x and z1 = y .

Exercise: program pairing and unpairing functions in an RM.

Exercise: design (or look up) a surjective pairing function.

Can generalize to 〈. . . 〉n : Nn → N. (Exercise: give two different ways.)

And thence to 〈. . . 〉 : N∗ → NThus we can compute with arbitrarily many numbers with a fixed numberof registers.

With a couple of extra instructions (‘goto’ and ‘clear’), just two userregisters are enough. (Think about what this means.)

18

Page 61: Julian Brad eld jcb@inf.ed.ac.uk IF{4

How many registers?

So far, we have introduced registers as needed. Do we need machineswith arbitrarily many registers?

A pairing function is an injective function N× N→ N.

An easy mathematical example is (x , y) 7→ 2x3y .

Given a pairing function f , write 〈x , y〉2 for f (x , y). If z = 〈x , y〉2, letz0 = x and z1 = y .

Exercise: program pairing and unpairing functions in an RM.Exercise: design (or look up) a surjective pairing function.

Can generalize to 〈. . . 〉n : Nn → N. (Exercise: give two different ways.)

And thence to 〈. . . 〉 : N∗ → NThus we can compute with arbitrarily many numbers with a fixed numberof registers.

With a couple of extra instructions (‘goto’ and ‘clear’), just two userregisters are enough. (Think about what this means.)

18

Page 62: Julian Brad eld jcb@inf.ed.ac.uk IF{4

How many registers?

So far, we have introduced registers as needed. Do we need machineswith arbitrarily many registers?

A pairing function is an injective function N× N→ N.

An easy mathematical example is (x , y) 7→ 2x3y .

Given a pairing function f , write 〈x , y〉2 for f (x , y). If z = 〈x , y〉2, letz0 = x and z1 = y .

Exercise: program pairing and unpairing functions in an RM.Exercise: design (or look up) a surjective pairing function.

Can generalize to 〈. . . 〉n : Nn → N. (Exercise: give two different ways.)

And thence to 〈. . . 〉 : N∗ → N

Thus we can compute with arbitrarily many numbers with a fixed numberof registers.

With a couple of extra instructions (‘goto’ and ‘clear’), just two userregisters are enough. (Think about what this means.)

18

Page 63: Julian Brad eld jcb@inf.ed.ac.uk IF{4

How many registers?

So far, we have introduced registers as needed. Do we need machineswith arbitrarily many registers?

A pairing function is an injective function N× N→ N.

An easy mathematical example is (x , y) 7→ 2x3y .

Given a pairing function f , write 〈x , y〉2 for f (x , y). If z = 〈x , y〉2, letz0 = x and z1 = y .

Exercise: program pairing and unpairing functions in an RM.Exercise: design (or look up) a surjective pairing function.

Can generalize to 〈. . . 〉n : Nn → N. (Exercise: give two different ways.)

And thence to 〈. . . 〉 : N∗ → NThus we can compute with arbitrarily many numbers with a fixed numberof registers.

With a couple of extra instructions (‘goto’ and ‘clear’), just two userregisters are enough. (Think about what this means.)

18

Page 64: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Towards a universal RM

Programs are data.

Can we build an RM that can interpret any other RM?

Need to encode RMs as numbers. What do we need for a machine M?

We need the contents R of the registers R0, . . . ,Rm−1, the programP = I0 . . . In−1 itself, and the ‘program counter’ C giving the currentinstruction.

Let p. . .q be the coding function given thus:

pinc(i)q = 〈0, i〉pdecjz(i , j)q = 〈1, i , j〉

pPq = 〈pI0q, . . . , pIn−1q〉pRq = 〈R0, . . . ,Rm−1〉pMq = 〈pPq, pRq,C 〉

Routine but very tedious Exercise: design an RM that takes an RMcoding in R0, simulates it, and leaves the final state (if any) in R0.

19

Page 65: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Towards a universal RM

Programs are data.

Can we build an RM that can interpret any other RM?

Need to encode RMs as numbers. What do we need for a machine M?

We need the contents R of the registers R0, . . . ,Rm−1, the programP = I0 . . . In−1 itself, and the ‘program counter’ C giving the currentinstruction.

Let p. . .q be the coding function given thus:

pinc(i)q = 〈0, i〉pdecjz(i , j)q = 〈1, i , j〉

pPq = 〈pI0q, . . . , pIn−1q〉pRq = 〈R0, . . . ,Rm−1〉pMq = 〈pPq, pRq,C 〉

Routine but very tedious Exercise: design an RM that takes an RMcoding in R0, simulates it, and leaves the final state (if any) in R0.

19

Page 66: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Towards a universal RM

Programs are data.

Can we build an RM that can interpret any other RM?

Need to encode RMs as numbers. What do we need for a machine M?

We need the contents R of the registers R0, . . . ,Rm−1, the programP = I0 . . . In−1 itself, and the ‘program counter’ C giving the currentinstruction.

Let p. . .q be the coding function given thus:

pinc(i)q = 〈0, i〉pdecjz(i , j)q = 〈1, i , j〉

pPq = 〈pI0q, . . . , pIn−1q〉pRq = 〈R0, . . . ,Rm−1〉pMq = 〈pPq, pRq,C 〉

Routine but very tedious Exercise: design an RM that takes an RMcoding in R0, simulates it, and leaves the final state (if any) in R0.

19

Page 67: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Towards a universal RM

Programs are data.

Can we build an RM that can interpret any other RM?

Need to encode RMs as numbers. What do we need for a machine M?

We need the contents R of the registers R0, . . . ,Rm−1, the programP = I0 . . . In−1 itself, and the ‘program counter’ C giving the currentinstruction.

Let p. . .q be the coding function given thus:

pinc(i)q = 〈0, i〉pdecjz(i , j)q = 〈1, i , j〉

pPq = 〈pI0q, . . . , pIn−1q〉pRq = 〈R0, . . . ,Rm−1〉pMq = 〈pPq, pRq,C 〉

Routine but very tedious Exercise: design an RM that takes an RMcoding in R0, simulates it, and leaves the final state (if any) in R0.

19

Page 68: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Towards a universal RM

Programs are data.

Can we build an RM that can interpret any other RM?

Need to encode RMs as numbers. What do we need for a machine M?

We need the contents R of the registers R0, . . . ,Rm−1, the programP = I0 . . . In−1 itself, and the ‘program counter’ C giving the currentinstruction.

Let p. . .q be the coding function given thus:

pinc(i)q = 〈0, i〉

pdecjz(i , j)q = 〈1, i , j〉pPq = 〈pI0q, . . . , pIn−1q〉pRq = 〈R0, . . . ,Rm−1〉pMq = 〈pPq, pRq,C 〉

Routine but very tedious Exercise: design an RM that takes an RMcoding in R0, simulates it, and leaves the final state (if any) in R0.

19

Page 69: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Towards a universal RM

Programs are data.

Can we build an RM that can interpret any other RM?

Need to encode RMs as numbers. What do we need for a machine M?

We need the contents R of the registers R0, . . . ,Rm−1, the programP = I0 . . . In−1 itself, and the ‘program counter’ C giving the currentinstruction.

Let p. . .q be the coding function given thus:

pinc(i)q = 〈0, i〉pdecjz(i , j)q = 〈1, i , j〉

pPq = 〈pI0q, . . . , pIn−1q〉pRq = 〈R0, . . . ,Rm−1〉pMq = 〈pPq, pRq,C 〉

Routine but very tedious Exercise: design an RM that takes an RMcoding in R0, simulates it, and leaves the final state (if any) in R0.

19

Page 70: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Towards a universal RM

Programs are data.

Can we build an RM that can interpret any other RM?

Need to encode RMs as numbers. What do we need for a machine M?

We need the contents R of the registers R0, . . . ,Rm−1, the programP = I0 . . . In−1 itself, and the ‘program counter’ C giving the currentinstruction.

Let p. . .q be the coding function given thus:

pinc(i)q = 〈0, i〉pdecjz(i , j)q = 〈1, i , j〉

pPq = 〈pI0q, . . . , pIn−1q〉

pRq = 〈R0, . . . ,Rm−1〉pMq = 〈pPq, pRq,C 〉

Routine but very tedious Exercise: design an RM that takes an RMcoding in R0, simulates it, and leaves the final state (if any) in R0.

19

Page 71: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Towards a universal RM

Programs are data.

Can we build an RM that can interpret any other RM?

Need to encode RMs as numbers. What do we need for a machine M?

We need the contents R of the registers R0, . . . ,Rm−1, the programP = I0 . . . In−1 itself, and the ‘program counter’ C giving the currentinstruction.

Let p. . .q be the coding function given thus:

pinc(i)q = 〈0, i〉pdecjz(i , j)q = 〈1, i , j〉

pPq = 〈pI0q, . . . , pIn−1q〉pRq = 〈R0, . . . ,Rm−1〉

pMq = 〈pPq, pRq,C 〉Routine but very tedious Exercise: design an RM that takes an RMcoding in R0, simulates it, and leaves the final state (if any) in R0.

19

Page 72: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Towards a universal RM

Programs are data.

Can we build an RM that can interpret any other RM?

Need to encode RMs as numbers. What do we need for a machine M?

We need the contents R of the registers R0, . . . ,Rm−1, the programP = I0 . . . In−1 itself, and the ‘program counter’ C giving the currentinstruction.

Let p. . .q be the coding function given thus:

pinc(i)q = 〈0, i〉pdecjz(i , j)q = 〈1, i , j〉

pPq = 〈pI0q, . . . , pIn−1q〉pRq = 〈R0, . . . ,Rm−1〉pMq = 〈pPq, pRq,C 〉

Routine but very tedious Exercise: design an RM that takes an RMcoding in R0, simulates it, and leaves the final state (if any) in R0.

19

Page 73: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Towards a universal RM

Programs are data.

Can we build an RM that can interpret any other RM?

Need to encode RMs as numbers. What do we need for a machine M?

We need the contents R of the registers R0, . . . ,Rm−1, the programP = I0 . . . In−1 itself, and the ‘program counter’ C giving the currentinstruction.

Let p. . .q be the coding function given thus:

pinc(i)q = 〈0, i〉pdecjz(i , j)q = 〈1, i , j〉

pPq = 〈pI0q, . . . , pIn−1q〉pRq = 〈R0, . . . ,Rm−1〉pMq = 〈pPq, pRq,C 〉

Routine but very tedious Exercise: design an RM that takes an RMcoding in R0, simulates it, and leaves the final state (if any) in R0.

19

Page 74: Julian Brad eld jcb@inf.ed.ac.uk IF{4

and on to the halting problem

“the final state (if any)” . . .

Can we tell computationally whether (any) simulated machine halts, bysome clever analysis of the program?

I Suppose H is an RM (PH ,R0, . . . ) which takes a machine codingpMq in R0, and terminates with 1 in R0 if M halts, and 0 in R0 if Mruns forever.

I It’s then easy to construct L = (PL,R0, . . . ), which takes a program(code) pPq and terminates with 1 if H returns 0 on the machine(P, pPq), and itself goes into an infinite loop if H returns 1 on(P, pPq).

I (L takes a program, and runs the halting test on the program withitself as input (diagonalization), and loops iff it halts.)

I What happens if we run L with input pPLq?I If L halts on pPLq, that means that H says that (PL, pPLq) loops;

and if L loops on pPLq, that means that H says that (PL, pPLq)halts. So either way, contradiction.

20

Page 75: Julian Brad eld jcb@inf.ed.ac.uk IF{4

and on to the halting problem

“the final state (if any)” . . .

Can we tell computationally whether (any) simulated machine halts, bysome clever analysis of the program?

I Suppose H is an RM (PH ,R0, . . . ) which takes a machine codingpMq in R0, and terminates with 1 in R0 if M halts, and 0 in R0 if Mruns forever.

I It’s then easy to construct L = (PL,R0, . . . ), which takes a program(code) pPq and terminates with 1 if H returns 0 on the machine(P, pPq), and itself goes into an infinite loop if H returns 1 on(P, pPq).

I (L takes a program, and runs the halting test on the program withitself as input (diagonalization), and loops iff it halts.)

I What happens if we run L with input pPLq?I If L halts on pPLq, that means that H says that (PL, pPLq) loops;

and if L loops on pPLq, that means that H says that (PL, pPLq)halts. So either way, contradiction.

20

Page 76: Julian Brad eld jcb@inf.ed.ac.uk IF{4

and on to the halting problem

“the final state (if any)” . . .

Can we tell computationally whether (any) simulated machine halts, bysome clever analysis of the program?

I Suppose H is an RM (PH ,R0, . . . ) which takes a machine codingpMq in R0, and terminates with 1 in R0 if M halts, and 0 in R0 if Mruns forever.

I It’s then easy to construct L = (PL,R0, . . . ), which takes a program(code) pPq and terminates with 1 if H returns 0 on the machine(P, pPq), and itself goes into an infinite loop if H returns 1 on(P, pPq).

I (L takes a program, and runs the halting test on the program withitself as input (diagonalization), and loops iff it halts.)

I What happens if we run L with input pPLq?I If L halts on pPLq, that means that H says that (PL, pPLq) loops;

and if L loops on pPLq, that means that H says that (PL, pPLq)halts. So either way, contradiction.

20

Page 77: Julian Brad eld jcb@inf.ed.ac.uk IF{4

and on to the halting problem

“the final state (if any)” . . .

Can we tell computationally whether (any) simulated machine halts, bysome clever analysis of the program?

I Suppose H is an RM (PH ,R0, . . . ) which takes a machine codingpMq in R0, and terminates with 1 in R0 if M halts, and 0 in R0 if Mruns forever.

I It’s then easy to construct L = (PL,R0, . . . ), which takes a program(code) pPq and terminates with 1 if H returns 0 on the machine(P, pPq), and itself goes into an infinite loop if H returns 1 on(P, pPq).

I (L takes a program, and runs the halting test on the program withitself as input (diagonalization), and loops iff it halts.)

I What happens if we run L with input pPLq?I If L halts on pPLq, that means that H says that (PL, pPLq) loops;

and if L loops on pPLq, that means that H says that (PL, pPLq)halts. So either way, contradiction.

20

Page 78: Julian Brad eld jcb@inf.ed.ac.uk IF{4

and on to the halting problem

“the final state (if any)” . . .

Can we tell computationally whether (any) simulated machine halts, bysome clever analysis of the program?

I Suppose H is an RM (PH ,R0, . . . ) which takes a machine codingpMq in R0, and terminates with 1 in R0 if M halts, and 0 in R0 if Mruns forever.

I It’s then easy to construct L = (PL,R0, . . . ), which takes a program(code) pPq and terminates with 1 if H returns 0 on the machine(P, pPq), and itself goes into an infinite loop if H returns 1 on(P, pPq).

I (L takes a program, and runs the halting test on the program withitself as input (diagonalization), and loops iff it halts.)

I What happens if we run L with input pPLq?

I If L halts on pPLq, that means that H says that (PL, pPLq) loops;and if L loops on pPLq, that means that H says that (PL, pPLq)halts. So either way, contradiction.

20

Page 79: Julian Brad eld jcb@inf.ed.ac.uk IF{4

and on to the halting problem

“the final state (if any)” . . .

Can we tell computationally whether (any) simulated machine halts, bysome clever analysis of the program?

I Suppose H is an RM (PH ,R0, . . . ) which takes a machine codingpMq in R0, and terminates with 1 in R0 if M halts, and 0 in R0 if Mruns forever.

I It’s then easy to construct L = (PL,R0, . . . ), which takes a program(code) pPq and terminates with 1 if H returns 0 on the machine(P, pPq), and itself goes into an infinite loop if H returns 1 on(P, pPq).

I (L takes a program, and runs the halting test on the program withitself as input (diagonalization), and loops iff it halts.)

I What happens if we run L with input pPLq?I If L halts on pPLq, that means that H says that (PL, pPLq) loops;

and if L loops on pPLq, that means that H says that (PL, pPLq)halts. So either way, contradiction.

20

Page 80: Julian Brad eld jcb@inf.ed.ac.uk IF{4

remarks

That (sketch) proof assumed very little – doesn’t actually need RMs, justa function of two inputs that can be understood as program and data. Sowhy is it such a big deal?

Because we convinced ourselves (?) that RMs can compute anything thatis computable in any reasonable sense. So we’ve shown that some thingscan’t be computed in any reasonable sense – which was not obvious.

But are Register Machines the right thing? Maybe other machines can domore?

21

Page 81: Julian Brad eld jcb@inf.ed.ac.uk IF{4

remarks

That (sketch) proof assumed very little – doesn’t actually need RMs, justa function of two inputs that can be understood as program and data. Sowhy is it such a big deal?

Because we convinced ourselves (?) that RMs can compute anything thatis computable in any reasonable sense. So we’ve shown that some thingscan’t be computed in any reasonable sense – which was not obvious.

But are Register Machines the right thing? Maybe other machines can domore?

21

Page 82: Julian Brad eld jcb@inf.ed.ac.uk IF{4

remarks

That (sketch) proof assumed very little – doesn’t actually need RMs, justa function of two inputs that can be understood as program and data. Sowhy is it such a big deal?

Because we convinced ourselves (?) that RMs can compute anything thatis computable in any reasonable sense. So we’ve shown that some thingscan’t be computed in any reasonable sense – which was not obvious.

But are Register Machines the right thing? Maybe other machines can domore?

21

Page 83: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Turing machines

(Reprise from Inf2A) Turing’s original paper (quite readable - read it!)used what we now call Turing machines.

A Turing machine comprises:

I a tape t = t0t1 . . . with an unlimited number of cells, each of whichholds a symbol ti from a finite alphabet A;

I a head which can read and write one cell of the tape, and be movedone step left or right; let h be the position of the head;

I a finite set S of control states

I a next function n : S × A→ (S × A× +1,−1) ∪ halt

The execution of the machine is:

I Suppose the current state is s, and the head position is h, then let(s ′, a′, k) = n(s, th): then cell h is written with the symbol a′, thehead moves to position h + k , and the next state is s ′;

I if n(s, th) = halt, or h + k < 0, the machine halts.

22

Page 84: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Turing machines

(Reprise from Inf2A) Turing’s original paper (quite readable - read it!)used what we now call Turing machines.

A Turing machine comprises:

I a tape t = t0t1 . . . with an unlimited number of cells, each of whichholds a symbol ti from a finite alphabet A;

I a head which can read and write one cell of the tape, and be movedone step left or right; let h be the position of the head;

I a finite set S of control states

I a next function n : S × A→ (S × A× +1,−1) ∪ halt

The execution of the machine is:

I Suppose the current state is s, and the head position is h, then let(s ′, a′, k) = n(s, th): then cell h is written with the symbol a′, thehead moves to position h + k , and the next state is s ′;

I if n(s, th) = halt, or h + k < 0, the machine halts.

22

Page 85: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Turing machines

(Reprise from Inf2A) Turing’s original paper (quite readable - read it!)used what we now call Turing machines.

A Turing machine comprises:

I a tape t = t0t1 . . . with an unlimited number of cells, each of whichholds a symbol ti from a finite alphabet A;

I a head which can read and write one cell of the tape, and be movedone step left or right; let h be the position of the head;

I a finite set S of control states

I a next function n : S × A→ (S × A× +1,−1) ∪ halt

The execution of the machine is:

I Suppose the current state is s, and the head position is h, then let(s ′, a′, k) = n(s, th): then cell h is written with the symbol a′, thehead moves to position h + k , and the next state is s ′;

I if n(s, th) = halt, or h + k < 0, the machine halts.

22

Page 86: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Turing machines

(Reprise from Inf2A) Turing’s original paper (quite readable - read it!)used what we now call Turing machines.

A Turing machine comprises:

I a tape t = t0t1 . . . with an unlimited number of cells, each of whichholds a symbol ti from a finite alphabet A;

I a head which can read and write one cell of the tape, and be movedone step left or right; let h be the position of the head;

I a finite set S of control states

I a next function n : S × A→ (S × A× +1,−1) ∪ halt

The execution of the machine is:

I Suppose the current state is s, and the head position is h, then let(s ′, a′, k) = n(s, th): then cell h is written with the symbol a′, thehead moves to position h + k , and the next state is s ′;

I if n(s, th) = halt, or h + k < 0, the machine halts.

22

Page 87: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Turing machines

(Reprise from Inf2A) Turing’s original paper (quite readable - read it!)used what we now call Turing machines.

A Turing machine comprises:

I a tape t = t0t1 . . . with an unlimited number of cells, each of whichholds a symbol ti from a finite alphabet A;

I a head which can read and write one cell of the tape, and be movedone step left or right; let h be the position of the head;

I a finite set S of control states

I a next function n : S × A→ (S × A× +1,−1) ∪ halt

The execution of the machine is:

I Suppose the current state is s, and the head position is h, then let(s ′, a′, k) = n(s, th): then cell h is written with the symbol a′, thehead moves to position h + k , and the next state is s ′;

I if n(s, th) = halt, or h + k < 0, the machine halts.

22

Page 88: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Turing machines

(Reprise from Inf2A) Turing’s original paper (quite readable - read it!)used what we now call Turing machines.

A Turing machine comprises:

I a tape t = t0t1 . . . with an unlimited number of cells, each of whichholds a symbol ti from a finite alphabet A;

I a head which can read and write one cell of the tape, and be movedone step left or right; let h be the position of the head;

I a finite set S of control states

I a next function n : S × A→ (S × A× +1,−1) ∪ halt

The execution of the machine is:

I Suppose the current state is s, and the head position is h, then let(s ′, a′, k) = n(s, th): then cell h is written with the symbol a′, thehead moves to position h + k , and the next state is s ′;

I if n(s, th) = halt, or h + k < 0, the machine halts.22

Page 89: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Programming Turing machines

How to represent numbers? In binary, with 0 and 1? In unary, with just1?

Typically use binary. Usually convenient to have a ‘blank’ symbol, and afew marker symbols such as $.

For further information and examples, see Sipser ch. 3.

Exercise: Design a TM (with any convenient alphabet) to interpret anRM program.

Exercise: Design an RM to interpret a TM specification.

(Don’t really do these – just think about how they can be done.)

A predecessor course

http://www.inf.ed.ac.uk/teaching/courses/ci/

provides a Java TM simulator and a bunch of TMs, including a universalTM!

23

Page 90: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Programming Turing machines

How to represent numbers? In binary, with 0 and 1? In unary, with just1?

Typically use binary. Usually convenient to have a ‘blank’ symbol, and afew marker symbols such as $.

For further information and examples, see Sipser ch. 3.

Exercise: Design a TM (with any convenient alphabet) to interpret anRM program.

Exercise: Design an RM to interpret a TM specification.

(Don’t really do these – just think about how they can be done.)

A predecessor course

http://www.inf.ed.ac.uk/teaching/courses/ci/

provides a Java TM simulator and a bunch of TMs, including a universalTM!

23

Page 91: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Programming Turing machines

How to represent numbers? In binary, with 0 and 1? In unary, with just1?

Typically use binary. Usually convenient to have a ‘blank’ symbol, and afew marker symbols such as $.

For further information and examples, see Sipser ch. 3.

Exercise: Design a TM (with any convenient alphabet) to interpret anRM program.

Exercise: Design an RM to interpret a TM specification.

(Don’t really do these – just think about how they can be done.)

A predecessor course

http://www.inf.ed.ac.uk/teaching/courses/ci/

provides a Java TM simulator and a bunch of TMs, including a universalTM!

23

Page 92: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Programming Turing machines

How to represent numbers? In binary, with 0 and 1? In unary, with just1?

Typically use binary. Usually convenient to have a ‘blank’ symbol, and afew marker symbols such as $.

For further information and examples, see Sipser ch. 3.

Exercise: Design a TM (with any convenient alphabet) to interpret anRM program.

Exercise: Design an RM to interpret a TM specification.

(Don’t really do these – just think about how they can be done.)

A predecessor course

http://www.inf.ed.ac.uk/teaching/courses/ci/

provides a Java TM simulator and a bunch of TMs, including a universalTM!

23

Page 93: Julian Brad eld jcb@inf.ed.ac.uk IF{4

TM variations

Bi-infinite tapes, multiple tapes, etc. may make programming easier, butaren’t necessary.

One semi-infinite tape and two symbols are needed.

24

Page 94: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Computable functions

Much of what we say henceforth could be about any reasonable ‘domain’:graphs, integers, rationals, trees. We will use N as the canonical domain,and rely on the (usually obvious) fact that any reasonable domain can beencoded into N.

A (total) function f : N→ N is computable if there is an RM/TM whichcomputes f (e.g. by taking x in R0 and leaving f (x) in R0) and alwaysterminates.

Note that predicates (e.g. ‘does the machine halt?’) are functions,viewing 0 as ‘no’ and 1 as ‘yes’.

You will also see the term ‘recursive function’. This is confusing, becauseit doesn’t mean ‘recursive’ in the sense you know, though there is a strongconnection. This terminology is slowly going out of use. We may come to it . . .

25

Page 95: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Computable functions

Much of what we say henceforth could be about any reasonable ‘domain’:graphs, integers, rationals, trees. We will use N as the canonical domain,and rely on the (usually obvious) fact that any reasonable domain can beencoded into N.

A (total) function f : N→ N is computable if there is an RM/TM whichcomputes f (e.g. by taking x in R0 and leaving f (x) in R0) and alwaysterminates.

Note that predicates (e.g. ‘does the machine halt?’) are functions,viewing 0 as ‘no’ and 1 as ‘yes’.

You will also see the term ‘recursive function’. This is confusing, becauseit doesn’t mean ‘recursive’ in the sense you know, though there is a strongconnection. This terminology is slowly going out of use. We may come to it . . .

25

Page 96: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Computable functions

Much of what we say henceforth could be about any reasonable ‘domain’:graphs, integers, rationals, trees. We will use N as the canonical domain,and rely on the (usually obvious) fact that any reasonable domain can beencoded into N.

A (total) function f : N→ N is computable if there is an RM/TM whichcomputes f (e.g. by taking x in R0 and leaving f (x) in R0) and alwaysterminates.

Note that predicates (e.g. ‘does the machine halt?’) are functions,viewing 0 as ‘no’ and 1 as ‘yes’.

You will also see the term ‘recursive function’. This is confusing, becauseit doesn’t mean ‘recursive’ in the sense you know, though there is a strongconnection. This terminology is slowly going out of use. We may come to it . . .

25

Page 97: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Computable functions

Much of what we say henceforth could be about any reasonable ‘domain’:graphs, integers, rationals, trees. We will use N as the canonical domain,and rely on the (usually obvious) fact that any reasonable domain can beencoded into N.

A (total) function f : N→ N is computable if there is an RM/TM whichcomputes f (e.g. by taking x in R0 and leaving f (x) in R0) and alwaysterminates.

Note that predicates (e.g. ‘does the machine halt?’) are functions,viewing 0 as ‘no’ and 1 as ‘yes’.

You will also see the term ‘recursive function’. This is confusing, becauseit doesn’t mean ‘recursive’ in the sense you know, though there is a strongconnection. This terminology is slowly going out of use. We may come to it . . .

25

Page 98: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Proving (un)decidability

Is the halting problem the only natural (?) undecidable problem?

How can we show that a ‘real’ problem is decidable or undecidable?

To show a problem decidable: write a program to solve it, prove theprogram terminates. Alternatively, translate it to a problem you alreadyknow how to decide.

Why doesn’t this contradict the halting theorem?

To show a problem undecidable: prove that if you could solve it, youcould also solve the halting problem.

26

Page 99: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Proving (un)decidability

Is the halting problem the only natural (?) undecidable problem?

How can we show that a ‘real’ problem is decidable or undecidable?

To show a problem decidable: write a program to solve it, prove theprogram terminates. Alternatively, translate it to a problem you alreadyknow how to decide.

Why doesn’t this contradict the halting theorem?

To show a problem undecidable: prove that if you could solve it, youcould also solve the halting problem.

26

Page 100: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Proving (un)decidability

Is the halting problem the only natural (?) undecidable problem?

How can we show that a ‘real’ problem is decidable or undecidable?

To show a problem decidable: write a program to solve it, prove theprogram terminates. Alternatively, translate it to a problem you alreadyknow how to decide.

Why doesn’t this contradict the halting theorem?

To show a problem undecidable: prove that if you could solve it, youcould also solve the halting problem.

26

Page 101: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Decision problems and oracles

A decision problem is a set D (domain) and a subset Q (query) of D. Theproblem is ‘is d ∈ Q?’, or ‘is Q(d) 0 or 1?’. The problem is computableor decidable iff the predicate Q is computable.

For example:

I the domain N and the subset Primes;

I the domain RM of register machine (encodings), and the subset Hthat halt.

Given a decision problem (D,Q), an oracle for Q is a ‘magic’ RMinstruction oracleQ(i) which assumes that Ri contains (an encoding of)d ∈ D, and sets Ri to contain Q(d).

If Q is itself decidable, oracleQ(i) can be replaced by a call to an RMwhich computes Q – thus a decidable oracle adds nothing.

27

Page 102: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Decision problems and oracles

A decision problem is a set D (domain) and a subset Q (query) of D. Theproblem is ‘is d ∈ Q?’, or ‘is Q(d) 0 or 1?’. The problem is computableor decidable iff the predicate Q is computable. For example:

I the domain N and the subset Primes;

I the domain RM of register machine (encodings), and the subset Hthat halt.

Given a decision problem (D,Q), an oracle for Q is a ‘magic’ RMinstruction oracleQ(i) which assumes that Ri contains (an encoding of)d ∈ D, and sets Ri to contain Q(d).

If Q is itself decidable, oracleQ(i) can be replaced by a call to an RMwhich computes Q – thus a decidable oracle adds nothing.

27

Page 103: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Decision problems and oracles

A decision problem is a set D (domain) and a subset Q (query) of D. Theproblem is ‘is d ∈ Q?’, or ‘is Q(d) 0 or 1?’. The problem is computableor decidable iff the predicate Q is computable. For example:

I the domain N and the subset Primes;

I the domain RM of register machine (encodings), and the subset Hthat halt.

Given a decision problem (D,Q), an oracle for Q is a ‘magic’ RMinstruction oracleQ(i) which assumes that Ri contains (an encoding of)d ∈ D, and sets Ri to contain Q(d).

If Q is itself decidable, oracleQ(i) can be replaced by a call to an RMwhich computes Q – thus a decidable oracle adds nothing.

27

Page 104: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Decision problems and oracles

A decision problem is a set D (domain) and a subset Q (query) of D. Theproblem is ‘is d ∈ Q?’, or ‘is Q(d) 0 or 1?’. The problem is computableor decidable iff the predicate Q is computable. For example:

I the domain N and the subset Primes;

I the domain RM of register machine (encodings), and the subset Hthat halt.

Given a decision problem (D,Q), an oracle for Q is a ‘magic’ RMinstruction oracleQ(i) which assumes that Ri contains (an encoding of)d ∈ D, and sets Ri to contain Q(d).

If Q is itself decidable, oracleQ(i) can be replaced by a call to an RMwhich computes Q – thus a decidable oracle adds nothing.

27

Page 105: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Decision problems and oracles

A decision problem is a set D (domain) and a subset Q (query) of D. Theproblem is ‘is d ∈ Q?’, or ‘is Q(d) 0 or 1?’. The problem is computableor decidable iff the predicate Q is computable. For example:

I the domain N and the subset Primes;

I the domain RM of register machine (encodings), and the subset Hthat halt.

Given a decision problem (D,Q), an oracle for Q is a ‘magic’ RMinstruction oracleQ(i) which assumes that Ri contains (an encoding of)d ∈ D, and sets Ri to contain Q(d).

If Q is itself decidable, oracleQ(i) can be replaced by a call to an RMwhich computes Q – thus a decidable oracle adds nothing.

27

Page 106: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reductions

Reductions are a key technique. Simple concept: turn one problem intoanother. But there are subtleties.

A Turing transducer is an RM that takes an instance d of a problem(D,Q) in R0 and halts with an instance d ′ = f (d) of (D ′,Q ′) in R0.(Thus f is a computable function D → D ′.)

A mapping reduction or many–one reduction from Q to Q ′ is a Turingtransducer f as above such that d ∈ Q iff f (d) ∈ Q ′.

Equivalently, an m-reduction is a Turing transducer which after puttingf (d) in R0 then calls oracleQ′(0) and halts.

NOTE that there is no post-processing of the oracle’s answer!

Is this intuitively reasonable?

If you allow the transducer to call the oracle freely, you get a Turing reduction.These are also useful; but they’re more powerful, so don’t make such finedistinctions of computing power.

28

Page 107: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reductions

Reductions are a key technique. Simple concept: turn one problem intoanother. But there are subtleties.

A Turing transducer is an RM that takes an instance d of a problem(D,Q) in R0 and halts with an instance d ′ = f (d) of (D ′,Q ′) in R0.(Thus f is a computable function D → D ′.)

A mapping reduction or many–one reduction from Q to Q ′ is a Turingtransducer f as above such that d ∈ Q iff f (d) ∈ Q ′.

Equivalently, an m-reduction is a Turing transducer which after puttingf (d) in R0 then calls oracleQ′(0) and halts.

NOTE that there is no post-processing of the oracle’s answer!

Is this intuitively reasonable?

If you allow the transducer to call the oracle freely, you get a Turing reduction.These are also useful; but they’re more powerful, so don’t make such finedistinctions of computing power.

28

Page 108: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reductions

Reductions are a key technique. Simple concept: turn one problem intoanother. But there are subtleties.

A Turing transducer is an RM that takes an instance d of a problem(D,Q) in R0 and halts with an instance d ′ = f (d) of (D ′,Q ′) in R0.(Thus f is a computable function D → D ′.)

A mapping reduction or many–one reduction from Q to Q ′ is a Turingtransducer f as above such that d ∈ Q iff f (d) ∈ Q ′.

Equivalently, an m-reduction is a Turing transducer which after puttingf (d) in R0 then calls oracleQ′(0) and halts.

NOTE that there is no post-processing of the oracle’s answer!

Is this intuitively reasonable?

If you allow the transducer to call the oracle freely, you get a Turing reduction.These are also useful; but they’re more powerful, so don’t make such finedistinctions of computing power.

28

Page 109: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reductions

Reductions are a key technique. Simple concept: turn one problem intoanother. But there are subtleties.

A Turing transducer is an RM that takes an instance d of a problem(D,Q) in R0 and halts with an instance d ′ = f (d) of (D ′,Q ′) in R0.(Thus f is a computable function D → D ′.)

A mapping reduction or many–one reduction from Q to Q ′ is a Turingtransducer f as above such that d ∈ Q iff f (d) ∈ Q ′.

Equivalently, an m-reduction is a Turing transducer which after puttingf (d) in R0 then calls oracleQ′(0) and halts.

NOTE that there is no post-processing of the oracle’s answer!

Is this intuitively reasonable?

If you allow the transducer to call the oracle freely, you get a Turing reduction.These are also useful; but they’re more powerful, so don’t make such finedistinctions of computing power.

28

Page 110: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reductions

Reductions are a key technique. Simple concept: turn one problem intoanother. But there are subtleties.

A Turing transducer is an RM that takes an instance d of a problem(D,Q) in R0 and halts with an instance d ′ = f (d) of (D ′,Q ′) in R0.(Thus f is a computable function D → D ′.)

A mapping reduction or many–one reduction from Q to Q ′ is a Turingtransducer f as above such that d ∈ Q iff f (d) ∈ Q ′.

Equivalently, an m-reduction is a Turing transducer which after puttingf (d) in R0 then calls oracleQ′(0) and halts.

NOTE that there is no post-processing of the oracle’s answer!

Is this intuitively reasonable?

If you allow the transducer to call the oracle freely, you get a Turing reduction.These are also useful; but they’re more powerful, so don’t make such finedistinctions of computing power.

28

Page 111: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Showing undecidability

Reductions are the key (really the only) tool for showing undecidability,thus:

I Given (D,Q), construct a reduction Red(H,Q) from (RM/TM,H)to (D,Q).

I Suppose Q is decidable. Then given M ∈ RM, feed it to Red(H,Q)to decide M ∈ H. But if Q is decidable, we can replace the oracle,and Red(H,Q) is just an ordinary RM. Hence H is decidable –contradiction. So Q must be undecidable.

Constructing Red is usually either very easy, or rather long and difficult!

For a relatively simple example of a long and complicated reduction, seehttps://hal.archives-ouvertes.fr/hal-00204625v2, a simplifiedproof by Nicolas Ollinger of the famous ‘tiling problem’ due to Hao Wang.

29

Page 112: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Showing undecidability

Reductions are the key (really the only) tool for showing undecidability,thus:

I Given (D,Q), construct a reduction Red(H,Q) from (RM/TM,H)to (D,Q).

I Suppose Q is decidable. Then given M ∈ RM, feed it to Red(H,Q)to decide M ∈ H. But if Q is decidable, we can replace the oracle,and Red(H,Q) is just an ordinary RM. Hence H is decidable –contradiction. So Q must be undecidable.

Constructing Red is usually either very easy, or rather long and difficult!

For a relatively simple example of a long and complicated reduction, seehttps://hal.archives-ouvertes.fr/hal-00204625v2, a simplifiedproof by Nicolas Ollinger of the famous ‘tiling problem’ due to Hao Wang.

29

Page 113: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Showing undecidability

Reductions are the key (really the only) tool for showing undecidability,thus:

I Given (D,Q), construct a reduction Red(H,Q) from (RM/TM,H)to (D,Q).

I Suppose Q is decidable. Then given M ∈ RM, feed it to Red(H,Q)to decide M ∈ H. But if Q is decidable, we can replace the oracle,and Red(H,Q) is just an ordinary RM. Hence H is decidable –contradiction. So Q must be undecidable.

Constructing Red is usually either very easy, or rather long and difficult!

For a relatively simple example of a long and complicated reduction, seehttps://hal.archives-ouvertes.fr/hal-00204625v2, a simplifiedproof by Nicolas Ollinger of the famous ‘tiling problem’ due to Hao Wang.

29

Page 114: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The Uniform Halting Problem

A simple example:

The Uniform Halting Problem (UH) asks, given a machine M, does Mhalt on all inputs?

‘Clearly’ at least as hard as H, so must be undecidable. Proof:

I We need an m-reduction from H to UH:

I given machine M, input R, build a machine M ′ which ignores itsinput and overwrites it with R, then behaves as M.

I Then M ′ halts on any input iff M halts on R.

Check that this really is a transducer.

30

Page 115: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The Uniform Halting Problem

A simple example:

The Uniform Halting Problem (UH) asks, given a machine M, does Mhalt on all inputs?

‘Clearly’ at least as hard as H, so must be undecidable.

Proof:

I We need an m-reduction from H to UH:

I given machine M, input R, build a machine M ′ which ignores itsinput and overwrites it with R, then behaves as M.

I Then M ′ halts on any input iff M halts on R.

Check that this really is a transducer.

30

Page 116: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The Uniform Halting Problem

A simple example:

The Uniform Halting Problem (UH) asks, given a machine M, does Mhalt on all inputs?

‘Clearly’ at least as hard as H, so must be undecidable. Proof:

I We need an m-reduction from H to UH:

I given machine M, input R, build a machine M ′ which ignores itsinput and overwrites it with R, then behaves as M.

I Then M ′ halts on any input iff M halts on R.

Check that this really is a transducer.

30

Page 117: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The Uniform Halting Problem

A simple example:

The Uniform Halting Problem (UH) asks, given a machine M, does Mhalt on all inputs?

‘Clearly’ at least as hard as H, so must be undecidable. Proof:

I We need an m-reduction from H to UH:

I given machine M, input R, build a machine M ′ which ignores itsinput and overwrites it with R, then behaves as M.

I Then M ′ halts on any input iff M halts on R.

Check that this really is a transducer.

30

Page 118: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The Uniform Halting Problem

A simple example:

The Uniform Halting Problem (UH) asks, given a machine M, does Mhalt on all inputs?

‘Clearly’ at least as hard as H, so must be undecidable. Proof:

I We need an m-reduction from H to UH:

I given machine M, input R, build a machine M ′ which ignores itsinput and overwrites it with R, then behaves as M.

I Then M ′ halts on any input iff M halts on R.

Check that this really is a transducer.

30

Page 119: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The Looping Problem

Let L be the subset of RMs (or TMs) that go into an infinite loop. Showthat L is undecidable.

Since L is just the complement of H, this seems equally easy: just flip theanswer. But m-reductions can’t post-process the answer.

Can you devise an m-reduction from H to L?

No! You can’t.

Can you devise a Turing reduction from H to L?

31

Page 120: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The Looping Problem

Let L be the subset of RMs (or TMs) that go into an infinite loop. Showthat L is undecidable.

Since L is just the complement of H, this seems equally easy: just flip theanswer. But m-reductions can’t post-process the answer.

Can you devise an m-reduction from H to L?

No! You can’t.

Can you devise a Turing reduction from H to L?

31

Page 121: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The Looping Problem

Let L be the subset of RMs (or TMs) that go into an infinite loop. Showthat L is undecidable.

Since L is just the complement of H, this seems equally easy: just flip theanswer. But m-reductions can’t post-process the answer.

Can you devise an m-reduction from H to L?

No! You can’t.

Can you devise a Turing reduction from H to L?

31

Page 122: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The Looping Problem

Let L be the subset of RMs (or TMs) that go into an infinite loop. Showthat L is undecidable.

Since L is just the complement of H, this seems equally easy: just flip theanswer. But m-reductions can’t post-process the answer.

Can you devise an m-reduction from H to L?

No! You can’t.

Can you devise a Turing reduction from H to L?

31

Page 123: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The Looping Problem

Let L be the subset of RMs (or TMs) that go into an infinite loop. Showthat L is undecidable.

Since L is just the complement of H, this seems equally easy: just flip theanswer. But m-reductions can’t post-process the answer.

Can you devise an m-reduction from H to L?

No! You can’t.

Can you devise a Turing reduction from H to L?

31

Page 124: Julian Brad eld jcb@inf.ed.ac.uk IF{4

When ‘yes’ is easier than ‘no’

The halting problem is not symmetrical:

I if M ∈ H, then we can determine this: run M, and when it halts,say ‘yes’;

I if M /∈ H, then we can’t determine this: run M, but it never stops,and we can never say ‘no’.

Such problems are called semi-decidable.

The problem L is the opposite: we can determine ‘no’, but not ‘yes’. It iscalled co-semi-decidable.

There are two ways of exploiting this asymmetry – we’ll see them quicklynow, and in more detail in the second section of the course.

32

Page 125: Julian Brad eld jcb@inf.ed.ac.uk IF{4

When ‘yes’ is easier than ‘no’

The halting problem is not symmetrical:

I if M ∈ H, then we can determine this: run M, and when it halts,say ‘yes’;

I if M /∈ H, then we can’t determine this: run M, but it never stops,and we can never say ‘no’.

Such problems are called semi-decidable.

The problem L is the opposite: we can determine ‘no’, but not ‘yes’. It iscalled co-semi-decidable.

There are two ways of exploiting this asymmetry – we’ll see them quicklynow, and in more detail in the second section of the course.

32

Page 126: Julian Brad eld jcb@inf.ed.ac.uk IF{4

When ‘yes’ is easier than ‘no’

The halting problem is not symmetrical:

I if M ∈ H, then we can determine this: run M, and when it halts,say ‘yes’;

I if M /∈ H, then we can’t determine this: run M, but it never stops,and we can never say ‘no’.

Such problems are called semi-decidable.

The problem L is the opposite: we can determine ‘no’, but not ‘yes’. It iscalled co-semi-decidable.

There are two ways of exploiting this asymmetry – we’ll see them quicklynow, and in more detail in the second section of the course.

32

Page 127: Julian Brad eld jcb@inf.ed.ac.uk IF{4

When ‘yes’ is easier than ‘no’

The halting problem is not symmetrical:

I if M ∈ H, then we can determine this: run M, and when it halts,say ‘yes’;

I if M /∈ H, then we can’t determine this: run M, but it never stops,and we can never say ‘no’.

Such problems are called semi-decidable.

The problem L is the opposite: we can determine ‘no’, but not ‘yes’. It iscalled co-semi-decidable.

There are two ways of exploiting this asymmetry – we’ll see them quicklynow, and in more detail in the second section of the course.

32

Page 128: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Enumeration

It’s convenient to talk of RMs that ‘output an infinite list’ – as ini = 0; while ( true ) print i++; – which can be formalizedin several ways. (Exercise: think of a few.)

Suppose (N,Q) is decidable. Then we can write a machine which outputsthe set Q:

I for each n ∈ 0, 1, 2, . . . , compute Q(n) and output n iff Q(n);

I thus every n ∈ Q will eventually be output.

This is an enumeration of Q, and we say that Q is computably enumerable(c.e. for short).

H is not decidable – but can we still enumerate it?

33

Page 129: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Enumeration

It’s convenient to talk of RMs that ‘output an infinite list’ – as ini = 0; while ( true ) print i++; – which can be formalizedin several ways. (Exercise: think of a few.)

Suppose (N,Q) is decidable. Then we can write a machine which outputsthe set Q:

I for each n ∈ 0, 1, 2, . . . , compute Q(n) and output n iff Q(n);

I thus every n ∈ Q will eventually be output.

This is an enumeration of Q, and we say that Q is computably enumerable(c.e. for short).

H is not decidable – but can we still enumerate it?

33

Page 130: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Enumeration

It’s convenient to talk of RMs that ‘output an infinite list’ – as ini = 0; while ( true ) print i++; – which can be formalizedin several ways. (Exercise: think of a few.)

Suppose (N,Q) is decidable. Then we can write a machine which outputsthe set Q:

I for each n ∈ 0, 1, 2, . . . , compute Q(n) and output n iff Q(n);

I thus every n ∈ Q will eventually be output.

This is an enumeration of Q, and we say that Q is computably enumerable(c.e. for short).

H is not decidable – but can we still enumerate it?

33

Page 131: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Enumeration

It’s convenient to talk of RMs that ‘output an infinite list’ – as ini = 0; while ( true ) print i++; – which can be formalizedin several ways. (Exercise: think of a few.)

Suppose (N,Q) is decidable. Then we can write a machine which outputsthe set Q:

I for each n ∈ 0, 1, 2, . . . , compute Q(n) and output n iff Q(n);

I thus every n ∈ Q will eventually be output.

This is an enumeration of Q, and we say that Q is computably enumerable(c.e. for short).

H is not decidable – but can we still enumerate it?

33

Page 132: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Enumeration by interleaving

Observe that the set of valid register machine encodings is decidable:given n, we can check whether n is pMq for a syntactically valid machineM.

Therefore we can enumerate pM0q, pM1q, . . .

Now consider the following pseudo-code:

machineList := 〈〉 sequence storing a list of machine (code)sfor (i:=0; true; i++):

add pMiq to machineListforeach pMq in machineList:

run M for one step and update in machineListif M has halted:

output pMqdelete pMq from machineList

This program outputs H: every halting machine is eventually output.

H is computably enumerable.

34

Page 133: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Enumeration by interleaving

Observe that the set of valid register machine encodings is decidable:given n, we can check whether n is pMq for a syntactically valid machineM.

Therefore we can enumerate pM0q, pM1q, . . .

Now consider the following pseudo-code:

machineList := 〈〉 sequence storing a list of machine (code)sfor (i:=0; true; i++):

add pMiq to machineListforeach pMq in machineList:

run M for one step and update in machineListif M has halted:

output pMqdelete pMq from machineList

This program outputs H: every halting machine is eventually output.

H is computably enumerable.

34

Page 134: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Enumeration by interleaving

Observe that the set of valid register machine encodings is decidable:given n, we can check whether n is pMq for a syntactically valid machineM.

Therefore we can enumerate pM0q, pM1q, . . .

Now consider the following pseudo-code:

machineList := 〈〉 sequence storing a list of machine (code)sfor (i:=0; true; i++):

add pMiq to machineListforeach pMq in machineList:

run M for one step and update in machineListif M has halted:

output pMqdelete pMq from machineList

This program outputs H: every halting machine is eventually output.

H is computably enumerable.

34

Page 135: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Semi-decidability and enumerability

The interleaving technique lets us see that

I Any semi-decidable problem is also computably enumerable

Exercise: Show that the converse is true:

I Any c.e. problem is semi-decidable.

Now we’ve shown that semi-decidability is the same as c.e.-ness.

Note that if (D,Q) is semi-decidable, then (D,D\Q) is co-semi-decidable.

35

Page 136: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Semi-decidability and enumerability

The interleaving technique lets us see that

I Any semi-decidable problem is also computably enumerable

Exercise: Show that the converse is true:

I Any c.e. problem is semi-decidable.

Now we’ve shown that semi-decidability is the same as c.e.-ness.

Note that if (D,Q) is semi-decidable, then (D,D\Q) is co-semi-decidable.

35

Page 137: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Back to Looping

Suppose that (D,Q) is both semi-decidable and co-semi-decidable.

I So Q and D \ Q are both semi-decidable.

I So run a semi-deciding machine for Q in parallel with a semi-deciderfor D \ Q, using interleaving.

I Whatever instance d we look at, one of the two will halt – so stopthen.

So (D,Q) is decidable.

Now we know that L cannot be semi-decidable, as it’s the complement ofsemi-decidable H.

(Exercise: When I say the complement of L is H, what is the domain D?Do I need to take care?)

36

Page 138: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Back to Looping

Suppose that (D,Q) is both semi-decidable and co-semi-decidable.

I So Q and D \ Q are both semi-decidable.

I So run a semi-deciding machine for Q in parallel with a semi-deciderfor D \ Q, using interleaving.

I Whatever instance d we look at, one of the two will halt – so stopthen.

So (D,Q) is decidable.

Now we know that L cannot be semi-decidable, as it’s the complement ofsemi-decidable H.

(Exercise: When I say the complement of L is H, what is the domain D?Do I need to take care?)

36

Page 139: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Back to Looping

Suppose that (D,Q) is both semi-decidable and co-semi-decidable.

I So Q and D \ Q are both semi-decidable.

I So run a semi-deciding machine for Q in parallel with a semi-deciderfor D \ Q, using interleaving.

I Whatever instance d we look at, one of the two will halt – so stopthen.

So (D,Q) is decidable.

Now we know that L cannot be semi-decidable, as it’s the complement ofsemi-decidable H.

(Exercise: When I say the complement of L is H, what is the domain D?Do I need to take care?)

36

Page 140: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Uniform Halting again

We showed, easily, that UH was undecidable. Is it semi-decidable?

We used reductions from H to show un-decidability. Do reductions fromL show un-semi-decidability? Yes!

Suppose f reduces Y to X , and suppose that X is semi-decidable bymachine MX . Then since Y (y) = X (f (y)), if Y (y) = 1 we can translatey to f (y), run MX , and get 1. If Y (y) = 0, then MX applied to f (y)either gives 0, or doesn’t halt.

So if X is semi-decidable, so is Y ; or if Y is not semi-decidable, then Xcan’t be.

37

Page 141: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Uniform Halting again

We showed, easily, that UH was undecidable. Is it semi-decidable?

We used reductions from H to show un-decidability. Do reductions fromL show un-semi-decidability? Yes!

Suppose f reduces Y to X , and suppose that X is semi-decidable bymachine MX . Then since Y (y) = X (f (y)), if Y (y) = 1 we can translatey to f (y), run MX , and get 1. If Y (y) = 0, then MX applied to f (y)either gives 0, or doesn’t halt.

So if X is semi-decidable, so is Y ; or if Y is not semi-decidable, then Xcan’t be.

37

Page 142: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Uniform Halting again

We showed, easily, that UH was undecidable. Is it semi-decidable?

We used reductions from H to show un-decidability. Do reductions fromL show un-semi-decidability? Yes!

Suppose f reduces Y to X , and suppose that X is semi-decidable bymachine MX . Then since Y (y) = X (f (y)), if Y (y) = 1 we can translatey to f (y), run MX , and get 1. If Y (y) = 0, then MX applied to f (y)either gives 0, or doesn’t halt.

So if X is semi-decidable, so is Y ; or if Y is not semi-decidable, then Xcan’t be.

37

Page 143: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Uniform Halting again

We showed, easily, that UH was undecidable. Is it semi-decidable?

We used reductions from H to show un-decidability. Do reductions fromL show un-semi-decidability? Yes!

Suppose f reduces Y to X , and suppose that X is semi-decidable bymachine MX . Then since Y (y) = X (f (y)), if Y (y) = 1 we can translatey to f (y), run MX , and get 1. If Y (y) = 0, then MX applied to f (y)either gives 0, or doesn’t halt.

So if X is semi-decidable, so is Y ; or if Y is not semi-decidable, then Xcan’t be.

37

Page 144: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reducing L to UH

We want a transducer f (M,R) such that f (M,R) halts on all inputs iffM loops on R. Not obvious . . . we could ignore the inputs to f (M), buthow do we make it stop?

By a clever trick: f (M,R) will measure how long M runs for, with atimeout as its input.

I Let M ′ = f (M,R) be a machine which takes a number n as input,and then simulates M on R for (at most) n steps.

I If M halts on R before n steps, then M ′ goes into a loop;

I while if M hasn’t halted, M ′ just halts.

I Hence if M loops on R, M ′ halts on all inputs n; and if M halts, thenM ′ loops on all sufficiently large n, and so doesn’t halt on all inputs.

38

Page 145: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reducing L to UH

We want a transducer f (M,R) such that f (M,R) halts on all inputs iffM loops on R. Not obvious . . . we could ignore the inputs to f (M), buthow do we make it stop?

By a clever trick: f (M,R) will measure how long M runs for, with atimeout as its input.

I Let M ′ = f (M,R) be a machine which takes a number n as input,and then simulates M on R for (at most) n steps.

I If M halts on R before n steps, then M ′ goes into a loop;

I while if M hasn’t halted, M ′ just halts.

I Hence if M loops on R, M ′ halts on all inputs n; and if M halts, thenM ′ loops on all sufficiently large n, and so doesn’t halt on all inputs.

38

Page 146: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reducing L to UH

We want a transducer f (M,R) such that f (M,R) halts on all inputs iffM loops on R. Not obvious . . . we could ignore the inputs to f (M), buthow do we make it stop?

By a clever trick: f (M,R) will measure how long M runs for, with atimeout as its input.

I Let M ′ = f (M,R) be a machine which takes a number n as input,and then simulates M on R for (at most) n steps.

I If M halts on R before n steps, then M ′ goes into a loop;

I while if M hasn’t halted, M ′ just halts.

I Hence if M loops on R, M ′ halts on all inputs n; and if M halts, thenM ′ loops on all sufficiently large n, and so doesn’t halt on all inputs.

38

Page 147: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reducing L to UH

We want a transducer f (M,R) such that f (M,R) halts on all inputs iffM loops on R. Not obvious . . . we could ignore the inputs to f (M), buthow do we make it stop?

By a clever trick: f (M,R) will measure how long M runs for, with atimeout as its input.

I Let M ′ = f (M,R) be a machine which takes a number n as input,and then simulates M on R for (at most) n steps.

I If M halts on R before n steps, then M ′ goes into a loop;

I while if M hasn’t halted, M ′ just halts.

I Hence if M loops on R, M ′ halts on all inputs n; and if M halts, thenM ′ loops on all sufficiently large n, and so doesn’t halt on all inputs.

38

Page 148: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reducing L to UH

We want a transducer f (M,R) such that f (M,R) halts on all inputs iffM loops on R. Not obvious . . . we could ignore the inputs to f (M), buthow do we make it stop?

By a clever trick: f (M,R) will measure how long M runs for, with atimeout as its input.

I Let M ′ = f (M,R) be a machine which takes a number n as input,and then simulates M on R for (at most) n steps.

I If M halts on R before n steps, then M ′ goes into a loop;

I while if M hasn’t halted, M ′ just halts.

I Hence if M loops on R, M ′ halts on all inputs n; and if M halts, thenM ′ loops on all sufficiently large n, and so doesn’t halt on all inputs.

38

Page 149: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reducing L to UH

We want a transducer f (M,R) such that f (M,R) halts on all inputs iffM loops on R. Not obvious . . . we could ignore the inputs to f (M), buthow do we make it stop?

By a clever trick: f (M,R) will measure how long M runs for, with atimeout as its input.

I Let M ′ = f (M,R) be a machine which takes a number n as input,and then simulates M on R for (at most) n steps.

I If M halts on R before n steps, then M ′ goes into a loop;

I while if M hasn’t halted, M ′ just halts.

I Hence if M loops on R, M ′ halts on all inputs n; and if M halts, thenM ′ loops on all sufficiently large n, and so doesn’t halt on all inputs.

38

Page 150: Julian Brad eld jcb@inf.ed.ac.uk IF{4

So UH is not . . .

The reduction from L shows that UH is not semi-decidable.

Is UH co-semi-decidable (like L)?

No! The (trivial) reduction from H to UH also shows that UH is notco-semi-decidable.

UH

semi-decidable H L co-semi-decidable

decidable

H has from ‘∃n.M halts after n’; L has form ‘∀n.M doesn’t halt after n’; UH hasform ‘∀R.∃n.M halts after n on R’. Really the diagram should be . . .

39

Page 151: Julian Brad eld jcb@inf.ed.ac.uk IF{4

So UH is not . . .

The reduction from L shows that UH is not semi-decidable.

Is UH co-semi-decidable (like L)?

No! The (trivial) reduction from H to UH also shows that UH is notco-semi-decidable.

UH

semi-decidable H L co-semi-decidable

decidable

H has from ‘∃n.M halts after n’; L has form ‘∀n.M doesn’t halt after n’; UH hasform ‘∀R.∃n.M halts after n on R’. Really the diagram should be . . .

39

Page 152: Julian Brad eld jcb@inf.ed.ac.uk IF{4

So UH is not . . .

The reduction from L shows that UH is not semi-decidable.

Is UH co-semi-decidable (like L)?

No! The (trivial) reduction from H to UH also shows that UH is notco-semi-decidable.

UH

semi-decidable H L co-semi-decidable

decidable

H has from ‘∃n.M halts after n’; L has form ‘∀n.M doesn’t halt after n’; UH hasform ‘∀R.∃n.M halts after n on R’. Really the diagram should be . . .

39

Page 153: Julian Brad eld jcb@inf.ed.ac.uk IF{4

So UH is not . . .

The reduction from L shows that UH is not semi-decidable.

Is UH co-semi-decidable (like L)?

No! The (trivial) reduction from H to UH also shows that UH is notco-semi-decidable.

UH

semi-decidable H L co-semi-decidable

decidable

H has from ‘∃n.M halts after n’; L has form ‘∀n.M doesn’t halt after n’; UH hasform ‘∀R.∃n.M halts after n on R’. Really the diagram should be . . .

39

Page 154: Julian Brad eld jcb@inf.ed.ac.uk IF{4

So UH is not . . .

The reduction from L shows that UH is not semi-decidable.

Is UH co-semi-decidable (like L)?

No! The (trivial) reduction from H to UH also shows that UH is notco-semi-decidable.

UH Π02

↑Σ0

1 semi-decidable H L co-semi-decidable Π01

decidable ∆0

1

H has from ‘∃n.M halts after n’; L has form ‘∀n.M doesn’t halt after n’; UH hasform ‘∀R.∃n.M halts after n on R’. Really the diagram should be . . .

39

Page 155: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Harder and harder

Suppose we consider RMH , the class of register machines with an oraclefor H.

We can easily adapt our universal machine to produce a universalH

machine for this class.

So we can apply diagonalization, and show that:

I There is no RMH machine that computes the halting problem forRMH

We can keep doing this, and generate harder and harder problems – thearithmetical hierarchy. Sadly, we don’t have (much) time to explore thisfascinating topic.

40

Page 156: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Harder and harder

Suppose we consider RMH , the class of register machines with an oraclefor H.

We can easily adapt our universal machine to produce a universalH

machine for this class.

So we can apply diagonalization, and show that:

I There is no RMH machine that computes the halting problem forRMH

We can keep doing this, and generate harder and harder problems – thearithmetical hierarchy. Sadly, we don’t have (much) time to explore thisfascinating topic.

40

Page 157: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Harder and harder

Suppose we consider RMH , the class of register machines with an oraclefor H.

We can easily adapt our universal machine to produce a universalH

machine for this class.

So we can apply diagonalization, and show that:

I There is no RMH machine that computes the halting problem forRMH

We can keep doing this, and generate harder and harder problems – thearithmetical hierarchy. Sadly, we don’t have (much) time to explore thisfascinating topic.

40

Page 158: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Harder and harder

Suppose we consider RMH , the class of register machines with an oraclefor H.

We can easily adapt our universal machine to produce a universalH

machine for this class.

So we can apply diagonalization, and show that:

I There is no RMH machine that computes the halting problem forRMH

We can keep doing this, and generate harder and harder problems – thearithmetical hierarchy. Sadly, we don’t have (much) time to explore thisfascinating topic.

40

Page 159: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Complexity of algorithms and problems

Recall that in Inf2B (and in ADS if you did it) we calculated the runningcost of algorithms: typically as a function f (n) of the size n of the input.

Recall big-O notation: f ∈ O(g) means that f is asymptotically boundedabove by g , or formally

∃c, n0.∀n > n0. f (n) ≤ c · g(n)

Similarly f ∈ Ω(g) means f is bounded below by g

∃c, n0.∀n > n0. f (n) ≥ c · g(n)

Recall that for some problems, we could show that any algorithm musttake at least a certain time. E.g. any sorting algorithm must be Ω(n lg n),and we have an O(n lg n) algorithm.

41

Page 160: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Complexity of algorithms and problems

Recall that in Inf2B (and in ADS if you did it) we calculated the runningcost of algorithms: typically as a function f (n) of the size n of the input.

Recall big-O notation: f ∈ O(g) means that f is asymptotically boundedabove by g , or formally

∃c, n0.∀n > n0. f (n) ≤ c · g(n)

Similarly f ∈ Ω(g) means f is bounded below by g

∃c, n0.∀n > n0. f (n) ≥ c · g(n)

Recall that for some problems, we could show that any algorithm musttake at least a certain time. E.g. any sorting algorithm must be Ω(n lg n),and we have an O(n lg n) algorithm.

41

Page 161: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Complexity of algorithms and problems

Recall that in Inf2B (and in ADS if you did it) we calculated the runningcost of algorithms: typically as a function f (n) of the size n of the input.

Recall big-O notation: f ∈ O(g) means that f is asymptotically boundedabove by g , or formally

∃c, n0.∀n > n0. f (n) ≤ c · g(n)

Similarly f ∈ Ω(g) means f is bounded below by g

∃c, n0.∀n > n0. f (n) ≥ c · g(n)

Recall that for some problems, we could show that any algorithm musttake at least a certain time. E.g. any sorting algorithm must be Ω(n lg n),and we have an O(n lg n) algorithm.

41

Page 162: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Complexity of algorithms and problems

Recall that in Inf2B (and in ADS if you did it) we calculated the runningcost of algorithms: typically as a function f (n) of the size n of the input.

Recall big-O notation: f ∈ O(g) means that f is asymptotically boundedabove by g , or formally

∃c, n0.∀n > n0. f (n) ≤ c · g(n)

Similarly f ∈ Ω(g) means f is bounded below by g

∃c, n0.∀n > n0. f (n) ≥ c · g(n)

Recall that for some problems, we could show that any algorithm musttake at least a certain time. E.g. any sorting algorithm must be Ω(n lg n),and we have an O(n lg n) algorithm.

41

Page 163: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Variations in models of computing

In Inf2/ADS, there was the question of what counts as a step ofcomputation.

E.g. in sorting, we counted the number of comparisons – and assumedthat it costs as much to compare a small number as a big number.

Reasonable, for numbers that fit in a machine word!

What about the cost of control flow etc.?

Do we care about whether we’re using Turing machines, register machines,random access machines, or whatever?

On a register machine, adding two numbers is an O(n) operation (becauseof the decrement/increment loop). But if we add addition as a primitive,it’s O(1). On a Turing machine, it’s O(lg n) (work in binary, and doprimary school long addition).

It would be nice to ignore these differences.

42

Page 164: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Variations in models of computing

In Inf2/ADS, there was the question of what counts as a step ofcomputation.

E.g. in sorting, we counted the number of comparisons – and assumedthat it costs as much to compare a small number as a big number.

Reasonable, for numbers that fit in a machine word!

What about the cost of control flow etc.?

Do we care about whether we’re using Turing machines, register machines,random access machines, or whatever?

On a register machine, adding two numbers is an O(n) operation (becauseof the decrement/increment loop). But if we add addition as a primitive,it’s O(1). On a Turing machine, it’s O(lg n) (work in binary, and doprimary school long addition).

It would be nice to ignore these differences.

42

Page 165: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Variations in models of computing

In Inf2/ADS, there was the question of what counts as a step ofcomputation.

E.g. in sorting, we counted the number of comparisons – and assumedthat it costs as much to compare a small number as a big number.

Reasonable, for numbers that fit in a machine word!

What about the cost of control flow etc.?

Do we care about whether we’re using Turing machines, register machines,random access machines, or whatever?

On a register machine, adding two numbers is an O(n) operation (becauseof the decrement/increment loop). But if we add addition as a primitive,it’s O(1). On a Turing machine, it’s O(lg n) (work in binary, and doprimary school long addition).

It would be nice to ignore these differences.

42

Page 166: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Variations in models of computing

In Inf2/ADS, there was the question of what counts as a step ofcomputation.

E.g. in sorting, we counted the number of comparisons – and assumedthat it costs as much to compare a small number as a big number.

Reasonable, for numbers that fit in a machine word!

What about the cost of control flow etc.?

Do we care about whether we’re using Turing machines, register machines,random access machines, or whatever?

On a register machine, adding two numbers is an O(n) operation (becauseof the decrement/increment loop). But if we add addition as a primitive,it’s O(1). On a Turing machine, it’s O(lg n) (work in binary, and doprimary school long addition).

It would be nice to ignore these differences.

42

Page 167: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Time to extend our RMs a bit

Our very basic RMs essentially work in unary: that’s why addition is O(n)instead of O(lg n).

This is too big a slow-down to ignore – it’s exponential in the size (numberof bits = lg n) of the input!.

So from now on we assume instructions add(i , j) and sub(i , j) which(add/truncated subtract) Rj from Ri , putting the result in Ri . (This isenough to do everything else.)

Now our machines are too fast, because we’re adding in time 1, instead of in timeO(lg n) = O(bits). For our purposes now, this is just a linear factor, so doesn’tmatter. If you prefer, you can refine the machine to have registers implementedas bit sequences, with bit manipulation instructions.

43

Page 168: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Time to extend our RMs a bit

Our very basic RMs essentially work in unary: that’s why addition is O(n)instead of O(lg n).

This is too big a slow-down to ignore – it’s exponential in the size (numberof bits = lg n) of the input!.

So from now on we assume instructions add(i , j) and sub(i , j) which(add/truncated subtract) Rj from Ri , putting the result in Ri . (This isenough to do everything else.)

Now our machines are too fast, because we’re adding in time 1, instead of in timeO(lg n) = O(bits). For our purposes now, this is just a linear factor, so doesn’tmatter. If you prefer, you can refine the machine to have registers implementedas bit sequences, with bit manipulation instructions.

43

Page 169: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Time to extend our RMs a bit

Our very basic RMs essentially work in unary: that’s why addition is O(n)instead of O(lg n).

This is too big a slow-down to ignore – it’s exponential in the size (numberof bits = lg n) of the input!.

So from now on we assume instructions add(i , j) and sub(i , j) which(add/truncated subtract) Rj from Ri , putting the result in Ri . (This isenough to do everything else.)

Now our machines are too fast, because we’re adding in time 1, instead of in timeO(lg n) = O(bits). For our purposes now, this is just a linear factor, so doesn’tmatter. If you prefer, you can refine the machine to have registers implementedas bit sequences, with bit manipulation instructions.

43

Page 170: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Time to extend our RMs a bit

Our very basic RMs essentially work in unary: that’s why addition is O(n)instead of O(lg n).

This is too big a slow-down to ignore – it’s exponential in the size (numberof bits = lg n) of the input!.

So from now on we assume instructions add(i , j) and sub(i , j) which(add/truncated subtract) Rj from Ri , putting the result in Ri . (This isenough to do everything else.)

Now our machines are too fast, because we’re adding in time 1, instead of in timeO(lg n) = O(bits). For our purposes now, this is just a linear factor, so doesn’tmatter. If you prefer, you can refine the machine to have registers implementedas bit sequences, with bit manipulation instructions.

43

Page 171: Julian Brad eld jcb@inf.ed.ac.uk IF{4

What counts as seriously different?

Can we separate easy problems from hard problems?

If a problem is O(n) on some model, it’s surely easy on any model.

Actually, O(n) is not ‘easy’ when n is a petabyte. There’s a whole field of‘sub-linear’ algorithms.

If a problem is Ω(2n) on some (non-unary) model, it’s surely hard on anymodel.

Actually, there are problems that are much worse than Ω(2n), but still quitesolvable for real examples. We’ll see one later.

What about something that is O(n10) or Ω(n10)?

An n10 problem is probably practically insoluble – but maybe there’s atrick or a fancy new model that makes it O(n2)?

There’s also that c . . . if f (n) ≥ 10100 lg n, that’s not very tractable. But

this doesn’t often happen.

44

Page 172: Julian Brad eld jcb@inf.ed.ac.uk IF{4

What counts as seriously different?

Can we separate easy problems from hard problems?

If a problem is O(n) on some model, it’s surely easy on any model.

Actually, O(n) is not ‘easy’ when n is a petabyte. There’s a whole field of‘sub-linear’ algorithms.

If a problem is Ω(2n) on some (non-unary) model, it’s surely hard on anymodel.

Actually, there are problems that are much worse than Ω(2n), but still quitesolvable for real examples. We’ll see one later.

What about something that is O(n10) or Ω(n10)?

An n10 problem is probably practically insoluble – but maybe there’s atrick or a fancy new model that makes it O(n2)?

There’s also that c . . . if f (n) ≥ 10100 lg n, that’s not very tractable. But

this doesn’t often happen.

44

Page 173: Julian Brad eld jcb@inf.ed.ac.uk IF{4

What counts as seriously different?

Can we separate easy problems from hard problems?

If a problem is O(n) on some model, it’s surely easy on any model.

Actually, O(n) is not ‘easy’ when n is a petabyte. There’s a whole field of‘sub-linear’ algorithms.

If a problem is Ω(2n) on some (non-unary) model, it’s surely hard on anymodel.

Actually, there are problems that are much worse than Ω(2n), but still quitesolvable for real examples. We’ll see one later.

What about something that is O(n10) or Ω(n10)?

An n10 problem is probably practically insoluble – but maybe there’s atrick or a fancy new model that makes it O(n2)?

There’s also that c . . . if f (n) ≥ 10100 lg n, that’s not very tractable. But

this doesn’t often happen.

44

Page 174: Julian Brad eld jcb@inf.ed.ac.uk IF{4

What counts as seriously different?

Can we separate easy problems from hard problems?

If a problem is O(n) on some model, it’s surely easy on any model.

Actually, O(n) is not ‘easy’ when n is a petabyte. There’s a whole field of‘sub-linear’ algorithms.

If a problem is Ω(2n) on some (non-unary) model, it’s surely hard on anymodel.

Actually, there are problems that are much worse than Ω(2n), but still quitesolvable for real examples. We’ll see one later.

What about something that is O(n10) or Ω(n10)?

An n10 problem is probably practically insoluble – but maybe there’s atrick or a fancy new model that makes it O(n2)?

There’s also that c . . . if f (n) ≥ 10100 lg n, that’s not very tractable. But

this doesn’t often happen.

44

Page 175: Julian Brad eld jcb@inf.ed.ac.uk IF{4

What counts as seriously different?

Can we separate easy problems from hard problems?

If a problem is O(n) on some model, it’s surely easy on any model.

Actually, O(n) is not ‘easy’ when n is a petabyte. There’s a whole field of‘sub-linear’ algorithms.

If a problem is Ω(2n) on some (non-unary) model, it’s surely hard on anymodel.

Actually, there are problems that are much worse than Ω(2n), but still quitesolvable for real examples. We’ll see one later.

What about something that is O(n10) or Ω(n10)?

An n10 problem is probably practically insoluble – but maybe there’s atrick or a fancy new model that makes it O(n2)?

There’s also that c . . . if f (n) ≥ 10100 lg n, that’s not very tractable. But

this doesn’t often happen.

44

Page 176: Julian Brad eld jcb@inf.ed.ac.uk IF{4

What counts as seriously different?

Can we separate easy problems from hard problems?

If a problem is O(n) on some model, it’s surely easy on any model.

Actually, O(n) is not ‘easy’ when n is a petabyte. There’s a whole field of‘sub-linear’ algorithms.

If a problem is Ω(2n) on some (non-unary) model, it’s surely hard on anymodel.

Actually, there are problems that are much worse than Ω(2n), but still quitesolvable for real examples. We’ll see one later.

What about something that is O(n10) or Ω(n10)?

An n10 problem is probably practically insoluble – but maybe there’s atrick or a fancy new model that makes it O(n2)?

There’s also that c . . . if f (n) ≥ 10100 lg n, that’s not very tractable. But

this doesn’t often happen.

44

Page 177: Julian Brad eld jcb@inf.ed.ac.uk IF{4

P – the class of polynomial-time problems

A problem (decision problem or function) or algorithm is in P or PTimeif it can be computed in time f (n) ∈ O(nk) for some k .

The class P is robust: any reasonable change of model doesn’t changeit, and moreover any reasonable translation of one problem into anotherpreserves membership in P.

Problems in P are called tractable.

This is somewhat bogus (see previous slide).

Nonetheless, a problem in P is generally easy-ish in practice, and aproblem not in P is generally hard in practice.

Most of the algorithms you’ve seen are in P.

Note that if a problem is not in P, it’s Ω(nk) for every k (and so ω(nk)for every k). E.g. 2n, or 2

√n.

45

Page 178: Julian Brad eld jcb@inf.ed.ac.uk IF{4

P – the class of polynomial-time problems

A problem (decision problem or function) or algorithm is in P or PTimeif it can be computed in time f (n) ∈ O(nk) for some k .

The class P is robust: any reasonable change of model doesn’t changeit, and moreover any reasonable translation of one problem into anotherpreserves membership in P.

Problems in P are called tractable.

This is somewhat bogus (see previous slide).

Nonetheless, a problem in P is generally easy-ish in practice, and aproblem not in P is generally hard in practice.

Most of the algorithms you’ve seen are in P.

Note that if a problem is not in P, it’s Ω(nk) for every k (and so ω(nk)for every k). E.g. 2n, or 2

√n.

45

Page 179: Julian Brad eld jcb@inf.ed.ac.uk IF{4

P – the class of polynomial-time problems

A problem (decision problem or function) or algorithm is in P or PTimeif it can be computed in time f (n) ∈ O(nk) for some k .

The class P is robust: any reasonable change of model doesn’t changeit, and moreover any reasonable translation of one problem into anotherpreserves membership in P.

Problems in P are called tractable.

This is somewhat bogus (see previous slide).

Nonetheless, a problem in P is generally easy-ish in practice, and aproblem not in P is generally hard in practice.

Most of the algorithms you’ve seen are in P.

Note that if a problem is not in P, it’s Ω(nk) for every k (and so ω(nk)for every k). E.g. 2n, or 2

√n.

45

Page 180: Julian Brad eld jcb@inf.ed.ac.uk IF{4

P – the class of polynomial-time problems

A problem (decision problem or function) or algorithm is in P or PTimeif it can be computed in time f (n) ∈ O(nk) for some k .

The class P is robust: any reasonable change of model doesn’t changeit, and moreover any reasonable translation of one problem into anotherpreserves membership in P.

Problems in P are called tractable.

This is somewhat bogus (see previous slide).

Nonetheless, a problem in P is generally easy-ish in practice, and aproblem not in P is generally hard in practice.

Most of the algorithms you’ve seen are in P.

Note that if a problem is not in P, it’s Ω(nk) for every k (and so ω(nk)for every k). E.g. 2n, or 2

√n.

45

Page 181: Julian Brad eld jcb@inf.ed.ac.uk IF{4

P – the class of polynomial-time problems

A problem (decision problem or function) or algorithm is in P or PTimeif it can be computed in time f (n) ∈ O(nk) for some k .

The class P is robust: any reasonable change of model doesn’t changeit, and moreover any reasonable translation of one problem into anotherpreserves membership in P.

Problems in P are called tractable.

This is somewhat bogus (see previous slide).

Nonetheless, a problem in P is generally easy-ish in practice, and aproblem not in P is generally hard in practice.

Most of the algorithms you’ve seen are in P.

Note that if a problem is not in P, it’s Ω(nk) for every k (and so ω(nk)for every k). E.g. 2n, or 2

√n.

45

Page 182: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Polynomial-time reductions

Just as with computability, reductions are the key tool for comparing thecomplexity of problems.

A PTime reduction or polynomial reduction from (D,Q) to (D ′,Q ′) is:

I a PTime-computable function r : D → D ′, such that

I d ∈ Q ⇔ r(d) ∈ Q ′

We say Q ≤P Q ′

Clearly, if Q ′ ∈ P and Q ≤P Q ′, then also Q ∈ P.

Exercise: If we replace ‘PTime-computable’ by ‘computable’, do we get thenotion of many–one reduction we used before, or the notion of Turing reduction?Does it matter?

For function problems, it makes sense to use a Turing reduction. Exercise:

Express this in terms of oracles.

46

Page 183: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Polynomial-time reductions

Just as with computability, reductions are the key tool for comparing thecomplexity of problems.

A PTime reduction or polynomial reduction from (D,Q) to (D ′,Q ′) is:

I a PTime-computable function r : D → D ′, such that

I d ∈ Q ⇔ r(d) ∈ Q ′

We say Q ≤P Q ′

Clearly, if Q ′ ∈ P and Q ≤P Q ′, then also Q ∈ P.

Exercise: If we replace ‘PTime-computable’ by ‘computable’, do we get thenotion of many–one reduction we used before, or the notion of Turing reduction?Does it matter?

For function problems, it makes sense to use a Turing reduction. Exercise:

Express this in terms of oracles.

46

Page 184: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Polynomial-time reductions

Just as with computability, reductions are the key tool for comparing thecomplexity of problems.

A PTime reduction or polynomial reduction from (D,Q) to (D ′,Q ′) is:

I a PTime-computable function r : D → D ′, such that

I d ∈ Q ⇔ r(d) ∈ Q ′

We say Q ≤P Q ′

Clearly, if Q ′ ∈ P and Q ≤P Q ′, then also Q ∈ P.

Exercise: If we replace ‘PTime-computable’ by ‘computable’, do we get thenotion of many–one reduction we used before, or the notion of Turing reduction?Does it matter?

For function problems, it makes sense to use a Turing reduction. Exercise:

Express this in terms of oracles.

46

Page 185: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Polynomial-time reductions

Just as with computability, reductions are the key tool for comparing thecomplexity of problems.

A PTime reduction or polynomial reduction from (D,Q) to (D ′,Q ′) is:

I a PTime-computable function r : D → D ′, such that

I d ∈ Q ⇔ r(d) ∈ Q ′

We say Q ≤P Q ′

Clearly, if Q ′ ∈ P and Q ≤P Q ′, then also Q ∈ P.

Exercise: If we replace ‘PTime-computable’ by ‘computable’, do we get thenotion of many–one reduction we used before, or the notion of Turing reduction?Does it matter?

For function problems, it makes sense to use a Turing reduction. Exercise:

Express this in terms of oracles.

46

Page 186: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Polynomial-time reductions

Just as with computability, reductions are the key tool for comparing thecomplexity of problems.

A PTime reduction or polynomial reduction from (D,Q) to (D ′,Q ′) is:

I a PTime-computable function r : D → D ′, such that

I d ∈ Q ⇔ r(d) ∈ Q ′

We say Q ≤P Q ′

Clearly, if Q ′ ∈ P and Q ≤P Q ′, then also Q ∈ P.

Exercise: If we replace ‘PTime-computable’ by ‘computable’, do we get thenotion of many–one reduction we used before, or the notion of Turing reduction?Does it matter?

For function problems, it makes sense to use a Turing reduction. Exercise:

Express this in terms of oracles.

46

Page 187: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Outside P

What does it mean to say that a problem cannot be computed by apolynomial algorithm? What is an algorithm?

A polynomially-bounded RM is an RM together with a polynomial (wlognk for some k), such that if the size of the input (e.g. the number of bitsof the number in R0) is n0, then the machine halts after executing nk0instructions.

Then formally a problem Q is in P iff it can be computed by somepolynomially-bounded RM.

This avoids the problem of having to analyse the running time of themachine (which we know is impossible in general!).

To show that Q /∈ P, we therefore have to show that every polynomially-bounded RM claiming to solve Q will time out on some input.

This is not all that easy . . .

47

Page 188: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Outside P

What does it mean to say that a problem cannot be computed by apolynomial algorithm? What is an algorithm?

A polynomially-bounded RM is an RM together with a polynomial (wlognk for some k), such that if the size of the input (e.g. the number of bitsof the number in R0) is n0, then the machine halts after executing nk0instructions.

Then formally a problem Q is in P iff it can be computed by somepolynomially-bounded RM.

This avoids the problem of having to analyse the running time of themachine (which we know is impossible in general!).

To show that Q /∈ P, we therefore have to show that every polynomially-bounded RM claiming to solve Q will time out on some input.

This is not all that easy . . .

47

Page 189: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Outside P

What does it mean to say that a problem cannot be computed by apolynomial algorithm? What is an algorithm?

A polynomially-bounded RM is an RM together with a polynomial (wlognk for some k), such that if the size of the input (e.g. the number of bitsof the number in R0) is n0, then the machine halts after executing nk0instructions.

Then formally a problem Q is in P iff it can be computed by somepolynomially-bounded RM.

This avoids the problem of having to analyse the running time of themachine (which we know is impossible in general!).

To show that Q /∈ P, we therefore have to show that every polynomially-bounded RM claiming to solve Q will time out on some input.

This is not all that easy . . .

47

Page 190: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Outside P

What does it mean to say that a problem cannot be computed by apolynomial algorithm? What is an algorithm?

A polynomially-bounded RM is an RM together with a polynomial (wlognk for some k), such that if the size of the input (e.g. the number of bitsof the number in R0) is n0, then the machine halts after executing nk0instructions.

Then formally a problem Q is in P iff it can be computed by somepolynomially-bounded RM.

This avoids the problem of having to analyse the running time of themachine (which we know is impossible in general!).

To show that Q /∈ P, we therefore have to show that every polynomially-bounded RM claiming to solve Q will time out on some input.

This is not all that easy . . .

47

Page 191: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Outside P

What does it mean to say that a problem cannot be computed by apolynomial algorithm? What is an algorithm?

A polynomially-bounded RM is an RM together with a polynomial (wlognk for some k), such that if the size of the input (e.g. the number of bitsof the number in R0) is n0, then the machine halts after executing nk0instructions.

Then formally a problem Q is in P iff it can be computed by somepolynomially-bounded RM.

This avoids the problem of having to analyse the running time of themachine (which we know is impossible in general!).

To show that Q /∈ P, we therefore have to show that every polynomially-bounded RM claiming to solve Q will time out on some input.

This is not all that easy . . .

47

Page 192: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Some apparently intractable problems

Practical problems that are hard to solve typically seem to need exploringmany (exponentially many) possible answers.

The Hamiltonian Path Problem takes a graph G = (V ,E ), and asks ifthere is a Hamiltonian path, i.e. a path that visits every vertex exactlyonce.

Naively, the only way to solve it is to explore all paths, and there couldbe up to |V |! of them. Factorial grows slightly faster than exponential, sothis is bad . . .

Timetabling is the following: given students taking exams, and timetableslots for exams, is it possible to schedule the exams so that there are noclashes? It also apparently requires looking at exponentially many possibleassignments. (That’s why Registry starts timetabling exams 9 weeks inadvance. . . )

48

Page 193: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Some apparently intractable problems

Practical problems that are hard to solve typically seem to need exploringmany (exponentially many) possible answers.

The Hamiltonian Path Problem takes a graph G = (V ,E ), and asks ifthere is a Hamiltonian path, i.e. a path that visits every vertex exactlyonce.

Naively, the only way to solve it is to explore all paths, and there couldbe up to |V |! of them. Factorial grows slightly faster than exponential, sothis is bad . . .

Timetabling is the following: given students taking exams, and timetableslots for exams, is it possible to schedule the exams so that there are noclashes? It also apparently requires looking at exponentially many possibleassignments. (That’s why Registry starts timetabling exams 9 weeks inadvance. . . )

48

Page 194: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Some apparently intractable problems

Practical problems that are hard to solve typically seem to need exploringmany (exponentially many) possible answers.

The Hamiltonian Path Problem takes a graph G = (V ,E ), and asks ifthere is a Hamiltonian path, i.e. a path that visits every vertex exactlyonce.

Naively, the only way to solve it is to explore all paths, and there couldbe up to |V |! of them. Factorial grows slightly faster than exponential, sothis is bad . . .

Timetabling is the following: given students taking exams, and timetableslots for exams, is it possible to schedule the exams so that there are noclashes? It also apparently requires looking at exponentially many possibleassignments. (That’s why Registry starts timetabling exams 9 weeks inadvance. . . )

48

Page 195: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Some apparently intractable problems

Practical problems that are hard to solve typically seem to need exploringmany (exponentially many) possible answers.

The Hamiltonian Path Problem takes a graph G = (V ,E ), and asks ifthere is a Hamiltonian path, i.e. a path that visits every vertex exactlyonce.

Naively, the only way to solve it is to explore all paths, and there couldbe up to |V |! of them. Factorial grows slightly faster than exponential, sothis is bad . . .

Timetabling is the following: given students taking exams, and timetableslots for exams, is it possible to schedule the exams so that there are noclashes? It also apparently requires looking at exponentially many possibleassignments. (That’s why Registry starts timetabling exams 9 weeks inadvance. . . )

48

Page 196: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Checking vs solving

Both HPP and Timetabling have the property that they are easy to check:if you have a claimed solution, it’s quick to verify that it is a solution.

This is a feature of many practical intractable problems.

Can we study and explore this property?

49

Page 197: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Checking vs solving

Both HPP and Timetabling have the property that they are easy to check:if you have a claimed solution, it’s quick to verify that it is a solution.

This is a feature of many practical intractable problems.

Can we study and explore this property?

49

Page 198: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Non-deterministic register machines

In Inf2A, you saw non-deterministic finite automata: multiple possibletransitions from each (state,input) pair. These were equivalent todeterministic automata, albeit with an exponential blow-up in states.

We can do the same with RMs (or TMs). For RMs, probably easiest toadd a new instruction maybe(j), which non-deterministically either doesnothing or jumps to Ij :

clear R0

beg : maybe (end)inc (0)goto beg

end :

puts a non-deterministically chosen number in R0.

There is a path which just loops for ever – this doesn’t matter. Usually we wantto choose from a finite set, so not an issue anyway.

There is no notion of probability here. Probabilistic RMs (e.g. maybe chooseswith equal prob.) are another extension.

50

Page 199: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Non-deterministic register machines

In Inf2A, you saw non-deterministic finite automata: multiple possibletransitions from each (state,input) pair. These were equivalent todeterministic automata, albeit with an exponential blow-up in states.

We can do the same with RMs (or TMs). For RMs, probably easiest toadd a new instruction maybe(j), which non-deterministically either doesnothing or jumps to Ij :

clear R0

beg : maybe (end)inc (0)goto beg

end :

puts a non-deterministically chosen number in R0.

There is a path which just loops for ever – this doesn’t matter. Usually we wantto choose from a finite set, so not an issue anyway.

There is no notion of probability here. Probabilistic RMs (e.g. maybe chooseswith equal prob.) are another extension.

50

Page 200: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Non-deterministic register machines

In Inf2A, you saw non-deterministic finite automata: multiple possibletransitions from each (state,input) pair. These were equivalent todeterministic automata, albeit with an exponential blow-up in states.

We can do the same with RMs (or TMs). For RMs, probably easiest toadd a new instruction maybe(j), which non-deterministically either doesnothing or jumps to Ij :

clear R0

beg : maybe (end)inc (0)goto beg

end :

puts a non-deterministically chosen number in R0.

There is a path which just loops for ever – this doesn’t matter. Usually we wantto choose from a finite set, so not an issue anyway.

There is no notion of probability here. Probabilistic RMs (e.g. maybe chooseswith equal prob.) are another extension.

50

Page 201: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Non-deterministic register machines

In Inf2A, you saw non-deterministic finite automata: multiple possibletransitions from each (state,input) pair. These were equivalent todeterministic automata, albeit with an exponential blow-up in states.

We can do the same with RMs (or TMs). For RMs, probably easiest toadd a new instruction maybe(j), which non-deterministically either doesnothing or jumps to Ij :

clear R0

beg : maybe (end)inc (0)goto beg

end :

puts a non-deterministically chosen number in R0.

There is a path which just loops for ever – this doesn’t matter. Usually we wantto choose from a finite set, so not an issue anyway.

There is no notion of probability here. Probabilistic RMs (e.g. maybe chooseswith equal prob.) are another extension.

50

Page 202: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Non-deterministic register machines

In Inf2A, you saw non-deterministic finite automata: multiple possibletransitions from each (state,input) pair. These were equivalent todeterministic automata, albeit with an exponential blow-up in states.

We can do the same with RMs (or TMs). For RMs, probably easiest toadd a new instruction maybe(j), which non-deterministically either doesnothing or jumps to Ij :

clear R0

beg : maybe (end)inc (0)goto beg

end :

puts a non-deterministically chosen number in R0.

There is a path which just loops for ever – this doesn’t matter. Usually we wantto choose from a finite set, so not an issue anyway.

There is no notion of probability here. Probabilistic RMs (e.g. maybe chooseswith equal prob.) are another extension.

50

Page 203: Julian Brad eld jcb@inf.ed.ac.uk IF{4

An NRM accepts if some run (sequence of instructions through thechoices) halts and accepts.

‘run accepts’ could mean: halts; halts with 1 in R0; or whatever.

Exercise: explore a few possibilities – does it matter?

Thus infinite runs are ignored.

I sometimes prefer to think of maybe as fork: the machine forks a copyof itself which takes the jump. If any copy accepts, it signals the OS,which kills off all the others.

How does this work for problems with multiple correct answers?

We can use an RM to simulate all the runs of an NRM using theinterleaving technique. Thus:

I NRMs have the same deciding power as RMs.

Exercise: Read the details of the TM version in Sipser.

51

Page 204: Julian Brad eld jcb@inf.ed.ac.uk IF{4

An NRM accepts if some run (sequence of instructions through thechoices) halts and accepts.

‘run accepts’ could mean: halts; halts with 1 in R0; or whatever.

Exercise: explore a few possibilities – does it matter?

Thus infinite runs are ignored.

I sometimes prefer to think of maybe as fork: the machine forks a copyof itself which takes the jump. If any copy accepts, it signals the OS,which kills off all the others.

How does this work for problems with multiple correct answers?

We can use an RM to simulate all the runs of an NRM using theinterleaving technique. Thus:

I NRMs have the same deciding power as RMs.

Exercise: Read the details of the TM version in Sipser.

51

Page 205: Julian Brad eld jcb@inf.ed.ac.uk IF{4

An NRM accepts if some run (sequence of instructions through thechoices) halts and accepts.

‘run accepts’ could mean: halts; halts with 1 in R0; or whatever.

Exercise: explore a few possibilities – does it matter?

Thus infinite runs are ignored.

I sometimes prefer to think of maybe as fork: the machine forks a copyof itself which takes the jump. If any copy accepts, it signals the OS,which kills off all the others.

How does this work for problems with multiple correct answers?

We can use an RM to simulate all the runs of an NRM using theinterleaving technique. Thus:

I NRMs have the same deciding power as RMs.

Exercise: Read the details of the TM version in Sipser.

51

Page 206: Julian Brad eld jcb@inf.ed.ac.uk IF{4

An NRM accepts if some run (sequence of instructions through thechoices) halts and accepts.

‘run accepts’ could mean: halts; halts with 1 in R0; or whatever.

Exercise: explore a few possibilities – does it matter?

Thus infinite runs are ignored.

I sometimes prefer to think of maybe as fork: the machine forks a copyof itself which takes the jump. If any copy accepts, it signals the OS,which kills off all the others.

How does this work for problems with multiple correct answers?

We can use an RM to simulate all the runs of an NRM using theinterleaving technique. Thus:

I NRMs have the same deciding power as RMs.

Exercise: Read the details of the TM version in Sipser.

51

Page 207: Julian Brad eld jcb@inf.ed.ac.uk IF{4

NPTime

NRMs are, however, potentially exponentially faster than RMs: in time n,an NRM can fork/explore 2O(n) possibilities.

We say Q ∈ NP if there is a polynomially-bounded NRM that computesQ.

I HPP is in NP: guess the Hamiltonian path (if there is one), andcheck it (in linear time).

I Timetabling is in NP: guess a timetable, and check it for clashes (inmaybe O(n2) time?).

Is this just theory? We can’t do non-determinism in practice . . . can we?Quantum computers can achieve a similar effect: an n-qubit computer computeson all 2n values simultaneously. But it’s hard to get many qubits; and there aresubtleties – not every NP algorithm is quantum-computable (as far as we know).

52

Page 208: Julian Brad eld jcb@inf.ed.ac.uk IF{4

NPTime

NRMs are, however, potentially exponentially faster than RMs: in time n,an NRM can fork/explore 2O(n) possibilities.

We say Q ∈ NP if there is a polynomially-bounded NRM that computesQ.

I HPP is in NP: guess the Hamiltonian path (if there is one), andcheck it (in linear time).

I Timetabling is in NP: guess a timetable, and check it for clashes (inmaybe O(n2) time?).

Is this just theory? We can’t do non-determinism in practice . . . can we?Quantum computers can achieve a similar effect: an n-qubit computer computeson all 2n values simultaneously. But it’s hard to get many qubits; and there aresubtleties – not every NP algorithm is quantum-computable (as far as we know).

52

Page 209: Julian Brad eld jcb@inf.ed.ac.uk IF{4

NPTime

NRMs are, however, potentially exponentially faster than RMs: in time n,an NRM can fork/explore 2O(n) possibilities.

We say Q ∈ NP if there is a polynomially-bounded NRM that computesQ.

I HPP is in NP: guess the Hamiltonian path (if there is one), andcheck it (in linear time).

I Timetabling is in NP: guess a timetable, and check it for clashes (inmaybe O(n2) time?).

Is this just theory? We can’t do non-determinism in practice . . . can we?Quantum computers can achieve a similar effect: an n-qubit computer computeson all 2n values simultaneously. But it’s hard to get many qubits; and there aresubtleties – not every NP algorithm is quantum-computable (as far as we know).

52

Page 210: Julian Brad eld jcb@inf.ed.ac.uk IF{4

NPTime

NRMs are, however, potentially exponentially faster than RMs: in time n,an NRM can fork/explore 2O(n) possibilities.

We say Q ∈ NP if there is a polynomially-bounded NRM that computesQ.

I HPP is in NP: guess the Hamiltonian path (if there is one), andcheck it (in linear time).

I Timetabling is in NP: guess a timetable, and check it for clashes (inmaybe O(n2) time?).

Is this just theory? We can’t do non-determinism in practice . . . can we?Quantum computers can achieve a similar effect: an n-qubit computer computeson all 2n values simultaneously. But it’s hard to get many qubits; and there aresubtleties – not every NP algorithm is quantum-computable (as far as we know).

52

Page 211: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Is NP all we need?

Is every apparently exponential problem really an NP problem?

No! Orat least, it will be very surprising if they are.

I Consider the problem of determining whether an exponentiallybounded RM accepts.

I There is no shortcut, and nothing to guess. You just have to run itand see.

I Alas, this isn’t a proof – we don’t have a proof that there isn’t anNP algorithm to do it, but (almost) nobody believes there is.

To say nothing of doubly-exponential, non-elementary, etc. (See later.)

53

Page 212: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Is NP all we need?

Is every apparently exponential problem really an NP problem? No! Orat least, it will be very surprising if they are.

I Consider the problem of determining whether an exponentiallybounded RM accepts.

I There is no shortcut, and nothing to guess. You just have to run itand see.

I Alas, this isn’t a proof – we don’t have a proof that there isn’t anNP algorithm to do it, but (almost) nobody believes there is.

To say nothing of doubly-exponential, non-elementary, etc. (See later.)

53

Page 213: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Are all NP problems equally hard?

What does this mean? How do we measure hardness?

We’ll measure hardness relative to ≤P.

I Q is easier than (or equally hard as) Q ′ iff Q ≤P Q ′.

I We say Q is NP-hard iff Q ≥P Q ′ for every Q ′ ∈ NP.

Truly exponential problems are NP-hard, but that’s boring.

I We say Q is NP-complete iff Q is NP-hard and Q ∈ NP.

I Obviously if Q is NP-complete, and NP 3 Q ′ ≥P Q, then Q ′ is alsoNP-complete.

Do NP-complete problems necessarily exist?

They do exist, and there are many, including HPP and Timetabling. Infact, almost all NP-problems encountered in practice are either in P, orNP-complete.

54

Page 214: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Are all NP problems equally hard?

What does this mean? How do we measure hardness?

We’ll measure hardness relative to ≤P.

I Q is easier than (or equally hard as) Q ′ iff Q ≤P Q ′.

I We say Q is NP-hard iff Q ≥P Q ′ for every Q ′ ∈ NP.

Truly exponential problems are NP-hard, but that’s boring.

I We say Q is NP-complete iff Q is NP-hard and Q ∈ NP.

I Obviously if Q is NP-complete, and NP 3 Q ′ ≥P Q, then Q ′ is alsoNP-complete.

Do NP-complete problems necessarily exist?

They do exist, and there are many, including HPP and Timetabling. Infact, almost all NP-problems encountered in practice are either in P, orNP-complete.

54

Page 215: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Are all NP problems equally hard?

What does this mean? How do we measure hardness?

We’ll measure hardness relative to ≤P.

I Q is easier than (or equally hard as) Q ′ iff Q ≤P Q ′.

I We say Q is NP-hard iff Q ≥P Q ′ for every Q ′ ∈ NP.

Truly exponential problems are NP-hard, but that’s boring.

I We say Q is NP-complete iff Q is NP-hard and Q ∈ NP.

I Obviously if Q is NP-complete, and NP 3 Q ′ ≥P Q, then Q ′ is alsoNP-complete.

Do NP-complete problems necessarily exist?

They do exist, and there are many, including HPP and Timetabling. Infact, almost all NP-problems encountered in practice are either in P, orNP-complete.

54

Page 216: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Are all NP problems equally hard?

What does this mean? How do we measure hardness?

We’ll measure hardness relative to ≤P.

I Q is easier than (or equally hard as) Q ′ iff Q ≤P Q ′.

I We say Q is NP-hard iff Q ≥P Q ′ for every Q ′ ∈ NP.

Truly exponential problems are NP-hard, but that’s boring.

I We say Q is NP-complete iff Q is NP-hard and Q ∈ NP.

I Obviously if Q is NP-complete, and NP 3 Q ′ ≥P Q, then Q ′ is alsoNP-complete.

Do NP-complete problems necessarily exist?

They do exist, and there are many, including HPP and Timetabling. Infact, almost all NP-problems encountered in practice are either in P, orNP-complete.

54

Page 217: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Are all NP problems equally hard?

What does this mean? How do we measure hardness?

We’ll measure hardness relative to ≤P.

I Q is easier than (or equally hard as) Q ′ iff Q ≤P Q ′.

I We say Q is NP-hard iff Q ≥P Q ′ for every Q ′ ∈ NP.

Truly exponential problems are NP-hard, but that’s boring.

I We say Q is NP-complete iff Q is NP-hard and Q ∈ NP.

I Obviously if Q is NP-complete, and NP 3 Q ′ ≥P Q, then Q ′ is alsoNP-complete.

Do NP-complete problems necessarily exist?

They do exist, and there are many, including HPP and Timetabling. Infact, almost all NP-problems encountered in practice are either in P, orNP-complete.

54

Page 218: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Are all NP problems equally hard?

What does this mean? How do we measure hardness?

We’ll measure hardness relative to ≤P.

I Q is easier than (or equally hard as) Q ′ iff Q ≤P Q ′.

I We say Q is NP-hard iff Q ≥P Q ′ for every Q ′ ∈ NP.

Truly exponential problems are NP-hard, but that’s boring.

I We say Q is NP-complete iff Q is NP-hard and Q ∈ NP.

I Obviously if Q is NP-complete, and NP 3 Q ′ ≥P Q, then Q ′ is alsoNP-complete.

Do NP-complete problems necessarily exist?

They do exist, and there are many, including HPP and Timetabling. Infact, almost all NP-problems encountered in practice are either in P, orNP-complete.

54

Page 219: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Are all NP problems equally hard?

What does this mean? How do we measure hardness?

We’ll measure hardness relative to ≤P.

I Q is easier than (or equally hard as) Q ′ iff Q ≤P Q ′.

I We say Q is NP-hard iff Q ≥P Q ′ for every Q ′ ∈ NP.

Truly exponential problems are NP-hard, but that’s boring.

I We say Q is NP-complete iff Q is NP-hard and Q ∈ NP.

I Obviously if Q is NP-complete, and NP 3 Q ′ ≥P Q, then Q ′ is alsoNP-complete.

Do NP-complete problems necessarily exist?

They do exist, and there are many, including HPP and Timetabling. Infact, almost all NP-problems encountered in practice are either in P, orNP-complete.

54

Page 220: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Are all NP problems equally hard?

What does this mean? How do we measure hardness?

We’ll measure hardness relative to ≤P.

I Q is easier than (or equally hard as) Q ′ iff Q ≤P Q ′.

I We say Q is NP-hard iff Q ≥P Q ′ for every Q ′ ∈ NP.

Truly exponential problems are NP-hard, but that’s boring.

I We say Q is NP-complete iff Q is NP-hard and Q ∈ NP.

I Obviously if Q is NP-complete, and NP 3 Q ′ ≥P Q, then Q ′ is alsoNP-complete.

Do NP-complete problems necessarily exist?

They do exist, and there are many, including HPP and Timetabling. Infact, almost all NP-problems encountered in practice are either in P, orNP-complete.

54

Page 221: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Are all NP problems equally hard?

What does this mean? How do we measure hardness?

We’ll measure hardness relative to ≤P.

I Q is easier than (or equally hard as) Q ′ iff Q ≤P Q ′.

I We say Q is NP-hard iff Q ≥P Q ′ for every Q ′ ∈ NP.

Truly exponential problems are NP-hard, but that’s boring.

I We say Q is NP-complete iff Q is NP-hard and Q ∈ NP.

I Obviously if Q is NP-complete, and NP 3 Q ′ ≥P Q, then Q ′ is alsoNP-complete.

Do NP-complete problems necessarily exist?

They do exist, and there are many, including HPP and Timetabling. Infact, almost all NP-problems encountered in practice are either in P, orNP-complete.

54

Page 222: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The Cook–Levin theorem

states that a particular NP problem, SAT, is NP-complete.

The theorem is usually proved for TMs; we shall do it for RMs.

The concept is simple, but there are details. Most details are left for you.

Why ‘Cook–Levin’? The notion of NP-completeness, and the theorem, weredue to Stephen Cook (and partly Richard Karp) – in the West. But as withmany major mathematical results of the mid-20th century, they were discoveredindependently in the Soviet Union, by Leonid Levin. Since the fall of the IronCurtain made Soviet maths more accessible, we try to attribute results to bothdiscoverers.

55

Page 223: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The Cook–Levin theorem

states that a particular NP problem, SAT, is NP-complete.

The theorem is usually proved for TMs; we shall do it for RMs.

The concept is simple, but there are details. Most details are left for you.

Why ‘Cook–Levin’? The notion of NP-completeness, and the theorem, weredue to Stephen Cook (and partly Richard Karp) – in the West. But as withmany major mathematical results of the mid-20th century, they were discoveredindependently in the Soviet Union, by Leonid Levin. Since the fall of the IronCurtain made Soviet maths more accessible, we try to attribute results to bothdiscoverers.

55

Page 224: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The Cook–Levin theorem

states that a particular NP problem, SAT, is NP-complete.

The theorem is usually proved for TMs; we shall do it for RMs.

The concept is simple, but there are details. Most details are left for you.

Why ‘Cook–Levin’? The notion of NP-completeness, and the theorem, weredue to Stephen Cook (and partly Richard Karp) – in the West. But as withmany major mathematical results of the mid-20th century, they were discoveredindependently in the Soviet Union, by Leonid Levin. Since the fall of the IronCurtain made Soviet maths more accessible, we try to attribute results to bothdiscoverers.

55

Page 225: Julian Brad eld jcb@inf.ed.ac.uk IF{4

SAT

SAT is a simple problem about boolean logic:

Given a boolean formula φ over a set of boolean variables Xi , is there anassignment of values to the Xi which satisfies φ (i.e. makes φ true)?

Equivalently, is φ non-contradictory?

I (A ∨ B) ∧ (¬B ∨ C ) ∧ (A ∨ C ) is satisfiable, e.g. by making A and Ctrue.

I (A ∧ B) ∧ (¬B ∧ C ) ∧ (A ∧ C ) is not satisfiable.

The size of a SAT problem is simply the number of symbols in φ.

SAT is obviously in NP: just guess an assignment and check it.

It’s also apparently exponential in reality: no obvious way to avoidchecking all possible assignments (the truth table method).

56

Page 226: Julian Brad eld jcb@inf.ed.ac.uk IF{4

SAT

SAT is a simple problem about boolean logic:

Given a boolean formula φ over a set of boolean variables Xi , is there anassignment of values to the Xi which satisfies φ (i.e. makes φ true)?

Equivalently, is φ non-contradictory?

I (A ∨ B) ∧ (¬B ∨ C ) ∧ (A ∨ C ) is satisfiable, e.g. by making A and Ctrue.

I (A ∧ B) ∧ (¬B ∧ C ) ∧ (A ∧ C ) is not satisfiable.

The size of a SAT problem is simply the number of symbols in φ.

SAT is obviously in NP: just guess an assignment and check it.

It’s also apparently exponential in reality: no obvious way to avoidchecking all possible assignments (the truth table method).

56

Page 227: Julian Brad eld jcb@inf.ed.ac.uk IF{4

SAT

SAT is a simple problem about boolean logic:

Given a boolean formula φ over a set of boolean variables Xi , is there anassignment of values to the Xi which satisfies φ (i.e. makes φ true)?

Equivalently, is φ non-contradictory?

I (A ∨ B) ∧ (¬B ∨ C ) ∧ (A ∨ C ) is satisfiable, e.g. by making A and Ctrue.

I (A ∧ B) ∧ (¬B ∧ C ) ∧ (A ∧ C ) is not satisfiable.

The size of a SAT problem is simply the number of symbols in φ.

SAT is obviously in NP: just guess an assignment and check it.

It’s also apparently exponential in reality: no obvious way to avoidchecking all possible assignments (the truth table method).

56

Page 228: Julian Brad eld jcb@inf.ed.ac.uk IF{4

SAT

SAT is a simple problem about boolean logic:

Given a boolean formula φ over a set of boolean variables Xi , is there anassignment of values to the Xi which satisfies φ (i.e. makes φ true)?

Equivalently, is φ non-contradictory?

I (A ∨ B) ∧ (¬B ∨ C ) ∧ (A ∨ C ) is satisfiable, e.g. by making A and Ctrue.

I (A ∧ B) ∧ (¬B ∧ C ) ∧ (A ∧ C ) is not satisfiable.

The size of a SAT problem is simply the number of symbols in φ.

SAT is obviously in NP: just guess an assignment and check it.

It’s also apparently exponential in reality: no obvious way to avoidchecking all possible assignments (the truth table method).

56

Page 229: Julian Brad eld jcb@inf.ed.ac.uk IF{4

SAT

SAT is a simple problem about boolean logic:

Given a boolean formula φ over a set of boolean variables Xi , is there anassignment of values to the Xi which satisfies φ (i.e. makes φ true)?

Equivalently, is φ non-contradictory?

I (A ∨ B) ∧ (¬B ∨ C ) ∧ (A ∨ C ) is satisfiable, e.g. by making A and Ctrue.

I (A ∧ B) ∧ (¬B ∧ C ) ∧ (A ∧ C ) is not satisfiable.

The size of a SAT problem is simply the number of symbols in φ.

SAT is obviously in NP: just guess an assignment and check it.

It’s also apparently exponential in reality: no obvious way to avoidchecking all possible assignments (the truth table method).

56

Page 230: Julian Brad eld jcb@inf.ed.ac.uk IF{4

SAT

SAT is a simple problem about boolean logic:

Given a boolean formula φ over a set of boolean variables Xi , is there anassignment of values to the Xi which satisfies φ (i.e. makes φ true)?

Equivalently, is φ non-contradictory?

I (A ∨ B) ∧ (¬B ∨ C ) ∧ (A ∨ C ) is satisfiable, e.g. by making A and Ctrue.

I (A ∧ B) ∧ (¬B ∧ C ) ∧ (A ∧ C ) is not satisfiable.

The size of a SAT problem is simply the number of symbols in φ.

SAT is obviously in NP: just guess an assignment and check it.

It’s also apparently exponential in reality: no obvious way to avoidchecking all possible assignments (the truth table method).

56

Page 231: Julian Brad eld jcb@inf.ed.ac.uk IF{4

SAT is NP-complete

Suppose (D,Q) ∈ NP. We shall construct a reduction Q ≤P SAT.

Outline of proof:

Given an instance d ∈ D, design a boolean formula φd which can besatisfied if its variables describe the successful executions of an NRMchecking Q. The machine can be polynomially bounded, so the size of φdis polynomial in the size of d .

The rest is detail.

57

Page 232: Julian Brad eld jcb@inf.ed.ac.uk IF{4

SAT is NP-complete

Suppose (D,Q) ∈ NP. We shall construct a reduction Q ≤P SAT.

Outline of proof:

Given an instance d ∈ D, design a boolean formula φd which can besatisfied if its variables describe the successful executions of an NRMchecking Q. The machine can be polynomially bounded, so the size of φdis polynomial in the size of d .

The rest is detail.

57

Page 233: Julian Brad eld jcb@inf.ed.ac.uk IF{4

SAT is NP-complete

Suppose (D,Q) ∈ NP. We shall construct a reduction Q ≤P SAT.

Outline of proof:

Given an instance d ∈ D, design a boolean formula φd which can besatisfied if its variables describe the successful executions of an NRMchecking Q. The machine can be polynomially bounded, so the size of φdis polynomial in the size of d .

The rest is detail.

57

Page 234: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Detail

We have an instance d ∈ D.

Since (D,Q) ∈ NP, there is a polynomially bounded NRM M =(R0, . . . ,Rm−1, I0 . . . In−1) that computes Q. Let p(x) be the polynomialbound, and let s = p(|d |). (s ‘number of steps’) (Note: the emptyinstruction In is the halting state.)

So M takes at most s steps.

How big can the values in the registers get? Worst case: start with 2|d |,and execute s add(0, 0) instructions, giving 2|d |+s . I.e. at most |d | + sbits. Assume wlog s ≥ |d |, so at most 2s bits.

We now define a large, but O(poly(s)), number of variables, with intendedinterpretations.

‘Intended’ because we will design φd so that these interpretations do what wewant.

58

Page 235: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Detail

We have an instance d ∈ D.

Since (D,Q) ∈ NP, there is a polynomially bounded NRM M =(R0, . . . ,Rm−1, I0 . . . In−1) that computes Q. Let p(x) be the polynomialbound, and let s = p(|d |). (s ‘number of steps’) (Note: the emptyinstruction In is the halting state.)

So M takes at most s steps.

How big can the values in the registers get? Worst case: start with 2|d |,and execute s add(0, 0) instructions, giving 2|d |+s . I.e. at most |d | + sbits. Assume wlog s ≥ |d |, so at most 2s bits.

We now define a large, but O(poly(s)), number of variables, with intendedinterpretations.

‘Intended’ because we will design φd so that these interpretations do what wewant.

58

Page 236: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Detail

We have an instance d ∈ D.

Since (D,Q) ∈ NP, there is a polynomially bounded NRM M =(R0, . . . ,Rm−1, I0 . . . In−1) that computes Q. Let p(x) be the polynomialbound, and let s = p(|d |). (s ‘number of steps’) (Note: the emptyinstruction In is the halting state.)

So M takes at most s steps.

How big can the values in the registers get? Worst case: start with 2|d |,and execute s add(0, 0) instructions, giving 2|d |+s . I.e. at most |d | + sbits. Assume wlog s ≥ |d |, so at most 2s bits.

We now define a large, but O(poly(s)), number of variables, with intendedinterpretations.

‘Intended’ because we will design φd so that these interpretations do what wewant.

58

Page 237: Julian Brad eld jcb@inf.ed.ac.uk IF{4

VariablesName Meaning How many?

Ctj program counter at step t is on Ij s · nRtik kth bit of Ri at step t s ·m · 2s

Formulas

Name Meaning How big?χone prog cntr in one place s · n2

χt step t + 1 follows from step t s2 ·mρinit initial register values m · nα machine accepts s

The formula C00 ∧ ρinit ∧χone ∧∧

t χt ∧α is then satisfied by assignmentscorresponding to valid executions of the machine.

The χone formula is easy:∧t

∨j

(Ctj ∧

∧j ′ 6=j

¬Ctj ′

)

59

Page 238: Julian Brad eld jcb@inf.ed.ac.uk IF{4

VariablesName Meaning How many?Ctj program counter at step t is on Ij s · n

Rtik kth bit of Ri at step t s ·m · 2s

Formulas

Name Meaning How big?χone prog cntr in one place s · n2

χt step t + 1 follows from step t s2 ·mρinit initial register values m · nα machine accepts s

The formula C00 ∧ ρinit ∧χone ∧∧

t χt ∧α is then satisfied by assignmentscorresponding to valid executions of the machine.

The χone formula is easy:∧t

∨j

(Ctj ∧

∧j ′ 6=j

¬Ctj ′

)

59

Page 239: Julian Brad eld jcb@inf.ed.ac.uk IF{4

VariablesName Meaning How many?Ctj program counter at step t is on Ij s · nRtik kth bit of Ri at step t s ·m · 2s

Formulas

Name Meaning How big?χone prog cntr in one place s · n2

χt step t + 1 follows from step t s2 ·mρinit initial register values m · nα machine accepts s

The formula C00 ∧ ρinit ∧χone ∧∧

t χt ∧α is then satisfied by assignmentscorresponding to valid executions of the machine.

The χone formula is easy:∧t

∨j

(Ctj ∧

∧j ′ 6=j

¬Ctj ′

)

59

Page 240: Julian Brad eld jcb@inf.ed.ac.uk IF{4

VariablesName Meaning How many?Ctj program counter at step t is on Ij s · nRtik kth bit of Ri at step t s ·m · 2s

Formulas

Name Meaning How big?

χone prog cntr in one place s · n2

χt step t + 1 follows from step t s2 ·mρinit initial register values m · nα machine accepts s

The formula C00 ∧ ρinit ∧χone ∧∧

t χt ∧α is then satisfied by assignmentscorresponding to valid executions of the machine.

The χone formula is easy:∧t

∨j

(Ctj ∧

∧j ′ 6=j

¬Ctj ′

)

59

Page 241: Julian Brad eld jcb@inf.ed.ac.uk IF{4

VariablesName Meaning How many?Ctj program counter at step t is on Ij s · nRtik kth bit of Ri at step t s ·m · 2s

Formulas

Name Meaning How big?χone prog cntr in one place s · n2

χt step t + 1 follows from step t s2 ·mρinit initial register values m · nα machine accepts s

The formula C00 ∧ ρinit ∧χone ∧∧

t χt ∧α is then satisfied by assignmentscorresponding to valid executions of the machine.

The χone formula is easy:∧t

∨j

(Ctj ∧

∧j ′ 6=j

¬Ctj ′

)

59

Page 242: Julian Brad eld jcb@inf.ed.ac.uk IF{4

VariablesName Meaning How many?Ctj program counter at step t is on Ij s · nRtik kth bit of Ri at step t s ·m · 2s

Formulas

Name Meaning How big?χone prog cntr in one place s · n2

χt step t + 1 follows from step t s2 ·m

ρinit initial register values m · nα machine accepts s

The formula C00 ∧ ρinit ∧χone ∧∧

t χt ∧α is then satisfied by assignmentscorresponding to valid executions of the machine.

The χone formula is easy:∧t

∨j

(Ctj ∧

∧j ′ 6=j

¬Ctj ′

)

59

Page 243: Julian Brad eld jcb@inf.ed.ac.uk IF{4

VariablesName Meaning How many?Ctj program counter at step t is on Ij s · nRtik kth bit of Ri at step t s ·m · 2s

Formulas

Name Meaning How big?χone prog cntr in one place s · n2

χt step t + 1 follows from step t s2 ·mρinit initial register values m · n

α machine accepts s

The formula C00 ∧ ρinit ∧χone ∧∧

t χt ∧α is then satisfied by assignmentscorresponding to valid executions of the machine.

The χone formula is easy:∧t

∨j

(Ctj ∧

∧j ′ 6=j

¬Ctj ′

)

59

Page 244: Julian Brad eld jcb@inf.ed.ac.uk IF{4

VariablesName Meaning How many?Ctj program counter at step t is on Ij s · nRtik kth bit of Ri at step t s ·m · 2s

Formulas

Name Meaning How big?χone prog cntr in one place s · n2

χt step t + 1 follows from step t s2 ·mρinit initial register values m · nα machine accepts s

The formula C00 ∧ ρinit ∧χone ∧∧

t χt ∧α is then satisfied by assignmentscorresponding to valid executions of the machine.

The χone formula is easy:∧t

∨j

(Ctj ∧

∧j ′ 6=j

¬Ctj ′

)

59

Page 245: Julian Brad eld jcb@inf.ed.ac.uk IF{4

VariablesName Meaning How many?Ctj program counter at step t is on Ij s · nRtik kth bit of Ri at step t s ·m · 2s

Formulas

Name Meaning How big?χone prog cntr in one place s · n2

χt step t + 1 follows from step t s2 ·mρinit initial register values m · nα machine accepts s

The formula C00 ∧ ρinit ∧χone ∧∧

t χt ∧α is then satisfied by assignmentscorresponding to valid executions of the machine.

The χone formula is easy:∧t

∨j

(Ctj ∧

∧j ′ 6=j

¬Ctj ′

)

59

Page 246: Julian Brad eld jcb@inf.ed.ac.uk IF{4

VariablesName Meaning How many?Ctj program counter at step t is on Ij s · nRtik kth bit of Ri at step t s ·m · 2s

Formulas

Name Meaning How big?χone prog cntr in one place s · n2

χt step t + 1 follows from step t s2 ·mρinit initial register values m · nα machine accepts s

The formula C00 ∧ ρinit ∧χone ∧∧

t χt ∧α is then satisfied by assignmentscorresponding to valid executions of the machine.

The χone formula is easy:∧t

∨j

(Ctj ∧

∧j ′ 6=j

¬Ctj ′

)59

Page 247: Julian Brad eld jcb@inf.ed.ac.uk IF{4

more details

Coding χt is a bit tedious . . . step t + 1 follows from step t if the controlflow is correct and if the registers have changed correctly.

For control flow: the formula is∨

j(Ctj ∧ νtj), where νtj is:

Ct+1,j+1 if Ij is inc, add or subCt+1,j+1 ∨ Ct+1,j ′ if Ij is maybe(j ′)((∨

k Rtik) ∧ Ct+1,j+1) ∨ ((∧

k ¬Rtik) ∧ Ct+1,j ′) if Ij is dec(i , j ′)

To code the register changes, we need to express the addition of (2s)-bitnumbers by boolean operations on the bits. See your hardware course!

Exercise: write the formula ρ+tii ′ which says that at step t + 1, the value

of Ri is the sum of the values of Ri and Ri ′ at step t. (Answer in thenotes.)

Despite the very tedious nature of these formulas, we can see that thewhole formula is O(s4).

And so the theorem is proved.

60

Page 248: Julian Brad eld jcb@inf.ed.ac.uk IF{4

more details

Coding χt is a bit tedious . . . step t + 1 follows from step t if the controlflow is correct and if the registers have changed correctly.

For control flow: the formula is∨

j(Ctj ∧ νtj), where νtj is:

Ct+1,j+1 if Ij is inc, add or subCt+1,j+1 ∨ Ct+1,j ′ if Ij is maybe(j ′)((∨

k Rtik) ∧ Ct+1,j+1) ∨ ((∧

k ¬Rtik) ∧ Ct+1,j ′) if Ij is dec(i , j ′)

To code the register changes, we need to express the addition of (2s)-bitnumbers by boolean operations on the bits. See your hardware course!

Exercise: write the formula ρ+tii ′ which says that at step t + 1, the value

of Ri is the sum of the values of Ri and Ri ′ at step t. (Answer in thenotes.)

Despite the very tedious nature of these formulas, we can see that thewhole formula is O(s4).

And so the theorem is proved.

60

Page 249: Julian Brad eld jcb@inf.ed.ac.uk IF{4

more details

Coding χt is a bit tedious . . . step t + 1 follows from step t if the controlflow is correct and if the registers have changed correctly.

For control flow: the formula is∨

j(Ctj ∧ νtj), where νtj is:

Ct+1,j+1 if Ij is inc, add or sub

Ct+1,j+1 ∨ Ct+1,j ′ if Ij is maybe(j ′)((∨

k Rtik) ∧ Ct+1,j+1) ∨ ((∧

k ¬Rtik) ∧ Ct+1,j ′) if Ij is dec(i , j ′)

To code the register changes, we need to express the addition of (2s)-bitnumbers by boolean operations on the bits. See your hardware course!

Exercise: write the formula ρ+tii ′ which says that at step t + 1, the value

of Ri is the sum of the values of Ri and Ri ′ at step t. (Answer in thenotes.)

Despite the very tedious nature of these formulas, we can see that thewhole formula is O(s4).

And so the theorem is proved.

60

Page 250: Julian Brad eld jcb@inf.ed.ac.uk IF{4

more details

Coding χt is a bit tedious . . . step t + 1 follows from step t if the controlflow is correct and if the registers have changed correctly.

For control flow: the formula is∨

j(Ctj ∧ νtj), where νtj is:

Ct+1,j+1 if Ij is inc, add or subCt+1,j+1 ∨ Ct+1,j ′ if Ij is maybe(j ′)

((∨

k Rtik) ∧ Ct+1,j+1) ∨ ((∧

k ¬Rtik) ∧ Ct+1,j ′) if Ij is dec(i , j ′)

To code the register changes, we need to express the addition of (2s)-bitnumbers by boolean operations on the bits. See your hardware course!

Exercise: write the formula ρ+tii ′ which says that at step t + 1, the value

of Ri is the sum of the values of Ri and Ri ′ at step t. (Answer in thenotes.)

Despite the very tedious nature of these formulas, we can see that thewhole formula is O(s4).

And so the theorem is proved.

60

Page 251: Julian Brad eld jcb@inf.ed.ac.uk IF{4

more details

Coding χt is a bit tedious . . . step t + 1 follows from step t if the controlflow is correct and if the registers have changed correctly.

For control flow: the formula is∨

j(Ctj ∧ νtj), where νtj is:

Ct+1,j+1 if Ij is inc, add or subCt+1,j+1 ∨ Ct+1,j ′ if Ij is maybe(j ′)((∨

k Rtik) ∧ Ct+1,j+1) ∨ ((∧

k ¬Rtik) ∧ Ct+1,j ′) if Ij is dec(i , j ′)

To code the register changes, we need to express the addition of (2s)-bitnumbers by boolean operations on the bits. See your hardware course!

Exercise: write the formula ρ+tii ′ which says that at step t + 1, the value

of Ri is the sum of the values of Ri and Ri ′ at step t. (Answer in thenotes.)

Despite the very tedious nature of these formulas, we can see that thewhole formula is O(s4).

And so the theorem is proved.

60

Page 252: Julian Brad eld jcb@inf.ed.ac.uk IF{4

more details

Coding χt is a bit tedious . . . step t + 1 follows from step t if the controlflow is correct and if the registers have changed correctly.

For control flow: the formula is∨

j(Ctj ∧ νtj), where νtj is:

Ct+1,j+1 if Ij is inc, add or subCt+1,j+1 ∨ Ct+1,j ′ if Ij is maybe(j ′)((∨

k Rtik) ∧ Ct+1,j+1) ∨ ((∧

k ¬Rtik) ∧ Ct+1,j ′) if Ij is dec(i , j ′)

To code the register changes, we need to express the addition of (2s)-bitnumbers by boolean operations on the bits. See your hardware course!

Exercise: write the formula ρ+tii ′ which says that at step t + 1, the value

of Ri is the sum of the values of Ri and Ri ′ at step t. (Answer in thenotes.)

Despite the very tedious nature of these formulas, we can see that thewhole formula is O(s4).

And so the theorem is proved.

60

Page 253: Julian Brad eld jcb@inf.ed.ac.uk IF{4

more details

Coding χt is a bit tedious . . . step t + 1 follows from step t if the controlflow is correct and if the registers have changed correctly.

For control flow: the formula is∨

j(Ctj ∧ νtj), where νtj is:

Ct+1,j+1 if Ij is inc, add or subCt+1,j+1 ∨ Ct+1,j ′ if Ij is maybe(j ′)((∨

k Rtik) ∧ Ct+1,j+1) ∨ ((∧

k ¬Rtik) ∧ Ct+1,j ′) if Ij is dec(i , j ′)

To code the register changes, we need to express the addition of (2s)-bitnumbers by boolean operations on the bits. See your hardware course!

Exercise: write the formula ρ+tii ′ which says that at step t + 1, the value

of Ri is the sum of the values of Ri and Ri ′ at step t. (Answer in thenotes.)

Despite the very tedious nature of these formulas, we can see that thewhole formula is O(s4).

And so the theorem is proved.

60

Page 254: Julian Brad eld jcb@inf.ed.ac.uk IF{4

more details

Coding χt is a bit tedious . . . step t + 1 follows from step t if the controlflow is correct and if the registers have changed correctly.

For control flow: the formula is∨

j(Ctj ∧ νtj), where νtj is:

Ct+1,j+1 if Ij is inc, add or subCt+1,j+1 ∨ Ct+1,j ′ if Ij is maybe(j ′)((∨

k Rtik) ∧ Ct+1,j+1) ∨ ((∧

k ¬Rtik) ∧ Ct+1,j ′) if Ij is dec(i , j ′)

To code the register changes, we need to express the addition of (2s)-bitnumbers by boolean operations on the bits. See your hardware course!

Exercise: write the formula ρ+tii ′ which says that at step t + 1, the value

of Ri is the sum of the values of Ri and Ri ′ at step t. (Answer in thenotes.)

Despite the very tedious nature of these formulas, we can see that thewhole formula is O(s4).

And so the theorem is proved.60

Page 255: Julian Brad eld jcb@inf.ed.ac.uk IF{4

More NP-complete problems

There are many hundreds of natural problems that are NP-complete.Look at the Wikipedia article ‘List of NP-complete problems’ for a rangeof examples, or Sipser for more details on a smaller range.

Starting from SAT, we use reductions to build a library of NP-completeproblems, for use in tackling new potential problems.

Sometimes, the reductions from SAT (or other known problem) requireconsiderable ingenuity. E.g., showing NP-completeness of HPP is quitetricky.

We’ll do a couple of examples.

61

Page 256: Julian Brad eld jcb@inf.ed.ac.uk IF{4

More NP-complete problems

There are many hundreds of natural problems that are NP-complete.Look at the Wikipedia article ‘List of NP-complete problems’ for a rangeof examples, or Sipser for more details on a smaller range.

Starting from SAT, we use reductions to build a library of NP-completeproblems, for use in tackling new potential problems.

Sometimes, the reductions from SAT (or other known problem) requireconsiderable ingenuity. E.g., showing NP-completeness of HPP is quitetricky.

We’ll do a couple of examples.

61

Page 257: Julian Brad eld jcb@inf.ed.ac.uk IF{4

More NP-complete problems

There are many hundreds of natural problems that are NP-complete.Look at the Wikipedia article ‘List of NP-complete problems’ for a rangeof examples, or Sipser for more details on a smaller range.

Starting from SAT, we use reductions to build a library of NP-completeproblems, for use in tackling new potential problems.

Sometimes, the reductions from SAT (or other known problem) requireconsiderable ingenuity. E.g., showing NP-completeness of HPP is quitetricky.

We’ll do a couple of examples.

61

Page 258: Julian Brad eld jcb@inf.ed.ac.uk IF{4

3SAT

There are special forms of SAT that turn out to be particularly handy.

A literal is either a propositional variable P or the negation ¬P of one.

φ is in conjunctive normal form (CNF) if it is of the form∧

i

∨j pij where

each pij is a literal.

φ is in k-CNF if each clause∨

j pij has at most k literals.

3SAT is the problem of whether a satisfying assignment exists for aformula in 3-CNF.

The reduction from unrestricted SAT to 3-SAT is also a bit tricky, mostlybecause normal boolean logic conversion to CNF introduces exponentialblowup. The Tseitin encoding is used to produce a not equivalent butequisatisfiable CNF formula. (See Note 3.)

We are paying a price here for having done Cook–Levin with RMs. The originalversion with TMs manages to produce a CNF formula describing machineexecutions. In fact, SAT is often defined to mean CNF-SAT.

62

Page 259: Julian Brad eld jcb@inf.ed.ac.uk IF{4

3SAT

There are special forms of SAT that turn out to be particularly handy.

A literal is either a propositional variable P or the negation ¬P of one.

φ is in conjunctive normal form (CNF) if it is of the form∧

i

∨j pij where

each pij is a literal.

φ is in k-CNF if each clause∨

j pij has at most k literals.

3SAT is the problem of whether a satisfying assignment exists for aformula in 3-CNF.

The reduction from unrestricted SAT to 3-SAT is also a bit tricky, mostlybecause normal boolean logic conversion to CNF introduces exponentialblowup. The Tseitin encoding is used to produce a not equivalent butequisatisfiable CNF formula. (See Note 3.)

We are paying a price here for having done Cook–Levin with RMs. The originalversion with TMs manages to produce a CNF formula describing machineexecutions. In fact, SAT is often defined to mean CNF-SAT.

62

Page 260: Julian Brad eld jcb@inf.ed.ac.uk IF{4

3SAT

There are special forms of SAT that turn out to be particularly handy.

A literal is either a propositional variable P or the negation ¬P of one.

φ is in conjunctive normal form (CNF) if it is of the form∧

i

∨j pij where

each pij is a literal.

φ is in k-CNF if each clause∨

j pij has at most k literals.

3SAT is the problem of whether a satisfying assignment exists for aformula in 3-CNF.

The reduction from unrestricted SAT to 3-SAT is also a bit tricky, mostlybecause normal boolean logic conversion to CNF introduces exponentialblowup. The Tseitin encoding is used to produce a not equivalent butequisatisfiable CNF formula. (See Note 3.)

We are paying a price here for having done Cook–Levin with RMs. The originalversion with TMs manages to produce a CNF formula describing machineexecutions. In fact, SAT is often defined to mean CNF-SAT.

62

Page 261: Julian Brad eld jcb@inf.ed.ac.uk IF{4

3SAT

There are special forms of SAT that turn out to be particularly handy.

A literal is either a propositional variable P or the negation ¬P of one.

φ is in conjunctive normal form (CNF) if it is of the form∧

i

∨j pij where

each pij is a literal.

φ is in k-CNF if each clause∨

j pij has at most k literals.

3SAT is the problem of whether a satisfying assignment exists for aformula in 3-CNF.

The reduction from unrestricted SAT to 3-SAT is also a bit tricky, mostlybecause normal boolean logic conversion to CNF introduces exponentialblowup. The Tseitin encoding is used to produce a not equivalent butequisatisfiable CNF formula. (See Note 3.)

We are paying a price here for having done Cook–Levin with RMs. The originalversion with TMs manages to produce a CNF formula describing machineexecutions. In fact, SAT is often defined to mean CNF-SAT.

62

Page 262: Julian Brad eld jcb@inf.ed.ac.uk IF{4

3SAT

There are special forms of SAT that turn out to be particularly handy.

A literal is either a propositional variable P or the negation ¬P of one.

φ is in conjunctive normal form (CNF) if it is of the form∧

i

∨j pij where

each pij is a literal.

φ is in k-CNF if each clause∨

j pij has at most k literals.

3SAT is the problem of whether a satisfying assignment exists for aformula in 3-CNF.

The reduction from unrestricted SAT to 3-SAT is also a bit tricky, mostlybecause normal boolean logic conversion to CNF introduces exponentialblowup. The Tseitin encoding is used to produce a not equivalent butequisatisfiable CNF formula. (See Note 3.)

We are paying a price here for having done Cook–Levin with RMs. The originalversion with TMs manages to produce a CNF formula describing machineexecutions. In fact, SAT is often defined to mean CNF-SAT.

62

Page 263: Julian Brad eld jcb@inf.ed.ac.uk IF{4

CLIQUE

Given a graph G = (V ,E ) and a number k , a k-clique is a k-sized subsetC of V , such that every vertex in C has an edge to every other. (C formsa complete subgraph.) The CLIQUE problem is to decide whether G hasa k-clique.

The problem has applications in chemistry and biology (and of coursesociology).

It is NP-complete, by the following reduction from 3SAT:

Let φ =∧

1≤i≤k(xi1 ∨ xi2 ∨ xi3) be an instance of 3SAT, so each xij isa literal. Construct a graph thus: each xij is a vertex. Make an edgebetween xij and xi ′j ′ just in case i 6= i ′ and xi ′j ′ is not the negation ofxij . (I.e., we connect literals in different clauses just when they are notinconsistent with each other.)

Since the vertices in one clause are disconnected, finding a k-cliqueamounts to finding one literal for each clause, such that they are allconsistent – and so represent a satisfying assignment. Conversely, anysatisfying assignment generates a k-clique.

63

Page 264: Julian Brad eld jcb@inf.ed.ac.uk IF{4

CLIQUE

Given a graph G = (V ,E ) and a number k , a k-clique is a k-sized subsetC of V , such that every vertex in C has an edge to every other. (C formsa complete subgraph.) The CLIQUE problem is to decide whether G hasa k-clique.

The problem has applications in chemistry and biology (and of coursesociology).

It is NP-complete, by the following reduction from 3SAT:

Let φ =∧

1≤i≤k(xi1 ∨ xi2 ∨ xi3) be an instance of 3SAT, so each xij isa literal.

Construct a graph thus: each xij is a vertex. Make an edgebetween xij and xi ′j ′ just in case i 6= i ′ and xi ′j ′ is not the negation ofxij . (I.e., we connect literals in different clauses just when they are notinconsistent with each other.)

Since the vertices in one clause are disconnected, finding a k-cliqueamounts to finding one literal for each clause, such that they are allconsistent – and so represent a satisfying assignment. Conversely, anysatisfying assignment generates a k-clique.

63

Page 265: Julian Brad eld jcb@inf.ed.ac.uk IF{4

CLIQUE

Given a graph G = (V ,E ) and a number k , a k-clique is a k-sized subsetC of V , such that every vertex in C has an edge to every other. (C formsa complete subgraph.) The CLIQUE problem is to decide whether G hasa k-clique.

The problem has applications in chemistry and biology (and of coursesociology).

It is NP-complete, by the following reduction from 3SAT:

Let φ =∧

1≤i≤k(xi1 ∨ xi2 ∨ xi3) be an instance of 3SAT, so each xij isa literal. Construct a graph thus: each xij is a vertex. Make an edgebetween xij and xi ′j ′ just in case i 6= i ′ and xi ′j ′ is not the negation ofxij . (I.e., we connect literals in different clauses just when they are notinconsistent with each other.)

Since the vertices in one clause are disconnected, finding a k-cliqueamounts to finding one literal for each clause, such that they are allconsistent – and so represent a satisfying assignment. Conversely, anysatisfying assignment generates a k-clique.

63

Page 266: Julian Brad eld jcb@inf.ed.ac.uk IF{4

CLIQUE

Given a graph G = (V ,E ) and a number k , a k-clique is a k-sized subsetC of V , such that every vertex in C has an edge to every other. (C formsa complete subgraph.) The CLIQUE problem is to decide whether G hasa k-clique.

The problem has applications in chemistry and biology (and of coursesociology).

It is NP-complete, by the following reduction from 3SAT:

Let φ =∧

1≤i≤k(xi1 ∨ xi2 ∨ xi3) be an instance of 3SAT, so each xij isa literal. Construct a graph thus: each xij is a vertex. Make an edgebetween xij and xi ′j ′ just in case i 6= i ′ and xi ′j ′ is not the negation ofxij . (I.e., we connect literals in different clauses just when they are notinconsistent with each other.)

Since the vertices in one clause are disconnected, finding a k-cliqueamounts to finding one literal for each clause, such that they are allconsistent – and so represent a satisfying assignment. Conversely, anysatisfying assignment generates a k-clique.

63

Page 267: Julian Brad eld jcb@inf.ed.ac.uk IF{4

P?= NP

We have said that NP-complete problems are effectively exponential.

But do we know this? Is it possible that we’re just stupid, and there’s aPTime algorithm for SAT?

We don’t know.

It is possible that NP = P. If you find a polynomial algorithm for SATor any other NP-complete problem . . . hire bodyguards, because mostweb/banking security depends on such problems being hard.

Solving the problem is worth a million US dollars: it is one of the sevenproblems chosen by the Clay Institute for their Millennium Prizes.

Many of the results in complexity theory are therefore conditional: ‘ifP 6= NP, then very difficult theorem’.

E.g. if P 6= NP, then there are problems that are neither P nor NP-complete. There are very few candidates: best known is the GraphIsomorphism Problem.

64

Page 268: Julian Brad eld jcb@inf.ed.ac.uk IF{4

P?= NP

We have said that NP-complete problems are effectively exponential.

But do we know this? Is it possible that we’re just stupid, and there’s aPTime algorithm for SAT?

We don’t know.

It is possible that NP = P. If you find a polynomial algorithm for SATor any other NP-complete problem . . . hire bodyguards, because mostweb/banking security depends on such problems being hard.

Solving the problem is worth a million US dollars: it is one of the sevenproblems chosen by the Clay Institute for their Millennium Prizes.

Many of the results in complexity theory are therefore conditional: ‘ifP 6= NP, then very difficult theorem’.

E.g. if P 6= NP, then there are problems that are neither P nor NP-complete. There are very few candidates: best known is the GraphIsomorphism Problem.

64

Page 269: Julian Brad eld jcb@inf.ed.ac.uk IF{4

P?= NP

We have said that NP-complete problems are effectively exponential.

But do we know this? Is it possible that we’re just stupid, and there’s aPTime algorithm for SAT?

We don’t know.

It is possible that NP = P. If you find a polynomial algorithm for SATor any other NP-complete problem . . . hire bodyguards, because mostweb/banking security depends on such problems being hard.

Solving the problem is worth a million US dollars: it is one of the sevenproblems chosen by the Clay Institute for their Millennium Prizes.

Many of the results in complexity theory are therefore conditional: ‘ifP 6= NP, then very difficult theorem’.

E.g. if P 6= NP, then there are problems that are neither P nor NP-complete. There are very few candidates: best known is the GraphIsomorphism Problem.

64

Page 270: Julian Brad eld jcb@inf.ed.ac.uk IF{4

P?= NP

We have said that NP-complete problems are effectively exponential.

But do we know this? Is it possible that we’re just stupid, and there’s aPTime algorithm for SAT?

We don’t know.

It is possible that NP = P. If you find a polynomial algorithm for SATor any other NP-complete problem . . . hire bodyguards, because mostweb/banking security depends on such problems being hard.

Solving the problem is worth a million US dollars: it is one of the sevenproblems chosen by the Clay Institute for their Millennium Prizes.

Many of the results in complexity theory are therefore conditional: ‘ifP 6= NP, then very difficult theorem’.

E.g. if P 6= NP, then there are problems that are neither P nor NP-complete. There are very few candidates: best known is the GraphIsomorphism Problem.

64

Page 271: Julian Brad eld jcb@inf.ed.ac.uk IF{4

P?= NP

We have said that NP-complete problems are effectively exponential.

But do we know this? Is it possible that we’re just stupid, and there’s aPTime algorithm for SAT?

We don’t know.

It is possible that NP = P. If you find a polynomial algorithm for SATor any other NP-complete problem . . . hire bodyguards, because mostweb/banking security depends on such problems being hard.

Solving the problem is worth a million US dollars: it is one of the sevenproblems chosen by the Clay Institute for their Millennium Prizes.

Many of the results in complexity theory are therefore conditional: ‘ifP 6= NP, then very difficult theorem’.

E.g. if P 6= NP, then there are problems that are neither P nor NP-complete. There are very few candidates: best known is the GraphIsomorphism Problem.

64

Page 272: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Handling NP

As far as we know, NP problems are just hard: need exponential search,so O(poly(n) · 2n). So how do we solve them in practice? Huge industry. . .

Randomized algorithms are often useful. Allow algorithms to toss acoin. Surprisingly one can get randomized algorithms that solve e.g.3SAT in time O(poly(n) · (4/3)n) (Why is this useful? 2100 ≈ 1031, while1.33100 ≈ 1012) Catch: (really) small probability of error!

In many special classes (e.g. sparse graphs, or almost-complete graphs),heuristics lead to fast results.

See http://satcompetition.org/ for the state of the art.

65

Page 273: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Handling NP

As far as we know, NP problems are just hard: need exponential search,so O(poly(n) · 2n). So how do we solve them in practice? Huge industry. . .

Randomized algorithms are often useful. Allow algorithms to toss acoin. Surprisingly one can get randomized algorithms that solve e.g.3SAT in time O(poly(n) · (4/3)n) (Why is this useful? 2100 ≈ 1031, while1.33100 ≈ 1012) Catch: (really) small probability of error!

In many special classes (e.g. sparse graphs, or almost-complete graphs),heuristics lead to fast results.

See http://satcompetition.org/ for the state of the art.

65

Page 274: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Handling NP

As far as we know, NP problems are just hard: need exponential search,so O(poly(n) · 2n). So how do we solve them in practice? Huge industry. . .

Randomized algorithms are often useful. Allow algorithms to toss acoin. Surprisingly one can get randomized algorithms that solve e.g.3SAT in time O(poly(n) · (4/3)n) (Why is this useful? 2100 ≈ 1031, while1.33100 ≈ 1012) Catch: (really) small probability of error!

In many special classes (e.g. sparse graphs, or almost-complete graphs),heuristics lead to fast results.

See http://satcompetition.org/ for the state of the art.

65

Page 275: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Beyond NP

So what do we know about hard problems?

As noted earlier, ExpTime is strictly harder than P (and, we believe, NP):a polynomially bounded machine simply doesn’t have time or memory toexplore 2n steps. Formal proof via the Time Hierarchy Theorem.

Similarly 2-ExpTime ) ExpTime.

Also applies to N versions.

66

Page 276: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Space-bounded machines

An RM/TM is f (n)-space-bounded if it may use only f (input size) space.For TMs, space means cells on tape; for RMs, number of bits in registers.

PSpace is the class of problems solvable by polynomially-space-boundedmachines.

The following are obvious (Exercise: why?):

I PSpace ⊇ PTime

I PSpace ⊇ NPTime

I PSpace ⊆ ExpTime

67

Page 277: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Space-bounded machines

An RM/TM is f (n)-space-bounded if it may use only f (input size) space.For TMs, space means cells on tape; for RMs, number of bits in registers.

PSpace is the class of problems solvable by polynomially-space-boundedmachines.

The following are obvious (Exercise: why?):

I PSpace ⊇ PTime

I PSpace ⊇ NPTime

I PSpace ⊆ ExpTime

67

Page 278: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Space-bounded machines

An RM/TM is f (n)-space-bounded if it may use only f (input size) space.For TMs, space means cells on tape; for RMs, number of bits in registers.

PSpace is the class of problems solvable by polynomially-space-boundedmachines.

The following are obvious (Exercise: why?):

I PSpace ⊇ PTime

I PSpace ⊇ NPTime

I PSpace ⊆ ExpTime

67

Page 279: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The polynomial hierarchy

Recall the notion of oracle. Suppose we have an oracle for NP problems.What then can we compute?

The polynomial hierarchy is defined thus:

I Let ∆P0 = ΣP

0 = ΠP0 = P.

I Let ∆Pn+1 = PΣP

n ,ΣPn+1 = NPΣP

n ,ΠPn+1 = co-NPΣP

n .

where co-X is problems whose negation is in X.

So ΣP1 = NPP = NP, and ΣP

2 = NPNP. Why is NPNP not just NP?

This hierarchy could collapse at any level – generalization of P?= NP.

We don’t know.

The whole hierarchy is inside PSpace.

68

Page 280: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The polynomial hierarchy

Recall the notion of oracle. Suppose we have an oracle for NP problems.What then can we compute?

The polynomial hierarchy is defined thus:

I Let ∆P0 = ΣP

0 = ΠP0 = P.

I Let ∆Pn+1 = PΣP

n ,ΣPn+1 = NPΣP

n ,ΠPn+1 = co-NPΣP

n .

where co-X is problems whose negation is in X.

So ΣP1 = NPP = NP, and ΣP

2 = NPNP. Why is NPNP not just NP?

This hierarchy could collapse at any level – generalization of P?= NP.

We don’t know.

The whole hierarchy is inside PSpace.

68

Page 281: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The polynomial hierarchy

Recall the notion of oracle. Suppose we have an oracle for NP problems.What then can we compute?

The polynomial hierarchy is defined thus:

I Let ∆P0 = ΣP

0 = ΠP0 = P.

I Let ∆Pn+1 = PΣP

n ,ΣPn+1 = NPΣP

n ,ΠPn+1 = co-NPΣP

n .

where co-X is problems whose negation is in X.

So ΣP1 = NPP = NP, and ΣP

2 = NPNP. Why is NPNP not just NP?

This hierarchy could collapse at any level – generalization of P?= NP.

We don’t know.

The whole hierarchy is inside PSpace.

68

Page 282: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The polynomial hierarchy

Recall the notion of oracle. Suppose we have an oracle for NP problems.What then can we compute?

The polynomial hierarchy is defined thus:

I Let ∆P0 = ΣP

0 = ΠP0 = P.

I Let ∆Pn+1 = PΣP

n ,ΣPn+1 = NPΣP

n ,ΠPn+1 = co-NPΣP

n .

where co-X is problems whose negation is in X.

So ΣP1 = NPP = NP, and ΣP

2 = NPNP. Why is NPNP not just NP?

This hierarchy could collapse at any level – generalization of P?= NP.

We don’t know.

The whole hierarchy is inside PSpace.

68

Page 283: Julian Brad eld jcb@inf.ed.ac.uk IF{4

The polynomial hierarchy

Recall the notion of oracle. Suppose we have an oracle for NP problems.What then can we compute?

The polynomial hierarchy is defined thus:

I Let ∆P0 = ΣP

0 = ΠP0 = P.

I Let ∆Pn+1 = PΣP

n ,ΣPn+1 = NPΣP

n ,ΠPn+1 = co-NPΣP

n .

where co-X is problems whose negation is in X.

So ΣP1 = NPP = NP, and ΣP

2 = NPNP. Why is NPNP not just NP?

This hierarchy could collapse at any level – generalization of P?= NP.

We don’t know.

The whole hierarchy is inside PSpace.

68

Page 284: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Lambda-Calculus

While Turing was working with machines, Alonzo Church was computingdifferently – with the precursor of Haskell!

You saw Haskell as a programming language for a computer: Church sawit as the computer.

Computation with lambda-calculus works by reduction (syntactic manip-ulation) of expressions or terms.

69

Page 285: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Lambda-Calculus

While Turing was working with machines, Alonzo Church was computingdifferently – with the precursor of Haskell!

You saw Haskell as a programming language for a computer: Church sawit as the computer.

Computation with lambda-calculus works by reduction (syntactic manip-ulation) of expressions or terms.

69

Page 286: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Lambda-Calculus

While Turing was working with machines, Alonzo Church was computingdifferently – with the precursor of Haskell!

You saw Haskell as a programming language for a computer: Church sawit as the computer.

Computation with lambda-calculus works by reduction (syntactic manip-ulation) of expressions or terms.

69

Page 287: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Lambda-calculus syntax

The terms of lambda-calculus are:

I There is a set of variables x , y , . . . . Every variable is a term.

I If x is a variable and t is a term, then (λx .t) is a term. (abstraction)

I If s and t are terms, then (st) is a term. (application)

I Nothing else is a term.

To avoid writing Lots of Irritating Spurious Parentheses, we adopt theconventions:

I Omit outermost parentheses. (E.g. λx .t)

I Application associates to the left: stu means (st)u.

I the body of an abstraction extends as far to the right as possible:λx .sλy .xty means λx .(s(λy .((xt)y)))

70

Page 288: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Lambda-calculus syntax

The terms of lambda-calculus are:

I There is a set of variables x , y , . . . . Every variable is a term.

I If x is a variable and t is a term, then (λx .t) is a term. (abstraction)

I If s and t are terms, then (st) is a term. (application)

I Nothing else is a term.

To avoid writing Lots of Irritating Spurious Parentheses, we adopt theconventions:

I Omit outermost parentheses. (E.g. λx .t)

I Application associates to the left: stu means (st)u.

I the body of an abstraction extends as far to the right as possible:λx .sλy .xty means λx .(s(λy .((xt)y)))

70

Page 289: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Lambda-calculus syntax

The terms of lambda-calculus are:

I There is a set of variables x , y , . . . . Every variable is a term.

I If x is a variable and t is a term, then (λx .t) is a term. (abstraction)

I If s and t are terms, then (st) is a term. (application)

I Nothing else is a term.

To avoid writing Lots of Irritating Spurious Parentheses, we adopt theconventions:

I Omit outermost parentheses. (E.g. λx .t)

I Application associates to the left: stu means (st)u.

I the body of an abstraction extends as far to the right as possible:λx .sλy .xty means λx .(s(λy .((xt)y)))

70

Page 290: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Lambda-calculus syntax

The terms of lambda-calculus are:

I There is a set of variables x , y , . . . . Every variable is a term.

I If x is a variable and t is a term, then (λx .t) is a term. (abstraction)

I If s and t are terms, then (st) is a term. (application)

I Nothing else is a term.

To avoid writing Lots of Irritating Spurious Parentheses, we adopt theconventions:

I Omit outermost parentheses. (E.g. λx .t)

I Application associates to the left: stu means (st)u.

I the body of an abstraction extends as far to the right as possible:λx .sλy .xty means λx .(s(λy .((xt)y)))

70

Page 291: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Lambda-calculus syntax

The terms of lambda-calculus are:

I There is a set of variables x , y , . . . . Every variable is a term.

I If x is a variable and t is a term, then (λx .t) is a term. (abstraction)

I If s and t are terms, then (st) is a term. (application)

I Nothing else is a term.

To avoid writing Lots of Irritating Spurious Parentheses, we adopt theconventions:

I Omit outermost parentheses. (E.g. λx .t)

I Application associates to the left: stu means (st)u.

I the body of an abstraction extends as far to the right as possible:λx .sλy .xty means λx .(s(λy .((xt)y)))

70

Page 292: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Lambda-calculus syntax

The terms of lambda-calculus are:

I There is a set of variables x , y , . . . . Every variable is a term.

I If x is a variable and t is a term, then (λx .t) is a term. (abstraction)

I If s and t are terms, then (st) is a term. (application)

I Nothing else is a term.

To avoid writing Lots of Irritating Spurious Parentheses, we adopt theconventions:

I Omit outermost parentheses. (E.g. λx .t)

I Application associates to the left: stu means (st)u.

I the body of an abstraction extends as far to the right as possible:λx .sλy .xty means λx .(s(λy .((xt)y)))

70

Page 293: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Lambda-calculus syntax

The terms of lambda-calculus are:

I There is a set of variables x , y , . . . . Every variable is a term.

I If x is a variable and t is a term, then (λx .t) is a term. (abstraction)

I If s and t are terms, then (st) is a term. (application)

I Nothing else is a term.

To avoid writing Lots of Irritating Spurious Parentheses, we adopt theconventions:

I Omit outermost parentheses. (E.g. λx .t)

I Application associates to the left: stu means (st)u.

I the body of an abstraction extends as far to the right as possible:λx .sλy .xty means λx .(s(λy .((xt)y)))

70

Page 294: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Intuition

Variables are . . . variables. Abstraction makes anonymous functions: λx .tis function that takes an argument (s, say), binds s to the variable x , andevaluates (?) t.

Application is application of a function to an argument.

Examples:

I λx .x is the identity function

I (λx .x)(λx .x) is the identity function applied to itself – which is (willturn out to be) the identity function

I λx .λy .wxy is a function that takes an argument x , and gives back afunction that takes an argument y and applies w to x giving a resultthat is applied to y . A curried function of two arguments.

This should all look completely familiar from Haskell – except there areno types here! Also no integers, booleans, data types, pattern matching,. . .

71

Page 295: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Intuition

Variables are . . . variables. Abstraction makes anonymous functions: λx .tis function that takes an argument (s, say), binds s to the variable x , andevaluates (?) t.

Application is application of a function to an argument.

Examples:

I λx .x is the identity function

I (λx .x)(λx .x) is the identity function applied to itself – which is (willturn out to be) the identity function

I λx .λy .wxy is a function that takes an argument x , and gives back afunction that takes an argument y and applies w to x giving a resultthat is applied to y . A curried function of two arguments.

This should all look completely familiar from Haskell – except there areno types here! Also no integers, booleans, data types, pattern matching,. . .

71

Page 296: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Intuition

Variables are . . . variables. Abstraction makes anonymous functions: λx .tis function that takes an argument (s, say), binds s to the variable x , andevaluates (?) t.

Application is application of a function to an argument.

Examples:

I λx .x is the identity function

I (λx .x)(λx .x) is the identity function applied to itself – which is (willturn out to be) the identity function

I λx .λy .wxy is a function that takes an argument x , and gives back afunction that takes an argument y and applies w to x giving a resultthat is applied to y . A curried function of two arguments.

This should all look completely familiar from Haskell – except there areno types here! Also no integers, booleans, data types, pattern matching,. . .

71

Page 297: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Intuition

Variables are . . . variables. Abstraction makes anonymous functions: λx .tis function that takes an argument (s, say), binds s to the variable x , andevaluates (?) t.

Application is application of a function to an argument.

Examples:

I λx .x is the identity function

I (λx .x)(λx .x) is the identity function applied to itself – which is (willturn out to be) the identity function

I λx .λy .wxy is a function that takes an argument x , and gives back afunction that takes an argument y and applies w to x giving a resultthat is applied to y . A curried function of two arguments.

This should all look completely familiar from Haskell – except there areno types here! Also no integers, booleans, data types, pattern matching,. . .

71

Page 298: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Intuition

Variables are . . . variables. Abstraction makes anonymous functions: λx .tis function that takes an argument (s, say), binds s to the variable x , andevaluates (?) t.

Application is application of a function to an argument.

Examples:

I λx .x is the identity function

I (λx .x)(λx .x) is the identity function applied to itself – which is (willturn out to be) the identity function

I λx .λy .wxy is a function that takes an argument x , and gives back afunction that takes an argument y and applies w to x giving a resultthat is applied to y . A curried function of two arguments.

This should all look completely familiar from Haskell – except there areno types here! Also no integers, booleans, data types, pattern matching,. . .

71

Page 299: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Free and bound variables; α-conversion

If a variable x occurs inside a term t, the occurrence is free iff x is notinside a subterm λx .t ′ of t.

If x is free in t ′, it is bound in λx .t ′.

This is just like variable binding in all civilized programming languages.

Exercise: Write a formal definition of ‘free’ and ‘bound’ by induction onthe structure of terms.

The terms λx .x and λy .y are not interestingly different.

Changing the name of a bound variable is α-conversion (historicalreasons. . . ).

Care is needed: can’t just change x to y in λx .λy .xy , or in λx .yx .

Exercise: Write a precise and safe definition of α-conversion.

We write sα→ t if s can be α-converted to t.

72

Page 300: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Free and bound variables; α-conversion

If a variable x occurs inside a term t, the occurrence is free iff x is notinside a subterm λx .t ′ of t.

If x is free in t ′, it is bound in λx .t ′.

This is just like variable binding in all civilized programming languages.

Exercise: Write a formal definition of ‘free’ and ‘bound’ by induction onthe structure of terms.

The terms λx .x and λy .y are not interestingly different.

Changing the name of a bound variable is α-conversion (historicalreasons. . . ).

Care is needed: can’t just change x to y in λx .λy .xy , or in λx .yx .

Exercise: Write a precise and safe definition of α-conversion.

We write sα→ t if s can be α-converted to t.

72

Page 301: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Free and bound variables; α-conversion

If a variable x occurs inside a term t, the occurrence is free iff x is notinside a subterm λx .t ′ of t.

If x is free in t ′, it is bound in λx .t ′.

This is just like variable binding in all civilized programming languages.

Exercise: Write a formal definition of ‘free’ and ‘bound’ by induction onthe structure of terms.

The terms λx .x and λy .y are not interestingly different.

Changing the name of a bound variable is α-conversion (historicalreasons. . . ).

Care is needed: can’t just change x to y in λx .λy .xy , or in λx .yx .

Exercise: Write a precise and safe definition of α-conversion.

We write sα→ t if s can be α-converted to t.

72

Page 302: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Free and bound variables; α-conversion

If a variable x occurs inside a term t, the occurrence is free iff x is notinside a subterm λx .t ′ of t.

If x is free in t ′, it is bound in λx .t ′.

This is just like variable binding in all civilized programming languages.

Exercise: Write a formal definition of ‘free’ and ‘bound’ by induction onthe structure of terms.

The terms λx .x and λy .y are not interestingly different.

Changing the name of a bound variable is α-conversion (historicalreasons. . . ).

Care is needed: can’t just change x to y in λx .λy .xy , or in λx .yx .

Exercise: Write a precise and safe definition of α-conversion.

We write sα→ t if s can be α-converted to t.

72

Page 303: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Free and bound variables; α-conversion

If a variable x occurs inside a term t, the occurrence is free iff x is notinside a subterm λx .t ′ of t.

If x is free in t ′, it is bound in λx .t ′.

This is just like variable binding in all civilized programming languages.

Exercise: Write a formal definition of ‘free’ and ‘bound’ by induction onthe structure of terms.

The terms λx .x and λy .y are not interestingly different.

Changing the name of a bound variable is α-conversion (historicalreasons. . . ).

Care is needed: can’t just change x to y in λx .λy .xy , or in λx .yx .

Exercise: Write a precise and safe definition of α-conversion.

We write sα→ t if s can be α-converted to t.

72

Page 304: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Evaluating terms: β-reduction

The main computation is done by β-reduction (historical reasons. . . ),which ‘evaluates’ a function application:

(λx .t)(s)β→ t[s/x ]

where t[s/x ] is the result of (appropriately) substituting s for every freeoccurrence of x in t.

Examples:

(λx .x)(λy .y)β→ λy .y

(λx .xx)(λy .y)wβ→ (λy .y)(λy .y)wβ→ (λy .y)wβ→ w

73

Page 305: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Evaluating terms: β-reduction

The main computation is done by β-reduction (historical reasons. . . ),which ‘evaluates’ a function application:

(λx .t)(s)β→ t[s/x ]

where t[s/x ] is the result of (appropriately) substituting s for every freeoccurrence of x in t.

Examples:

(λx .x)(λy .y)β→ λy .y

(λx .xx)(λy .y)wβ→ (λy .y)(λy .y)wβ→ (λy .y)wβ→ w

73

Page 306: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Evaluating terms: β-reduction

The main computation is done by β-reduction (historical reasons. . . ),which ‘evaluates’ a function application:

(λx .t)(s)β→ t[s/x ]

where t[s/x ] is the result of (appropriately) substituting s for every freeoccurrence of x in t.

Examples:

(λx .x)(λy .y)β→ λy .y

(λx .xx)(λy .y)wβ→ (λy .y)(λy .y)w

β→ (λy .y)wβ→ w

73

Page 307: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Evaluating terms: β-reduction

The main computation is done by β-reduction (historical reasons. . . ),which ‘evaluates’ a function application:

(λx .t)(s)β→ t[s/x ]

where t[s/x ] is the result of (appropriately) substituting s for every freeoccurrence of x in t.

Examples:

(λx .x)(λy .y)β→ λy .y

(λx .xx)(λy .y)wβ→ (λy .y)(λy .y)wβ→ (λy .y)w

β→ w

73

Page 308: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Evaluating terms: β-reduction

The main computation is done by β-reduction (historical reasons. . . ),which ‘evaluates’ a function application:

(λx .t)(s)β→ t[s/x ]

where t[s/x ] is the result of (appropriately) substituting s for every freeoccurrence of x in t.

Examples:

(λx .x)(λy .y)β→ λy .y

(λx .xx)(λy .y)wβ→ (λy .y)(λy .y)wβ→ (λy .y)wβ→ w

73

Page 309: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Variable capture and substitution

The lambda calculus is purely text (symbol) manipulation – naivesubstitution can go wrong:

(λx .λy .xy)yw → (λy .yy)w → ww

because the free y was captured by λy .

So we defineβ→ to do the right thing, by (e.g.) renaming bound variables

to avoid capture:

(λx .λy .xy)ywβ→ (λy ′.yy ′)w

β→ yw

Surprisingly tricky to get right. See Pierce §5.3 for details.

74

Page 310: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Variable capture and substitution

The lambda calculus is purely text (symbol) manipulation – naivesubstitution can go wrong:

(λx .λy .xy)yw → (λy .yy)w → ww

because the free y was captured by λy .

So we defineβ→ to do the right thing, by (e.g.) renaming bound variables

to avoid capture:

(λx .λy .xy)ywβ→ (λy ′.yy ′)w

β→ yw

Surprisingly tricky to get right. See Pierce §5.3 for details.

74

Page 311: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Eta conversion

Sometimes it’s useful to wrap a function up in another abstraction (e.g.to delay its evaluation). η-conversion allows us to do this: provide x isnot free in f , then

(λx .fx)η↔ f

75

Page 312: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Evaluation order

Does it matter what order we evaluate things in?

(λz.w)((λx .xx)(λx .xx))β→ w

or

(λz.w)((λx .xx)(λx .xx))

β→ (λz.w)((λx .xx)(λx .xx))

β→ (λz.w)((λx .xx)(λx .xx))

. . .

‘Call by name’ (nearly but not quite what Haskell does) is the first: reducethe outer left redex first, and don’t go inside lambda abstractions.

‘Call by value’ is the second: reduce function arguments before reducingthe application.

For others, see Pierce §5.1.

76

Page 313: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Evaluation order

Does it matter what order we evaluate things in?

(λz.w)((λx .xx)(λx .xx))β→ w

or

(λz.w)((λx .xx)(λx .xx))

β→ (λz.w)((λx .xx)(λx .xx))

β→ (λz.w)((λx .xx)(λx .xx))

. . .

‘Call by name’ (nearly but not quite what Haskell does) is the first: reducethe outer left redex first, and don’t go inside lambda abstractions.

‘Call by value’ is the second: reduce function arguments before reducingthe application.

For others, see Pierce §5.1.

76

Page 314: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Evaluation order

Does it matter what order we evaluate things in?

(λz.w)((λx .xx)(λx .xx))β→ w

or

(λz.w)((λx .xx)(λx .xx))

β→ (λz.w)((λx .xx)(λx .xx))

β→ (λz.w)((λx .xx)(λx .xx))

. . .

‘Call by name’ (nearly but not quite what Haskell does) is the first: reducethe outer left redex first, and don’t go inside lambda abstractions.

‘Call by value’ is the second: reduce function arguments before reducingthe application.

For others, see Pierce §5.1.

76

Page 315: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Evaluation order

Does it matter what order we evaluate things in?

(λz.w)((λx .xx)(λx .xx))β→ w

or

(λz.w)((λx .xx)(λx .xx))

β→ (λz.w)((λx .xx)(λx .xx))

β→ (λz.w)((λx .xx)(λx .xx))

. . .

‘Call by name’ (nearly but not quite what Haskell does) is the first: reducethe outer left redex first, and don’t go inside lambda abstractions.

‘Call by value’ is the second: reduce function arguments before reducingthe application.

For others, see Pierce §5.1.76

Page 316: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Adding stuff

So far, this doesn’t look like much of a computer.

We could imagine adding basic numerals 0, 1, 2, . . . , and arithmeticoperations with rules like

+n1n2β→ n where n = n1 + n2

and booleans true, false and conditionals with rules like

ifbstβ→ s if b is true

t if b is false

but we still need some way to achieve iteration or looping.

It’s not actually necessary to add numerical primitives: we can define λ-terms torepresent them. Look up Church numerals in the textbook.

77

Page 317: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Adding stuff

So far, this doesn’t look like much of a computer.

We could imagine adding basic numerals 0, 1, 2, . . . , and arithmeticoperations with rules like

+n1n2β→ n where n = n1 + n2

and booleans true, false and conditionals with rules like

ifbstβ→ s if b is true

t if b is false

but we still need some way to achieve iteration or looping.

It’s not actually necessary to add numerical primitives: we can define λ-terms torepresent them. Look up Church numerals in the textbook.

77

Page 318: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Adding stuff

So far, this doesn’t look like much of a computer.

We could imagine adding basic numerals 0, 1, 2, . . . , and arithmeticoperations with rules like

+n1n2β→ n where n = n1 + n2

and booleans true, false and conditionals with rules like

ifbstβ→ s if b is true

t if b is false

but we still need some way to achieve iteration or looping.

It’s not actually necessary to add numerical primitives: we can define λ-terms torepresent them. Look up Church numerals in the textbook.

77

Page 319: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Recursion in lambda calculus

Haskellers will expect that recursion is the trick. But without names forfunctions it’s . . . tricky.

Example: factorial function is

(λF .(λX .F (XX ))(λX .F (XX )))(λf .λx .if(=0x)1(×x(f (−x1))))

If you can understand that . . .

The λf . . . . is basically factorial, but we have to find a way for theembedded f to refer to itself.

Exercise: work through the evaluation applied to 3. (Use call by name.)

At this point, you may wish to find a lambda-calculus evaluator. (Thereare many. Explore!)

Exercise: can you do this in Haskell? If not, why not?

78

Page 320: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Recursion in lambda calculus

Haskellers will expect that recursion is the trick. But without names forfunctions it’s . . . tricky.

Example: factorial function is

(λF .(λX .F (XX ))(λX .F (XX )))(λf .λx .if(=0x)1(×x(f (−x1))))

If you can understand that . . .

The λf . . . . is basically factorial, but we have to find a way for theembedded f to refer to itself.

Exercise: work through the evaluation applied to 3. (Use call by name.)

At this point, you may wish to find a lambda-calculus evaluator. (Thereare many. Explore!)

Exercise: can you do this in Haskell? If not, why not?

78

Page 321: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Recursion in lambda calculus

Haskellers will expect that recursion is the trick. But without names forfunctions it’s . . . tricky.

Example: factorial function is

(λF .(λX .F (XX ))(λX .F (XX )))(λf .λx .if(=0x)1(×x(f (−x1))))

If you can understand that . . .

The λf . . . . is basically factorial, but we have to find a way for theembedded f to refer to itself.

Exercise: work through the evaluation applied to 3. (Use call by name.)

At this point, you may wish to find a lambda-calculus evaluator. (Thereare many. Explore!)

Exercise: can you do this in Haskell? If not, why not?

78

Page 322: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Y?

The magic term that made λf . . . . recursive was

Ydef= λF .(λX .F (XX ))(λX .F (XX ))

Why does it work?

Suppose we give Y an argument G , a function that takes two argumentsf and x . Then, in call-by-name,

YGβ→ (λX .G (XX ))(λX .G (XX ))

β→ G ((λX .G (XX ))(λX .G (XX )))

Observe that the last term is just (one reduction of) G (YG ).

Now if G itself applies its first argument (YG here) to its second argument,YG will get expanded to G (YG ) again. If doesn’t use its first argument,the expansion stops.

79

Page 323: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Y?

The magic term that made λf . . . . recursive was

Ydef= λF .(λX .F (XX ))(λX .F (XX ))

Why does it work?

Suppose we give Y an argument G , a function that takes two argumentsf and x . Then, in call-by-name,

YGβ→ (λX .G (XX ))(λX .G (XX ))

β→ G ((λX .G (XX ))(λX .G (XX )))

Observe that the last term is just (one reduction of) G (YG ).

Now if G itself applies its first argument (YG here) to its second argument,YG will get expanded to G (YG ) again. If doesn’t use its first argument,the expansion stops.

79

Page 324: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Y?

The magic term that made λf . . . . recursive was

Ydef= λF .(λX .F (XX ))(λX .F (XX ))

Why does it work?

Suppose we give Y an argument G , a function that takes two argumentsf and x . Then, in call-by-name,

YGβ→ (λX .G (XX ))(λX .G (XX ))

β→ G ((λX .G (XX ))(λX .G (XX )))

Observe that the last term is just (one reduction of) G (YG ).

Now if G itself applies its first argument (YG here) to its second argument,YG will get expanded to G (YG ) again. If doesn’t use its first argument,the expansion stops.

79

Page 325: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Y?

The magic term that made λf . . . . recursive was

Ydef= λF .(λX .F (XX ))(λX .F (XX ))

Why does it work?

Suppose we give Y an argument G , a function that takes two argumentsf and x . Then, in call-by-name,

YGβ→ (λX .G (XX ))(λX .G (XX ))

β→ G ((λX .G (XX ))(λX .G (XX )))

Observe that the last term is just (one reduction of) G (YG ).

Now if G itself applies its first argument (YG here) to its second argument,YG will get expanded to G (YG ) again. If doesn’t use its first argument,the expansion stops.

79

Page 326: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Putting Gdef= λf .λx .if (= 0 x) 1 (× x (f (− x 1))), and evaluating Y G 3,

say, we get:

Y G 3

β→ G (YG ) 3β→(λx .if (= 0 x) 1 (× x (YG (− x 1)))

)3

β→ if (= 0 3) 1 (× 3 (YG (− 3 1)))β→

+

× 3 (YG (− 3 1))

. . .β→

+

6

80

Page 327: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Putting Gdef= λf .λx .if (= 0 x) 1 (× x (f (− x 1))), and evaluating Y G 3,

say, we get:

Y G 3β→ G (YG ) 3

β→(λx .if (= 0 x) 1 (× x (YG (− x 1)))

)3

β→ if (= 0 3) 1 (× 3 (YG (− 3 1)))β→

+

× 3 (YG (− 3 1))

. . .β→

+

6

80

Page 328: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Putting Gdef= λf .λx .if (= 0 x) 1 (× x (f (− x 1))), and evaluating Y G 3,

say, we get:

Y G 3β→ G (YG ) 3β→(λx .if (= 0 x) 1 (× x (YG (− x 1)))

)3

β→ if (= 0 3) 1 (× 3 (YG (− 3 1)))β→

+

× 3 (YG (− 3 1))

. . .β→

+

6

80

Page 329: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Putting Gdef= λf .λx .if (= 0 x) 1 (× x (f (− x 1))), and evaluating Y G 3,

say, we get:

Y G 3β→ G (YG ) 3β→(λx .if (= 0 x) 1 (× x (YG (− x 1)))

)3

β→ if (= 0 3) 1 (× 3 (YG (− 3 1)))

β→+

× 3 (YG (− 3 1))

. . .β→

+

6

80

Page 330: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Putting Gdef= λf .λx .if (= 0 x) 1 (× x (f (− x 1))), and evaluating Y G 3,

say, we get:

Y G 3β→ G (YG ) 3β→(λx .if (= 0 x) 1 (× x (YG (− x 1)))

)3

β→ if (= 0 3) 1 (× 3 (YG (− 3 1)))β→

+

× 3 (YG (− 3 1))

. . .β→

+

6

80

Page 331: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Putting Gdef= λf .λx .if (= 0 x) 1 (× x (f (− x 1))), and evaluating Y G 3,

say, we get:

Y G 3β→ G (YG ) 3β→(λx .if (= 0 x) 1 (× x (YG (− x 1)))

)3

β→ if (= 0 3) 1 (× 3 (YG (− 3 1)))β→

+

× 3 (YG (− 3 1))

. . .β→

+

6

80

Page 332: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Types

In λ-calculus, we can write crazy things like

(λx .xxx)(0)

or(+(λx .xx)42)

Simply typed λ-calculus adds simple (!) types to the basic λ-calculussyntax, to prevent such nonsense.

I Fix some set of base types, e.g. nat, bool, for the constants.I If σ, τ are types, then σ→ τ is a type, the function type of

abstractions that take an argument of type σ and return a value oftype τ . Convention: → associates to the right, so σ→ τ → υmeans σ→ (τ → υ).

I The syntax of abstractions is changed to (λx :σ.t), giving the typeof the argument.

Note that σ, τ are meta-variables, not symbols in the language itself.

81

Page 333: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Types

In λ-calculus, we can write crazy things like

(λx .xxx)(0)

or(+(λx .xx)42)

Simply typed λ-calculus adds simple (!) types to the basic λ-calculussyntax, to prevent such nonsense.

I Fix some set of base types, e.g. nat, bool, for the constants.I If σ, τ are types, then σ→ τ is a type, the function type of

abstractions that take an argument of type σ and return a value oftype τ . Convention: → associates to the right, so σ→ τ → υmeans σ→ (τ → υ).

I The syntax of abstractions is changed to (λx :σ.t), giving the typeof the argument.

Note that σ, τ are meta-variables, not symbols in the language itself.

81

Page 334: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Types

In λ-calculus, we can write crazy things like

(λx .xxx)(0)

or(+(λx .xx)42)

Simply typed λ-calculus adds simple (!) types to the basic λ-calculussyntax, to prevent such nonsense.

I Fix some set of base types, e.g. nat, bool, for the constants.

I If σ, τ are types, then σ→ τ is a type, the function type ofabstractions that take an argument of type σ and return a value oftype τ . Convention: → associates to the right, so σ→ τ → υmeans σ→ (τ → υ).

I The syntax of abstractions is changed to (λx :σ.t), giving the typeof the argument.

Note that σ, τ are meta-variables, not symbols in the language itself.

81

Page 335: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Types

In λ-calculus, we can write crazy things like

(λx .xxx)(0)

or(+(λx .xx)42)

Simply typed λ-calculus adds simple (!) types to the basic λ-calculussyntax, to prevent such nonsense.

I Fix some set of base types, e.g. nat, bool, for the constants.I If σ, τ are types, then σ→ τ is a type, the function type of

abstractions that take an argument of type σ and return a value oftype τ . Convention: → associates to the right, so σ→ τ → υmeans σ→ (τ → υ).

I The syntax of abstractions is changed to (λx :σ.t), giving the typeof the argument.

Note that σ, τ are meta-variables, not symbols in the language itself.

81

Page 336: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Types

In λ-calculus, we can write crazy things like

(λx .xxx)(0)

or(+(λx .xx)42)

Simply typed λ-calculus adds simple (!) types to the basic λ-calculussyntax, to prevent such nonsense.

I Fix some set of base types, e.g. nat, bool, for the constants.I If σ, τ are types, then σ→ τ is a type, the function type of

abstractions that take an argument of type σ and return a value oftype τ . Convention: → associates to the right, so σ→ τ → υmeans σ→ (τ → υ).

I The syntax of abstractions is changed to (λx :σ.t), giving the typeof the argument.

Note that σ, τ are meta-variables, not symbols in the language itself.81

Page 337: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Typing

A legitimate simply typed λ-calculus term must be well typed. This isdetermined by inferring types.

For a term t and type τ , we write t : τ for ‘t is well typed with type τ ’.

I Constants (individuals or functions) are well typed with theirassigned type.

I If t : τ under the assumption that x : σ, then (λx :σ.t) : σ→ τ .

I If t : σ→ τ and s : σ, then (ts) : τ .

Examples:

I A function that takes a function f on integers and returns a functionthat applies f twice:

(λf :nat→ nat.λx :nat.f (fx)) : (nat→ nat)→ (nat→ nat)

Exercise: Work it through.

I λx :nat.xx cannot be well typed.

82

Page 338: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Typing

A legitimate simply typed λ-calculus term must be well typed. This isdetermined by inferring types.

For a term t and type τ , we write t : τ for ‘t is well typed with type τ ’.

I Constants (individuals or functions) are well typed with theirassigned type.

I If t : τ under the assumption that x : σ, then (λx :σ.t) : σ→ τ .

I If t : σ→ τ and s : σ, then (ts) : τ .

Examples:

I A function that takes a function f on integers and returns a functionthat applies f twice:

(λf :nat→ nat.λx :nat.f (fx)) : (nat→ nat)→ (nat→ nat)

Exercise: Work it through.

I λx :nat.xx cannot be well typed.

82

Page 339: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Typing

A legitimate simply typed λ-calculus term must be well typed. This isdetermined by inferring types.

For a term t and type τ , we write t : τ for ‘t is well typed with type τ ’.

I Constants (individuals or functions) are well typed with theirassigned type.

I If t : τ under the assumption that x : σ, then (λx :σ.t) : σ→ τ .

I If t : σ→ τ and s : σ, then (ts) : τ .

Examples:

I A function that takes a function f on integers and returns a functionthat applies f twice:

(λf :nat→ nat.λx :nat.f (fx)) : (nat→ nat)→ (nat→ nat)

Exercise: Work it through.

I λx :nat.xx cannot be well typed.

82

Page 340: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Typing

A legitimate simply typed λ-calculus term must be well typed. This isdetermined by inferring types.

For a term t and type τ , we write t : τ for ‘t is well typed with type τ ’.

I Constants (individuals or functions) are well typed with theirassigned type.

I If t : τ under the assumption that x : σ, then (λx :σ.t) : σ→ τ .

I If t : σ→ τ and s : σ, then (ts) : τ .

Examples:

I A function that takes a function f on integers and returns a functionthat applies f twice:

(λf :nat→ nat.λx :nat.f (fx)) : (nat→ nat)→ (nat→ nat)

Exercise: Work it through.

I λx :nat.xx cannot be well typed.

82

Page 341: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Typing

A legitimate simply typed λ-calculus term must be well typed. This isdetermined by inferring types.

For a term t and type τ , we write t : τ for ‘t is well typed with type τ ’.

I Constants (individuals or functions) are well typed with theirassigned type.

I If t : τ under the assumption that x : σ, then (λx :σ.t) : σ→ τ .

I If t : σ→ τ and s : σ, then (ts) : τ .

Examples:

I A function that takes a function f on integers and returns a functionthat applies f twice:

(λf :nat→ nat.λx :nat.f (fx)) : (nat→ nat)→ (nat→ nat)

Exercise: Work it through.

I λx :nat.xx cannot be well typed.

82

Page 342: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Typing

A legitimate simply typed λ-calculus term must be well typed. This isdetermined by inferring types.

For a term t and type τ , we write t : τ for ‘t is well typed with type τ ’.

I Constants (individuals or functions) are well typed with theirassigned type.

I If t : τ under the assumption that x : σ, then (λx :σ.t) : σ→ τ .

I If t : σ→ τ and s : σ, then (ts) : τ .

Examples:

I A function that takes a function f on integers and returns a functionthat applies f twice:

(λf :nat→ nat.λx :nat.f (fx)) : (nat→ nat)→ (nat→ nat)

Exercise: Work it through.

I λx :nat.xx cannot be well typed.82

Page 343: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Proofs for types

It’s common to present type inferences as proof trees, as with proofs inpropositional or predicate logic.

We assume that all formulae are α-converted to avoid clashes of boundvariables.

Γ ` t : τ is a judgement that t has type τ under the set of assumptionsΓ. The following rules (read top to bottom) prove judgements:

I Γ ` c : τ for each constant c of base type τ .

I x : σ ∈ ΓΓ ` x : σ

I Γ, x : σ ` t : τΓ ` λx :σ.t : σ→ τ

I Γ ` t : σ→ τ Γ ` s : σΓ ` ts : τ

Then t : τ iff ` t : τ is provable.

83

Page 344: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Proofs for types

It’s common to present type inferences as proof trees, as with proofs inpropositional or predicate logic.

We assume that all formulae are α-converted to avoid clashes of boundvariables.

Γ ` t : τ is a judgement that t has type τ under the set of assumptionsΓ. The following rules (read top to bottom) prove judgements:

I Γ ` c : τ for each constant c of base type τ .

I x : σ ∈ ΓΓ ` x : σ

I Γ, x : σ ` t : τΓ ` λx :σ.t : σ→ τ

I Γ ` t : σ→ τ Γ ` s : σΓ ` ts : τ

Then t : τ iff ` t : τ is provable.

83

Page 345: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Proofs for types

It’s common to present type inferences as proof trees, as with proofs inpropositional or predicate logic.

We assume that all formulae are α-converted to avoid clashes of boundvariables.

Γ ` t : τ is a judgement that t has type τ under the set of assumptionsΓ. The following rules (read top to bottom) prove judgements:

I Γ ` c : τ for each constant c of base type τ .

I x : σ ∈ ΓΓ ` x : σ

I Γ, x : σ ` t : τΓ ` λx :σ.t : σ→ τ

I Γ ` t : σ→ τ Γ ` s : σΓ ` ts : τ

Then t : τ iff ` t : τ is provable.

83

Page 346: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Proofs for types

It’s common to present type inferences as proof trees, as with proofs inpropositional or predicate logic.

We assume that all formulae are α-converted to avoid clashes of boundvariables.

Γ ` t : τ is a judgement that t has type τ under the set of assumptionsΓ. The following rules (read top to bottom) prove judgements:

I Γ ` c : τ for each constant c of base type τ .

I x : σ ∈ ΓΓ ` x : σ

I Γ, x : σ ` t : τΓ ` λx :σ.t : σ→ τ

I Γ ` t : σ→ τ Γ ` s : σΓ ` ts : τ

Then t : τ iff ` t : τ is provable.

83

Page 347: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Proofs for types

It’s common to present type inferences as proof trees, as with proofs inpropositional or predicate logic.

We assume that all formulae are α-converted to avoid clashes of boundvariables.

Γ ` t : τ is a judgement that t has type τ under the set of assumptionsΓ. The following rules (read top to bottom) prove judgements:

I Γ ` c : τ for each constant c of base type τ .

I x : σ ∈ ΓΓ ` x : σ

I Γ, x : σ ` t : τΓ ` λx :σ.t : σ→ τ

I Γ ` t : σ→ τ Γ ` s : σΓ ` ts : τ

Then t : τ iff ` t : τ is provable.

83

Page 348: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Example: to show that λx :nat.(+ 1 x) : nat→ nat:

x : nat ` + : nat→ (nat→ nat) x : nat ` 1 : nat

x : nat ` (+ 1) : nat→ nat

x : nat ∈ x : natx : nat ` x : nat

x : nat ` ((+ 1) x) : nat

` λx :nat.((+ 1) x) : nat→ nat

What if we try to type λx :nat.xx?

` λx :nat.xx : ?

It should be reasonably obvious that we can quickly well-type (or proveuntypable) any simply typed term. (Did the stage when σ turned into natremind you of anything?)

84

Page 349: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Example: to show that λx :nat.(+ 1 x) : nat→ nat:

x : nat ` + : nat→ (nat→ nat) x : nat ` 1 : nat

x : nat ` (+ 1) : nat→ nat

x : nat ∈ x : natx : nat ` x : nat

x : nat ` ((+ 1) x) : nat

` λx :nat.((+ 1) x) : nat→ nat

What if we try to type λx :nat.xx?

` λx :nat.xx : ?

It should be reasonably obvious that we can quickly well-type (or proveuntypable) any simply typed term. (Did the stage when σ turned into natremind you of anything?)

84

Page 350: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Example: to show that λx :nat.(+ 1 x) : nat→ nat:

x : nat ` + : nat→ (nat→ nat) x : nat ` 1 : nat

x : nat ` (+ 1) : nat→ nat

x : nat ∈ x : natx : nat ` x : nat

x : nat ` ((+ 1) x) : nat

` λx :nat.((+ 1) x) : nat→ nat

What if we try to type λx :nat.xx?

x : nat ` xx : τ

` λx :nat.xx : nat→ τ

It should be reasonably obvious that we can quickly well-type (or proveuntypable) any simply typed term. (Did the stage when σ turned into natremind you of anything?)

84

Page 351: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Example: to show that λx :nat.(+ 1 x) : nat→ nat:

x : nat ` + : nat→ (nat→ nat) x : nat ` 1 : nat

x : nat ` (+ 1) : nat→ nat

x : nat ∈ x : natx : nat ` x : nat

x : nat ` ((+ 1) x) : nat

` λx :nat.((+ 1) x) : nat→ nat

What if we try to type λx :nat.xx?

x : nat ` x : σ→ τ x : nat ` x : σ

x : nat ` xx : τ

` λx :nat.xx : nat→ τ

It should be reasonably obvious that we can quickly well-type (or proveuntypable) any simply typed term. (Did the stage when σ turned into natremind you of anything?)

84

Page 352: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Example: to show that λx :nat.(+ 1 x) : nat→ nat:

x : nat ` + : nat→ (nat→ nat) x : nat ` 1 : nat

x : nat ` (+ 1) : nat→ nat

x : nat ∈ x : natx : nat ` x : nat

x : nat ` ((+ 1) x) : nat

` λx :nat.((+ 1) x) : nat→ nat

What if we try to type λx :nat.xx?

x : nat ` x : nat→ τ

x : nat ∈ x : natx : nat ` x : nat

x : nat ` xx : τ

` λx :nat.xx : nat→ τ

It should be reasonably obvious that we can quickly well-type (or proveuntypable) any simply typed term. (Did the stage when σ turned into natremind you of anything?)

84

Page 353: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Example: to show that λx :nat.(+ 1 x) : nat→ nat:

x : nat ` + : nat→ (nat→ nat) x : nat ` 1 : nat

x : nat ` (+ 1) : nat→ nat

x : nat ∈ x : natx : nat ` x : nat

x : nat ` ((+ 1) x) : nat

` λx :nat.((+ 1) x) : nat→ nat

What if we try to type λx :nat.xx?

x : nat 0 x : nat→ τ

x : nat ∈ x : natx : nat ` x : nat

x : nat ` xx : τ

` λx :nat.xx : nat→ τ

It should be reasonably obvious that we can quickly well-type (or proveuntypable) any simply typed term. (Did the stage when σ turned into natremind you of anything?)

84

Page 354: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Example: to show that λx :nat.(+ 1 x) : nat→ nat:

x : nat ` + : nat→ (nat→ nat) x : nat ` 1 : nat

x : nat ` (+ 1) : nat→ nat

x : nat ∈ x : natx : nat ` x : nat

x : nat ` ((+ 1) x) : nat

` λx :nat.((+ 1) x) : nat→ nat

What if we try to type λx :nat.xx?

x : nat 0 x : nat→ τ

x : nat ∈ x : natx : nat ` x : nat

x : nat ` xx : τ

` λx :nat.xx : nat→ τ

It should be reasonably obvious that we can quickly well-type (or proveuntypable) any simply typed term. (Did the stage when σ turned into natremind you of anything?)

84

Page 355: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Properties of simple types

I Uniqueness of types. In a given context (types for free variables),any simply typed lambda term has at most one type (and this isdecidable, in small polynomial time). (This should be fairly obviousfrom what we’ve just done; formal proof takes a little work, but notmuch.)

I Type safety. The type of a term remains unchanged underα, β, η-conversion. (Also straightforward, though with some lemmas(e.g. to handle substitution.))

I Strong normalization. A well typed term evaluates underβ-reduction in finitely many steps to a unique irreducible term. Ifthe type is a base type, then the irreducible term is a constant. (Abit harder to prove.)

I Corollary: Simply typed lambda calculus cannot beTuring-complete! What have we lost?

85

Page 356: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Properties of simple types

I Uniqueness of types. In a given context (types for free variables),any simply typed lambda term has at most one type (and this isdecidable, in small polynomial time). (This should be fairly obviousfrom what we’ve just done; formal proof takes a little work, but notmuch.)

I Type safety. The type of a term remains unchanged underα, β, η-conversion. (Also straightforward, though with some lemmas(e.g. to handle substitution.))

I Strong normalization. A well typed term evaluates underβ-reduction in finitely many steps to a unique irreducible term. Ifthe type is a base type, then the irreducible term is a constant. (Abit harder to prove.)

I Corollary: Simply typed lambda calculus cannot beTuring-complete! What have we lost?

85

Page 357: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Properties of simple types

I Uniqueness of types. In a given context (types for free variables),any simply typed lambda term has at most one type (and this isdecidable, in small polynomial time). (This should be fairly obviousfrom what we’ve just done; formal proof takes a little work, but notmuch.)

I Type safety. The type of a term remains unchanged underα, β, η-conversion. (Also straightforward, though with some lemmas(e.g. to handle substitution.))

I Strong normalization. A well typed term evaluates underβ-reduction in finitely many steps to a unique irreducible term. Ifthe type is a base type, then the irreducible term is a constant. (Abit harder to prove.)

I Corollary: Simply typed lambda calculus cannot beTuring-complete! What have we lost?

85

Page 358: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Properties of simple types

I Uniqueness of types. In a given context (types for free variables),any simply typed lambda term has at most one type (and this isdecidable, in small polynomial time). (This should be fairly obviousfrom what we’ve just done; formal proof takes a little work, but notmuch.)

I Type safety. The type of a term remains unchanged underα, β, η-conversion. (Also straightforward, though with some lemmas(e.g. to handle substitution.))

I Strong normalization. A well typed term evaluates underβ-reduction in finitely many steps to a unique irreducible term. Ifthe type is a base type, then the irreducible term is a constant. (Abit harder to prove.)

I Corollary: Simply typed lambda calculus cannot beTuring-complete! What have we lost?

85

Page 359: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Recursion lost

Just as λx .xx can’t be typed, nor can Y, nor any other potentiallynon-terminating term. (Exercise: convince yourself that Y can’t be givena simple type.)

What can we do? General recursion is bound to allow non-termination.

86

Page 360: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Recursion re-gained

If your tool-kit doesn’t have a gadget: make one and put it in the kit!

I We add the term constructor fix to the language: if t is a term,then (fix t) is a term.

I We extend β-reduction to say that:

fix(λx :τ.t)β→ t[fix(λx :τ.t)/x ]

(often called the unfolding rule).I We add a new typing rule

Γ ` t : τ → τΓ ` fix t : τ

Now we can use fix just as we used Y.

Exercise: Define addition of (base type nat) numbers using fix, thesuccessor and predecessor functions suc, prd : nat→ nat, and the equalityof numbers function = : nat→ nat→ nat.

87

Page 361: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Recursion re-gained

If your tool-kit doesn’t have a gadget: make one and put it in the kit!

I We add the term constructor fix to the language: if t is a term,then (fix t) is a term.

I We extend β-reduction to say that:

fix(λx :τ.t)β→ t[fix(λx :τ.t)/x ]

(often called the unfolding rule).I We add a new typing rule

Γ ` t : τ → τΓ ` fix t : τ

Now we can use fix just as we used Y.

Exercise: Define addition of (base type nat) numbers using fix, thesuccessor and predecessor functions suc, prd : nat→ nat, and the equalityof numbers function = : nat→ nat→ nat.

87

Page 362: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Recursion re-gained

If your tool-kit doesn’t have a gadget: make one and put it in the kit!

I We add the term constructor fix to the language: if t is a term,then (fix t) is a term.

I We extend β-reduction to say that:

fix(λx :τ.t)β→ t[fix(λx :τ.t)/x ]

(often called the unfolding rule).

I We add a new typing rule

Γ ` t : τ → τΓ ` fix t : τ

Now we can use fix just as we used Y.

Exercise: Define addition of (base type nat) numbers using fix, thesuccessor and predecessor functions suc, prd : nat→ nat, and the equalityof numbers function = : nat→ nat→ nat.

87

Page 363: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Recursion re-gained

If your tool-kit doesn’t have a gadget: make one and put it in the kit!

I We add the term constructor fix to the language: if t is a term,then (fix t) is a term.

I We extend β-reduction to say that:

fix(λx :τ.t)β→ t[fix(λx :τ.t)/x ]

(often called the unfolding rule).I We add a new typing rule

Γ ` t : τ → τΓ ` fix t : τ

Now we can use fix just as we used Y.

Exercise: Define addition of (base type nat) numbers using fix, thesuccessor and predecessor functions suc, prd : nat→ nat, and the equalityof numbers function = : nat→ nat→ nat.

87

Page 364: Julian Brad eld jcb@inf.ed.ac.uk IF{4

More bells and whistles

We can further extend the syntax of simply typed lambda with constructorsand types for the familiar apparatus of programming languages:

I tuples (products)

I records (named products)

I sums (union types)

I lists, trees etc. (N.B. one list type for each base type!)

See Pierce ch. 11 for gory (rather uninteresting) details.

More interesting is the let statement. Simple (non-recursive) lets can bedone by

let x :τ = t in t ′ ≡ (λx :τ.t ′)t

and fully recursive let definitions by

letrec x :τ = t in t ′ ≡ let x :τ = fix(λx :τ.t) in t ′

Exercise: Think carefully about the last! How does it work?

88

Page 365: Julian Brad eld jcb@inf.ed.ac.uk IF{4

More bells and whistles

We can further extend the syntax of simply typed lambda with constructorsand types for the familiar apparatus of programming languages:

I tuples (products)

I records (named products)

I sums (union types)

I lists, trees etc. (N.B. one list type for each base type!)

See Pierce ch. 11 for gory (rather uninteresting) details.

More interesting is the let statement. Simple (non-recursive) lets can bedone by

let x :τ = t in t ′ ≡ (λx :τ.t ′)t

and fully recursive let definitions by

letrec x :τ = t in t ′ ≡ let x :τ = fix(λx :τ.t) in t ′

Exercise: Think carefully about the last! How does it work?

88

Page 366: Julian Brad eld jcb@inf.ed.ac.uk IF{4

More bells and whistles

We can further extend the syntax of simply typed lambda with constructorsand types for the familiar apparatus of programming languages:

I tuples (products)

I records (named products)

I sums (union types)

I lists, trees etc. (N.B. one list type for each base type!)

See Pierce ch. 11 for gory (rather uninteresting) details.

More interesting is the let statement. Simple (non-recursive) lets can bedone by

let x :τ = t in t ′ ≡ (λx :τ.t ′)t

and fully recursive let definitions by

letrec x :τ = t in t ′ ≡ let x :τ = fix(λx :τ.t) in t ′

Exercise: Think carefully about the last! How does it work?88

Page 367: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Type reconstruction

It’s rather boring having to type all these types! Can we leave them out,write in the untyped language, and let the computer work out the types?

Essentially we program up the inference we did ‘in our heads’ to maketype proofs.

Give every (ordinary) variable x a different type variable, so x :αx . Thensee what relations have to hold between the α.

Before, we were using meta-variables σ, τ to do human reasoning; now we’regoing to promote these to actual variables ranging over types.

This is a constraint solving problem: e.g. we have α, β and we know thatα = β→ β.

Where do the constraints come from? From the typing rules.

First, some formalities . . .

89

Page 368: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Type reconstruction

It’s rather boring having to type all these types! Can we leave them out,write in the untyped language, and let the computer work out the types?

Essentially we program up the inference we did ‘in our heads’ to maketype proofs.

Give every (ordinary) variable x a different type variable, so x :αx . Thensee what relations have to hold between the α.

Before, we were using meta-variables σ, τ to do human reasoning; now we’regoing to promote these to actual variables ranging over types.

This is a constraint solving problem: e.g. we have α, β and we know thatα = β→ β.

Where do the constraints come from? From the typing rules.

First, some formalities . . .

89

Page 369: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Type reconstruction

It’s rather boring having to type all these types! Can we leave them out,write in the untyped language, and let the computer work out the types?

Essentially we program up the inference we did ‘in our heads’ to maketype proofs.

Give every (ordinary) variable x a different type variable, so x :αx . Thensee what relations have to hold between the α.

Before, we were using meta-variables σ, τ to do human reasoning; now we’regoing to promote these to actual variables ranging over types.

This is a constraint solving problem: e.g. we have α, β and we know thatα = β→ β.

Where do the constraints come from? From the typing rules.

First, some formalities . . .

89

Page 370: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Type variables

We add type variables to our typed language. Convention: I use thestart of the Greek alphabet for type variables. Now types include:

I every type variable α is a type.

Eventually, type variables need to turn into actual types: consider(λx :α.x)1.

A type substitution gives a substitution of types for type variables,e.g. [nat/α, bool/β, (α→ β)/γ]. We’ll write type substitution before aterm, e.g. [nat/α](λx :α.x) = λx :nat.x

Substitutions apply to terms, contexts, etc. in the obvious way.

I Theorem: type substitution preserves typing: if Γ ` t : τ , and T is atype substitution, then T Γ ` T t : T τ

I The reverse is not true: x : α 0 x : nat, but[nat/α](x : α) ` [nat/α]x : [nat/α]nat.

90

Page 371: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Type variables

We add type variables to our typed language. Convention: I use thestart of the Greek alphabet for type variables. Now types include:

I every type variable α is a type.

Eventually, type variables need to turn into actual types: consider(λx :α.x)1.

A type substitution gives a substitution of types for type variables,e.g. [nat/α, bool/β, (α→ β)/γ]. We’ll write type substitution before aterm, e.g. [nat/α](λx :α.x) = λx :nat.x

Substitutions apply to terms, contexts, etc. in the obvious way.

I Theorem: type substitution preserves typing: if Γ ` t : τ , and T is atype substitution, then T Γ ` T t : T τ

I The reverse is not true: x : α 0 x : nat, but[nat/α](x : α) ` [nat/α]x : [nat/α]nat.

90

Page 372: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Type variables

We add type variables to our typed language. Convention: I use thestart of the Greek alphabet for type variables. Now types include:

I every type variable α is a type.

Eventually, type variables need to turn into actual types: consider(λx :α.x)1.

A type substitution gives a substitution of types for type variables,e.g. [nat/α, bool/β, (α→ β)/γ]. We’ll write type substitution before aterm, e.g. [nat/α](λx :α.x) = λx :nat.x

Substitutions apply to terms, contexts, etc. in the obvious way.

I Theorem: type substitution preserves typing: if Γ ` t : τ , and T is atype substitution, then T Γ ` T t : T τ

I The reverse is not true: x : α 0 x : nat, but[nat/α](x : α) ` [nat/α]x : [nat/α]nat.

90

Page 373: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Type variables

We add type variables to our typed language. Convention: I use thestart of the Greek alphabet for type variables. Now types include:

I every type variable α is a type.

Eventually, type variables need to turn into actual types: consider(λx :α.x)1.

A type substitution gives a substitution of types for type variables,e.g. [nat/α, bool/β, (α→ β)/γ]. We’ll write type substitution before aterm, e.g. [nat/α](λx :α.x) = λx :nat.x

Substitutions apply to terms, contexts, etc. in the obvious way.

I Theorem: type substitution preserves typing: if Γ ` t : τ , and T is atype substitution, then T Γ ` T t : T τ

I The reverse is not true: x : α 0 x : nat, but[nat/α](x : α) ` [nat/α]x : [nat/α]nat.

90

Page 374: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reconstruction example

Find types for λf .λx .λy .suc(f (+ x y)).

First annotate: λf :γ.λx :α.λy :β.suc(f (+ x y)).

Now make the proof tree, using additional type variables and recordingequations instead of working things out:

. . . ` suc : nat→ nat

[γ = η→ nat] . . . ` f : γ

. . . ` f : η→ nat

[α→ θ→ η = nat→ nat→ nat]

. . . ` + : nat→ nat→ nat

. . . ` + : α→ (θ→ η) . . . ` x : α

. . . ` (+ x) : θ→ η

[θ = β]

. . . ` y : β

. . . ` y : θ

. . . ` (+ x y) : η

. . . ` f (+ x y) : nat [ζ = nat]

f : γ, x : α, y : β ` suc(f (+ x y)) : ζ

f : γ, x : α ` λy :β.suc(f (+ x y)) : β→ ζ [ε = β→ ζ]

f : γ ` λx :α.λy :β.suc(f (+ x y)) :

α→ ε [δ = α→ ε]

` λf :γ.λx :α.λy :β.suc(f (+ x y)) : γ→ δ

Solve the equations to get α = β = nat, and γ = nat→ nat.

91

Page 375: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reconstruction example

Find types for λf .λx .λy .suc(f (+ x y)).

First annotate: λf :γ.λx :α.λy :β.suc(f (+ x y)).

Now make the proof tree, using additional type variables and recordingequations instead of working things out:

. . . ` suc : nat→ nat

[γ = η→ nat] . . . ` f : γ

. . . ` f : η→ nat

[α→ θ→ η = nat→ nat→ nat]

. . . ` + : nat→ nat→ nat

. . . ` + : α→ (θ→ η) . . . ` x : α

. . . ` (+ x) : θ→ η

[θ = β]

. . . ` y : β

. . . ` y : θ

. . . ` (+ x y) : η

. . . ` f (+ x y) : nat [ζ = nat]

f : γ, x : α, y : β ` suc(f (+ x y)) : ζ

f : γ, x : α ` λy :β.suc(f (+ x y)) : β→ ζ [ε = β→ ζ]

f : γ ` λx :α.λy :β.suc(f (+ x y)) :

α→ ε [δ = α→ ε]

` λf :γ.λx :α.λy :β.suc(f (+ x y)) : γ→ δ

Solve the equations to get α = β = nat, and γ = nat→ nat.

91

Page 376: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reconstruction example

Find types for λf .λx .λy .suc(f (+ x y)).

First annotate: λf :γ.λx :α.λy :β.suc(f (+ x y)).

Now make the proof tree, using additional type variables and recordingequations instead of working things out:

. . . ` suc : nat→ nat

[γ = η→ nat] . . . ` f : γ

. . . ` f : η→ nat

[α→ θ→ η = nat→ nat→ nat]

. . . ` + : nat→ nat→ nat

. . . ` + : α→ (θ→ η) . . . ` x : α

. . . ` (+ x) : θ→ η

[θ = β]

. . . ` y : β

. . . ` y : θ

. . . ` (+ x y) : η

. . . ` f (+ x y) : nat [ζ = nat]

f : γ, x : α, y : β ` suc(f (+ x y)) : ζ

f : γ, x : α ` λy :β.suc(f (+ x y)) : β→ ζ [ε = β→ ζ]

f : γ ` λx :α.λy :β.suc(f (+ x y)) : δ

α→ ε [δ = α→ ε]

` λf :γ.λx :α.λy :β.suc(f (+ x y)) : γ→ δ

Solve the equations to get α = β = nat, and γ = nat→ nat.

91

Page 377: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reconstruction example

Find types for λf .λx .λy .suc(f (+ x y)).

First annotate: λf :γ.λx :α.λy :β.suc(f (+ x y)).

Now make the proof tree, using additional type variables and recordingequations instead of working things out:

. . . ` suc : nat→ nat

[γ = η→ nat] . . . ` f : γ

. . . ` f : η→ nat

[α→ θ→ η = nat→ nat→ nat]

. . . ` + : nat→ nat→ nat

. . . ` + : α→ (θ→ η) . . . ` x : α

. . . ` (+ x) : θ→ η

[θ = β]

. . . ` y : β

. . . ` y : θ

. . . ` (+ x y) : η

. . . ` f (+ x y) : nat [ζ = nat]

f : γ, x : α, y : β ` suc(f (+ x y)) : ζ

f : γ, x : α ` λy :β.suc(f (+ x y)) : β→ ζ [ε = β→ ζ]

f : γ ` λx :α.λy :β.suc(f (+ x y)) : α→ ε [δ = α→ ε]

` λf :γ.λx :α.λy :β.suc(f (+ x y)) : γ→ δ

Solve the equations to get α = β = nat, and γ = nat→ nat.

91

Page 378: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reconstruction example

Find types for λf .λx .λy .suc(f (+ x y)).

First annotate: λf :γ.λx :α.λy :β.suc(f (+ x y)).

Now make the proof tree, using additional type variables and recordingequations instead of working things out:

. . . ` suc : nat→ nat

[γ = η→ nat] . . . ` f : γ

. . . ` f : η→ nat

[α→ θ→ η = nat→ nat→ nat]

. . . ` + : nat→ nat→ nat

. . . ` + : α→ (θ→ η) . . . ` x : α

. . . ` (+ x) : θ→ η

[θ = β]

. . . ` y : β

. . . ` y : θ

. . . ` (+ x y) : η

. . . ` f (+ x y) : nat [ζ = nat]

f : γ, x : α, y : β ` suc(f (+ x y)) : ζ

f : γ, x : α ` λy :β.suc(f (+ x y)) : β→ ζ [ε = β→ ζ]

f : γ ` λx :α.λy :β.suc(f (+ x y)) : α→ ε [δ = α→ ε]

` λf :γ.λx :α.λy :β.suc(f (+ x y)) : γ→ δ

Solve the equations to get α = β = nat, and γ = nat→ nat.

91

Page 379: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reconstruction example

Find types for λf .λx .λy .suc(f (+ x y)).

First annotate: λf :γ.λx :α.λy :β.suc(f (+ x y)).

Now make the proof tree, using additional type variables and recordingequations instead of working things out:

. . . ` suc : nat→ nat

[γ = η→ nat] . . . ` f : γ

. . . ` f : η→ nat

[α→ θ→ η = nat→ nat→ nat]

. . . ` + : nat→ nat→ nat

. . . ` + : α→ (θ→ η) . . . ` x : α

. . . ` (+ x) : θ→ η

[θ = β]

. . . ` y : β

. . . ` y : θ

. . . ` (+ x y) : η

. . . ` f (+ x y) : nat [ζ = nat]

f : γ, x : α, y : β ` suc(f (+ x y)) : ζ

f : γ, x : α ` λy :β.suc(f (+ x y)) : β→ ζ [ε = β→ ζ]

f : γ ` λx :α.λy :β.suc(f (+ x y)) : α→ ε [δ = α→ ε]

` λf :γ.λx :α.λy :β.suc(f (+ x y)) : γ→ δ

Solve the equations to get α = β = nat, and γ = nat→ nat.

91

Page 380: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reconstruction example

Find types for λf .λx .λy .suc(f (+ x y)).

First annotate: λf :γ.λx :α.λy :β.suc(f (+ x y)).

Now make the proof tree, using additional type variables and recordingequations instead of working things out:

. . . ` suc : nat→ nat

[γ = η→ nat] . . . ` f : γ

. . . ` f : η→ nat

[α→ θ→ η = nat→ nat→ nat]

. . . ` + : nat→ nat→ nat

. . . ` + : α→ (θ→ η) . . . ` x : α

. . . ` (+ x) : θ→ η

[θ = β]

. . . ` y : β

. . . ` y : θ

. . . ` (+ x y) : η

. . . ` f (+ x y) : nat [ζ = nat]

f : γ, x : α, y : β ` suc(f (+ x y)) : ζ

f : γ, x : α ` λy :β.suc(f (+ x y)) : β→ ζ [ε = β→ ζ]

f : γ ` λx :α.λy :β.suc(f (+ x y)) : α→ ε [δ = α→ ε]

` λf :γ.λx :α.λy :β.suc(f (+ x y)) : γ→ δ

Solve the equations to get α = β = nat, and γ = nat→ nat.

91

Page 381: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reconstruction example

Find types for λf .λx .λy .suc(f (+ x y)).

First annotate: λf :γ.λx :α.λy :β.suc(f (+ x y)).

Now make the proof tree, using additional type variables and recordingequations instead of working things out:

. . . ` suc : nat→ nat

[γ = η→ nat] . . . ` f : γ

. . . ` f : η→ nat

[α→ θ→ η = nat→ nat→ nat]

. . . ` + : nat→ nat→ nat

. . . ` + : α→ (θ→ η) . . . ` x : α

. . . ` (+ x) : θ→ η

[θ = β]

. . . ` y : β

. . . ` y : θ

. . . ` (+ x y) : η

. . . ` f (+ x y) : nat [ζ = nat]

f : γ, x : α, y : β ` suc(f (+ x y)) : ζ

f : γ, x : α ` λy :β.suc(f (+ x y)) : β→ ζ [ε = β→ ζ]

f : γ ` λx :α.λy :β.suc(f (+ x y)) : α→ ε [δ = α→ ε]

` λf :γ.λx :α.λy :β.suc(f (+ x y)) : γ→ δ

Solve the equations to get α = β = nat, and γ = nat→ nat.

91

Page 382: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reconstruction example

Find types for λf .λx .λy .suc(f (+ x y)).

First annotate: λf :γ.λx :α.λy :β.suc(f (+ x y)).

Now make the proof tree, using additional type variables and recordingequations instead of working things out:

. . . ` suc : nat→ nat

[γ = η→ nat] . . . ` f : γ

. . . ` f : η→ nat

[α→ θ→ η = nat→ nat→ nat]

. . . ` + : nat→ nat→ nat

. . . ` + : α→ (θ→ η) . . . ` x : α

. . . ` (+ x) : θ→ η

[θ = β]

. . . ` y : β

. . . ` y : θ

. . . ` (+ x y) : η

. . . ` f (+ x y) : nat [ζ = nat]

f : γ, x : α, y : β ` suc(f (+ x y)) : ζ

f : γ, x : α ` λy :β.suc(f (+ x y)) : β→ ζ [ε = β→ ζ]

f : γ ` λx :α.λy :β.suc(f (+ x y)) : α→ ε [δ = α→ ε]

` λf :γ.λx :α.λy :β.suc(f (+ x y)) : γ→ δ

Solve the equations to get α = β = nat, and γ = nat→ nat.

91

Page 383: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reconstruction example

Find types for λf .λx .λy .suc(f (+ x y)).

First annotate: λf :γ.λx :α.λy :β.suc(f (+ x y)).

Now make the proof tree, using additional type variables and recordingequations instead of working things out:

. . . ` suc : nat→ nat

[γ = η→ nat] . . . ` f : γ

. . . ` f : η→ nat

[α→ θ→ η = nat→ nat→ nat]

. . . ` + : nat→ nat→ nat

. . . ` + : α→ (θ→ η) . . . ` x : α

. . . ` (+ x) : θ→ η

[θ = β]

. . . ` y : β

. . . ` y : θ

. . . ` (+ x y) : η

. . . ` f (+ x y) : nat [ζ = nat]

f : γ, x : α, y : β ` suc(f (+ x y)) : ζ

f : γ, x : α ` λy :β.suc(f (+ x y)) : β→ ζ [ε = β→ ζ]

f : γ ` λx :α.λy :β.suc(f (+ x y)) : α→ ε [δ = α→ ε]

` λf :γ.λx :α.λy :β.suc(f (+ x y)) : γ→ δ

Solve the equations to get α = β = nat, and γ = nat→ nat.

91

Page 384: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reconstruction example

Find types for λf .λx .λy .suc(f (+ x y)).

First annotate: λf :γ.λx :α.λy :β.suc(f (+ x y)).

Now make the proof tree, using additional type variables and recordingequations instead of working things out:

. . . ` suc : nat→ nat

[γ = η→ nat] . . . ` f : γ

. . . ` f : η→ nat

[α→ θ→ η = nat→ nat→ nat]

. . . ` + : nat→ nat→ nat

. . . ` + : α→ (θ→ η)

. . . ` x : α

. . . ` (+ x) : θ→ η

[θ = β]

. . . ` y : β

. . . ` y : θ

. . . ` (+ x y) : η

. . . ` f (+ x y) : nat [ζ = nat]

f : γ, x : α, y : β ` suc(f (+ x y)) : ζ

f : γ, x : α ` λy :β.suc(f (+ x y)) : β→ ζ [ε = β→ ζ]

f : γ ` λx :α.λy :β.suc(f (+ x y)) : α→ ε [δ = α→ ε]

` λf :γ.λx :α.λy :β.suc(f (+ x y)) : γ→ δ

Solve the equations to get α = β = nat, and γ = nat→ nat.

91

Page 385: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reconstruction example

Find types for λf .λx .λy .suc(f (+ x y)).

First annotate: λf :γ.λx :α.λy :β.suc(f (+ x y)).

Now make the proof tree, using additional type variables and recordingequations instead of working things out:

. . . ` suc : nat→ nat

[γ = η→ nat] . . . ` f : γ

. . . ` f : η→ nat

[α→ θ→ η = nat→ nat→ nat]

. . . ` + : nat→ nat→ nat

. . . ` + : α→ (θ→ η)

. . . ` x : α

. . . ` (+ x) : θ→ η

[θ = β]

. . . ` y : β

. . . ` y : θ

. . . ` (+ x y) : η

. . . ` f (+ x y) : nat [ζ = nat]

f : γ, x : α, y : β ` suc(f (+ x y)) : ζ

f : γ, x : α ` λy :β.suc(f (+ x y)) : β→ ζ [ε = β→ ζ]

f : γ ` λx :α.λy :β.suc(f (+ x y)) : α→ ε [δ = α→ ε]

` λf :γ.λx :α.λy :β.suc(f (+ x y)) : γ→ δ

Solve the equations to get α = β = nat, and γ = nat→ nat.

91

Page 386: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reconstruction example

Find types for λf .λx .λy .suc(f (+ x y)).

First annotate: λf :γ.λx :α.λy :β.suc(f (+ x y)).

Now make the proof tree, using additional type variables and recordingequations instead of working things out:

. . . ` suc : nat→ nat

[γ = η→ nat] . . . ` f : γ

. . . ` f : η→ nat

[α→ θ→ η = nat→ nat→ nat]

. . . ` + : nat→ nat→ nat

. . . ` + : α→ (θ→ η) . . . ` x : α

. . . ` (+ x) : θ→ η

[θ = β]

. . . ` y : β

. . . ` y : θ

. . . ` (+ x y) : η

. . . ` f (+ x y) : nat [ζ = nat]

f : γ, x : α, y : β ` suc(f (+ x y)) : ζ

f : γ, x : α ` λy :β.suc(f (+ x y)) : β→ ζ [ε = β→ ζ]

f : γ ` λx :α.λy :β.suc(f (+ x y)) : α→ ε [δ = α→ ε]

` λf :γ.λx :α.λy :β.suc(f (+ x y)) : γ→ δ

Solve the equations to get α = β = nat, and γ = nat→ nat.

91

Page 387: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reconstruction example

Find types for λf .λx .λy .suc(f (+ x y)).

First annotate: λf :γ.λx :α.λy :β.suc(f (+ x y)).

Now make the proof tree, using additional type variables and recordingequations instead of working things out:

. . . ` suc : nat→ nat

[γ = η→ nat] . . . ` f : γ

. . . ` f : η→ nat

[α→ θ→ η = nat→ nat→ nat]

. . . ` + : nat→ nat→ nat

. . . ` + : α→ (θ→ η) . . . ` x : α

. . . ` (+ x) : θ→ η

[θ = β]

. . . ` y : β

. . . ` y : θ

. . . ` (+ x y) : η

. . . ` f (+ x y) : nat [ζ = nat]

f : γ, x : α, y : β ` suc(f (+ x y)) : ζ

f : γ, x : α ` λy :β.suc(f (+ x y)) : β→ ζ [ε = β→ ζ]

f : γ ` λx :α.λy :β.suc(f (+ x y)) : α→ ε [δ = α→ ε]

` λf :γ.λx :α.λy :β.suc(f (+ x y)) : γ→ δ

Solve the equations to get α = β = nat, and γ = nat→ nat.

91

Page 388: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Reconstruction example

Find types for λf .λx .λy .suc(f (+ x y)).

First annotate: λf :γ.λx :α.λy :β.suc(f (+ x y)).

Now make the proof tree, using additional type variables and recordingequations instead of working things out:

. . . ` suc : nat→ nat

[γ = η→ nat] . . . ` f : γ

. . . ` f : η→ nat

[α→ θ→ η = nat→ nat→ nat]

. . . ` + : nat→ nat→ nat

. . . ` + : α→ (θ→ η) . . . ` x : α

. . . ` (+ x) : θ→ η

[θ = β]

. . . ` y : β

. . . ` y : θ

. . . ` (+ x y) : η

. . . ` f (+ x y) : nat [ζ = nat]

f : γ, x : α, y : β ` suc(f (+ x y)) : ζ

f : γ, x : α ` λy :β.suc(f (+ x y)) : β→ ζ [ε = β→ ζ]

f : γ ` λx :α.λy :β.suc(f (+ x y)) : α→ ε [δ = α→ ε]

` λf :γ.λx :α.λy :β.suc(f (+ x y)) : γ→ δ

Solve the equations to get α = β = nat, and γ = nat→ nat.91

Page 389: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Solving the type equations

The equations are very simple: just equalities involving variables and →.

We can solve them by unification (technique from logic programming).Standard algorithm: look it up in Pierce for this case, or anywhere for thegeneral case.

Theorem: if an untyped λ-term can be simply typed, this procedure willfind a simple type for it, or return ‘fail’ otherwise (if the equations haveno solution).

Proof by a somewhat technically involved induction on terms. (Full detailsin Pierce.)

So type inference is decidable (with reasonable polynomial complexity –in all practical cases, it’s even linear).

92

Page 390: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Solving the type equations

The equations are very simple: just equalities involving variables and →.

We can solve them by unification (technique from logic programming).Standard algorithm: look it up in Pierce for this case, or anywhere for thegeneral case.

Theorem: if an untyped λ-term can be simply typed, this procedure willfind a simple type for it, or return ‘fail’ otherwise (if the equations haveno solution).

Proof by a somewhat technically involved induction on terms. (Full detailsin Pierce.)

So type inference is decidable (with reasonable polynomial complexity –in all practical cases, it’s even linear).

92

Page 391: Julian Brad eld jcb@inf.ed.ac.uk IF{4

General solutions

I Consider the identity function λx .x . Type reconstruction gives(λx :α.x) : α→ α with no further equations!

I What does an unsolved variable type mean? We can type λx .x withany substitution [τ/α].

I In general, the reconstruction algorithm gives us the most generaltype, leaving free variables.

I What about λf :α.λx :β.f (fx)?We get α = β→ β, so type (β→ β)→ β→ β.

I What about λf :α.λx :β.f (x)?We get α = β→ γ, so type (β→ γ)→ β→ γ.

I But still, to get a real simply typed term, we must substitute.

93

Page 392: Julian Brad eld jcb@inf.ed.ac.uk IF{4

General solutions

I Consider the identity function λx .x . Type reconstruction gives(λx :α.x) : α→ α with no further equations!

I What does an unsolved variable type mean? We can type λx .x withany substitution [τ/α].

I In general, the reconstruction algorithm gives us the most generaltype, leaving free variables.

I What about λf :α.λx :β.f (fx)?We get α = β→ β, so type (β→ β)→ β→ β.

I What about λf :α.λx :β.f (x)?We get α = β→ γ, so type (β→ γ)→ β→ γ.

I But still, to get a real simply typed term, we must substitute.

93

Page 393: Julian Brad eld jcb@inf.ed.ac.uk IF{4

General solutions

I Consider the identity function λx .x . Type reconstruction gives(λx :α.x) : α→ α with no further equations!

I What does an unsolved variable type mean? We can type λx .x withany substitution [τ/α].

I In general, the reconstruction algorithm gives us the most generaltype, leaving free variables.

I What about λf :α.λx :β.f (fx)?We get α = β→ β, so type (β→ β)→ β→ β.

I What about λf :α.λx :β.f (x)?We get α = β→ γ, so type (β→ γ)→ β→ γ.

I But still, to get a real simply typed term, we must substitute.

93

Page 394: Julian Brad eld jcb@inf.ed.ac.uk IF{4

General solutions

I Consider the identity function λx .x . Type reconstruction gives(λx :α.x) : α→ α with no further equations!

I What does an unsolved variable type mean? We can type λx .x withany substitution [τ/α].

I In general, the reconstruction algorithm gives us the most generaltype, leaving free variables.

I What about λf :α.λx :β.f (fx)?We get α = β→ β, so type (β→ β)→ β→ β.

I What about λf :α.λx :β.f (x)?We get α = β→ γ, so type (β→ γ)→ β→ γ.

I But still, to get a real simply typed term, we must substitute.

93

Page 395: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Polymorphism

In the simply-typed λ-calculus, we have infinitely many distinct identityfunctions: λx :nat.x , λx :bool.x , λx :nat→ nat→ nat.x and so on.

Wouldn’t be nice if we could write a polymorphic λx .x that was actuallythe identity for all the types at once?

How did we avoid giving a type to the recursion ‘combinator’ fix?

Note that if we write it out each time, we can do this, as type reconstruc-tion works separately on each instance:

〈(λx .x)(1), (λx .x)(true)〉

What we can’t do is:

let id = λx .x in 〈id(1), id(true)〉

Why not? Un-sugar the let into a λ-application, and work through thereconstruction. How does it fail?

94

Page 396: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Polymorphism

In the simply-typed λ-calculus, we have infinitely many distinct identityfunctions: λx :nat.x , λx :bool.x , λx :nat→ nat→ nat.x and so on.

Wouldn’t be nice if we could write a polymorphic λx .x that was actuallythe identity for all the types at once?

How did we avoid giving a type to the recursion ‘combinator’ fix?

Note that if we write it out each time, we can do this, as type reconstruc-tion works separately on each instance:

〈(λx .x)(1), (λx .x)(true)〉

What we can’t do is:

let id = λx .x in 〈id(1), id(true)〉

Why not? Un-sugar the let into a λ-application, and work through thereconstruction. How does it fail?

94

Page 397: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Polymorphism

In the simply-typed λ-calculus, we have infinitely many distinct identityfunctions: λx :nat.x , λx :bool.x , λx :nat→ nat→ nat.x and so on.

Wouldn’t be nice if we could write a polymorphic λx .x that was actuallythe identity for all the types at once?

How did we avoid giving a type to the recursion ‘combinator’ fix?

Note that if we write it out each time, we can do this, as type reconstruc-tion works separately on each instance:

〈(λx .x)(1), (λx .x)(true)〉

What we can’t do is:

let id = λx .x in 〈id(1), id(true)〉

Why not? Un-sugar the let into a λ-application, and work through thereconstruction. How does it fail?

94

Page 398: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Polymorphism more formally

We have several possibilities for types with free type variables:

I some terms, like λx :α.x are well-typed ‘with the variablesthemselves’, and so are well-typed for any type substitution;

I some terms, like λx :α.(+ 1 x) are ill-typed, and only becomewell-typed for a specific substitution [nat/α];

I and some become well-typed for many substitutions:λf :α.λx :β.f (fx) well types under T wheneverT (α) = T (β)→T (β).

So what is the best type for λx .x? Our typing rules tell us that for anyα, we have ` (λx :α.x) : α→α. Perhaps we could say that λx .x has thetype scheme ∀α.α→ α?

λx .(+ 1 x), however, must have type nat→ nat.

And λf .λx .f (fx) could have the type (scheme) ∀β.(β→ β)→ β→ β?

The universal quantifier binds the free type variables.

Now go back to our original simply typed proof system, when we tried to type

λx :nat.xx – did anything happen in there that looked like type substitution?

95

Page 399: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Polymorphism more formally

We have several possibilities for types with free type variables:

I some terms, like λx :α.x are well-typed ‘with the variablesthemselves’, and so are well-typed for any type substitution;

I some terms, like λx :α.(+ 1 x) are ill-typed, and only becomewell-typed for a specific substitution [nat/α];

I and some become well-typed for many substitutions:λf :α.λx :β.f (fx) well types under T wheneverT (α) = T (β)→T (β).

So what is the best type for λx .x? Our typing rules tell us that for anyα, we have ` (λx :α.x) : α→α. Perhaps we could say that λx .x has thetype scheme ∀α.α→ α?

λx .(+ 1 x), however, must have type nat→ nat.

And λf .λx .f (fx) could have the type (scheme) ∀β.(β→ β)→ β→ β?

The universal quantifier binds the free type variables.

Now go back to our original simply typed proof system, when we tried to type

λx :nat.xx – did anything happen in there that looked like type substitution?

95

Page 400: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Polymorphism more formally

We have several possibilities for types with free type variables:

I some terms, like λx :α.x are well-typed ‘with the variablesthemselves’, and so are well-typed for any type substitution;

I some terms, like λx :α.(+ 1 x) are ill-typed, and only becomewell-typed for a specific substitution [nat/α];

I and some become well-typed for many substitutions:λf :α.λx :β.f (fx) well types under T wheneverT (α) = T (β)→T (β).

So what is the best type for λx .x? Our typing rules tell us that for anyα, we have ` (λx :α.x) : α→α. Perhaps we could say that λx .x has thetype scheme ∀α.α→ α?

λx .(+ 1 x), however, must have type nat→ nat.

And λf .λx .f (fx) could have the type (scheme) ∀β.(β→ β)→ β→ β?

The universal quantifier binds the free type variables.

Now go back to our original simply typed proof system, when we tried to type

λx :nat.xx – did anything happen in there that looked like type substitution?

95

Page 401: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Polymorphism more formally

We have several possibilities for types with free type variables:

I some terms, like λx :α.x are well-typed ‘with the variablesthemselves’, and so are well-typed for any type substitution;

I some terms, like λx :α.(+ 1 x) are ill-typed, and only becomewell-typed for a specific substitution [nat/α];

I and some become well-typed for many substitutions:λf :α.λx :β.f (fx) well types under T wheneverT (α) = T (β)→T (β).

So what is the best type for λx .x? Our typing rules tell us that for anyα, we have ` (λx :α.x) : α→α. Perhaps we could say that λx .x has thetype scheme ∀α.α→ α?

λx .(+ 1 x), however, must have type nat→ nat.

And λf .λx .f (fx) could have the type (scheme) ∀β.(β→ β)→ β→ β?

The universal quantifier binds the free type variables.

Now go back to our original simply typed proof system, when we tried to type

λx :nat.xx – did anything happen in there that looked like type substitution?

95

Page 402: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Polymorphism more formally

We have several possibilities for types with free type variables:

I some terms, like λx :α.x are well-typed ‘with the variablesthemselves’, and so are well-typed for any type substitution;

I some terms, like λx :α.(+ 1 x) are ill-typed, and only becomewell-typed for a specific substitution [nat/α];

I and some become well-typed for many substitutions:λf :α.λx :β.f (fx) well types under T wheneverT (α) = T (β)→T (β).

So what is the best type for λx .x? Our typing rules tell us that for anyα, we have ` (λx :α.x) : α→α. Perhaps we could say that λx .x has thetype scheme ∀α.α→ α?

λx .(+ 1 x), however, must have type nat→ nat.

And λf .λx .f (fx) could have the type (scheme) ∀β.(β→ β)→ β→ β?

The universal quantifier binds the free type variables.

Now go back to our original simply typed proof system, when we tried to type

λx :nat.xx – did anything happen in there that looked like type substitution?

95

Page 403: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Polymorphism more formally

We have several possibilities for types with free type variables:

I some terms, like λx :α.x are well-typed ‘with the variablesthemselves’, and so are well-typed for any type substitution;

I some terms, like λx :α.(+ 1 x) are ill-typed, and only becomewell-typed for a specific substitution [nat/α];

I and some become well-typed for many substitutions:λf :α.λx :β.f (fx) well types under T wheneverT (α) = T (β)→T (β).

So what is the best type for λx .x? Our typing rules tell us that for anyα, we have ` (λx :α.x) : α→α. Perhaps we could say that λx .x has thetype scheme ∀α.α→ α?

λx .(+ 1 x), however, must have type nat→ nat.

And λf .λx .f (fx) could have the type (scheme) ∀β.(β→ β)→ β→ β?

The universal quantifier binds the free type variables.

Now go back to our original simply typed proof system, when we tried to type

λx :nat.xx – did anything happen in there that looked like type substitution? 95

Page 404: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Let-polymorphism

Allowing type schemes seems natural, but . . .

I could we still do type inference?I are we going to get more complicated types?

We’ll come back to this, but just think about trying to type

(λx .x)(λx .x)

It turns out that there is a natural ‘sweet spot’ if we allow type schemesin let-bindings, but not in λ-abstractions. (N.B. This means let has tobe promoted from syntactic sugar to a primitive.) Then

let id = λx .x in 〈id(1), id(true)〉will work as desired.

The following will not work!

let id = λx .x in (λf .〈f (1), f (true)〉)(id)

Exercise: Verify that indeed it fails in Haskell.

96

Page 405: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Let-polymorphism

Allowing type schemes seems natural, but . . .

I could we still do type inference?I are we going to get more complicated types?

We’ll come back to this, but just think about trying to type

(λx .x)(λx .x)

It turns out that there is a natural ‘sweet spot’ if we allow type schemesin let-bindings, but not in λ-abstractions. (N.B. This means let has tobe promoted from syntactic sugar to a primitive.) Then

let id = λx .x in 〈id(1), id(true)〉will work as desired.

The following will not work!

let id = λx .x in (λf .〈f (1), f (true)〉)(id)

Exercise: Verify that indeed it fails in Haskell.

96

Page 406: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Let-polymorphism

Allowing type schemes seems natural, but . . .

I could we still do type inference?I are we going to get more complicated types?

We’ll come back to this, but just think about trying to type

(λx .x)(λx .x)

It turns out that there is a natural ‘sweet spot’ if we allow type schemesin let-bindings, but not in λ-abstractions. (N.B. This means let has tobe promoted from syntactic sugar to a primitive.) Then

let id = λx .x in 〈id(1), id(true)〉will work as desired.

The following will not work!

let id = λx .x in (λf .〈f (1), f (true)〉)(id)

Exercise: Verify that indeed it fails in Haskell. 96

Page 407: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Hindley–Milner type inference

First, we add let as a primitive:

I If x is a variable, and t and t ′ are terms, then let x = t in t ′ is aterm.

I It evaluates just as λ:

let x = t in t ′β→ t ′[t/x ]

The difference comes in the typing rule:

Γ ` t : σ Γ, x : Γ(σ) ` t ′ : τ

Γ ` let x = t in t ′ : τ

What is Γ(σ)? It is the result of generalizing (binding with ∀) all free typevariables in σ that do not occur free in Γ.

E.g.: if x : α ` t : α→ β, then x : α(α→ β) = ∀β.α→ β

97

Page 408: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Hindley–Milner type inference

First, we add let as a primitive:

I If x is a variable, and t and t ′ are terms, then let x = t in t ′ is aterm.

I It evaluates just as λ:

let x = t in t ′β→ t ′[t/x ]

The difference comes in the typing rule:

Γ ` t : σ Γ, x : Γ(σ) ` t ′ : τ

Γ ` let x = t in t ′ : τ

What is Γ(σ)? It is the result of generalizing (binding with ∀) all free typevariables in σ that do not occur free in Γ.

E.g.: if x : α ` t : α→ β, then x : α(α→ β) = ∀β.α→ β

97

Page 409: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Hindley–Milner type inference

First, we add let as a primitive:

I If x is a variable, and t and t ′ are terms, then let x = t in t ′ is aterm.

I It evaluates just as λ:

let x = t in t ′β→ t ′[t/x ]

The difference comes in the typing rule:

Γ ` t : σ Γ, x : Γ(σ) ` t ′ : τ

Γ ` let x = t in t ′ : τ

What is Γ(σ)? It is the result of generalizing (binding with ∀) all free typevariables in σ that do not occur free in Γ.

E.g.: if x : α ` t : α→ β, then x : α(α→ β) = ∀β.α→ β

97

Page 410: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Hindley–Milner type inference

First, we add let as a primitive:

I If x is a variable, and t and t ′ are terms, then let x = t in t ′ is aterm.

I It evaluates just as λ:

let x = t in t ′β→ t ′[t/x ]

The difference comes in the typing rule:

Γ ` t : σ Γ, x : Γ(σ) ` t ′ : τ

Γ ` let x = t in t ′ : τ

What is Γ(σ)? It is the result of generalizing (binding with ∀) all free typevariables in σ that do not occur free in Γ.

E.g.: if x : α ` t : α→ β, then x : α(α→ β) = ∀β.α→ β

97

Page 411: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Hindley–Milner type inference

First, we add let as a primitive:

I If x is a variable, and t and t ′ are terms, then let x = t in t ′ is aterm.

I It evaluates just as λ:

let x = t in t ′β→ t ′[t/x ]

The difference comes in the typing rule:

Γ ` t : σ Γ, x : Γ(σ) ` t ′ : τ

Γ ` let x = t in t ′ : τ

What is Γ(σ)? It is the result of generalizing (binding with ∀) all free typevariables in σ that do not occur free in Γ.

E.g.: if x : α ` t : α→ β, then x : α(α→ β) = ∀β.α→ β

97

Page 412: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Generalization and Specialization

Generalization gave variables type schemes (or polytypes). For typeinference to work, we have to un-generalize, or specialize, by modifyingthe rule for variables:

x : σ ∈ Γ σ v τΓ ` x : τ

What does σ v τ mean? ‘τ is more special than σ’, i.e. is got bysubstituting types (no quantifiers!) for the quantified variables of σ.

E.g. ∀β.α→ β v α→ γ and ∀β.α→ β v α→ (nat→ bool).

This lets every use of a polymorphic identifier get a new bunch of typevariables to label it.

98

Page 413: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Generalization and Specialization

Generalization gave variables type schemes (or polytypes). For typeinference to work, we have to un-generalize, or specialize, by modifyingthe rule for variables:

x : σ ∈ Γ σ v τΓ ` x : τ

What does σ v τ mean? ‘τ is more special than σ’, i.e. is got bysubstituting types (no quantifiers!) for the quantified variables of σ.

E.g. ∀β.α→ β v α→ γ and ∀β.α→ β v α→ (nat→ bool).

This lets every use of a polymorphic identifier get a new bunch of typevariables to label it.

98

Page 414: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Generalization and Specialization

Generalization gave variables type schemes (or polytypes). For typeinference to work, we have to un-generalize, or specialize, by modifyingthe rule for variables:

x : σ ∈ Γ σ v τΓ ` x : τ

What does σ v τ mean? ‘τ is more special than σ’, i.e. is got bysubstituting types (no quantifiers!) for the quantified variables of σ.

E.g. ∀β.α→ β v α→ γ and ∀β.α→ β v α→ (nat→ bool).

This lets every use of a polymorphic identifier get a new bunch of typevariables to label it.

98

Page 415: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Generalization and Specialization

Generalization gave variables type schemes (or polytypes). For typeinference to work, we have to un-generalize, or specialize, by modifyingthe rule for variables:

x : σ ∈ Γ σ v τΓ ` x : τ

What does σ v τ mean? ‘τ is more special than σ’, i.e. is got bysubstituting types (no quantifiers!) for the quantified variables of σ.

E.g. ∀β.α→ β v α→ γ and ∀β.α→ β v α→ (nat→ bool).

This lets every use of a polymorphic identifier get a new bunch of typevariables to label it.

98

Page 416: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Polymorphic type inference example

Let’s look at our motivating example

let id = λx .x in 〈id(1), id(true)〉

The inference tree is (with some abbreviation)

x : β ∈ x : βx : β ` x : β

` λx .x : β→ β

. . .

. . . ∈ . . . [ζ = δ]

. . . v ζ→ ζ

. . . ` id : ζ→ δ

[δ = bool]

. . . ` true : bool

. . . ` id(true) : δ

id : ∀β.β→ β ` 〈id(1), id(true)〉 : α [α = γ × δ]

` let id = λx .x in 〈id(1), id(true)〉 : α

99

Page 417: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Polymorphic type inference example

Let’s look at our motivating example

let id = λx .x in 〈id(1), id(true)〉

The inference tree is (with some abbreviation)

x : β ∈ x : βx : β ` x : β

` λx .x : β→ β

. . .

. . . ∈ . . . [ζ = δ]

. . . v ζ→ ζ

. . . ` id : ζ→ δ

[δ = bool]

. . . ` true : bool

. . . ` id(true) : δ

id : ∀β.β→ β ` 〈id(1), id(true)〉 : α [α = γ × δ]

` let id = λx .x in 〈id(1), id(true)〉 : α

99

Page 418: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Polymorphic type inference example

Let’s look at our motivating example

let id = λx .x in 〈id(1), id(true)〉

The inference tree is (with some abbreviation)

x : β ∈ x : βx : β ` x : β

` λx .x : β→ β

. . . ∈ . . . [ε = γ]

. . . v ε→ ε

. . . ` id : ε→ γ

[γ = nat]

. . . ` 1 : nat

. . . ` id(1) : γ

. . . ∈ . . . [ζ = δ]

. . . v ζ→ ζ

. . . ` id : ζ→ δ

[δ = bool]

. . . ` true : bool

. . . ` id(true) : δ

id : ∀β.β→ β ` 〈id(1), id(true)〉 : α [α = γ × δ]

` let id = λx .x in 〈id(1), id(true)〉 : α

99

Page 419: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Remarks on Hindley–Milner

I The proof-tree plus unification description is convenient for provingproperties of the system. Implementations do more or less the same(as the quite efficient version we’ve given).

I Type inference for let-polymorphism runs quickly (linearly) on everyprogram a real programmer has ever written.

I But in fact it is ExpTime-complete!

I We can imagine generalizing to full polymorphism: allow allvariables and terms to have actual types like ∀α.α→ α. Then(λf .〈f (1), f (true)〉)(λx .x) is fine.

I But type inference becomes undecidable!

I So we write the types explicitly, as in Java generics, and introducetype abstraction and type application:if t : τ , then (Λα.t)〈〈σ〉〉 → [σ/α]tE.g. (λf :∀α.α→ α.〈(f 〈〈bool〉〉)(true), (f 〈〈nat〉〉)(1)〉)(Λα.λx :α.x)

I System F (Girard), or polymorphic λ-calculus (Reynolds).

100

Page 420: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Remarks on Hindley–Milner

I The proof-tree plus unification description is convenient for provingproperties of the system. Implementations do more or less the same(as the quite efficient version we’ve given).

I Type inference for let-polymorphism runs quickly (linearly) on everyprogram a real programmer has ever written.

I But in fact it is ExpTime-complete!

I We can imagine generalizing to full polymorphism: allow allvariables and terms to have actual types like ∀α.α→ α. Then(λf .〈f (1), f (true)〉)(λx .x) is fine.

I But type inference becomes undecidable!

I So we write the types explicitly, as in Java generics, and introducetype abstraction and type application:if t : τ , then (Λα.t)〈〈σ〉〉 → [σ/α]tE.g. (λf :∀α.α→ α.〈(f 〈〈bool〉〉)(true), (f 〈〈nat〉〉)(1)〉)(Λα.λx :α.x)

I System F (Girard), or polymorphic λ-calculus (Reynolds).

100

Page 421: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Remarks on Hindley–Milner

I The proof-tree plus unification description is convenient for provingproperties of the system. Implementations do more or less the same(as the quite efficient version we’ve given).

I Type inference for let-polymorphism runs quickly (linearly) on everyprogram a real programmer has ever written.

I But in fact it is ExpTime-complete!

I We can imagine generalizing to full polymorphism: allow allvariables and terms to have actual types like ∀α.α→ α. Then(λf .〈f (1), f (true)〉)(λx .x) is fine.

I But type inference becomes undecidable!

I So we write the types explicitly, as in Java generics, and introducetype abstraction and type application:if t : τ , then (Λα.t)〈〈σ〉〉 → [σ/α]tE.g. (λf :∀α.α→ α.〈(f 〈〈bool〉〉)(true), (f 〈〈nat〉〉)(1)〉)(Λα.λx :α.x)

I System F (Girard), or polymorphic λ-calculus (Reynolds).

100

Page 422: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Remarks on Hindley–Milner

I The proof-tree plus unification description is convenient for provingproperties of the system. Implementations do more or less the same(as the quite efficient version we’ve given).

I Type inference for let-polymorphism runs quickly (linearly) on everyprogram a real programmer has ever written.

I But in fact it is ExpTime-complete!

I We can imagine generalizing to full polymorphism: allow allvariables and terms to have actual types like ∀α.α→ α. Then(λf .〈f (1), f (true)〉)(λx .x) is fine.

I But type inference becomes undecidable!

I So we write the types explicitly, as in Java generics, and introducetype abstraction and type application:if t : τ , then (Λα.t)〈〈σ〉〉 → [σ/α]tE.g. (λf :∀α.α→ α.〈(f 〈〈bool〉〉)(true), (f 〈〈nat〉〉)(1)〉)(Λα.λx :α.x)

I System F (Girard), or polymorphic λ-calculus (Reynolds).

100

Page 423: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Remarks on Hindley–Milner

I The proof-tree plus unification description is convenient for provingproperties of the system. Implementations do more or less the same(as the quite efficient version we’ve given).

I Type inference for let-polymorphism runs quickly (linearly) on everyprogram a real programmer has ever written.

I But in fact it is ExpTime-complete!

I We can imagine generalizing to full polymorphism: allow allvariables and terms to have actual types like ∀α.α→ α. Then(λf .〈f (1), f (true)〉)(λx .x) is fine.

I But type inference becomes undecidable!

I So we write the types explicitly, as in Java generics, and introducetype abstraction and type application:if t : τ , then (Λα.t)〈〈σ〉〉 → [σ/α]tE.g. (λf :∀α.α→ α.〈(f 〈〈bool〉〉)(true), (f 〈〈nat〉〉)(1)〉)(Λα.λx :α.x)

I System F (Girard), or polymorphic λ-calculus (Reynolds).

100

Page 424: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Remarks on Hindley–Milner

I The proof-tree plus unification description is convenient for provingproperties of the system. Implementations do more or less the same(as the quite efficient version we’ve given).

I Type inference for let-polymorphism runs quickly (linearly) on everyprogram a real programmer has ever written.

I But in fact it is ExpTime-complete!

I We can imagine generalizing to full polymorphism: allow allvariables and terms to have actual types like ∀α.α→ α. Then(λf .〈f (1), f (true)〉)(λx .x) is fine.

I But type inference becomes undecidable!

I So we write the types explicitly, as in Java generics, and introducetype abstraction and type application:if t : τ , then (Λα.t)〈〈σ〉〉 → [σ/α]t

E.g. (λf :∀α.α→ α.〈(f 〈〈bool〉〉)(true), (f 〈〈nat〉〉)(1)〉)(Λα.λx :α.x)

I System F (Girard), or polymorphic λ-calculus (Reynolds).

100

Page 425: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Remarks on Hindley–Milner

I The proof-tree plus unification description is convenient for provingproperties of the system. Implementations do more or less the same(as the quite efficient version we’ve given).

I Type inference for let-polymorphism runs quickly (linearly) on everyprogram a real programmer has ever written.

I But in fact it is ExpTime-complete!

I We can imagine generalizing to full polymorphism: allow allvariables and terms to have actual types like ∀α.α→ α. Then(λf .〈f (1), f (true)〉)(λx .x) is fine.

I But type inference becomes undecidable!

I So we write the types explicitly, as in Java generics, and introducetype abstraction and type application:if t : τ , then (Λα.t)〈〈σ〉〉 → [σ/α]tE.g. (λf :∀α.α→ α.〈(f 〈〈bool〉〉)(true), (f 〈〈nat〉〉)(1)〉)(Λα.λx :α.x)

I System F (Girard), or polymorphic λ-calculus (Reynolds).

100

Page 426: Julian Brad eld jcb@inf.ed.ac.uk IF{4

Remarks on Hindley–Milner

I The proof-tree plus unification description is convenient for provingproperties of the system. Implementations do more or less the same(as the quite efficient version we’ve given).

I Type inference for let-polymorphism runs quickly (linearly) on everyprogram a real programmer has ever written.

I But in fact it is ExpTime-complete!

I We can imagine generalizing to full polymorphism: allow allvariables and terms to have actual types like ∀α.α→ α. Then(λf .〈f (1), f (true)〉)(λx .x) is fine.

I But type inference becomes undecidable!

I So we write the types explicitly, as in Java generics, and introducetype abstraction and type application:if t : τ , then (Λα.t)〈〈σ〉〉 → [σ/α]tE.g. (λf :∀α.α→ α.〈(f 〈〈bool〉〉)(true), (f 〈〈nat〉〉)(1)〉)(Λα.λx :α.x)

I System F (Girard), or polymorphic λ-calculus (Reynolds).100