An Introduction to the Lambda Calculus - Universidad de Chileabassi/Cursos/41a/lambdacalc.pdf · An...

An Introduction to the Lambda Calculus

Mayer Goldberg

February 20, 2000

1 Notation and Conventions

It is surprising that despite the simplicity of its syntax, the λ-calculus hosts a large body ofnotation, abbreviations, naming conventions, etc. Our aim, as far as the notation throughoutthis work is concerned, is to remain consistent, clear and unambiguous as much as possible. Byand large we adhere to the notation set down in Church’s Calculi of Lambda Conversion [4], andBarendregt’s The Lambda Calculus, Its Syntax and Semantics [1]. There are however instanceswhere our notation diverges. Most notably, we avoid abbreviating the names of combinators.Thus, for example, we use Succ

Churchto denote the λ-term computing the successor function on

Church numerals, instead of Church’s S+,1 etc.A variety of typefaces are used in denoting logical and mathematical entities. In general, we

strive for consistency, unless prevented from doing so by existing, well-established notation.The following table illustrates the various typefaces used throughout this work, and the context

in which they appear:

Typeface Usage within this DocumentMath Italics variables, λ-terms used locallyRoman Bold λ-terms used throughout this workSan Serif sets, relations, trees, syntactic objectsFraktur Gothic: Fraktur arithmetic functionsGreek: α, β, γ, . . . rules, well-known combinatorsBlackboard bold: N, Q, Z, . . . well-known sets of numbersCALLIGRAPHIC systems of equations

Our introduction to the λ-calculus consists of five sections, which cover the following topics, inorder: Syntax, reduction, λ-definability, fixed-points, and bases. By no stretch of the imaginationcan this selection of topics be considered a complete survey of the main areas of the λ-calculus.The most striking omissions include models and types. This introduction is, however, sufficient forits intended purposes, which are, as stated earlier, to make this work as self-contained as possible,and to establish a common language with the reader, in terms of which, the ideas contained inthis work can be conveyed.

The sections on syntax and reduction cover the λ-calculus as a formalism for describing com-putation. Section 4 covers λ-definability, which concerns itself with representing logical and math-ematical entities in the λ-calculus. Section 5 covers fixed points. The treatment is not complete,but is sufficient for our purposes. Section 6 concludes the introduction to the λ-calculus with atreatment of bases. Bases in the λ-calculus are analogous, in many ways, to bases in linear algebra.

1 The latter, S+, could easily be confused for the set of λ-terms generated by the set S, in accordance withDefinition 2.16, which follows Barendregt’s text [1, Definition 8.1.1, 65].

1

2 Syntax

2.1 Lambda Terms

The domain of discourse in the λ-calculus is a set Λ of terms, known as “λ-terms” or “λ-expressions.” The construction of the set Λ follows.2.1 Definition: The sets Symbols and Λ. Let Symbols be a countable set of symbols. Theset Λ of all λ-terms is defined inductively as follows:

i. Symbols ⊂ Λ

ii. For any M,N ∈ Λ, we have (M N) ∈ Λ. This syntactic form is called an application.

iii. For any M ∈ Λ, ν ∈ Symbols, we have (λν.M) ∈ Λ. This syntactic form is known as λ-abstraction, (or just plain “lambda”). We refer to M in the context of the λ-abstraction asthe body of the lambda.

The set Λ is also known by ΛK, distinguishing it from a slightly different set of λ-terms, theΛI. The set ΛI will be introduced in Definition 2.14 after we define the notions of free and boundvariables.2.2 Intuition: It is often useful to think of λ-abstraction and application as modellingfunction abstraction and the application of a function to an argument. In this work, however, thismodelling is not taken to be more than a useful intuition.2.3 Conventions:

i. We may remove any pair of parentheses or even the dot separator between the variable andthe body in a λ-abstraction, whenever this cannot result in ambiguity. For example: Theλ-term (λx.x) by itself could appear as λx.x, but in an application, we keep the parentheses:λy.((λx.x)y).

ii. For any M1,M2,M3 ∈ Λ, the λ-term ((M1 M2) M3) can be abbreviated as (M1 M2 M3).This convention is known as the left-associativity of application.

iii. For any ν1, ν2 ∈ Symbols,M ∈ Λ, the λ-term λν1.λν2.M can be abbreviated as λν1ν2.M .

2.4 Examples:

• The expression (λx.x) can in fact be abbreviated as λxx [4, Chapter I, Section 4, Page 7],but we avoid this abbreviation in the remainder of the work.

• The expression (λx.(λy.(λz.((x z) (y z))))) can be abbreviated as λxyz.(x z (y z)).2

2.5 Definition: Repeated Left- and Right-Associated Applications. (See [1, 5,Definition 2.1.9]) Let M,N ∈ Λ, and k ∈ N. Then

(Mk N) = (M · · · (M︸︷︷︸k

N) · · ·) (1)

(M N∼k) = (M N · · ·N︸︷︷︸k

) (2)

2.6 Definition: The length of a λ-term. The length of a λ-term M , denoted by ||M ||, isdefined inductively as follows:

||ν|| = 1 where ν ∈ Symbols||(M N)|| = 1 + ||M ||+ ||N || where M,N ∈ Λ||λν.M || = 1 + ||M || where ν ∈ Symbols, M ∈ Λ

2 Terms in the literature are usually abbreviated a step further, by removing the parentheses around the bodyof the λ-abstraction, and compressing all spaces. For example, the λ-term λxyz.(x z (y z)) can also be writtenas λxyz.xz(yz). This last abbreviation, however, can be quite confusing when symbol names are longer than onecharacter. We therefore avoid such abbreviations throughout this work, so as to be able to use long, expressivesymbol names.

2

2.7 Examples:

||λx.x|| = 2||λx.(x x)|| = 4

||λxyz.(x z (y z))|| = 10

2.8 Notation: Arb. A λ-term whose value is irrelevant is denoted by � (pronounced “arb”,short for “arbitrary”).

Arb is used in order to make arbitrary choices of λ-terms explicit.The length of an arbitrary λ-term, ||�||, is an arbitrary positive integer. ||�|| is useful when

reasoning about complexity.

2.2 Variables

2.9 Definition: Set of Variables. For any λ-expression M , the set Vars(M) of variables ofM is defined inductively as follows. Let ν ∈ Symbols and M,N ∈ Λ:

Vars(ν) = {ν}Vars((M N)) = Vars(M) ∪ Vars(N)Vars((λν.M)) = Vars(M)

2.10 Definition: Set of Free Variables. For any λ-expression M , the set FreeVars(M) offree variables of M is defined inductively as follows. Let ν ∈ Symbols and M,N ∈ Λ:

FreeVars(M) = FreeVars′(M, ∅)

FreeVars′(ν, S) ={{ν} if ν 6∈ S∅ if ν ∈ S

FreeVars′((M N), S) = FreeVars′(M,S) ∪ FreeVars′(N,S)FreeVars′(λν.M, S) = FreeVars′(M,S ∪ {ν})

2.11 Definition: Set of Bound Variables. For any λ-expression M , the set BoundVars(M)of bound variables of M is defined inductively as follows. Let ν ∈ Symbols and M,N ∈ Λ:

BoundVars(ν) = ∅BoundVars((M N)) = BoundVars(M) ∪ BoundVars(N)BoundVars(λν.M) = {ν} ∪ BoundVars(M)

2.12 Examples:

Vars(λxyz.(x z (y z))) = {x, y, z}Vars(λxy.x) = {x}

FreeVars(λx.(x y z)) = {y, z}BoundVars(λxw.(x y z)) = {x, w}

Note that

3

• As the last example shows, a variable can be contained in the set of bound variables of someexpression, without actually occuring in that expression.

• A variable can have both free and bound occurrences within the same expression. Forexample:

FreeVars((x (λx.x))) = BoundVars((x (λx.x)))= {x}

When it is important to consider the free variables in a λ-term, we will tag a term with asubscript consisting of the free variables in the term. For example: Mx,y = (x (x y)).2.13 Terminology: A Fresh Variable. Given a λ-term M ∈ Λ, a variable ν is said to be afresh variable [for, or with respect to, M ] if ν 6∈ Vars(M).2.14 Definition: The set ΛI. The set ΛI of all λI-terms, is defined inductively as follows:Let Symbols be a countable set of symbols.

i. Symbols ⊂ ΛI

ii. For any M,N ∈ Λ, we have (M N) ∈ ΛI.

iii. For any M ∈ ΛI, ν ∈ Symbols ∩ FreeVars(M), we have (λν.M) ∈ ΛI.

The set ΛK (or simply Λ) is the domain of discourse of the λK-calculus (or simply the λ-calculus). The set ΛI is the domain of discourse of the λI-calculus. These two calculi are definedin Definition 3.7.

2.3 Sequences

The following material is adapted from Barendregt’s text on the λ-calculus [1, Chapter 2,Item 2.1.3, Page 22]. These conventions help avoid much of the clutter that results from ex-cessive use of meta-syntactic ellipses within λ-terms:2.15 Abbreviation: Sequences of Symbols, Sequences of λ-Terms. When the size n of afinite sequence ν1, . . . , νn of symbols, or M1, . . . ,Mn of λ-terms is known, or can be inferred fromthe context, or is simply irrelevant, the sequence can be abbreviated as ~ν or ~M respectively.2.16 Definition: Set of λ-Terms Generated by a Given Set. [1, Section 8.1, Definition 8.1.1]Let S ⊆ Λ. The set of all λ-terms generated by S, is denoted by S+ and is the smallest set Wwhich satisfies

i. S ⊆ W

ii. For M,N ∈ W, (M N) ∈ W

Another way to view the set of λ-terms S+ generated by a set S is that S+ is the closure of Sunder application.

2.4 Combinators

2.17 Definition: Combinator. Set of All Combinators. A combinator is a λ-term M ∈ Λthat contains no free variables, i.e., for which

FreeVars(M) = ∅ (3)

The set of all combinators is denoted by Λ0.2.18 Definition: Proper Combinator. A proper combinator is an expression of the formλ~x.M , where M ∈ {~x}+. The arity of a proper combinator is |{~x}|. The λ-term M in the contextof the proper combinator, is referred to as the body of the proper combinator.2.19 Examples:

4

i. The following λ-terms are proper combinators: λx.x, λxy.x, and λxy.(x y y).

ii. The following λ-terms are not proper combinators: λx.(x (λx.x)), and λx.(x y).

iii. The arity of λxy.y is 2.

The combinators listed below are used throughout this work, and appear in much of theliterature on the λ-calculus:

ω = λx.(x x) (4)B = λxyz.(x (y z)) (5)C = λxyz.(x z y) (6)I = λx.x (7)J = λxyzt.(x y (x t z)) (8)K = λxy.x (9)S = λxyz.(x z (y z)) (10)

Unk = λx0 · · ·xn.xk (11)

W = λxy.(x y y) (12)Y

Curry= λf.((λx.(f (x x)))

(λx.(f (x x))))(13)

YTuring

= ((λfx.(f (x x f)))(λfx.(f (x x f))))

(14)

2.5 Substitution

The substitution in a λ-term M , of a free variable ν by a λ-term N , is denoted in the literatureby either M [N/ν] or M [ν := N ]. While it is important to recognise both notations, we use thelatter in this work, because it is used in Barendregt’s text on the λ-calculus [1, Item 2.1.15, Page27].2.20 Definition: Substitution of Free Variables. The substitution of all free occurrences ofa variable ν in a λ-term M by a λ-term N is denoted by M [ν := N ] and is defined inductively asfollows:

ν′[ν := N ] ={

N if ν′ = νν′ if ν′ 6= ν

(15)

(M ′ M ′′)[ν := N ] = (M ′[ν := N ] M ′′[ν := N ]) (16)

(λν.M ′)[ν := N ] =

λν′.M ′ where ν = ν′

λν′.M ′[ν := N ]where BoundVars(M ′) ∩ FreeVars(N) = ∅

(17)

The above definition does not specify the value of M [ν := N ], where N contains free variablesthat are bound in M . Removing the restriction in (17) would violate the common mathematicalintuition, according to which, an occurrence of a given variable is a place holder for whatever issubstituted for that variable: We expect the property of being a place holder for a given variable tobe invariant under substitution, for all variables other than the variable being substituted for. Forexample, suppose we relaxed the condition in (17), and allowed BoundVars(M ′)∩FreeVars(N) 6= ∅:Consider the expression

λx.(λy.λx.y)[y := x] (18)

The variable being substituted for is y, so the property of being a place holder for the variable inthe outer ‘λx’ should be invariant under substitution, and yet, the result of the substitution wouldbe λx.λx.x, and the occurrence of x is no longer a place holder for the variable in the outer ‘λx’,but rather for the inner one instead.

5

The problem of specifying (λν.M ′)[ν := N ], where BoundVars(M ′) ∩ FreeVars(N) 6= ∅, is dealtwith completely by the following convention: That an occurrence of a variable in a λ-term issimply a place holder for what is substituted for that variable, is a relationship between a variableoccurrence and the ‘λ’ that abstracts over this variable. Furthermore, this relationship is a propertyof the graph of the term rather than of the particular names we happen to choose in order todenote this relationship. Now that the names of variables are starting to get in the way of ourmathematical intuition, we define the problem away by partitioning the set of λ-terms modulothe renaming of bound variables, i.e., the substitution of bound variable names for fresh names(see Terminology 2.13). The result of this partitioning is that terms such as (λx.x) and (λy.y) areconsidered to be the same.

Getting back to the situation that prompted this discussion, concerning the substitution of(λν.M ′)[ν := N ], where BoundVars(M ′) ∩ FreeVars(N) 6= ∅, it should be clear why this situationno longer poses a problem: The term (λν.M ′) is equivalent, in the sense we just discussed, to a term(λν.M ′′) that differs from the previous term in that all the variables BoundVars(M ′)∩FreeVars(N)which are bound in M ′ have been renamed to fresh variables, which do not occur in M ′ orN . We now have BoundVars(M ′′) ∩ FreeVars(N) = ∅, and the substitution (λν.M ′′)[ν := N ]can be computed according to Definition 2.20. In light of the notion of equivalence modulorenaming of bound variables, the result of the substitution (λν.M ′′)[ν := N ] is taken as the resultof (λν.M ′)[ν := N ] as well.

This notion of equivalence is known, in various forms, as α-rules:2.21 Definition: α-Conversion, α-Equivalence.

• α-Conversion.

λx.M = λν.M [x := ν] (19)

where M ∈ Λ, and ν is fresh.

• α-Equivalence. The set of λ-terms is partitioned modulo α-conversion, and so λ-terms thatα-convert to each other are taken to represent the same entity.

In light of this discussion, α-conversion and α-equivalence will be implicit throughout theremainder of this work.

3 Reduction

3.1 Definition: R-Redex, −→R, −→→R, =R, R-normal form (R-nf), The Diamond Property,Church-Rosser (CR). [1, Definitions 3.1.3, 3.1.8, Page 51] Let R be a binary relation on Λ.

• An R-redex is a term M ∈ Λ, such that 〈M,N〉 ∈ R, for some term N ∈ Λ.

• −→R is the one-step reduction: For any M,N ∈ Λ, we have M −→R N if and only if〈M,N〉 ∈ R.

• −→→R is the reflexive and transitive closure of −→R.

• =R is the equivalence relation generated by −→→R.

• A λ-term M ∈ Λ is R-nf if it contains no R-redex as a sub-expression. A λ-term N is R-nfof a λ-term M , if M =R N , and N is R-nf.

• Let R be a binary relation on Λ. R satisfies the diamond property, if for all M,M1,M2 ∈ Λ,such that 〈M,M1〉, 〈M,M2〉 ∈ R, we have 〈M1,M3〉, 〈M2,M3〉 ∈ R, for some M3 ∈ Λ.

• Church-Rosser (CR). A relation R is CR, if −→→R satisfies the diamond property.

3.2 Definition: The Relations −→→+R , =+

R . Let R be a binary relation on Λ.

6

• −→→+R is the transitive closure of −→R.

• For any λ-term M,N ∈ Λ, if M =+R N then there exists a λ-term P ∈ Λ, such that either

M −→→+R P, and N −→→R P

or

M −→→R P, and N −→→+R P

3.3 Observation: For any M,N ∈ Λ, and a binary relation R, if M−→→RN then M =R N ,and if M−→→+

R N then M =+R N . The converse does not always hold. In this sense, −→→R and

−→→+R are stronger properties than =R and =+

R , respectively.3.4 Definition: The Binary Relations β, η, and βη. The binary relations β, β, and βηon Λ are defined as follows:

β = {〈((λx.M) N),M [x := N ]〉 :M,N ∈ Λ, x ∈ Symbols}

(20)

η = {〈(λx.(M x)),M〉 :M ∈ Λ, x ∈ Symbols− FreeVars(M)}

(21)

βη = β ∪ η (22)

3.5 Abbreviations: The Relations −→, −→→, =, −→→+, =+. Throughout the remainderof this work, the following abbreviations are used:

• −→ abbreviates −→βη.

• −→→ abbreviates −→→βη.

• = abbreviates =βη.

• −→→+ abbreviates −→→+βη

• =+ abbreviates =+βη

The notion of βη-equality has a finitary quality to it: If for some M,N ∈ Λ, we have M = N ,then there exists a term R ∈ Λ, such that M−→→R and N−→→R. But the −→→ relation has afinitary quality to it: If A−→→B, then there exists n ∈ Z

+, such that A(−→)nB, or put anotherway, A −→ B1 −→ · · · −→ Bn−1 −→ B, for some ~B ∈ Λ. Hence if M−→→R and N−→→R, thereexists m,n ∈ Z

+, such that M and N βη-reduce to R in m and n single-step βη-reductions,respectively.3.6 Abbreviation: fk. Through an abuse of notation we abbreviate λν.(fk ν) by fk. Thismay look like a one-step η-reduction, but it is not.3.7 Definition: The λK-calculus, the λI-calculus. The λK-calculus is a theory that studiesthe behaviour of ΛK under the rule βη (it is also known as the λKβη-calculus). The λI-calculus isa theory that studies the behaviour of ΛI under the rule βη (it is also known as the λIβη-calculus).3.8 Definition: The Axioms of the λ-Calculus (see [1, Definition 2.1.4, Chapter 2, Page 23]).The λ-calculus is axiomatised by the following axioms and rules:

i. β-conversion: ((λx.M) N) = M [x := N ]

ii. M = M

iii. If M = N , then N = M

iv. If M = N , and N = P , then M = P

v. If M = N , then (M P ) = (N P )

7

vi. If M = N , then (P M) = (P N)

vii. ξ-rule: If M = N , λx.M = λx.N

for all M,N,P ∈ Λ.Throughout this work, wherever we refer to the “λ-calculus,” we intend, by default, the λK-

calculus.Many variants of these calculi exist today. They differ in the addition or removal of reduction

rules, and the possible imposition of a system of types on the λ-terms.3.9 Definition: Head Normal Form. A λ-term M is a head normal form if M = λ~x.(y ~N)for ~x, y ∈ Symbols, and ~N ∈ Λ.3.10 Theorem: (Church-Rosser)

• β,βη are CR.

• For M,N ∈ Λ, if M =β N , then M−→→βP , and N−→→βP , for some P ∈ Λ.

• For M,N ∈ Λ, if M =βη N , then M−→→βηP , and N−→→βηP , for some P ∈ Λ.

Proof: See Theorem 3.2.8, and Theorem 3.3.9, in Barendregt’s text on the λ-calculus [1]. �3.11 Theorem: If M,N ∈ Λ have R-nf M ′, N ′, respectively, and if M =R N , thenM ′ =α N ′.

Proof: Since =R is an equivalence relation, it is reflexive, symmetric, and transitive. It followsthat N ′ =R N =R M =R M ′. We want to show that if N ′ =R M ′, for M ′, N ′ that are R-nf, thenN ′ =α M ′. Since M ′, N ′ are R-nf, they contain no R-redexes. It follows that they must be equalby reflexivity, which is taken modulo α. Our claim follows. �3.12 Definition: Solvability [1, Definition 2.2.10, Page 41]. A λ-term M is said to besolvable if there exist for some k ∈ N, k λ-terms ~N , such that

(M ~N) −→→R I (23)

If there exist such ~N ∈ ΛI, M is said to be λI-solvable. If there exist such ~N ∈ ΛK, M is said tobe λK-solvable.

The solvability properties of a term are important in understanding how this λ-term behaveswhen applied to some arguments, that is, how these arguments are used by the given λ-term.Terms that are not solvable are considered meaningless [1, Section 2§2, Page 41–42].

Heavy use is made of solvability in the λI-calculus, where the variable of an abstraction mustoccur free in the body of the abstraction, i.e., be used. Whereas in the λK-calculus, we can makeuse of the K combinator (and other expressions in ΛK − ΛI) in order to “abandon” unwanted λ-terms, in the λI-calculus, on the other hand, those terms must be “used up”. Solvability providesa way of “using up” a λ-term.3.13 Definition: Bohm Trees3 [1, Definition 10.1.3, Page 216]. A Bohm tree is a [possiblyinfinite] tree associated with a λ-term. The Bohm tree of M ∈ Λ is denoted by BT(M), and isdefined inductively as:

BT(M) = ⊥ (24)if M is unsolvable (i.e., a singlenode labelled ⊥.)

(25)

For any λ-term M , BT(M) is finite if and only if M has a normal form. In connection with this,Barendregt makes the beautiful observation that terms relate to their Bohm trees in the same waythat the real numbers relate to their continued fractions [1, Section 10§1 – 10§2, Page 215–245].Some relations between λ-terms and real numbers:

3 Bohm trees are named after the logician Corrado Bohm [1923 – ].

8

• Continued fractions are [possibly infinite] representations of real numbers, just as Bohm treesare [possible infinite] representations of λ-terms.

• Rational numbers and normalisable λ-terms are represented by finite continued fractionsand finite Bohm trees, respectively.

• Irrational numbers and non-normalisable λ-terms are represented by infinite continued frac-tions and infinite Bohm trees, respectively.

• Endowed with the appropriate topology, normalisable λ-terms form a dense subset of Λ, inpretty much the same way the set Q of rational numbers forms a dense subset of the set R

of real numbers.

4 Lamda Definability

4.1 Preliminaries

λ-Definability concerns itself with representing logical and mathematical entities in the λ-calculus:λ-terms are used to represent numbers, ordered n-tuples, boolean values, functions, and otherstructures.

We begin our excursion into λ-definability with the construction of a system of numerals,and the elementary number-theoretic functions defined over it. The numerals we consider here,and which are the default numerals used throughout this entire work, are known as the Churchnumerals, named after the logician Alonzo Church [1903 – 1995], who was the first to define andthen make use of them.

4.2 Church Numerals

Suppose we have some λ-expression representing the number 0, and a λ-expression s representingthe successor function. Then (s z) represents the number 1, (s (s z)) represents the number 2,etc. The n-th Church numeral is thus defined by abstracting s and z over the n-th application ofs to z:4.1 Definition: Church numerals. The n-th Church numeral, denoted by pnq is defined as

p0q = λsz.z (26)pn + 1q = λsz. (s · · · (s︸︷︷︸

n + 1

z) · · ·)

= λsz.(sn+1 z) (27)

Having defined Church numerals, let us define the elementary number-theoretic functions overthem, starting with the successor function.

We begin by noticing that since a Church numeral is an abstraction over the n-th applicationof some successor function to some zero numeral, it is a simple matter to convert between Churchnumerals and any other kind of numerals, given a representation z′ for zero in the alternate system,and an appropriate successor function s′:

(pnq s′ z′) −→→ (s′ · · · (s′︸︷︷︸n

z′) · · ·) (28)

Applying s′ to both sides of (28), we get:

(s′ (pnq s′ z′)) −→→ (s′ · · · (s′︸︷︷︸n + 1

z′) · · ·) (29)

9

We abstract the variables s and z for s′ and z′ over the left-hand side in (28), to get a numeral inChurch’s system:

pn + 1q = λsz.(s (pnq s z)) (30)

Since a λ-term which computes the successor function maps pnq to pn + 1q, we obtain such aλ-term by abstracting n for pnq over the right-hand side of (30):4.2 Proposition: The expression

SuccChurch

= λnsz.(s (n s z)) (31)

computes the successor function on Church numerals.Proof: Verify that (Succ

Churchpnq)−→→pn + 1q. �

We stated that a Church numeral is an abstraction of a successor and a zero over the n-thapplication of the given successor to the given zero. This observation served as a useful metaphor inderiving a successor function over the Church numerals. We now offer the more general observationthat Church numerals are abstractions over the n-th composition of some function: Given afunction f , the n-th composition of f is

fn = λx. (f · · · (f︸︷︷︸n

x) · · ·) (32)

The 0-th composition of f is defined to be the identity combinator I = λx.x. Abstracting f overfn is α-equivalent to the n-th Church numeral. We thus have a convenient way to compute thebounded composition of a λ-term f since

(pnq f) −→→ fn (33)

We will be using (33) implicitly, but extensively, throughout this work.In particular, we now make use of the above property in order to derive a λ-term which

computes the addition function on Church numerals: The λ-term (paq SuccChurch

) computes thefunction that adds paq to any Church numeral. Therefore

(paq SuccChurch

pbq) −→→ pa + bq (34)

A λ-term that computes the addition function is obtained by abstracting a and b for paq and pbq

respectively, over the left-hand side of (34):4.3 Proposition: The expression

AddChurch

= λab.(a SuccChurch

b) (35)

computes the addition function on Church numerals.Proof: Verify that

(AddChurch

paq pbq) −→→ pa + bq (36)

�Similarly, (Add

Churchpaq) computes the function that adds paq to a Church numeral, and

(pbq (AddChurch

paq)) computes the b-th composition of that function, which is the function thatadds pa× bq to a Church numeral. Therefore

(pbq (AddChurch

paq) p0q) −→→ pa× bq (37)

A λ-term that computes the multiplication function on Church numerals is obtained by abstractingthe variables a and b for paq and pbq respectively over the left-hand side of (37):4.4 Proposition: The expression

TimesChurch

= λab.(b (AddChurch

a) p0q) (38)

10

computes the multiplication function on Church numerals.Proof: Verify that (Times

Churchpaq pbq)−→→pa× bq. �

It should be clear by now how to construct the exponentiation function:4.5 Proposition: The expression

PowerChurch

= λab.(b (TimesChurch

a) p1q) (39)

computes the exponentiation function on Church numerals.Proof: Verify that (Power

Churchpaq pbq)−→→p

abq. �Concerning exponentiation, however, Church [4, Chapter II, Page 10] noticed that

(pbq paq) −→→ pabq

(40)

and so exponentiation can also be defined as λab.(b a). This fact, which is easy to verify, issuggested by the similarity between the algebraic laws of exponentiation and composition.

We can further iterate over the exponentiation function to obtain even faster growing functions,and arrive, in fact, at Ackermann’s function:4.6 Definition: Ackermann’s Function4. Ackermann’s function a is defined inductively asfollows:

a(0, q) = q + 1 (41)a(p + 1, 0) = a(p, 1) (42)

a(p + 1, q + 1) = a(p, a(p + 1, q)) (43)

The expansion of a(p+1, q), after q applications of (43), followed by a single application of (42),is

a(p + 1, q) = a(p, · · · a(p + 1︸︷︷︸q + 1

, 0) · · ·) (44)

= a(p, · · · a(p︸︷︷︸q + 1

, 1) · · ·) (45)

We curry over p, so ap(q) = a(p, q). We now further rewrite (45) as

ap+1(q) = ap(· · · ap(︸︷︷︸q + 1

1) · · ·) (46)

Let Ap be a λ-term computing ap on Church numerals. We compose Ap using Church numerals,exploiting the following property: For any λ-term f , and any n ∈ Z

+ we have

(pnq f) = fn (47)

Thus, equation (46) can be rewritten in the λ-calculus as

Ap+1 = Ap(· · ·Ap(︸︷︷︸q + 1

p1q) · · ·) (48)

= (pq + 1q App1q) (49)

= (SuccChurch

pqq App1q) (50)

The function f : Ap → Ap+1 is λ-definable by abstracting the variable q for pqq over Ap in (50):

f = λapq.(SuccChurch

q app1q) (51)

4 Wilhelm Ackermann, [1896 – 1962]

11

To compute a(p, q) = ap(q) we need to apply the p-th composition of f to A0 (which computes a0

on Church numerals). As can be seen from (41), A0 is just SuccChurch

. Abstracting over p and qyields:

AckChurch

= λpq.(p f A0 q) (52)=η λp.(p f A0) (53)= λp.(p (λaq.(Succ

Churchq a p1q))

SuccChurch

)(54)

4.7 Proposition: The expression

AckChurch

= λp.(p (λaq.(a (q a p1q))) SuccChurch

) (55)

computes Ackermann’s function on Church numerals.Proof: It is straightforward to verify that Ack

Churchsatisfies the following three equations:

(AckChurch

p0q pqq) = pq + 1q (56)

(AckChurch

pp + 1q p0q) = (AckChurch

ppq p1q) (57)

(AckChurch

pp + 1q pq + 1q) = (AckChurch

ppq

(AckChurch

pp + 1q pqq))(58)

which correspond to (41)–(43) respectively. �Our ability to construct Ackermann’s the way we did might come as a surprise: We used

Church numerals as a bounded iteration mechanism, and this was the only iteration mechanismwe needed. However, Ackermann’s function is not primitive recursive, and bounded iteration isthe only form of iteration used to generate the primitive recursive functions. Furthermore, if wewere to add unbounded iteration to the definition of the primitive recursive functions, we wouldhave general recursion, and could definitely compute Ackermann’s function. So it would seem asif bounded iteration is not powerful enough in order to define Ackermann’s function. In orderto understand how we could define Ackermann the way we did, we need to look at what we areapplying bounded iteration to. We use Church numerals as an iteration mechanism twice: First,to iterate over ap, and second, to iterate over f (see (51) and (55)). The function ap maps integersto integers, and the function f is a higher-order function, mapping functions from integers tointegers to functions from integers to integers. A careful examination of the primitive recursionschema reveals that bounded iteration is applied to first-order functions. So as we have seen, if werelax this restriction, we can use bounded iteration to generate functions that are not primitiverecursive.

In his book Proofs and Types [7, Section 7.3.2, Page 51], Girard mentions that Ackermann’sfunction is definable through finite iteration over some “reasonable” function of a complex type.Our construction provides such a reasonable function in the pure λ-calculus.

4.3 Boolean Values and Conditional Statements

We would like to have Boolean values, Boolean functions and predicates. The purpose for definingBoolean values is to have a selection mechanism of the form

if condition then do-if-true else do-if-false fi

Since the purpose of introducing Boolean values is to be able to use them to select betweentwo possibilities (i.e., expressions), we build the selection mechanism into the Boolean valuesthemselves, thus we define T, the Boolean value true, as taking two possibilities and selecting (i.e.,evaluating to) the first; similarly, we define F, the Boolean value false, as taking two possibilitiesand selecting the second.4.8 Definition: Boolean values.

T = λxy.x (59)F = λxy.y (60)

12

It is now a simple matter to define the standard Boolean operators: The combinator Notwhich takes an argument x and returns F if x is true, and T otherwise; The combinator Andtakes two arguments and returns T if both are true, F otherwise; And the Boolean combinatorOr takes two arguments and returns T if either is true, F otherwise. Notice that there is no needto define λ-expressions for if, then, else, or fi, since the Boolean value itself performs the selection.4.9 Proposition: The following expressions correspond respectively to the Boolean functions¬, ∧ and ∨:

Not = λx.(x F T) (61)And = λxy.(x (y T F) F) (62)

Or = λxy.(x T (y T F)) (63)

Note that it is straightforward to verify [by enumeration] that for any boolean value b, we have

(b T F) −→→ b (64)

and so the definitions of And (in (62)) and Or (in (63)), can be simplified. We refrain from thissimplification because the current definitions of And and Or can be read off directly from thetruth tables for the respective boolean functions, and in that sense they are more intuitive.

Proof: Verify that these Boolean functions behave as the corresponding truth tables suggest.�

Boolean values would be of little use to us without predicates. Since we have just defineda system of numerals, it would seem reasonable to define the Zero?

Churchpredicate. Church

numerals, which are the arguments to the Zero?Church

predicate take two arguments and applythe first to the second some number of times. Our approach to defining Zero?

Churchwould therefore

be to define two expressions A and B such that for all n ∈ N, we have

(p0q A B) −→→ T (65)(pn + 1q A B) −→→ F (66)

Clearly

(p0q A B) = ((λab.b) A B) (67)−→→ B (68)= T (69)

So B = T. Solving for A, we have pn + 1q = λab.(a Ma,b) where Ma,b is one of b, (a b), (a (a b)),etc., depending on n. So for n > 0 we apply b [at least once] while for n = 0 we completely ignoreb. Regardless of what b is applied to, we would like the application to evaluate to F. So we letA = λx.F. We now define Zero?

Churchto take a Church numeral n and apply it to the above

expressions A and B. Thus:4.10 Proposition: The expression

Zero?Church

= λn.(n (λx.F) T) (70)

computes the zero predicate on Church numerals.Proof: Verify that for any n ∈ N, we have

(Zero?Church

p0q) −→→ T (71)

(Zero?Church

pn + 1q) −→→ F (72)

�

13

4.4 Aggregate Data Structures

Numerals and Booleans are common data types in many programming languages, and definingthem in the λ-calculus gives λ-definability the expressibility of a programming language. Beforewe proceed any further, we would like to incorporate one additional concept from programminglanguages, that of aggregate data structures. The most convenient data structure for our purposesis the ordered n-tuple.

The encoding [x1, . . . , xn] of the ordered n-tuple 〈x1, . . . , xn〉 must allow us to project alongany dimension k ∈ {1, . . . , n}, and therefore [x1, . . . , xn] can be defined to take a selector of theform (λx1 · · ·xn.xk) and pass onto it x1, . . . , xn. Thus

[x1, . . . , xn] = λs.(s x1 · · ·xn) (73)

The k-th projection of an n-tuple can be defined to take an ordered n-tuple and pass onto it thek-th selector:

πnk = λt.(t (λx1 · · ·xn.xk)) (74)

Finally, the n + 1-tuple constructor can be defined by abstracting x1, . . . , xn+1 over [x1, . . . , xn+1]to give:4.11 Proposition: The following expressions are an n + 1-tuple constructor, and a k-thn-tuple projection function:

[, . . . ,︸︷︷︸n + 1

] = λx1 · · ·xn.[x1, . . . , xn+1] (75)

= λx1 · · ·xn+1s.(s x1 · · ·xn+1) (76)πn

k = λt.(t (λx1 · · ·xn.xk)) (77)

Proof: Verify that

(πn+1k ([, . . . ,︸︷︷︸

n + 1

] x1 · · ·xn+1)) −→→ (πn+1k [x1, . . . , xn+1]) (78)

−→→ xk (79)

�One might wonder why we chose not to define the ordered n-tuple (for n > 2) as nested ordered

pairs. The reason is that this definition would lead to anomalous solutions which do not exist giventhe representation in (73). Consider for example the following claim:4.12 Claim: Given the representation we chose for ordered n-tuples, there exists no expressionM such that [M,M ] = [M,M,M ].

Proof: Assume there exists such M . Then applying π33 to both sides of the equation we get:

(π33 [M,M,M ]) = (π3

3 [M,M ]). But:(π3

3 [M,M,M ]) −→→ M. (80)(π3

3 [M,M ]) −→→ ((λx1x2x3.x3) M M) (81)−→→ λx3.x3 ≡ I. (82)

so M = I. We now substitute I for M in the equation [M,M ] = [M,M,M ] to obtain

[I, I] = [I, I, I] (83)

which is a contradiction, by Theorem 3.11, because both the right and the left-hand side of (83)have different normal forms. It follows that the equation [M,M ] = [M,M,M ] has no solution.There is a solution however, given the representation of an ordered triple as a nested orderedpair: Let M = (Φ (λm.[m,m])) (where Φ is some fixed-point combinator). Clearly M = [M,M ],

14

and therefore [M,M ] = [M, [M,M ]]. Since we found an equation for which the existence of asolution depends on the choice between two representations for ordered n-tuples, it follows thatthese representations are not equivalent. �

Now that we have defined ordered n-tuples, we can return to λ-defining the elementary number-theoretic functions.4.13 Definition: The Monus Function. The monus function, denoted by −· (written as adot over a minus sign) is defined as follows:

a−· b ={

a− b if a ≥ b0 otherwise (84)

The attentive reader may have noticed that we have yet to define a predecessor function, andmay have wondered why we chose to postpone its construction. In fact, the predecessor functionPred

Churchover the Church numerals poses a special challenge: Church numerals, as stated before,

are abstractions over the n-th composition of some function, so

(PredChurch

pn + 1q) (85)= (Pred

Church(λfx. (f · · · (f︸︷︷︸

n + 1

x) · · ·)) (86)

−→→ λfx. (f · · · (f︸︷︷︸n

x) · · ·) (87)

It is hardly obvious how to “peal off” one application of f in pn + 1q to yield pnq. The followingidea is due to Kleene: Consider the function f mapping [a, b] to [(Succ

Churcha), a]. We can define

f as follows:

λp.([(SuccChurch

(π21 p)), (π2

1 p)]) (88)

Notice that for any x ∈ Λ

(pn + 1q f [p0q, x]) −→→ (fn+1 [p0q, x]) (89)−→→ [pn + 1q, pnq] (90)

Although the predecessor function is undefined for zero, it would be useful to have

(PredChurch

p0q) −→→ p0q (91)

For example, it would make the definition of the monus function much simpler. And so we choosex to be p0q. Hence

(pnq f [p0q, p0q]) −→→ [pnq, pn−· 1q] (92)

and so

(π22 (pnq f [p0q, p0q])) −→→ pn−· 1q (93)

The predecessor function is defined by abstracting the variable n for pnq over (π22 (pnq f [p0q, p0q])):

4.14 Proposition: The λ-expression PredChurch

given below computes the predecessor functionon Church numerals:

PredChurch

= λn(π22 (n (λp.[(Succ

Church(π2

1 p)),(π2

1 p)])[p0q, p0q]))

(94)

Proof: Verify that (PredChurch

pnq) −→→ pn−· 1q. �

15

The same technique carries over to other number-theoretic functions. For example, considerthe function f mapping [a, b] to [(Succ

Churcha), (Times

Churcha b)]. It is a simple matter to define

f :

λp.[(SuccChurch

(π21 p)),

(TimesChurch

(π21 p) (π2

2 p))](95)

Finally, observe that

(pnq f [p1q, p1q]) −→→ (fn [p2q, p1q]) (96)−→→ [pn + 1q, pn!q] (97)

So by taking the second tuple of the last expression, and abstracting the variable n for pnq overthe left-hand side of (96), we get a definition for the factorial function:

FactChurch

= λn.(π22 (n (λp.[(Succ

Church(π2

1 p)),(Times

Church(π2

1 p) (π22 p))])

[p1q, p1q]))

(98)

Given the definition of the predecessor function, we can now λ-define the monus function overChurch numerals (Definition 4.13 defined the monus function over Z

+). The reason for λ-definingthe monus function, rather than the minus function, has to do with the fact that the combinatorwe defined for the predecessor function satisfies (91): We need only apply the b-th composition ofPred

Churchto paq to get pa−· bq:

4.15 Proposition: The expression

MonusChurch

= λab.(b PredChurch

a) (99)

computes the monus function on Church numerals.Proof: Verify that (Monus

Churchpaq pbq)−→→pa−· bq. �

Equality of Church numerals is defined in terms of MonusChurch

:4.16 Proposition: The expression Equal?

Churchdefined below computes the equality predicate

on Church numerals.

Equal?Church

= λab.(And (Zero?Church

(MonusChurch

a b))(Zero?

Church(Monus

Churchb a)))

(100)

Proof: Verify that for all a, b ∈ N, a 6= b, we have

(Equal?Church

paq paq) −→→ T (101)

(Equal?Church

paq pbq) −→→ F (102)

�The next proposition is similar, and is given without proof.

4.17 Proposition: The expressions below compute the functions 6=,≤,≥, <, > on Churchnumerals respectively:

NotEqual?Church

= λab.(Not (Equal?Church

a b)) (103)LessThanOrEqual?

Church

= λab.(Zero?Church

(MonusChurch

a b)) (104)

16

GreaterThanOrEqual?Church

= λab.(Zero?Church

(MonusChurch

b a)) (105)LessThan?

Church

= λab.(Not (GreaterThanOrEqual?Church

a b)) (106)GreaterThan?

Church

= λab.(Not (LessThanOrEqual?Church

a b)) (107)

5 Fixed Points

5.1 Single Fixed Points

In mathematics, a fixed point of a given function f is a point x0 ∈ Domain(f) ∩ Range(f), thatsatisfies f(x0) = x0. If we consider the set of functions defined over the real line {f : R → R}, wecan find functions with finitely many, countably many, uncountably many, or no fixed points.

In the λ-calculus, the situation is somewhat similar:5.1 Definition: A Fixed Point. The fixed-point of a λ-term M is a λ-term x that satisfies

(M x) = x (108)

A difference worth noting between the notion of a fixed point in mathematics, and the correspond-ing notion in the λ-calculus is that in the λ-calculus both M and x are λ-terms. By Intuition 2.2,both M and x can be thought of as functions. But what does it mean for a fixed point to be afunction? We explore this question through the next example.5.2 Example: The Factorial Function as a Fixed Point. The partial function Fn maps pkq topk!q, for all k = 1, . . . , n, and is undefined for all k > n. The higher-order function g : Fk → Fk+1

for any k ∈ N is λ-definable as follows:

g = λfk.(Zero?Church

k p1q

(TimesChurch

n (f (PredChurch

n))))(109)

What are the fixed points of g ? It can be seen by induction, that any fixed-point of g will mappnq → pn!q for all n ∈ N, and so all fixed-points of g are extensions of the factorial function onChurch numerals. In this sense, the factorial function on Church numerals can be considered tobe the least fixed point of f .5

5.3 Definition: A Fixed-Point Combinator. A combinator Φ is a fixed-point combinator,if for any M ∈ Λ, we have

(Φ M) = (M (Φ M)) (110)

So (Φ M) is a fixed point of M .5.4 Definition: Curry’s Fixed-Point Combinator, Turing’s Fixed-Point Combinator.6 Thefollowing are two well-known fixed-point combinators. Curry’s fixed-point combinator, Y

Curry, is

given by

YCurry

= λf.((λx.(f (x x)))(λx.(f (x x))))

(111)

Turing’s fixed-point combinator, YTuring

is given by

YTuring

= ((λxf.(f (x x f)))(λxf.(f (x x f))))

(112)

5 For a more formal treatment of the ordering of functions, the reader is referred to any of the standard textson lattices and domains [8].

6 These terms are named after Haskell Brooks Curry [1900 – 1982], and Alan Mathison Turing [1912 – 1954].

17

Turing’s fixed-point combinator satisfies for all M ∈ Λ:

(YTuring

M) −→→ (M (YTuring

M)) (113)

This property is a stronger property than (110) (Recall Observation 3.3).Returning to Example 5.2, where the factorial function is defined as the fixed point of the

higher-order function g, we are now in the position to provide an alternate definition for FactChurch

,using Y

Curryand Y

Turing. We state the following claim without proof:

5.5 Claim: Let g be defined as in (109). The combinators Fact′Church

and Fact′′Church

, definedbelow, both compute the factorial function on Church numerals:

Fact′Church

= (YCurry

g) (114)

Fact′′Church

= (YTuring

g) (115)

Another approach to defining recursive functions is to use the minimalisation operator (aka:“µ operator”):5.6 Definition: The µ Operator. The µ operator is a partial map that takes a predicateP , defined over Z

+, and returns the smallest number that satisfies P (if such a number exists).5.7 Theorem: The µ Operator is λ-Definable.

Proof: Let Φ be any fixed-point combinator. We define

Rp = λrn.(p n n (r (SuccChurch

n))) (116)

µ = λp.(Φ Rpp0q) (117)

It is straightforward to verify that

(Φ Rppnq) =

{pnq if (p pnq)−→→T(Φ Rp

pn + 1q) otherwise

�

5.2 Multiple Fixed Points

Multiple fixed points and multiple fixed-point combinators extend the notions of fixed points andfixed-point combinators, respectively. Just as we can define recursive functions by applying a fixed-point combinator to a higher-order function (such as g in (109)), we can define mutually recursivefunctions by applying a multiple fixed-point combinator to several higher-order functions.5.8 Definition: Multiple Fixed Points. For all n ∈ N, the multiple fixed-points of n λ-termsA1, . . . , An, are n λ-terms x1, . . . , xn that satisfy:

(A1 x1 · · ·xn) = x1

· · ·(An x1 · · ·xn) = xn

It is a simple matter to verify that when n = 1, this definition coincides with Definition 5.1.5.9 Definition: Multiple Fixed-Point Combinators. The n λ-terms Φ1, . . . ,Φn are said tobe fixed-point combinators for a set of n λ-terms, if for any n λ-terms A1, . . . , An, the λ-terms

(Φ1 A1 · · ·An)· · ·

(Φn A1 · · ·An)

18

are multiple fixed-points of A1, . . . , An, respectively.The proof of the following theorem is a generalisation of a beautiful proof due to Smullyan7 [2,

Pages 334–335].5.10 Theorem: There exist multiple fixed-point combinators for a set of n λ-terms.

Proof: Pick n ∈ N. Using an ordinary fixed-point combinator Φ as in Definition 5.3, we candefine a combinator M that satisfies:

(M x0 x1 · · ·xn) = (x0 (M x1 x1 · · ·xn)· · ·(M xn x1 · · ·xn))

(118)

A definition for M could be

M = (Φ (λmx0x1 · · ·xn.(x0 (m x1 x1 · · ·xn)· · ·(m xn x1 · · ·xn))))

(119)

In terms of M we can define for k = 1, . . . , n, the k-th fixed-point combinator for n λ-terms:

Φk = λx1 · · ·xn.(M xk x1 · · ·xn) (120)

It is straightforward to verify that Φ1, . . . ,Φn satisfy Definition 5.9. �In the above derivation of multiple fixed-point combinators, we make use of an ordinary fixed-

point combinator (for example, YCurry

, or YTuring

). For an alternate derivation of multiple fixedpoints using a fixed-point combinator8, see Bekic’s Theorem in Winskel’s The Formal Semanticsof Programming Languages [9, Chapter 10, Section]. Bekic himself referred to the theorem as thebisection lemma [3, Page 39].5.11 Examples: Defining the Even?

Church, and Odd?

Churchcombinators as Multiple Fixed

Points. Let

E = λeon.(Zero?Church

n T (o (PredChurch

n))) (121)O = λeon.(Zero?

Churchn F (e (Pred

Churchn))) (122)

Let Φ1,Φ2 be multiple fixed-point combinators for two expressions. We can now define:

Even?Church

= (Φ1 E O) (123)Odd?

Church= (Φ2 E O) (124)

6 Bases

The notion of a basis in the λ-calculus is analogous to the corresponding notion in linear algebra:A basis B for a vector space W is a finite set of vectors, whose closure under the operations ofaddition and scalar multiplication is equal to W. The situation in the λ-calculus is similar: Abasis B for a set of terms W is a set of λ-terms (which need not be finite), whose closure, underthe operation of application, is equal to W.

An implication of this similarity in the two notions of bases, is that the questions we willconcern ourselves with, regarding bases in the λ-calculus, are similar to the questions that are of

7 Raymond Merrill Smullyan [1919 – ].8 In the actual derivation, a ‘µ’ operator is used to fix a variable in an expression, but expressions that contain

‘µ’ can be rewritten in terms of a fixed-point operator: µx.M is equivalent to (Φ (λx.M)), for any fixed-pointcombinator Φ.

19

interest in linear algebra: How do we represent an arbitrary λ-term as a combination of λ-termsin a given basis? and How small can a basis be if it is to span the set Λ0 of all combinators?6.1 Definition: A Basis. A set B is a basis for a set L if and only if for any M ∈ L thereexists N ∈ B+, such that M = N .6.2 Definition: Abstraction Algorithm. Let a set B be a basis for a set L. An algorithmthat associates with each M ∈ L a particular N ∈ B+ is known as an abstraction algorithm.6.3 Theorem: The set N = {I,K,B,C,S} forms a basis for Λ0.

Proof: Pick M ∈ Λ0. We want to show that M ∈ N+. We use [[ ]]N

to denote the abstractionalgorithm for N. The algorithm is defined inductively on the length of M . There are five cases toconsider, following the recursive structure of a λ-term of the form (λν.(M N).

There are five cases to consider:

• Base case: [[x]]N

= x, for x ∈ {I,K,B,C,S} ∪ Symbols.

• The K-Case: M = λν.M ′, where ν 6∈ FreeVars(M ′).

We have ||M ′|| < ||M ||, and

((λmν.m) M ′) = (K M ′) −→→ M

So we define

[[λν.M ′]]N

= (K [[M ′]]N) (125)

• The C-Case: M = λν.(M ′ν M ′′), where ν ∈ FreeVars(M ′

ν), ν 6∈ FreeVars(M ′′).

We have ||λν.M ′ν ||, ||M ′′|| < ||M ||, and

((λm′m′′ν.(m′ ν M ′′)) (λν.M ′ν) M ′′)

= (C (λν.M ′ν) M ′′)

−→→ M

So we define

[[λν.(M ′ν M ′′)]]

N= (C [[λν.M ′

ν ]]N

[[M ′′]]N) (126)

• The B-Case: M = λν.(M ′ M ′′ν ), where ν 6∈ FreeVars(M ′), ν ∈ FreeVars(M ′′

ν )

We have ||M ′||, ||λν.M ′′ν || < ||M ||, and

((λm′m′′ν.(m′ (m′′ ν))) M ′ (λν.M ′′ν ))

= (B M ′ (λν.M ′′ν ))

−→→ M

So we define

[[λν.(M ′ M ′′ν )]]

N= (B [[M ′]]

N[[λν.M ′′

ν ]]N) (127)

• The S-Case: M = λν.(Mν Nν), where ν ∈ FreeVars(Mν), ν ∈ FreeVars(Nν)

We have ||λν.M ′ν ||, ||λν.M ′′

ν || < ||M ||, and

((λm′m′′ν.(m′ ν (m′′ ν))) (λν.M ′ν) (λν.M ′′

ν ))= (S (λν.M ′

ν) (λν.M ′′ν ))

−→→ M

So we define

[[λν.(Mν Nν)]]N

= (S [[λν.M ′ν ]]

N[[λν.M ′′

ν ]]N) (128)

20

This completes the proof. �The C, and B-comb cases are, in fact, special instances of the S-case: The term λν.(Mν N) can

be rewritten as λν.(Mν (K N ν)), and the term λν.(M Nν) can be rewritten as λν.((K M ν) Nν).Furthermore, λν.ν can be rewritten as λν.(K ν (K ν)), so that the S-case will be applicable. Thisis the rationale behind Theorem 6.5.6.4 Examples: Here are some familiar combinators, and their equivalents in N+:

W = λxy.(x y y) = (C S I)Succ

Church= λxyz.(y (x y z)) = (S B)

p0q = λxy.y = (K I)

6.5 Theorem: The set {K,S} forms a basis for Λ0.Proof: Observe that:

I = (S K K) (129)B = (S (K S) K) (130)C = (S (S (K S) (S (K K) S)) (K K)) (131)

Now apply Theorem 6.3. �6.6 Definition: A One-Point Basis. A one-point basis is a basis consisting of a singleλ-term.6.7 Theorem: There exists a one-point basis for Λ0.

Proof: Let X be the ordered pair of S and K, i.e.:

X = [S,K] = λx.(x S K) (132)

Observe that

(X (X (X X))) = K (133)(X (X (X (X X)))) = S (134)

Now apply Theorem 6.5. �The λ-term X in (132), which forms the one-point basis used in the proof of Theorem 6.7, is

the smallest (in the sense of length of the λ-term, see Definition 2.6) λ-term that forms a one-point basis, the author is aware of. This author has not found this λ-term in any of the standardliterature on the λ-calculus, and so to the best of his knowledge, this λ-term is new.

7 Conclusion and Issues

In this work we provided a brief introduction to the λ-calculus. We covered the following topics:

• Syntax

• Reduction rules

• λ-Definability

• Fixed points

• Bases

This work provides a cursory overview of some of the main topics in the λ-calculus. For a morecomprehensive treatment of the λ-calculus, we refer the reader to Church’s Calculi of LambdaConversion [4], Curry’s Combinatory Logic I, II [5, 6], and Barendregt’s The Lambda Calculus:Its Syntax and Semantics [1].

21

References

[1] Hendrik P. Barendregt. The Lambda Calculus, Its Syntax and Semantics. North-Holland, 1984.

[2] Hendrik P. Barendregt. Functional programming and the λ-calculus. In Jan van Leeuwen,editor, Handbook of Theoretical Computer Science, chapter 7, pages 323 – 363. MIT Press,Cambridge, Massachusetts, 1990.

[3] Hans Bekic. Definable Operations in General Algebras, and the Theoryof Automata andFlowcharts, pages 30–55. Lecture Notes in Computer Science, 177. Springer-Verlag, 1984.

[4] Alonzo Church. The Calculi of Lambda-Conversion. Princeton University Press, 1941.

[5] Haskell B. Curry, Robert Feys, and William Craig. Combinatory Logic, volume I. North-Holland Publishing Company, 1958.

[6] Haskell B. Curry, J. Roger Hindley, and Jonathan P. Seldin. Combinatory Logic, volume II.North-Holland Publishing Company, 1972.

[7] Jean-Yves Girard, Yves Lafont, and Paul Taylor. Proofs and Types, volume 7 of CambridgeTracts in Theoretical Computer Science. Cambridge University Press, 1989.

[8] Joseph Stoy. Denotational Semantics: the Scott-Strachey approach to programming languagetheory. The MIT Press Series in Computer Science. MIT Press, 1977.

[9] Glynn Winskel. The Formal Semantics of Programming Languages: An Introduction. MITPress, Cambridge, Massachusetts, 1993.

22

An Introduction to the Lambda Calculus - Universidad de Chileabassi/Cursos/41a/lambdacalc.pdf · An...

Documents

Transcript of An Introduction to the Lambda Calculus - Universidad de Chileabassi/Cursos/41a/lambdacalc.pdf · An...