An Athena tutorial - People | MIT...

An Athena tutorial

Konstantine Arkoudas

March 29, 2005

Chapter 1

Preliminaries

1.1 What is Athena?

Athena is a new language that integrates computation and deduction. As a programming language,Athena is a higher-order dynamically typed functional programming language with side effects, similarto Scheme. As a logic tool, Athena is a proof checker for a novel formulation of Fitch-style naturaldeduction for first-order logic with equality, sorts, and polymorphism; and an interactive theoremproving system for discovering theorems. It incorporates facilities for high-performance automatedtheorem proving and for model generation. It has a rich and fluid language for writing tactics forproof search, and supports inductive reasoning over arbitrary first-order datatypes. This tutorial ispreliminary and far from exhaustive. It describes only those aspects of the language that a completenewcomer would need to know in order to start doing non-trivial proofs fairly quickly. By the end ofthe tutorial, you should be able to do some pretty interesting proofs in Athena, including inductiveproofs over structures such as numbers, lists, and trees. To report errors or to make comments, pleaseemail me at [email protected].

1.2 Interacting with Athena

Athena can be used either in batch mode or interactively. The interactive mode is used in most cases.It consists of a read-eval-print loop similar to that used by other functional and logic programminglanguages (Scheme, ML, Prolog, etc.). The user enters some input in response to the Athena prompt>, Athena evaluates that input, displays the result, and the process is repeated. The user can quit atany time by typing quit at the prompt.

Typing directly at the prompt all the time is tiresome. It is often more convenient to edit allthe desired Athena text (declarations, definitions, assertions, proofs, etc.) into a file, say foo.ath,and then have Athena read the contents of that file at once, processing the inputs sequentially—fromthe top of the file to bottom—as if they had been entered directly at the prompt. This can be donewith the load-file command. Typing (load-file "foo.ath") at the Athena prompt will processfoo.ath in that manner.

Note that load-file commands can themselves appear inside files. If the command

(load-file "groups.ath")

1

was encountered while loading foo.ath, Athena would proceed to load groups.ath and then return toloading foo.ath, resuming at the point immediately after the (load-file "groups.ath") command.

1.3 Domains, terms, and propositions

1.3.1 Domains

A universe of discourse is introduced with a domain declaration. The syntax is (domain I), for anyidentifier I.1 Here is an example:

>(domain Person)

New domain Person introduced.

Athena responds by confirming that a new domain named Person has been introduced. If the userattempts to reintroduce the same domain later on, Athena will issue a warning reminding the userthat the domain already exists.

Multiple domains can be introduced in a single line with the keyword domains. For instance, thefollowing introduces a domain for rational numbers, a domain Element of individuals, and a domainSet intended to represent sets of objects from Element.

>(domains Rational Element Set)

New domain Rational introduced.

New domain Element introduced.

New domain Set introduced.

There are some domains that are predefined in Athena. Boolean is the most important of them.2

1.3.2 Function symbols

Once a domain has been introduced we can go ahead and declare function symbols, for instance:

>(declare father (-> (Person) Person))

New symbol father declared.

This simply says that father denotes an operation (function) that takes a person and producesanother person. We refer to the expression (-> (Person) Person) as the signature of father.Nothing else is known about father at this point, only its signature.

Any identifier can be used as a function symbol, including non-alphanumeric strings, e.g.:1An identifier in Athena is any non-empty string of printable characters (i.e., with ASCII codes ranging from 33 to

126) that satisfies the following constraints: (a) The first character can be anything except these fourteen characters:

! " # $ ’ ( ) , : ; ? [ ] ‘

(b) It does not contain any of the following seven characters:

" # ) , : ; ]

2Boolean is actually a special type of domain: a datatype. Those are discussed in Section 4.1.

2

>(declare + (-> (Rational Rational) Rational))

New symbol + declared.

The general syntax form for a function symbol declaration is

(declare f (-> (D1 · · · Dn) D))

where f is an identifier and the Di and D are domains, n ≥ 0. We refer to D1 · · ·Dn as the inputdomains of f , and to D as the result domain, or as the range of f . The number of input domains,n, is the arity of f . Function symbols must be unique, so they cannot be redeclared. In particular,there is no overloading.

A function symbol of arity zero is called a constant symbol. A constant symbol c of domain Dcan be introduced simply by writing (declare c D), instead of (declare c (-> () D)). The twoforms are equivalent, though the first is more convenient. Example:

>(declare joe Person)

New symbol joe declared.

>(declare zero Rational)

New symbol zero declared.

>(declare null Set)

New symbol null declared.

There are several built-in constant symbols in Athena. Two of them are true and false, both ofwhich are elements of the built-in domain Boolean.

Relations are simply functions whose range is Boolean. For instance:

>(declare <= (-> (Rational Rational) Rational))

New symbol <= declared.

>(declare member (-> (Element Set) Boolean))

New symbol member declared.

Here <= stands for a function that maps any two Rational numbers to true or false, accordingto whether or not the first number is less than or equal to the second (assuming that we have this“standard interpretation” of <= in mind). Therefore, we can think of <= as a binary predicate onthe domain Rational—the usual less-than-or-equal-to relation. Likewise, member is a binary relationthat holds between a given Element and a given Set iff the first is a member of the second. We alsointroduce two relation symbols empty? and subset with the obvious intended interpretations:

>(declare empty? (-> (Set) Boolean))

New symbol empty? declared.

3

>(declare subset (-> (Set Set) Boolean))

New symbol subset declared.

For brevity, multiple function symbols that share the same signature can be declared in a singleline by listing them within parentheses:

(declare (f1 · · · fn) (-> (D1 · · · Dn) D))

For example:

>(declare (male female) (-> (Person) Boolean))

New symbol male declared.

New symbol female declared.

>(declare (union intersection) (-> (Set Set) Set))

New symbol union declared.

New symbol intersection declared.

Multiple constant symbols of the same domain can also be likewise introduced:

>(declare (ann peter sue) Person)

New symbol ann declared.

New symbol peter declared.

New symbol sue declared.

Some function symbols are built-in. One of them is the binary equality symbol =, which takes anytwo objects of the same domain D and returns a Boolean. D can be arbitrary.3

1.3.3 Terms

A term is a syntactic object that represents an element of some domain. The simplest type of termis a constant symbol. For instance, assuming the declarations of the previous section, if we type joeat the Athena prompt, Athena will recognize the input as a term:

>joe

Term: joe

3More precisely, = is a polymorphic function symbol. We discuss those in Section ??.

4

A variable is also a term—it denotes an indeterminate individual in some domain. Variables are ofthe form ?I, for any identifier I.4 The following are all legal variables: ?x, ?foo-bar, ?v_1, ?789,?@sd%&. There is no restriction on the length of a variable, and very few restrictions on the kind ofcharacters that can appear inside a variable.

Constant symbols and variables serve as the primitive building blocks of terms. More complexterms can be formed by applying a function symbol f to n given terms t1 · · · tn, where n is the arityof f . Such an application is written in prefix form as (f t1 · · · tn). For instance, the following are alllegal terms:

1. (father joe)2. (+ ?m ?n)3. (father (father joe))4. (= peter (father joe))5. (male (father joe))6. (<= zero (+ zero ?n))7. (= (member ?x null) (male ann))8. (subset ?s1 (union ?s1 ?s2))

Every term is of a certain sort. The sort of (father joe), for example, is Person; the sort of(+ zero ?n) is Rational; and so on. In general, the sort of a term (f t1 · · · tn) is D, where D is therange of f .

Given an interpretation of the various domains and function symbols, a term designates an elementof its sort. For example, given the expected interpretation of the domain Person, the function symbolfather, etc., the term (father joe) would designate the father of the person named joe. The term(male (father joe)) designates true, etc. If a term contains variables then it does not designateany specific individual, unless we explicitly assign denotations to all the variables. For instance, wedo not know what (+ zero ?n) is, since we do not know what ?n stands for. A term that does notcontain any variables is called ground. Ground terms always denote domain elements (again, withrespect to a given interpretation).

Some terms are ill-sorted, e.g., (father zero). This is a “type error” (only we speak of sortsinstead of types), because the function symbol father requires an argument from the Person domainand we are applying it to an Rational instead. Athena performs sort-checking automatically, andwill detect and report any ill-sorted terms as errors:

>(+ zero joe)

Error, top level, 1.1: Ill-sorted term: (+ zero joe).

Here 1.1 indicates line 1, column 1, the precise position where the offending term was located; and “toplevel” means that the input was entered directly into the Athena promt. If the term (+ zero joe)had been read in batch mode from a file foo.ath instead, and the term was found, say, on line 279,column 35, the message would have been:

Error, foo.ath, 279.35: Ill-sorted term: (+ zero joe).

The user does not need to declare the sorts of variables explicitly. Athena infers them automati-cally. For instance, it is not necessary to write something like (+ zero ?n:Rational); we just write(+ zero ?n). Athena will realize that the sort of ?n must be Rational for this term to be well-sorted.

4Actually, ?I is an abbreviation for the built-in syntax form (var I).

5

1.3.4 Propositions

Propositions are the bread and butter of Athena. Every successful proof derives a proposition. Propo-sitions make statements about relationships that may (or may not) hold amongst elements of varioussorts. There are three types of propositions:

1. Atomic propositions, or just “atoms.” Those are simply terms of sort Boolean. Examples are(male joe), (<= zero zero), (subset ?s1 (union ?s1 ?s2)), and (= peter (father ann)).

2. Sentential combinations, obtained from other propositions through one of the five sententialconnectives: not, and, or, if, and iff. Those have the expected interpretations: (not P)negates proposition P ; (and P Q) expresses the conjunction of P and Q; (or P Q) expressestheir alternation (disjunction); (if P Q) expresses the material conditional if P then Q; and(iff P Q) expresses the equivalence P if and only if Q.

3. Quantified propositions. There are two main quantifiers in Athena, forall and exists. Aquantified proposition is of the form (q x P) where q is a quantifier (either forall or exists),x is a variable, and P is a proposition—the body of the quantification. An example:

(forall ?n (= (+ ?n zero) ?n))

A proposition of the form (forall x P) says that P holds for every element x (of a certainsort), while (exists x P) says that there is some object x for which P holds.

The following shows what happens when the user types this proposition at the Athena prompt:

>(forall ?n (= (+ ?n zero) ?n))

Proposition: (forall ?n:Rational(= (+ ?n ?m)

?n))

Athena acknowledges that a proposition was entered, and displays that proposition in a neat formatusing standard s-expression indentation. Note that the user did not have to explicitly indicate thesort of the quantified variable. Athena inferred that ?n ranges over Rational, and annotated thequantified occurrence of ?n in the output with that sort.

As a shorthand for iterated occurrences of the same quantifier, the user may enter propositionsof the form (q x1 · · ·xn P) for a quantifier q and any number n > zero of variables x1 · · ·xn. Forinstance:

>(forall ?n ?m (= (+ ?n ?m) (+ ?m ?n)))

Proposition: (forall ?n:Rational(forall ?m:Rational(= (+ ?n ?m)

(+ ?m ?n))))

A similar shorthand exists for the sentential connectives and and or. Both of these can take anarbitrary number n > zero of arguments: (and P1 · · ·Pn) and (or P1 · · ·Pn). For example:

6

>(and (empty? null) (male (father joe)) (= (+ ?n zero) ?n))

Proposition: (and (empty? null)(male (father joe))(= (+ ?n zero)

?n))

Free and bound variable occurrences are defined as usual.5 Variables with free occurrences must haveconsistent sorts across a given proposition. For instance, (and (<= zero ?n) (= ?n (+ ?m1 ?m2)))is legal, but (and (male ?n) (= ?n (+ ?m1 ?m2))) is not, and would generate an error. However,bound variable occurrences can have different sorts in the same proposition:

>(and (forall ?x (male (father ?x)))(forall ?x (<= ?x (+ ?x ?x))))

Proposition: (and (forall ?x:Person(male (father ?x)))

(forall ?x:Rational(<= ?x

(+ ?x ?x))))

Intuitively, this is because the specific name of a bound variable is immaterial. For instance, there isno real difference between (forall ?x (male (father ?x))) and

(forall ?p (male (father ?p)))

Both propositions say the exact same thing, even though one of them uses ?x as a bound variableand the other uses ?p. We say that the two propositions are alphabetically equivalent. This meansthat each can be obtained from the other by consistently renaming bound variables. Alphabeticallyequivalent propositions are “essentially identical,” and indeed we will see later on that Athena treatsthem as such for deduction purposes.

1.4 Definitions

The ability to abbreviate a complex object by giving it a name and then subsequently referring to it bythat name is crucial for managing complexity. There are several naming mechanisms in Athena. Oneof the most useful is the top-level directive define. The general syntax form for it is (define I F)where I is any identifier (name) and F is a phrase denoting the object that we want to define. Oncea definition has been made, the defined object can be referred to by its name:

>(define P (forall ?s (subset ?s ?s)))

Proposition P defined.

>(define Q (forall ?x(exists ?y(and (<= ?x ?y)

5Refer, for instance, to “The Logic Book” [3].

7

(not (= ?x ?y))))))

Proposition Q defined.

>(and P Q)

Proposition: (and (forall ?s:Set(subset ?s ?s))

(forall ?x:Rational(exists ?y:Rational(and (<= ?x ?y)

(not (= ?x ?y))))))

Athena has lexical scoping. New definitions override older ones:

>(define t joe)

Term t defined.

>t

Term: joe

>(define t (+ zero zero))

Term t defined.

>t

Term: (+ zero zero)

We stress that define is a top-level directive, not a procedure. A define can only appear at the toplevel by itself, i.e., typed directly in response to Athena’s input prompt. It cannot be nested insideother code. So something like (let ((s null)) (define P (empty? s))) is a syntax error thatwould be rejected by the Athena parser.

1.5 Assumption bases

Athena maintains a global set of propositions called the assumption base. We can think of the elementsof the assumption base as our premises—propositions that we regard as true. Initially the systemstarts with the empty assumption base. Every time an axiom is postulated or a theorem is proved,the corresponding proposition is inserted into the assumption base.

The assumption base can be inspected with the built-in procedure6 show-assumption-base. Itdoes not take any arguments, so it is invoked as (show-assumption-base). When Athena is firststarted, the assumption base is empty:

>(show-assumption-base)

6Procedures are discussed in the second part of the tutorial.

8

The assumption base is currently empty.

Unit: ()

A proposition can be inserted in the assumption base with the top-level directive assert:

>(assert (= (father ann) joe))

The proposition(= (father ann)

joe)has been added to the assumption base.

Unit: ()

The above assertion postulated the proposition that the father of ann is joe. We can verify that theproposition was indeed added to the assumption base by invoking show-assumption-base again:


There is one proposition in the current assumption base:

(= (father ann)joe)

Unit: ()

Multiple propositions can be asserted at the same time; the general syntax form is (assert P1 · · ·Pn):

>(assert (forall ?x (male (father ?x)))(forall ?n (exists ?m (and (<= ?n ?m)

(not (= ?n ?m))))))

The proposition(forall ?x:Person(male (father ?x)))

has been added to the assumption base.

The proposition(forall ?n:Rational(exists ?m:Rational(and (<= ?n ?m)

(not (= ?n ?m)))))has been added to the assumption base.


There are 3 propositions in the current assumption base:

9

(= (father ann)joe)

(forall ?x:Person(male (father ?x)))

(forall ?n:Rational(exists ?m:Rational

(and (<= ?n ?m)(not (= ?n ?m)))))

Unit: ()

Lists of propositions can also appear as arguments to assert. We will have more to say about listslater on. For now it suffices to say that a list of n ≥ zero values V1 · · ·Vn can be formed simply byenclosing the values inside square brackets: [V1 · · ·Vn]. (For the time being a “value” will mean aterm or a proposition, or perhaps another list). For instance:

>[joe ann]

List: [joe ann]

>[]

List: []

>(define L [joe (father ann) (forall ?n (<= ?n (+ ?n zero))) []])

List L defined.

>L

List: [joe (father ann) (forall ?n (<= ?n (+ ?n zero))) []]

Note that Athena lists, like those of Scheme (but unlike those of ML or Haskell) can be heterogeneous:elements of different types can appear in the same list.

A list of axioms can be introduced, given a name, and then asserted by that name. For instance:

>(define father-facts [(= (father ann) joe)(= (father sue) peter)(= (father peter) joe)])

List father-facts defined.

>(assert father-facts)

The proposition(= (father ann)

joe)

10


The proposition(= (father sue)

peter)has been added to the assumption base.

The proposition(= (father peter)

joe)has been added to the assumption base.

There are also two mechanisms for removing propositions from the global assumption base:clear-assumption-base and retract. The former will delete every element of the assumption base,while the second will remove a single specified proposition from it:

>clear-assumption-base

Assumption base cleared, blank-slate state.



Unit: ()

>(assert (male joe))

The proposition(male joe)has been added to the assumption base.


There is one proposition in the current assumption base:

Proposition: (male joe)

Unit: ()

>(retract (male joe))

The proposition(male joe)has been removed from the assumption base.


11


Unit: ()

Like define, the forms assert, clear-assumption-base and retract are top-level directives; theycannot appear inside other code.

12

Chapter 2

Proofs

2.1 Primitive methods

The simplest type of proof is a single application of an inference rule. Inference rules in Athena arecalled methods. Athena comes with a small collection of predefined methods—the so-called primitivemethods. We will see later that users can define their own methods.

A method application is of the form (!E F1 · · ·Fn), where E is an expression denoting a methodand F1 · · ·Fn are the arguments supplied to the method. Perhaps the simplest primitive method isclaim, a unary “reiteration” method that takes an arbitrary proposition P as its sole argument. IfP is in the assumption base, then claim simply returns P (i.e., it reiterates P ); otherwise, if P is notin the current a.b., then claim reports an error. Suppose, for example, that the current assumptionbase contains the proposition (male joe) but does not contain the proposition (female ann). Thenhere is what will happen if we try to claim these propositions:

>(!claim (male joe))

Theorem: (male joe)

>(!claim (female ann))

Error, top level, 1.9: Failed application of claim---the proposition(female ann) is not in the assumption base.

The first claim succeeds, but the second fails. Note that Athena reports the precise position of thefailed claim (line 1, column 9 of the input stream). Also note that Athena pronounces the successfulclaim a theorem. In fact every time a proof is successfully evaluated at the top level, Athena refers tothe resulting proposition as a theorem. There is a reason for this: It can be proved formally that ifan Athena proof produces a proposition P in the context of an assumption base β, then P is a logicalconsequence of β. This is the main soundness guarantee provided by Athena, and it is the sense inwhich the result of a proof is understood to be a theorem.

One question about claim that comes up sometimes is this: If a proposition P is in the assumptionbase and we try to claim P ′, where P ′ is a variant of P that is logically equivalent to P but is notitself in the assumption base, will the claim succeed? The answer is no. An application (!claim P)succeeds if and only if P itself, as a syntactic object, is in the assumption base. For instance,

13

suppose that P is the proposition (male joe) and P ′ is (and true (male joe)), where P is in theassumption base but P ′ is not. Then (!claim P) will succeed, but (claim P ′) will fail, even thoughP ′ is logically equivalent to P . There is a reason for this too: The problem of deciding whether twopropositions are logically equivalent is very difficult. If fact it is undecidable, meaning that there isno algorithm that can solve every instance the problem.

However, claim, like all other primitive Athena methods, determines propositional identity up toalphabetic equivalence. E.g., if P is (forall ?x (male (father ?x))) and P ′ is

(forall ?y (male (father ?y)))

then (!claim P) will succeed if and only if (!claim P ′) suceeds:



>(assert (forall ?x (male (father ?x))))

The proposition(forall ?x:Person(male (father ?x)))


>(!claim (forall ?foo (male (father ?foo))))

Theorem: (forall ?foo:Person(male (father ?foo)))

All of the remaining primitive methods can be classified as either introduction or elimination rulesfor one of the sentential connectives or quantifiers. That is, an application of a primitive methodM either produces a proposition of the form (� · · · ) for � ∈ {not, and, or, if, iff, forall, exists},in which case we say that M is an introduction method for �; or else it takes such a propositionas an input premise and produces a part of it as its conclusion, in which case we say that M is anelimination method for �.

Consider, for instance, conjunctions of the form (and P1 P2). There are two elimination meth-ods for conjunctions, left-and and right-and. The former will produce P1, the left part of theconjunction, while the latter will produce P2, the right part. Specifically, left-and is a unary prim-itive method that takes a conjunction (and P1 P2) as its sole argument. If the conjunction is inthe current assumption base, then the application (!left-and (and P1 P2)) will produce P1 as theresult; otherwise it will fail. For example:



>(define P (and (male joe) (female ann)))

Proposition P defined.

14

>(assert P)

The proposition(and (male joe)

(female ann))has been added to the assumption base.

>(!left-and P)

Theorem: (male joe)

An application of left-and will fail if the given conjunction is not in the assumption base; or if thegiven argument is not a conjunction at all; or if the wrong number of arguments are given:

>(!left-and (and true true))

Error, top level, 1.12: Failed application of left-and---the proposition(and true true) is not in the assumption base.

>(!left-and (not false))

Error, top level, 1.12: Failed application of left-and---the given propositionmust be a conjunction, but here it was a negation: (not false).

>(!left-and true false true)

Error, top level, 1.1: Wrong number of arguments (3) givento left-and---exactly one argument is required.

The method right-and works exactly like left-and, except that it produces the right componentof the given conjunction.

It is instructive to compare Athena’s formulation of these inference rules with a more conventionalpresentation. For instance, the left conjunction-elimination rule is usually depicted graphically asfollows:

` P ∧Q

` P

This says that if the premise P ∧Q above the horizontal line has already shown to be a theorem(where theoremhood is signified by the turnstile ` ), then we may also derive P as a theorem. Or,more loosely, if we already know that P ∧Q holds, then we may conclude P .

In the case of Athena, “P ∧Q holds” boils down to no more and no less than it being in theassumption base. Therefore, the Athena semantics of left-and can be understood as saying that if(and P Q) is in the assumption base, then we may conclude P .

The semantics of a proof form D can often be expressed succinctly using judgments of the formβ `D ; P , which can be read as follows:

In the context of assumption base β, evaluating D produces the conclusion P .

For instance, the semantics of left-and can be captured by the following rule:

β ∪ {P ∧Q} ` (!left-and (and P Q)) ; P (2.1)

15

which says that applying left-and to a conjunction of the form (and P Q) in any assumption basethat contains this conjunction will result in the conclusion P . Rule (2.1) can be viewed as a patternthat gives rise to infinitely many rule instances depending on what particular values we choose tosubstitute for β, P , and Q. For instance, both of the following are instances of (2.1):

{(and true true)} ` (!left-and (and true true)) ; true

{false, (and (male joe) (female ann)))} `(!left-and (and (male joe) (female ann))) ; (male joe)

The first is obtained from (2.1) via the substitution

β 7→ ∅, P 7→ true, Q 7→ true

while the second is obtained through the substitution

β 7→ {false}, P 7→ (male joe), Q 7→ (female ann).

The rule does not tell us what happens when the assumption base does not contain the premise(and P Q); or what happens when the argument to left-and is not of the form (and P Q); orwhen left-and is applied to a different number of arguments. We will make the convention that inall such cases the result will be an error token, and we will follow that convention in the sequel whenwe come to state similar rules for other methods.

Below we list all the remaining primitive Athena methods and describe their semantics:

• both (conjunction introduction): This is a binary method which takes two propositions P andQ and returns the conjunction (and P Q), provided that both P and Q are in the assumptionbase:

β ∪ {P,Q} ` (!both P Q) ; (and P Q)

It is an error if either P or Q is not in the assumption base. Of course it is also an error if anyother type or number of arguments are given to both. (From now on we will not bother to makethis point explicitly.)

• mp (conditional elimimation): mp—an abbreviation for modus ponens—is a binary methodwhich takes two propositions of the form (if P Q) and P as arguments. If both of these arein the assumption base, the conclusion Q is returned:

β ∪ {(if P Q), Q} ` (!mp (if P Q) P) ; Q

• dn (negation elimination): This is a unary method that performs “double negation” elimination.It takes an argument of the form (not (not P)) and produces P , provided that the premise(not (not P)) is in the assumption base:

β ∪ {(not (not P))} ` (!dn (not (not P))) ; P

• either (disjunction introduction): This is a binary method which takes two propositions Pand Q and returns the disjunction (or P Q), provided that at least one of them is in theassumption base. Its semantics can be expressed by the following two rules:

β ∪ {P} ` (!either P Q) ; (or P Q)

andβ ∪ {Q} ` (!either P Q) ; (or P Q)

If neither P nor Q is in the assumption base, an error is generated.

16

• cd (disjunction elimination): This ternary method captures the inference rule traditionallyknown as “constructive dilemma”: If we know that either P1 or P2 holds and we also knowthat both P1 and P2 imply Q, then we may conclude Q. In terms of assumption bases: ifthe disjunction (or P1 P2) and the two conditionals (if P1 Q) and (if P2 Q) are in theassumption base, then the application (!cd (or P1 P2) (if P1 Q) (if P2 Q)) will producethe conclusion Q. More succinctly:

β∪{(or P1 P2), (if P1 Q), (if P2 Q)} ` (!cd (or P1 P2) (if P1 Q) (if P2 Q)) ; Q

• equiv (biconditional introduction): This is a binary method that takes two conditionals ofthe form (if P Q) and (if Q P) as arguments. Provided that both of these are in theassumption base, equiv produces the biconditional (iff P Q):

β ∪ {(if P Q), (if Q P)} ` (!equiv (if P Q) (if Q P)) ; (iff P Q)

• left-iff (biconditional elimination): This is a binary method that takes a premise of the form(iff P Q) and returns (if P Q), provided that the given biconditional is in the assumptionbase:

β ∪ {(iff P Q)} ` (!left-iff (iff P Q)) ; (if P Q)

• right-iff (biconditional elimination): Just like left-iff, but returns (if Q P) instead:

β ∪ {(iff P Q)} ` (!right-iff (iff P Q)) ; (if Q P)

• absurd (negation introduction): This is a binary method that takes two propositions of the formP and (not P) and returns false, provided that both P and (not P) are in the assumptionbase:

β ∪ {P , (not P)} ` (!absurd P (not P)) ; false

• true-intro (true introduction): This is a nullary method that introduces the atom true inevery assumption base: β ` (!true-intro) ; true.

• uspec (universal quantifier elimination): uspec—short for “universal specialization”—is a bi-nary method that takes a proposition of the form (forall x P) and a term t as arguments.If the proposition is in the assumption base, then the conclusion P [x 7→ t] is produced, whereP [x 7→ t] denotes the proposition obtained from P by replacing every free occurrence of x by t.It is an error if the resulting proposition is ill-sorted.

β ∪ {(forall x P)} ` (!uspec (forall x P) t) ; P [x 7→ t]

• egen (existential quantifier introduction): egen—short for “existential generalization”—takes aproposition of the form (exists x P) and a term t as arguments. If the proposition P [x 7→ t]is in the assumption base, then the conclusion (exists x P) is produced:

β ∪ {P [x 7→ t]} ` (!egen (exists x P) t) ; (exists x P)

Here t is a “witness” term: we know that P holds for t, therefore we may conclude that thereis something for which P holds. For instance, if (male joe) is in the assumption base, thenthe application (!egen (exists ?x (male ?x)) joe) will successfully produce the conclusion(exists ?x (male ?x)). We know that joe is male, hence it is sound to conclude that someindividual is male.

17

It should be noted that the introduction and elimination rules for conjunction and disjunctionwork properly even when given arbitrarily long arguments. Specifically, if P is a conjunction of theform (and P1 · · ·Pn) for n > 2, then (!left-and P) and (!right-and P) will produce P1 and(and P2 · · ·Pn), respectively, in any assumption base containing P . Likewise, (!either P1 · · ·Pn)will yield the disjunction (or P1 · · ·Pn) as long as at least one Pi is in the assumption base. Finally,(!cd (or P1 · · ·Pn) (if P1 Q) · · · (if Pn Q)) will derive the conclusion Q provided that thedisjunction (or P1 · · ·Pn) and the conditionals (if Pi Q) are in the assumption base, i = 1, . . . , n.

We have covered all of Athena’s primitive methods for dealing with the sentential connectives andquantifiers. Figure 2.1 presents a summary. There are also some primitive methods for dealing withequality (recall that the equality symbol = is built-in):

• reflex (Reflexivity): This is a unary method that takes an arbitrary term t and produces theproposition (= t t), in any assumption base:

β ` (!reflex t) ; (= t t)

• sym (Symmetry): This is a unary method that takes an equality (= s t) as an argument. Ifthe proposition (= s t) is in the assumption base, then the conclusion (= t s) is produced:

β ∪ {(= s t)} ` (!sym (= s t)) ; (= t s)

• tran (Transitivity): This is a binary method that takes two equalities of the form (= t1 t2) and(= t2 t3) as arguments. If both of these are in the assumption base, the conclusion (= t1 t3)is produced:

β ∪ {(= t1 t2), (= t2 t3)} ` (!tran (= t1 t2) (= t2 t3)) ; (= t1 t3)

• rcong (Relational congruence): This is a binary method that takes two atomic propositions ofthe form (R s1 · · · sn) and (R t1 · · · tn) as arguments, where R is a relation symbol of arity n.If the assumption base contains the first proposition, (R s1 · · · sn), along with the n equalities(= s1 t1), . . ., (= sn tn), then the second proposition (R t1 · · · tn) is produced:

β ∪ {(R s1 · · · sn), (= s1 t1),. . . , (= sn tn)} `(!rcong (R s1 · · · sn) (R t1 · · · tn)) ; (R t1 · · · tn)

• fcong (Functional congruence): This is a unary method that takes an equality of the form(= (f s1 · · · sn) (f t1 · · · tn)) as an argument. If the n equalities (= si ti), i = 1, . . . , n arein the assumption base, then the given proposition (= (f s1 · · · sn) (f t1 · · · tn)) is returnedas the conclusion:

β ∪ {(= s1 t1),. . . , (= sn tn)} ` (!fcong (= (f s1 · · · sn) (f t1 · · · tn))) ;

(= (f s1 · · · sn) (f t1 · · · tn))

These are the main primitive methods of Athena. There are three additional methods dealing withdatatypes; we will discuss those in Section ??.

In terms of the introduction and elimination of sentential connectives and quantifiers, the readermay have noticed that so far we have not introced any primitive methods for performing the following:

1. Conditional introduction

18

Name Arity Semanticsclaim 1 β ∪ {P} ` (!claim P) ; Pmp 2 β ∪ {(if P Q), P} ` (!mp (if P Q) P) ; Q

absurd 2 β ∪ {P, (not P)} ` (!absurd P (not P)) ; falsedn 1 β ∪ {(not (not P))} ` (!dn (not (not P))) ; P

both 2 β ∪ {P, Q} ` (!both P Q) ; (and P Q)left-and 1 β ∪ {(and P Q} ` (!left-and (and P Q)) ; Pright-and 1 β ∪ {(and P Q} ` (!right-and (and P Q)) ; Q

equiv 2 β ∪ {(if P Q), (if Q P)} ` (!equiv (if P Q) (if Q P)) ; (iff P Q)left-iff 1 β ∪ {(iff P Q)} ` (!left-iff (iff P Q)) ; (if P Q)right-iff 1 β ∪ {(iff P Q} ` (!right-iff (iff P Q)) ; (if Q P)either 2 β ` (!either P Q) ; (or P Q) when P ∈ β or Q ∈ β

cd 3 β ∪ {(or P P’), (if P Q), (if P’ Q)} ` (!cd (or P P’) (if P Q) (if P’ Q)) ; Qtrue-intro 0 β ` (!true-intro) ; true

uspec 2 β ∪ {(forall x P)} ` (!uspec (forall x P) t) ; P[t/x]egen 2 β ∪ {P[t/x]} ` (!egen (exists x P) t) ; (exists x P)

leibniz 4 β ∪ {(= s t)} ` (!leibniz s t x P) ; (iff P[s/x] P[t/x])eq-reflex 1 β ` (!eq-reflex t) ; (= t t)

Figure 2.1: Athena’s primitive methods.

2. Negation introduction

3. Universal quantifier introduction

4. Existential quantifier elimination

In Athena all four of these are performed by primitive (built-in) syntax forms, not by methods. Thecorresponding syntax forms are assume, suppose-absurd, pick-any, and pick-witness. We willdescribe these in Section 2.3, Section 2.4, Section 2.5, and Section ??, respectively. But first we needto talk about composite proofs.

2.2 Proof composition

How do we put together two or more simple proofs to form a single bigger proof? The simplest way tocombine inferences in Athena is the dseq (“deduction sequence”) mechanism. For any two proofs D1

and D2, the phrase (dseq D1 D2) is a new composite proof that performs D1 and D2 sequentially.First, D1 is evaluated in the current assumption base β, producing some conclusion P1. Then P1 isadded to the assumption base, and we continue with the second proof D2.1 Thus D2 is evaluated inβ ∪ {P1}. The result of D2 becomes the result of the entire dseq. This allows for lemma formation:the conclusion of D1 (P1) serves as a lemma inside D2. The following rule expresses the semantics ofdseq succinctly:

β `D1 ; P1 β ∪ {P1} `D2 ; P2

β ` (dseq D1 D2) ; P2

1The phrase “P1 is added to the assumption base” is somewhat misleading, as it implies that there is a singlepersistent assumption base that gets destructively modified (updated) like a store. In fact assumption bases in Athenaare functional, so it would be more appropriate to say that D2 is evaluated in a new assumption base β′ = β ∪ {P1}.However, for exposition purposes it is simpler to just say “we add P1 to the assumption base and continue with D2.”We will continue that practice, but the reader should keep in mind that under the hood assumption bases are functional.

19

More than two deductions can appear inside a dseq. For n > 2, the semantics of (dseq D1 · · ·Dn)are given by desugaring to the n− 1 case. For instance, (dseq D1 D2 D3) is defined as

(dseq D1 (dseq D2 D3)).

Likewise, (dseq D1 D2 D3 D4) is defined as (dseq D1 (dseq D2 D3 D4)); etc.For instance, suppose that A and B are Boolean constants and that the assumption base contains

exactly one proposition, the conjunction (and A B). Then the proof below will derive the “reverse”conjunction (and B A) (a pound sign # starts a comment that extends to the end of the line):

>(declare (A B) Boolean)

New symbol A declared.

New symbol B declared.



>(assert (and A B))

The proposition

(and A B)


>(dseq (!left-and (and A B)) # this derives A; the a.b. now is {(and A B), A}

(!right-and (and A B)) # this gets B; the a.b. now is {(and A B), A, B}

(!both B A)) # and finally this results in (and B A)

Theorem: (and B A)

The application of left-and above succeeds because the premise (and A B) is in the assumptionbase, as required by the semantics of left-and. Accordingly, the conclusion A is obtained, addedto the assumption base, and we continue with the subsequent elements of dseq, starting with theright-and application. At this point the assumption base contains two propositions: the originalpremise (and A B), and the intermediate conclusion A that was derived by left-and. The applicationof right-and succeeds, the conclusion B is added to the assumption base, and we proceed to the next—and last—step, the application of both. At this point the assumption base contains three elements:(and A B), A, and B. The application of both succeeds (since both of the required premises B and Aare in the current assumption base), and hence the conclusion (and B A) is returned, which becomesthe conclusion of the entire dseq.

At the end of the entire dseq proof, only the final conclusion (and B A) is retained and addedto the original assumption base. The intermediate conclusions generated during thhe evaluation ofthe dseq (namely, the propositions A and B generated by the left-and and right-and applications)are not retained. For instance, here is what we get if we try to view the a.b. immediately after thepreceding dialogue:


There are 2 propositions in the current assumption base:

20

(and A B)

(and B A)

Unit: ()

In general, whenever we evaluate a deduction D at the top level and obtain a result P :>D

Theorem: Ponly the conclusion P is added to the global assumption base. Any auxiliary conclusions derived inthe course of evaluating D are discarded.

2.3 Conditional proofs

In common mathematical reasoning, conditionals—propositions of the form (if P Q)—are usuallyderived by adding the antecedent P to our working assumptions and then inferring the consequentQ. That is, we supply a proof that derives Q from the hypothesis P along with whatever otherassumptions are currently in effect. If we succeed, we then discharge the hypothesis P and concludethe desired conditional (if P Q). This mode of reasoning is captured in Athena by proofs of theform

(assume P D) (2.2)

which are called hypothetical or conditional proofs. The proposition P is called the hypothesis of theproof, and D is its body. We also say that D represents the scope of the hypothesis P .

The operational semantics of hypothetical deductions are straightforward: To evaluate a proof ofthe form (2.2) in some assumption base β, we add the hypothesis P to β and proceed to evaluate thebody D in the extended assumption base β ∪ {P}. If and when that produces a conclusion Q, wereturn the conditional (if P Q) as the result of the entire assume. Symbolically:

β ∪ {P} `D ; Q

β ` (assume P D) ; (if P Q)

We illustrate with some examples. First, let us introduce A, B, and C as propositional atoms:

>(declare (A B C) Boolean)



New symbol C declared.

The following deduction will derive the conditional (if A A) no matter what is in the assumptionbase:

>(assume A

(!claim A))

Theorem: (if A A)

21

Here the hypothesis is A and the body is (!claim A). The conclusion (if A A) will be successfullyderived in any (every) assumption base β, because, no matter what β contains, the body (!claim A)will always be evaluated in an assumption base that contains A (namely, in β∪{A}), and will thereforesucceed, as prescribed by the semantics of claim. Here are three additional examples that respectivelyderive the following tautologies:

A ∧B ⇒B ∧A

(A⇒B)⇒ [(B ⇒C)⇒ (A⇒C)]

[A⇒ (B ⇒C)]⇒ [(A ∧B)⇒C]

>(assume (and A B)

(dseq (!left-and (and A B))

(!right-and (and A B))

(!both B A)))

Theorem: (if (and A B)

(and B A))

>(assume (if A B)

(assume (if B C)

(assume A

(dseq (!mp (if A B) A) # this gives B

(!mp (if B C) B))))) # and this gets C

Theorem: (if (if A B)

(if (if B C)

(if A C)))

>(assume (if A (if B C))

(assume (and A B)

(dseq (!left-and (and A B)) # first get A

(!mp (if A (if B C)) A) # then get (if B C)

(!right-and (and A B)) # then B

(!mp (if B C) B)))) # and finally C

Theorem: (if (if A

(if B C))

(if (and A B)

C))

Observe that it is not necessary to disharge assumptions explicitly. That is done automatically bythe semantics of assume. This is a better model of informal reasoning, where assumption dischargedoes not occur explicitly but is rather tacitly understood by the lexical subdivision of the proof textinto syntactic blocks such as paragraphs.

2.4 Proofs by contradiction

Oftentimes when we need to establish a proposition P it is convenient to take a somewhat indirectroute: we suppose that P does not hold, and show that this supposition is absurd, in that it leads to

22

a contradiction. From this we are entitled to conclude P . In Athena this type of reasoning is capturedby deductions of the form

(suppose-absurd P D) (2.3)

Deductions of this form are called “proofs by contradiction.” As with conditional proofs, we refer toP as the hypothesis and to D as the body of the proof and the scope of the hypothesis P .

To evaluate a deduction of the form (2.3) in an assumption base β, we add the hypothesis P toβ and proceed to evaluate the body D in the extended assumption base β ∪ {P}. If and when thatproduces the atom false, we return the negation (not P) as the result of the entire suppose-absurd.Symbolically:

β ∪ {P} `D ; false

β ` (suppose-absurd P D) ; (not P)

Below are a couple of sample proofs using suppose-absurd. The first establishes the conditionalA⇒¬¬A, while the second derives the tautology (¬A ∨ ¬B)⇒¬(A ∧B). Recall that absurd is abinary primitive method that takes two propositions of the form P and (not P) and derives false,provided that both P and (not P) are in the assumption base (an error occurs otherwise).

>(assume A

(suppose-absurd (not A)

(!absurd A (not A))))

Theorem: (if A

(not (not A)))

>(assume (or (not A) (not B)) # assume we either have (not A) or (not B)

(suppose-absurd (and A B) # suppose, by way of contradiction, (and A B)

(dseq

(!left-and (and A B)) # this gives A

(!right-and (and A B)) # this gives B

(assume (not A) # this show that (not A) implies false

(!absurd A (not A)))

(assume (not B) # and this show that (not B) implies false

(!absurd B (not B)))

(!cd (or (not A) (not B)) # but (not A) OR (not B) holds by assumption

(if (not A) false) # hence we get false by constructive dilemma

(if (not B) false)))))

Theorem: (if (or (not A)

(not B))

(not (and A B)))

Athena offers one more way to perform reasoning by contradiction, apart from suppose-absurd: aunary method by-contradiction, which takes a conditional of the form (if P false) and produces(not P), provided that the argument (if P false) is in the assumption base. In addition, if Pis of the form (not Q), then the proposition Q is directly returned (instead of (not (not Q))).This is convenient because many times when we wish to establish a proposition Q we end up showingthat the negation of Q engenders an absurdity, i.e., we show (if (not Q) false). This allows usto conclude (not (not Q)), but then we need to apply the double-negation method (dn) to thatin order to arrive at the desired Q. The by-contradiction method does this automatically. Thismethod is not built-in—it is implemented in an Athena library in terms of suppose-absurd.

23

2.5 Universal generalizations

Universal generalizations are propositions of the form ∀ x . P , asserting that every object x (of somesort) has a property P . Mathematicians often prove such statements by reasoning as follows:

Consider any x (of such-and-such sort). Then · · · D · · · (2.4)

where D is some deduction that derives the proposition P (x). The variable x in this context iscalled an eigenvariable. The idea is that x represents “an arbitrary object.” If we can show that anyobject—i.e., an arbitrary object—has P , then it follows that every object has P . To ensure that xis arbitrary, we must be careful to avoid making any special assumptions about it. In particular, Dmust not rely on any previous assumptions that might happen to contain free occurrences of x.

This is captured in Athena by deductions of the form

(pick-any I D) (2.5)

for any identifier I. To evaluate a deduction of this form in an assumption base β, we first generatea fresh variable x, say, ?v135, or ?foo1999. We then evaluate D′ in β, where D′ is the deductionobtained from D by replacing every free occurrence of I by the fresh variable x.2 If and when theevaluation of D′ produces a proposition P , we generalize and return the conclusion (forall x P)as the result of (2.5). Symbolically, if we write D[I 7→ x] to denote the deduction obtained from Dby replacing every free occurrence of I by x, we have:

β `D[I 7→ x] ; P

β ` (pick-any I D) ; (forall x P)where x is a fresh variable

The replacement of I by a fresh variable is what ensures that I refers to an “arbitrary” object insidethe body D.

Consider, for example, the deduction (pick-any x (!reflex x)). Recall that reflex is a unaryprimitive method that takes any term t and produces the equality (= t t). To evaluate this deductionin some assumption base β, Athena will first generate a fresh variable, i.e., a variable that is guaran-teed not to have appeared anywhere since the beginning of the current Athena session, say ?v18. Itwill then evaluate the body of the pick-any deduction, namely (!reflex x), in a context in which xrefers to ?v18. Thus Athena will essentially be evaluating the deduction (!reflex ?v18). Accordingto the semantics of reflex, that will produce the equality (= ?v18 ?v18). Finally, Athena will gener-alize and return the conclusion (forall ?v18 (= ?v18 ?v18)) as the result of the entire pick-any.Note that for readability purposes Athena might present the conclusion as (forall ?x (= ?x ?x))instead of (forall ?v18 (= ?v18 ?v18)). This is legitimate because the two propositions are alpha-betically equivalent, and as we mentioned earlier, Athena views alphabetically equivalent propositionsas identical.

As another example, here is a proof that the equality relation is symmetric. Recall that sym is aunary primitive method that takes an equality (= s t) and returns (= t s), provided that (= s t)is in the assumption base:

2We are being somewhat vague here, as we are relying on an intuitive understanding of what it means for an identifierI to have a free occurrence inside a deduction D. To be perfectly rigorous, we should say that Athena generates a freshvariable x and then evaluates the body D in a lexical environment in which I denotes x (a lexical environment is afinite function from identifiers to denotable values—terms, propositions, etc.). Athena evaluates a deduction not onlyin the context of an assumption base, but also in the context of a given environment and a given store. So the proofevaluation judgments of the language are really of the form ρ, σ, β `D ; P , where ρ is an environment, σ is a store,and β is an assumption base. A more rigorous presentation of Athena’s semantics is given elsewhere [2].

24

>(pick-any x

(pick-any y

(assume (= x y)

(!sym (= x y)))))

Theorem: (forall ?x:‘S

(forall ?y:‘S

(if (= ?x ?y)

(= ?y ?x))))

Observe the sort of the quantified variables ?x and ?y in the displayed conclusion: ‘S. Here ‘S is a sortvariable, representing a completely arbitrary sort. Athena marks sort variables with an apostrophe,so an annotation such as ?x:‘S means that the variable ?x can range over any domain whatsoever.The only sort constraint in the above proposition is that ?x and ?y range over the same sort. Thatsort could be anything, but it has to be the same for both variables. We will discuss polymorphismin detail in Section ??.

Nested pick-any deductions (pick-any I1 (pick-any I2 (pick-any I3 · · · )) can be abbrevi-ated as (pick-any I1 I2 I3 · · · ). For instance, the preceding proof could also be written as follows:

(pick-any x y

(assume (= x y)

(!sym (= x y)))))

As another example, let us prove the tautology [(∀ x . P (x)) ∧ (∀ x . Q(x))]⇒ (∀ x . P (x) ∧Q(x)),where P and Q are unary predicates on some domain. (E.g., if everything is green and everything islarge, then everything is both green and large.)

>(domain Object)

New domain Object introduced.

>(declare (P Q) (-> (Object) Boolean))

New symbol P declared.

New symbol Q declared.

>(define hypothesis (and (forall ?x (P ?x))

(forall ?x (Q ?x))))

Proposition hypothesis defined.

>(assume hypothesis

(pick-any z

(dseq (!left-and hypothesis) # this will give (forall ?x (P ?x))

(!uspec (forall ?x (P ?x)) z) # this will give (P z)

(!right-and hypothesis) # this will give (forall ?x (Q ?x))

(!uspec (forall ?x (Q ?x)) z) # this gives (Q z)

(!both (P z) (Q z))))) # finally we get (and (P z) (Q z))

Theorem: (if (and (forall ?x:Object

(P ?x))

25

(forall ?x:Object

(Q ?x)))

(forall ?z:Object

(and (P ?z)

(Q ?z))))

2.6 Existential instantiation

The last technique for quantifier reasoning that we will study is existential instantiation, i.e., existentialquantifier elimination. This is a frequently encountered idiom in mathematical discourse, typicallytaking the following form:

We know ∃ x . P , i.e., that P holds for some object.Let w be a name for that object (i.e., let w be a “witness” for ∃ x . P ),

so that P [x 7→ w] holds. Then · · ·D · · ·(2.6)

where D is a deduction that proceeds to derive some conclusion Q with the aid of the assumptionP [x 7→ w], along with whatever other assumptions are currently active. We refer to ∃ x . P as theexistential premise; the term w is called a witness; and the proposition P [x 7→ w] is called the witnesspremise. We call D the body of the existential instantiation. The conclusion Q derived by the bodyD becomes the result of the entire proof (2.6).

As a simple example, consider the proof that for all integers n, Even(n) implies Even(n + 2), i.e.,that if n is even then so is n + 2. The unary predicate Even is defined as follows:

Even(n)⇔∃ k . n = 2 · k (2.7)

The proof uses the lemma∀ n, m . n · (m + 1) = n ·m + n (2.8)

and proceeds as follows:

Pick any n and assume Even(n). Then, by (2.7), we infer ∃ k . n = 2 · k, i.e., thatthere is some number, which, when multiplied by 2, yields n. Let k0 be such anumber, so that n = 2 · k0. Then, by functional congruence, n + 2 = (2 · k0) + 2.But, by (2.8), (2 ·k0)+2 = 2 · (k0 +1), hence, by the transitivity of equality, n+2 =2 · (k0 + 1). Therefore, by existential generalization, we obtain ∃ m . n + 2 = 2 ·m,and so, from (2.7), we conclude Even(n + 2).

We have italicized the existential elimination argument in the above proof. The existential premisehere is ∃ k . n = 2 ·k; the witness is the variable k0; and the witness premise is the equality n = 2 ·k0.The conclusion of the existential instantiation is Even(n + 2).

Returning to the general form (2.6), we mention some caveats that are necessary to ensure sound-ness. The witness variable w must serve as a “dummy”—no special assumptions about it should bemade. In particular, the body D must not rely on any previous working assumptions that mighthappen to contain free occurrences of w. Also, to ensure that the witness is only used as a temporarydummy variable, the conclusion Q should not depend on w in any essential way. Specifically, Q shouldnot contain any free occurrences of w. In summary, we need to adhere to the following principle: thebody D should derive the same conclusion Q—up to alphabetic equivalence—even if we replace w byan arbitrary, freshly generated variable w′.

26

It is not difficult to see what can go astray otherwise. Suppose, for instance, that one of ourcurrent assumptions is the proposition Even(w), for some variable w, along with the existentialpremise ∃ n . Odd(n). Now if we unwisely choose w as a witness for this existential premise, we willobtain the witness premise Odd(w), and hence, in tandem with Even(w), we will be able to concludethat there is a number that is both even and odd.

So we must choose witnesses that do not occur in any current working assumptions. We will seeshortly that Athena ensures this automatically, relieving the user of that book-keeping burden.

To see that the witness must also not occur free in the final conclusion Q, consider the twoexistential premises ∃ n . Even(n) and ∃ n . Odd(n), and suppose that both of them are established(in our current assumption base). If we choose w as a witness for the first proposition, we willobtain the witness premise Even(w). Suppose we simply claim that premise, obtaining Even(w) asthe conclusion of the existential instantiation—a conclusion with a free occurrence of the witnessw. Then suppose we continue and do the same thing with the second existential premise, usingw as a witness again, and concluding the witness premise Odd(w). So at this point we have thetwo conclusions Even(w) and Odd(w) available, and with a simple existential generalization we canconclude ∃ n . Even(n) ∧Odd(n). The upshot is that the witness must not appear in the conclusionof an existential instantiation.

Existential instantiations in Athena are performed by deductions of the form

(pick-witness I (exists x P) D) (2.9)

where I is an identifier that will denote the witness variable, (exists x P) is the existential premise,and D is the body. To evaluate such a deduction in some assumption base β, we first make sure that theexistential premise (exists x P) is in β; if not, we report an error. Assuming that the existentialpremise is in β, we begin by generating a fresh variable z, which will serve as the actual witnessvariable. We then construct the witness premise, call it P ′, obtained from P by replacing every freeoccurrence of x by z. Finally, we construct a deduction D′, obtained from D by replacing every freeoccurrence of I by z, and proceed to evaluate D′ in β ∪ {P ′}, i.e., in the original assumption baseaugmented with the witness premise. If and when that evaluation produces a conclusion Q, we returnQ as the result of the entire proof (2.9), provided that Q does not contain any free occurrences of z(if it does, we report an error). Symbolically:

β ∪ {(exists x P), P [x 7→ z]} `D[I 7→ z] ; Q

β ∪ {(exists x P)} ` (pick-witness I (exists x P) D) ; Q

where z is a fresh variable that does not occur in Q

The fact that the witness variable z is freshly generated is what guarantees that the body D willnot be able to make any special assumptions about it. The freshness of z along with the explicitproviso that it must not occur in the conclusion Q ensures that the witness is used only as a dummytemporary name.

As an example, the following proof uses existential instantiation to derive the tautology

[∃ x . ¬P (x)]⇒ [¬∀ x . P (x)] :

>(assume (exists ?x (not (P ?x)))

(suppose-absurd (forall ?x (P ?x))

(pick-witness w (exists ?x (not (P ?x))) # this gives the witness premise (not (P w))

(dseq (!uspec (forall ?x (P ?x)) w) # this gives (P w)

(!absurd (P w) (not (P w))))))) # but we also have (not (P w))

27

# from the witness premise -- a contradiction

Theorem: (if (exists ?x:Object

(not (P ?x)))

(not (forall ?x:Object

(P ?x))))

The existential premise does not need to be explicitly written out in the form (exists x P).Any phrase that evaluates to (denotes) an existentially quantified proposition can be used, so thesyntax is really (pick-witness I F D) for an arbitrary phrase F . E.g., the following version of thepreceding deduction yields identical results:

>(define ex-premise (exists ?x (not (P ?x))))

Proposition ex-premise defined.

>(assume ex-premise

(suppose-absurd (forall ?x (P ?x))

(pick-witness w ex-premise

(dseq (!uspec (forall ?x (P ?x)) w)

(!absurd (P w) (not (P w)))))))

Theorem: (if (exists ?x:Object

(not (P ?x)))

(not (forall ?x:Object

(P ?x))))

This point will become clearer after we discuss Athena’s integration of computation and deduction.An an example of an incorrect use of pick-witness, the following illustrates the enforcement of

the caveat that the witness variable must not appear in the conclusion:

>(declare even (-> (Rational) Boolean))

New symbol even declared.

>(define even-numbers-exist (exists ?x (even ?x)))

Proposition even-numbers-exist defined.

>(assert even-numbers-exist)

The proposition

(exists ?x:Rational

(even ?x))


>(pick-witness n even-numbers-exist

(!claim (even n)))

Error, top level, 1.2: Failed existential instantiation---the witness variable

occurs free in the resulting proposition.

Another potential error is that the existential premise is not in the assumption base:

28

>(pick-witness n (exists ?x (not (= ?x ?x)))

(!claim true))

Error, top level, 1.17: Existential proposition to be instantiated is not in

the assumption base:

(exists ?x:‘S

(not (= ?x ?x))).

When the existential premise has multiple existentially quantified variables, we can abbreviatenested pick-witness deductions with the form

(pick-witnesses (I1 · · · In) (exists x1 · · ·xn P) D).

The semantics of pick-witnesses is given by desugaring to pick-witness. For instance,

(pick-witnesses (I1 I2) (exists x1 x2 P) D)

is an abbreviation for

(pick-witness I1 (exists x1 x2 P) (pick-witness I2 (exists x2 P [x1 7→ I1] D)))

Here is an example, showing that (exists ?x ?y (< ?x ?y)) implies (exists ?y ?x (< ?x ?y)):

>(define ex-premise (exists ?x ?y (< ?x ?y)))

Proposition ex-premise defined.

>(assume ex-premise

(pick-witnesses (w1 w2) ex-premise

(dseq (!egen (exists ?x (< ?x w2)) w1)

(!egen (exists ?y ?x (< ?x ?y)) w2))))

Theorem: (if (exists ?x:Rational

(exists ?y:Rational

(< ?x ?y)))

(exists ?y:Rational

(exists ?x:Rational

(< ?x ?y))))

Oftentimes it is convenient to give a name to the witness premise, and then be able to refer toit by that name from inside the body of the pick-witness. This can be done by inserting a name(an identifier) before the body D of the pick-witness. That identifier will then refer to the witnesspremise inside D. For example, the proof (pick-witness w (exists ?foo (= ?foo ?foo)) wp D)will give the name wp to the witness premise, i.e., every free occurrence of wp within D will refer tothe witness premise.

2.7 Additional methods of proof composition

2.7.1 dlet: naming and proof composition

So far the only mechanism we have presented for composing deductions is the syntax form

(dseq D1 · · ·Dn D) (2.10)

29

where the result Pi of each intermediate proof Di becomes available as a lemma in the subsequentproofs Di+1, . . . , Dn, D. Oftentimes is is useful to name these intermediate results so we can laterrefer to them by name instead of having to write them out explicitly. A more flexible variant of dseqthat allows for naming is dlet, which is typically of the form

(dlet ((I1 D1) · · · (In Dn)) D) (2.11)

A proof of the form (2.11) has the same assumption-base semantics as (2.10): first D1 is evaluated,yielding some conclusion P1; then P1 is added to the assumption base and we continue with D2; and soon. We continue evaluating intermediate deductions sequentially in this manner and adding their re-sults to the assumption base until we finally get to the last deduction, D. The conclusion of D becomesthe result of the entire proof. The difference with dseq is that as each intermediate conclusion Pj isobtained, we give it the name Ij before continuing with the next deduction. Any subsequent (free) oc-curences of Ij will refer to Pj . To illustrate, consider a proof that (and (male peter) (female ann))implies (and (female ann) (male peter)):

>(define hyp (and (male peter) (female ann)))

Proposition hyp defined.

>(assume hyp

(dlet ((left (!left-and hyp))

(right (!right-and hyp)))

(!both right left)))

Theorem: (if (and (male peter)

(female ann))

(and (female ann)

(male peter)))

The following rule gives a more succinct description of the semantics of dlet when there is onlyone intermediate deduction:

β `D1 ; P1 β ∪ {P1} `D[I1 7→ P1] ; P

β ` (dlet ((I1 D1)) D) ; P

where D[I1 7→ P1] denotes the deduction obtained from D by replacing every free occurrence of I1 byP1. Proofs with multiple intermediate deductions can be desugared to this case, e.g., a proof of theform (dlet ((I1 D1) (I2 D2)) D) can be understood as (dlet ((I1 D1)) (dlet ((I2 D2)) D)),and so on.

dlet bindings can be used to give names not just to results of deductions, but to any valuewhatsoever—terms, propositions, lists, etc. For instance:

>(dlet ((p (male peter))

(a (female ann))

(hyp (and p a)))

(assume hyp

(dlet ((left (!left-and hyp))

(right (!right-and hyp)))

(!both right left))))


30

(female ann))

(and (female ann)

(male peter)))

2.7.2 Nested method calls

The most fundamental way of composing proofs is by having one proof appear directly as an argumentto a method call. Suppose that a deduction D appears as an argument to an application of somemethod M : (!M · · ·D · · · ). To evaluate such a method call in an assumption base β, we first needto evaluate the arguments; in particular, we need to evaluate D in β. The conclusion of D in β, callit P , will be added to β when we come to apply M to the argument values. Hence, the result of Dwill be available as a lemma by the time we come to apply M .

To make matters concrete, suppose we have defined conj to be (and A (not (not B))), andconsider the deduction

(!dn (!right-and conj)) (2.12)

in an assumption base β0 that contains only the premise conj: β0 = {(and A (not (not B)))}. Herethe deduction (!right-and conj) appears directly as an argument to the double negation method dn.To evaluate the entire proof (2.12) in β0, we begin by evaluating the argument (!right-and conj) inβ0. That will successfully produce the conclusion (not (not B)), since the requisite premise is in β0.We are now ready to apply the outer method dn to the argument (not (not B)), but before we doso, we add the conclusion (not (not B)) to β0. Accordingly, the application of dn to (not (not B))will take place in the assumption base

β′0 = {(not (not B)), (and A (not (not B)))}

and will sucessfully produce the ultimate conclusion B:

>(define conj (and A (not (not B))))

Proposition conj defined.



>(assert conj)

The proposition

(and A

(not (not B)))


>(!dn (!right-and conj))

Theorem: B

Accordingly, (!dn (!right-and conj)) is completely equivalent to—and more concise than—thededuction

(dseq (!right-and conj)

(!dn (not (not B))))

31

In general, every time a deduction appears as an argument to a method call, the conclusion of thatdeduction will be incorporated in the assumption base once all the arguments have been evaluatedand we are ready to apply the method. This is depicted succinctly by the following rule:

β `D ; P β ∪ {P} ` (!M · · ·P · · · ) ; Q

β ` (!M · · ·D · · · ) ; Q

As another example, the following is a shorter version of our earlier proof of the conditional(if (and A B) (and B A)):

>(assume (and A B)

(!both (!right-and (and A B))

(!left-and (and A B))))


(and B A))

Because the applications of right-and and left-and appear as arguments to the both call, thelatter will take place in an assumption base that contains the conclusions of those two deductions,namely, B and A respectively. Thus the application of both will succeed, producing (and B A),with the entire assume finally returning (if (and A B) (and B A)). Again, this shorter proof isequivalent to the one we gave earlier using dseq:

(assume (and A B)



(!both B A)))

We mentioned earlier that nesting method calls is a more “fundamental” mechanism for proofcomposition than dseq and dlet. What we mean is that, in the presence of user-defined methods,both dseq and dlet can be desugared to nested method calls. Specifically, (dlet ((I1 D1)) D) isequivalent to

(!(method (I1) D) D1)

This will become clear when we discuss methods.

2.8 Conclusion annotation with BY

Sometimes a proof can be made more readable if we announce its intended conclusion before weactually present the proof. This is done frequently in practice. Authors often say “and now we deriveP as follows: · · · ” or “P follows by · · · .” In Athena such annotations can be made with the BYconstruct, whose syntax is (P BY D), where D is a deduction and P is the intended conclusion ofD.

To evaluate (P BY D) in an assumption base β, we evaluate D in β. If and when we obtaina conclusion Q, we check to make sure that Q is the same as P (up to alphabetic equivalence). Ifso, we simply return P as the final result. If not, we report an error to the effect that the obtainedconclusion was different from the advertised conclusion. Symbolically:

β `D ; P

β ` (P BY D) ; P

Here is a version of our recurring example that (and A B) implies (and B A) that uses the BY feature:

32

>(assume (and A B)

(dseq (A BY (!left-and (and A B)))

(B BY (!right-and (and A B)))

((and B A) BY (!both B A))))


(and B A))

The general form of this construct is (E BY D) for an arbitrary expression E that evaluates toa proposition. E can contain previouly defined names and can indeed spawn an arbitrary amountof computation, as long as it eventually produces a proposition P . We then proceed normally byevaluating D to get a conclusion Q and then comparing P and Q for equality.

2.9 Assume-let and suppose-absurd-let

When the hypothesis of a conditional deduction is a long proposition, it is convenient to be able togive it a name and then refer to it in the body of the deduction by that name. We can do this usinga dlet, writing hypothetical deductions in the following form:

(dlet ((I E)) (assume I D)) (2.13)

Assuming that the expression E yields a proposition, every free occurrence of I in the body D willrefer to that proposition. This is a useful enough idiom that Athena has a shortcut for it:

(assume-let (I E) D).

The semantics of this are identical to those of (2.13). It is just like (assume E D), except that everyfree occurrence of I in D will refer to the proposition produced by E. An example:

>(assume-let (hyp (and (male peter)

(female ann)))

(!both (!right-and hyp)

(!left-and hyp)))


(female ann))

(and (female ann)

(male peter)))

There is a similar form for proofs by contradiction:

(suppose-absurd-let (I E) D).

This is taken as syntax sugar for (dlet ((I E)) (suppose-absurd I D)).

33

Chapter 3

Automated theorem proving

3.1 The need for automation

At this point we have presented enough deductive machinery to enable the user to derive any propo-sition P from any set of propositions that entail P . Technically speaking, the primitive methods wehave described along with the built-in syntax forms (assume, pick-any, etc.) constitute a completedeductive system for multi-sorted first-order logic with equality. This means that if a proposition Pfollows logically from an assumption base β, we can rest assured that P is derivable from β, i.e., thatthere is some deduction D such that β `D ; P .

However, all the inference techniques presented up to this point are of very low granularity: they areall introduction or elimination techniques for the sentential connectives and quantifiers. Constructinga realistic proof exclusively in terms of such “intro/elim” rules can be very instructive—it is a greatway to learn logic—but also very tedious. In a sense, it is like programming in machine language, whereevery step needs to be spelled out in terms of low-level primitives such as bit operations. While that canalso be instructive and help to build one’s insights about computation, it is not a feasible methodologyfor building realistic programs; writing a word processor or a database application in machine languagewould be too difficult. To be productive as software engineers, we need automation: the ability totake multiple steps—potentially hundreds of thousands of them—with one single instruction.

The same holds for proofs. To be productive as proof engineers, we need automation. Athenaoffers two key automation mechanisms, both of which are widely used in practice:

1. User-defined methods: This is the equivalent of procedural abstraction for inference. Just asprogramming languages let us bundle up (“abstract”) a piece of code into a separate procedurewith its own name that can later be invoked as if it were a language primitive, Athena lets usabstract a given deduction into a proof method with its own name that can later be used as ifit were a primitive method.

2. Black-box facilities for automated theorem proving.

In this section we will focus on the second item. User-defined methods will be discussed elsewhere.Athena offers two primitive methods for black-box-style automated theorem proving: derive and

prove. The first, derive, is a binary method that takes a single proposition P (the goal) and a listof propositions [P1 · · ·Pn] (the premises), and attempts to derive P from P1, . . . , Pn, provided thatevery premise Pi is in the current assumption base. An error is generated if some Pi is not in theassumption base.

34

This method is used for skipping tedious inference steps. Many times we know that a certaindesired conclusion P follows from some other propositions P1, . . . , Pn that have already been estab-lished, usually because the entailment is obvious. Indeed, oftentimes we know how to derive P fromP1, . . . , Pn using Athena’s introduction and elimination constructs; if we were challenged, we couldwrite out such a proof in detail. But descending to such a low level of detail would be uninterestingand counterproductive; our effort would not shed any light on the important ideas of the overallargument. In such cases we may issue a method call of the form

(!derive P [P1 · · ·Pn])

essentially saying to Athena: “P follows from P1, . . . , Pn by standard logical manipulations: universalspecializations, modus ponens, etc. There is nothing interesting or deep here—you work out thedetails.” If we are wrong, either because P does not in fact follow from P1, . . . , Pn, or because it isa non-trivial consequence of them, the application will fail within a certain time limit (set by defaultto one minute). Otherwise, a proof of the implication (if (and P1 · · ·Pn) P) will almost certainly befound within a few seconds—often in a fraction of a second—and P will be successfully returned asa theorem.1

The user need not be concerned with how Athena goes about proving that P1, . . . , Pn imply P . Itcould launch a proof search on its own, consult outside experts, or resort to divine intervention—itdoes not matter. The user should view derive as a primitive oracle, a black box that is guaranteedto produce either an answer within a certain amount of time.

For instance, consider our running example, the implication (if (and A B) (and B A)). It couldbe argued that the body of our previous proof:

(assume (and A B)



(!both B A)))

is too low-level. A higher-level proof is this:

>(assume-let (hyp (and A B))

(!derive (and B A) [hyp]))


(and B A))

An even shorter proof is the following:

>(!derive (if (and A B) (and B A)) [])


(and B A))

This succeeds because the conditional (if (and A B) (and B A)), being a tautology, follows from theempty list of premises.

In general, assuming that the premises are in the assumption base, there are three possible out-comes for an application of derive:

1This is sound since we already know that the conjunction (and P1 · · ·Pn) holds, by virtue of the fact that everyPi is in the assumption base.

35

A positive answer is produced within the set time limit. Positive answers should be trusted.The user should take it for granted that if derive succeeds, it is extremely likely that the answeris correct.

A negative answer is produced within the set time limit. Such answers should also be trusted.If derive replies within a few seconds that it could not infer the goal from the premises, then itis highly likely that the goal does not in fact follow from the premises. In those cases the userwill likely be able to find a counter-model—a model that satisfies the premises but falsifies thegoal—using the model generation capabilities of Athena (discussed in Section ??).

A timeout occurs. These cases give us the least information. If derive takes a minute or more andreports that it was unable to infer the goal from the premises, then it could be that the goaldoes not follow from the premises; we might be able to find a counter-model for it. But it couldalso be that the goal follows from the premises and derive simply was not smart enough to finda proof. It may be that if we increase the time limit, derive will succeed. This might be worthtrying, but if it fails then we have to rethink the problem. Usually at that point we have tostart writing out the derivation of the goal ourselves.

The unary method prove takes a single proposition P as its goal and attempts to derive P fromthe entire assumption base. In fact prove is defined in terms of derive:

>(define (prove P) (!derive P (get-assumption-base)))

Method prove defined.

where get-assumption-base is a built-in Athena procedure of zero arguments that returns a list of alland only those propositions that are in the current assumption base. Here is an example of usingprove:

>(declare (a b c d e f g h) Rational)

Symbol a declared.

.

.

Symbol h declared.

>(assert (<= a zero))

The proposition

(<= a zero)


>(assert [(= a b) (= c b) (= c e) (= e d) (= h d) (= f a) (= g f)])

The proposition

(= a b)

has been added to the assumption

base.

.

.

The proposition

(= g f)

36


>(!prove (<= h zero))

Theorem: (<= h zero)

3.2 Automation and the issue of proof presentation

As another example, suppose we have introduces a binary relation younger on persons:

>(declare younger (-> (Person Person) Boolean))

New symbol younger declared.

and assumed that it is irreflexive and transivitive:

>(define younger-is-irreflexive (forall ?p (not (younger ?p ?p))))

Proposition younger-is-irreflexive defined.

>(define younger-is-transitive

(forall ?x ?y ?z

(if (and (younger ?x ?y)

(younger ?y ?z))

(younger ?x ?z))))

Proposition younger-is-transitive defined.

>(assert younger-is-irreflexive younger-is-transitive)

The proposition

(forall ?p:Person

(not (younger ?p ?p)))


The proposition

(forall ?x:Person

(forall ?y:Person

(forall ?z:Person


(younger ?y ?z))

(younger ?x ?z)))))


We now wish to conclude from these assumptions that younger is asymmetric, i.e., that the followingholds:

(forall ?x ?y (if (younger ?x ?y)(not (younger ?y ?x))))

(3.1)

An informal proof might proceed as follows, where we have numbered the various steps of the argument(1)—(6) for easy reference. Note that this is the standard proof that appears in many discrete math

37

textbooks showing that a binary relation which is irreflexive and transitive must also be asymmetric:

Having assumed that younger is irreflexive and transitive, we want to show(forall ?x ?y (if (younger ?x ?y) (not (younger ?y ?x)))). (1) Accordingly,pick any person x and any person y and (2) assume that (younger x y).(3) Now suppose, by way of contradiction, that we have (younger y x). (4)Then, by (younger x y), (younger y x), and the transitivity of younger, weobtain (younger x x). (5) But because younger is irreflexive, we must have(not (younger x x)). (6) Together, (younger x x) and (not (younger x x)) con-stitute a contradiction, hence we must have (not (younger x y)).

(3.2)

We can express this proof in Athena as follows, where we have numbered the steps 1—6 in accordancewith the steps in the informal proof above:

1. (pick-any x y

2. (assume (younger x y)

3. (suppose-absurd (younger y x)

4. (dseq ((younger x x) BY ... )

5. ((not (younger x x)) BY ... )

6. (!absurd (younger x x) (not (younger x x)))))))

Of course this is not a complete proof. We still need to fill the gaps in steps 4 and 5, the derivations of(younger x x) and (not (younger x x)), respectively. In terms of introduction and elimination rules,this can be done as follows:

>(pick-any x y

(assume (younger x y)

(suppose-absurd (younger y x)

(dseq ((younger x x) BY (!mp ((if (and (younger x y) (younger y x)) (younger x x))

BY (!uspec (!uspec (!uspec younger-is-transitive x) y) x))

(!both (younger x y) (younger y x))))

((not (younger x x)) BY (!uspec younger-is-irreflexive x))

(!absurd (younger x x) (not (younger x x)))))))

Theorem: (forall ?x:Person

(forall ?y:Person

(if (younger ?x ?y)

(not (younger ?y ?x)))))

Although correct, this proof has a drawback: steps 4 and 5 are specified in too much detail,and in that respect they are not faithful representations of the corresponding steps of the informalproof (3.2). For a human reader, the appropriate level of detail for, say, step 4, is the level of detailgiven in the informal proof: “we obtain (younger x x) from (younger x y), (younger y x), and thetransitivity of younger.” That is a perfectly rigorous and complete reasoning step for a human; it is notnecessary to descend to the level of universally instantiating younger-is-transitive three successivetimes, then using conjunction introduction on (younger x y) and (younger y x) and finally applyingmodus ponens on the results. Doing so is pedantic. It distracts from the main thread of the argumentwithout adding any significant new insights. Likewise for step 4: we should be able to do in our Athenaproof the same thing we did in the informal proof (3.2): simply say “we infer (not (younger x x))

because younger is irreflexive,” insted of having to pergform universal specialization.

38

Using derive, we can dispense with these details and lift our Athena proof to a higher level ofabstraction:

>(pick-any x y

(assume (younger x y)

(suppose-absurd (younger y x)

(dseq (!derive (younger x x) [(younger x y) (younger y x) younger-is-transitive])

(!derive (not (younger x x)) [younger-is-irreflexive])

(!absurd (younger x x) (not (younger x x)))))))


(forall ?y:Person

(if (younger ?x ?y)


Observe that this shorter proof is virtually isomorphic to the informal proof (3.2). This is one ofthe key advantages of Athena: It allows for well-structured proofs given at roughly the same level ofdetail as they would be in practice.

Could we have an even shorter proof—a completely automatic proof that younger is asymmetricfrom the assumptions of irreflexivity and transitivity? We could:

>(define younger-is-asymmetric

(forall ?x ?y (if (younger ?x ?y)


Proposition younger-is-asymmetric defined.

>(!derive younger-is-asymmetric [younger-is-irreflexive younger-is-transitive])


(forall ?y:Person

(if (younger ?x ?y)


Athena is seamlessly integrated with cutting-edge ATPs 2 (“automated theorem provers”) thatrun under the hood every time derive is invoked. These ATPs are quite powerful, and are able toprove the asymmetry of younger from its irreflexivity and transitivity in a fraction of a second.

However, one might argue that we have now overdone it; that the one-line deduction

(!derive younger-is-asymmetric [younger-is-irreflexive younger-is-transitive])

does not constitute a proof. It is not an argument, in that it offers no indication as to why younger

must be asymmetric if it is irreflexive and transitive.Is this a valid criticism? The answer depends on our purposes. Proofs are useful for two main

reasons: they provide justification for believing in a proposition, and rational explanation as to whythe proposition holds. These are two different things. If we somehow had access to an infallible oraclethat could always magically divine mathematical truth, we would be perfectly justified in believingit if it told us that some proposition P (e.g., Fermat’s conjecture) follows from some propositions

2Currently Vampire [7] and Spass [8] are the two most prominent ATP systems bundled with Athena. Vampire isusually considerably more effective than Spass, and is the default prover used by derive.

39

P1 · · ·Pn (e.g., the axioms of number theory). We would have justification, but we would still nothave any understanding. For that, we need a proof that patiently derives P from P1 · · ·Pn in a chainof inferences, with each step along the way being simple and easily understood.

So if all we want to do is demonstrate that a certain proposition holds, without caring muchabout the underlying reasons, then we may well resort to “push-button” reasoning and use a singleapplication of derive or prove. As long as we have good reason to believe that derive and prove aretrustworthy, a positive answer from them provides enough justification for accepting a proposition.But if we want to explain to our audience why a proposition holds, then we’ll probably have to writeout a proof. How short or long that proof will need to be depends on the intended audience. Ifwe are addressing a mathematically sophisticated audience then we should try to make our proofsshorter, skipping as many tedious steps as possible and highlighting only the main ideas of the overallargument. For professional mathematicians, the one-line proof

(!derive younger-is-asymmetric [younger-is-irreflexive younger-is-transitive])

may be adequate. For others, we may want to opt for proof (??), which depicts the core ideas of theargument without many details. For novices, we may even have to go lower, to the level of introducingand eliminating sentential connectives and quantifiers, and present the first Athena proof we gave inthis section.

We end with a few remarks on efficiency. We mentioned earlier that applications of derive andprove are delegated under the hood to ATPs such as Vampire and Spass. How much power doesderive get from these systems? Assuming that a goal P follows from a list of premises L, will acall (!derive P L) always succeed? The short answer is no; the outcome depends on several factors.First, if the implication is subtle, then derive will most likely fail. To take an extreme example, if thegoal is Fermat’s result and the premises are the axioms of number theory, then derive will certainlyfail. We cannot expect a machine—at least at this point in time—to find within a few seconds a proofof a result that could challenge a competent human for years. But even if the entailment is fairlyobvious (say, to a computer scientist with a background in logic), the call might still fail, dependingmainly on the number and relevance of the given premises, and the size of the propositions. Wediscuss each of these factors briefly:

1. Number and relevance of premises: The fewer the premises and the more relevant they are,the higher the chance of success. Ideally, we will have just a handful of premises and each oneof them will be necessary for the proof. Unneeded premises tend to lead ATPs down blindcorners. A few extra premises will not hurt, and cutting-edge ATPs do a decent job of detectingirrelevant premises and ignoring them. But if dozens—or hundreds or thousands—of premisesare supplied and only four or five of them are needed, the chance of failure increases greatly.This is why the prove method (which supplies the contents of the entire assumption base aspremises) should be used with care, mindful of the number and type of propositions that are inthe assumption base at the time of application.

2. Size and structure of premises: Even when there are a few premises and all of them are neededfor the result, derive might still get stuck if it is dealing with very large propositions (e.g.having more than a dozen quantified variables or a complex propositional structure with mul-tiply nested sentential connectives or quantifiers). The propositional structure of the premisesis also important. Conjunctions and conditionals are fairly innocuous, but disjunctions andbiconditionals complicate automated deduction.

40

Chapter 4

Datatypes and induction

4.1 Datatypes

A datatype is a special kind of domain. It is special in that it is inductively generated, meaning thatevery element of the domain can be built up in a finite number of steps by applying certain functionsknown as constructors. A datatype D is specified by giving its name, possibly followed by some sortparameters (if D is polymorphic; we discuss such datatypes in Chapter 6), and then a sequence ofconstructor profiles. A constructor profile is of the form (c S1 · · ·Sn), consisting of the name of theconstructor, c, along with a list of sorts S1...Sn, where Si is the sort of the ith argument of c. Therange of c is not explicitly mentioned—it is tacitly understood to be the datatype D. That is why,after all, c is called a “constructor” of D; because it builds or constructs members of the datatype.Thus every application of c to n arguments of the appropriate sorts produces an element of D. Anullary constructor (one that takes no arguments, so that n = 0) is called a constant constructor ofD, or simply a constant of D. Such a constructor is meant to denote an individual element. Twotypical datatype examples are the booleans and the natural numbers:

(datatype Boolean

true

false)

(datatype Nat

zero

(succ Nat))

The first example defines a datatype by the name of Boolean that has two constant constructors,true and false. In more conventional notation, the “signature” of true might be written as

true : → Boolean

or simply true:Boolean, and likewise for false.Thus this definition asserts that the datatype Boolean has two elements, true and false. This is

not all that the definition says, however. It also says that true and false are distinct elements; and,moreover, that true and false are the only two elements of Boolean. Suppose then that instead ofthis datatype definition we had opted for the following:

(domain Boolean)

41

(declare true Boolean)

(declare false Boolean)

What is the difference between this approach and the datatype definition? First, this approachdoes not rule out the possibility that true and false might in fact be identical (denote the sameobject). Secondly, it does not rule out the possibility that the domain Boolean might contain otherelements besides true and false, perhaps infinitely more elements. If, however, we postulated thesetwo conditions axiomatically, then the two approaches would indeed be equivalent. That is, if wewrote

(domain Boolean)

(declare (true false) Boolean)

(assert (not (= true false)))

(assert (forall ?b

(or (= ?b true)

(= ?b false))))

then the net effect would be the same as that of the datatype definition. So to a certain extent datatypedefinitions can be viewed as syntax sugar for domain declarations along with the explicit postulationof axioms such as the above.1 For readers who are familiar with universal algebra, these axioms—roughly, that different constructors stand for different objects and that every object is represented bysome constructor—ensure that the datatype is freely generated, i.e., that it is a free algebra. Given adatatype D, we will collectively refer to these axioms as the free generation axioms for D.

The second example above defined a datatype Nat that has one constant constructor zero andone unary constructor succ. The profile (succ Nat) tells us that succ may be written as

succ: Nat→ Nat

while that of zero may be expressed as

zero:→ Nat

or simply

zero: Nat

Because the argument of succ is of the same sort as its result, Nat, we say that succ is a reflexiveconstructor. Thus succ requires an element of Nat as input in order to construct another such elementas output. By contrast, zero, as well as true and false in the preceding example, are irreflexivecontructors—trivially, since they take no arguments.

The free-generation axioms for Nat are as follows:

(forall ?n

(not (= zero (succ ?n))))

(forall ?n ?m

1We say to a certain extent because an additional effect of a datatype definition is the introduction of an inductionprinciple for performing structural induction on the datatype elements (we discuss this in Section 4.2). That cannot bedesugared into other Athena primitives.

42

(if (= (succ ?n) (succ ?m))

(= ?n ?m)))

(forall ?n

(or (= ?n zero)

(exists ?m

(= ?n (succ ?m)))))

or in more traditional notation,

∀ n . 0 6= succ(n)∀ n, m . [succ(n) = succ(m)] ⇒ n = m

∀ n . n = 0 ∨ [∃ m . n = succ(m)]

which the reader might recognize if he has ever seen Peano’s axioms. The axioms assert that nonumber has zero as a successor, that succ is injective, and that the two constructors zero and succ“span” the domain Nat, i.e., every natural number is either zero or the successor of some naturalnumber. These axioms, along with the induction principle that we will describe shortly, ensure thatthe terms

zero, (succ zero), (succ (succ zero)), (succ (succ (succ zero))), . . .

are all distinct from one another; that each of them represents a natural number; and that everynatural number is represented by some such term. Accordingly, these terms may as well be identifiedwith the natural numbers.

The axioms of a datatype can also be obtained by the unary built-in procedure datatype-axioms,which takes the name of the datatype as an input string and returns a list of all and only the free-generation axioms for that datatype. For instance:

>(datatype-axioms "Nat")

List: [(forall ?y1 (not (= zero (succ ?y1))))

(forall ?x1 (forall ?y1 (if (= (succ ?x1) (succ ?y1)) (= ?x1 ?y1))))

(forall ?v (or (= ?v zero) (exists ?x1 (= ?v (succ ?x1)))))]

This often comes handy when we are using the theorem-proving primitive derive and want to passthe free generation axioms of a given datatype as part of the premise list argument to it, e.g.,

(!derive goal (join [hyp P1 P2] (datatype-axioms "Nat")))

(join is a built-in Athena procedure that concatenates any number of lists).Once a datatype D has been introduced, it can be used as a bona fide Athena sort, e.g., we

can declare functions that take elements of D as arguments or return elements of D as results. Forinstance, we can introduce binary addition on natural numbers as follows:

>(declare plus (-> (Nat Nat) Nat))

New symbol plus declared.

Mutually recursive datatypes are allowed, and can be defined with the keyword datatypes. Forexample:

43

>(datatypes

(Int-Tree

empty-tree

(node Integer Int-Tree-List))

(Int-Tree-List

empty-tree-list

(cons Int-Tree Int-Tree-List)))

New datatypes Int-Tree and Int-Tree-List defined.

The constructors of a datatype (or a set of mutually recursive datatypes) must have distinct namesfrom one another, as well as from the constructors of every other datatype and every other declaredfunction symbol.

4.2 Induction

Athena allows for inductive reasoning over datatypes with the built-in syntax form by-induction.Let us take the datatype Nat as an example. To prove that a proposition P holds for every elementof Nat, it suffices to prove that it holds for zero (the basis step); and that if it holds for an arbitrarynatural number n then it also holds for (succ n)—the inductive step. Such an argument can becarried out with a proof of the following form:

(by-induction G(zero D1)

((succ I) D2))

Here G is the goal—a proposition of the form (forall x:Nat P), stating that every natural numberhas some property P . D1 is the proof of the basis step: It must produce the conclusion P [x 7→ zero],i.e., the proposition obtained from P by replacing every free occurrence of x by zero. DeductionD2 performs the inductive step. It must yield the conclusion P [x 7→ (succ v)] on the assumptionP [x 7→ v], where v is a freshly generated variable that will become the value of the identifier I.

More precisely, to evaluate such a deduction in an assumption base β, we first evaluate D1 in β,checking to make sure that the obtained conclusion is P [x 7→ zero] (an error is reported otherwise). Ifwe get the expected conclusion, we continue with the inductive step. A fresh variable v is generated,and we evaluate D2[I 7→ v] in the assumption base β ∪ {P [x 7→ v]}. Note that the propositionP [x 7→ v] represents the inductive hypothesis. If and when we obtain the expected conclusion,P [x 7→ (succ v)], we return the universal generalization G as the result of the entire by-inductionproof. In symbols:

β `D1 ; P [x 7→ zero] β ∪ {P [x 7→ v]} `D2[I 7→ v] ; P [x 7→ (succ v)]β ` (by-induction G (zero D1) ((succ I) D2)) ; G

where v is a fresh variable

As an example, suppose we define plus (binary addition on the natural numbers) via the twousual primitive recursive equations, which we will assert as axioms:

>(define plus-axiom-1

(forall ?n

(= (plus zero ?n) ?n)))

Proposition plus-axiom-1 defined.

44

>(define plus-axiom-2

(forall ?n ?m

(= (plus (succ ?n) ?m)

(succ (plus ?n ?m)))))

Proposition plus-axiom-2 defined.

>(define plus-axioms [plus-axiom-1 plus-axiom-2])

List plus-axioms defined.

>(assert plus-axioms)

The proposition

(forall ?n:Nat

(= (plus zero ?n)

?n))


The proposition

(forall ?n:Nat

(forall ?m:Nat

(= (plus (succ ?n)

?m)

(succ (plus ?n ?m)))))


These two equational axioms completely specify the behavior of plus for any two given naturalnumbers. Suppose we now want to prove the following:

>(define goal

(forall ?n (= (plus ?n zero) ?n)))

Proposition goal defined.

Note that this says something different from plus-axiom-1. The latter says that zero is a left identityfor plus, whereas goal says that it is a right identity.

Perhaps surprisingly, it can be shown that goal does not follow from the free-generation axiomsof the natural numbers along with the two equational axioms definining plus. The only way to provegoal is to use induction, which is what we will now do. First we will give an informal proof:Proof: We use induction. For the basis step we need to prove (= (plus zero zero) zero), whichfollows directly from plus-axiom-1. For the inductive step, pick any n and assume the inductive hy-pothesis (= (plus n zero) n); we now have to show the equality (= (plus (succ n) zero) (succ n)).From plus-axiom-2 we have

(= (plus (succ n) zero) (succ (plus n zero))) (4.1)

But, from the inductive hypothesis, (= (plus n zero) n), and hence (4.1) yields the desired equality(= (plus (succ n) zero) (succ n)).

Here is this proof in Athena:

45

>(by-induction goal

(zero (!derive (= (plus zero zero) zero) [plus-axiom-1]))

((succ n) (dlet ((ind-hyp (= (plus n zero) n))

(P (!derive (= (plus (succ n) zero)

(succ (plus n zero))) [plus-axiom-2])))

(!derive (= (plus (succ n) zero) (succ n)) [P ind-hyp]))))

Theorem: (forall ?n:Nat

(= (plus ?n zero)

?n))

Note that the detail level and overall structure of this Athena proof is the same as that of theinformal one.

Using the procedural capabilities of Athena, we can shorten and clean up the above proof some-what. We have not covered procedures yet, but what we are about to say should make sufficientintuitive sense. The reader should return to this section after reading up on the computational as-pects of Athena (using Athena as a programming language).

Specifically, let us define a unary procedure property that takes any term t of sort Nat as inputand constructs the proposition (= (plus t zero) t):

>(define (property t)

(= (plus t zero) t))

Procedure property defined.

>(property ?foo)

Term: (= (plus ?foo zero)

?foo)

>(property (plus (succ zero) ?n))

Term: (= (plus (plus (succ zero)

?n)

zero)

(plus (succ zero)

?n))

We can now express the preceding inductive proof as follows:

>(by-induction (forall ?n (property ?n))

(zero (!derive (property zero) [plus-axiom-1]))

((succ n) (dlet ((ind-hyp (property n))

(P (!derive (= (plus (succ n) zero)


(!derive (property (succ n)) [P ind-hyp]))))


(= (plus ?n zero)

?n))

This is cleaner because our basis and inductive goals are now expressible simply as (property zero)and (propery (succ n)), respectively. The inductive hypothesis becomes (property n). This

46

means that we do not have to explicitly name the inductive hypothesis in the dlet; we can denoteit simply by writing (property n). Removing the explicit introduction of ind-hyp from the aboveproof leads to:



((succ n) (dlet ((P (!derive (= (plus (succ n) zero)


(!derive (property (succ n)) [P (property n)]))))


(= (plus ?n zero)

?n))

If we wish, we can make the proof shorter by skipping the formulation of the P lemma, passinginstead plus-axiom-2 as an explicit premise to derive:



((succ n) (!derive (property (succ n)) [plus-axiom-2 (property n)])))


(= (plus ?n zero)

?n))

Which version is more readable is debatable. Again, that depends mostly on the reader’s level ofsophistication.

Can we get more automation yet? We can. First, we can use prove instead of derive, so that wedon’t have to explicitly identify the necessary premises:


(zero (!prove (property zero)))

((succ n) (!prove (property (succ n)))))


(= (plus ?n zero)

?n))

This works because by the time the inductive subdeduction (!prove (property (succ n))) isevaluated, the inductive hypothesis (property n) has been added to the assumption base and willhence be supplied as a premise to prove.

We can go further by using Athena’s facilities for defining methods. We can abstract the precedingproof into a general induction method that takes an arbitrary natural-number “property” (expressedas a procedure that takes a natural number term and constructs an appropriate proposition), andperforms the corresponding induction:

>(define (nat-induction property)

(by-induction (forall ?n (property ?n))

(zero (!prove (property zero)))

((succ n) (!prove (property (succ n))))))

Method nat-induction defined.

We can now simply say:

47

>(!nat-induction property)


(= (plus ?n zero)

?n))

Because nat-induction uses prove, it is likely to fail if the assumption base is large. We can definea more fine-tuned version of nat-induction that takes a list of propositions as a second argument,representing the premises that are actually needed to carry out the proofs of the two steps (basis andinductive), and supplies those to derive:

>(define (nat-induction-1 property premises)

(by-induction (forall ?n (property ?n))

(zero (!derive (property zero) premises))

((succ n) (!derive (property (succ n)) (add (property n) premises)))))

Method nat-induction-1 defined.

Note that the inductive hypothesis (property n) is explicitly added to the given premises during theinductive step (add is a predefined Athena procedure that prepends an arbitrary element to the frontof a list). A sample use of nat-induction-1 would be:

>(!nat-induction-1 property plus-axioms)


(= (plus ?n zero)

?n))

4.3 More induction examples

In this section we give examples of induction on slightly more complicated data types: lists and trees.

4.3.1 Lists

We can define lists of natural numbers as the following datatype:

>(datatype NatList

nil

(cons Nat NatList))

New structure NatList defined.

Thus a NatList term is either the empty list nil or a list of the form (cons n l), obtained byprepending (“consing”) a natural number n to the front of some other list of natural numbers l.Therefore, the following are all valid terms of sort NatList:

nil (the empty list [])(cons zero nil) (the 1-element list [0])

(cons (succ (succ zero)) nil) (the 1-element list [2])(cons zero (cons (succ zero) (cons zero nil))) (the 3-element list [0, 1, 0])

Reasoning by induction on such lists can be performed with proofs of the form

48

(by-induction G(nil D1)

((cons I1 I2) D2))

where G is the goal to be established by induction, a proposition of the form (forall l:NatList P),asserting that every list of natural numbers has some property P . The basis step, performed by D1,consists in proving P [l 7→ nil], i.e., that the empty list nil has property P . The inductive step,performed by D2, consists in proving that an arbitrary list of the form (cons h t) has property Pon the assumption that the tail t has P ; that assumption represents the inductive hypothesis. Moreprecisely, D2 needs to show P [l 7→ (cons h t)] under the assumption P [l 7→ t], for freshly generatedvariables h and t. In symbols:

β `D1 ; P [l 7→ nil] β ∪ {P [l 7→ t]} `D2[I1 7→ h][I2 7→ t] ; P [l 7→ (cons h t)]β ` (by-induction G (nil D1) ((cons I1 I2 D2)) ; G

where h, t are fresh variables

As an example, let us define a concatenation operation on lists with the following equations:2

>(declare append (-> (NatList NatList) NatList))

New symbol append declared.

>(define append-axiom-1

(forall ?l

(= (append nil ?l) ?l)))

Proposition append-axiom-1 defined.

>(define append-axiom-2

(forall ?n ?rest ?l

(= (append (cons ?n ?rest) ?l)

(cons ?n (append ?rest ?l)))))

Proposition append-axiom-2 defined.

>(define append-axioms [append-axiom-1 append-axiom-2])

List append-axioms defined.

>(assert append-axioms)

... output of assert omitted for brevity...

2Note that it is not necessary to define append with two equations. We could just as well do with a single conjunctiveaxiom. However, it is usually cleaner to break up the specification of a function in a number of different equationscorresponding to the datatype patterns that some of the arguments may assume. This is similar to function definitionvia pattern matching in functional languages such as ML or Haskell. For instance, if we were to define NatList andappend in ML:

datatype NatList = nil | cons of Nat * NatList;

fun append(nil,l) = l

| append(cons(n,rest),l) = cons(n,append(rest,l))

the two clauses above would correspond precisely to our two equations append-axiom-1 and append-axiom-2.

49

Moreover, let us define the length of a list as follows:

>(declare length (-> (NatList) Nat))

New symbol length declared.

>(define length-axiom-1

(= (length nil) zero))

Proposition length-axiom-1 defined.

>(define length-axiom-2

(forall ?x ?rest

(= (length (cons ?x ?rest)) (succ (length ?rest)))))

Proposition length-axiom-2 defined.

>(define length-axioms [length-axiom-1 length-axiom-2])

List length-axioms defined.

>(assert length-axioms)


Now suppose we want to prove the following:

>(define length-of-append

(forall ?l1 ?l2

(= (length (append ?l1 ?l2))

(plus (length ?l1) (length ?l2)))))

Proposition length-of-append defined.

We can do this by induction on the first list. We will see later that this inductive proof can be donecompletely automatically in Athena (in a fraction of a second), but for now we will give the proofourselves. We present an informal proof first:Proof: We will use induction. For the basis step—covering the empty list case—we need to show

(forall ?l2

(= (length (append nil ?l2))

(plus (length nil) (length ?l2))))

Pick any list l2. We need to show (= L R), where the “left-hand side” L is the term

(length (append nil l2))

and the right-hand side R is (plus (length nil) (length l2)). From append-axiom-2, L is equalto (length l2); while R is equal to (length l2) by virtue of length-axiom-1 and plus-axiom-1.We thus have the desired identity (= L R).

Consider now the inductive step. Here we need to show that for any list of the form (cons h t)(i.e., for arbitrary h and t), we have

50

(forall ?l2

(= (length (append (cons h t) ?l2))

(plus (length (cons h t)) (length ?l2))))

assuming the inductive hypothesis

(forall ?l2

(= (length (append t ?l2))

(plus (length t) (length ?l2))))

Accordingly, pick any l2 and let L and R be the left- and right-hand sides of the desired equality,namely,

L = (length (append (cons h t) l2))

andR = (plus (length (cons h t)) (length l2))

We will derive (= L R) by reducing L and R to a common term. We have:

L = (length (cons h (append t l2))) (by append-axiom-2)= (succ (length (append t l2))) (by length-axiom-2)= (succ (plus (length t) (length l2))) (by the inductive hypothesis)= (plus (succ (length t)) (length l2)) (by plus-axiom-2)

whileR = (plus (succ (length t)) (length l2))

by length-axiom-2. Therefore, (= L R) and the proof is complete.

Sometimes it is convenient—and more modular—to separate the basis and inductive steps intotwo different pieces of reasoning, each with its own name, which can later be invoked independently,instead of “inlining” both steps within the inductive proof. Here is the Athena proof of the basis step:

(define basis-step

(pick-any l2

(dlet ((left (length (append nil l2)))

(right (plus (length nil)

(length l2)))

(goal (= left right))

(P1 (!derive (= left (length l2)) [append-axiom-1]))

(P2 (!derive (= right (length l2)) [length-axiom-1 plus-axiom-1])))

(!derive goal [P1 P2]))))

Observe again that the structure of this proof as well as the granularity of the reasoning steps aresimilar to those of the corresponding informal proof.

The proof of the inductive step illustrates a common and interesting pattern of equality reasoning.In informal proofs we often derive an equality s = t by displaying a list of n > 1 identities andjustifications in the following format:

s = t1 (from premises1)s = t2 (from premises2)s = t3 (from premises3)

...s = tn (from premisesn)

51

where tn = t and each premisesi is a list of propositions known to hold (“in the assumption base”)at the time of the proof. The idea here is that the first identity s = t1 follows from premises1, whilepremisesi+1 entail the equality ti = ti+1, for i = 1, . . . , n−1, so that s = ti+1 follows from the previousequation s = ti and transitity. For brevity, sometimes s is omitted from the left side of the equationsafter the first identity:

s = t1 (from premises1)= t2 (from premises2)= t3 (from premises3)...= tn (from premisesn)

We can capture this form of reasoning in Athena with a unary method id-chain (“identity chain”)which takes as input a non-empty list of 2n arguments

[(= s t1) premises1(= s t2) premises2

...(= s tn) premisesn]

where n > 0 and proceeds to perform the reasoning described above, resulting, if successful, in theequality (= s tn). Note that id-chain is not an Athena primitive. But it can be readily coded as auser-defined method (we present its definition at the end of this section). This should give the readera taste of the power and versatility of methods. Using id-chain, we can express the proof of theinductive step as a method that is parameterized over h and t (the arbitrary head and tail of the listbeing inducted on):

(define (inductive-step h t)

(pick-any l2

(dlet ((left (length (append (cons h t) l2)))

(right (plus (length (cons h t))

(length l2)))

(goal (= left right))

(ind-hyp (forall ?l2

(= (length (append t ?l2))

(plus (length t) (length ?l2)))))

(common-form (plus (succ (length t)) (length l2)))

(P ((= left common-form) BY

(!id-chain [(= left (length (cons h (append t l2)))) [append-axiom-2]

(= left (succ (length (append t l2)))) [length-axiom-2]

(= left (succ (plus (length t) (length l2)))) [ind-hyp]

(= left (plus (succ (length t)) (length l2))) [plus-axiom-2]])))

(Q (!derive (= right common-form) [length-axiom-2])))

(!derive goal [P Q]))))

We can now put together the entire inductive proof of length-of-append as follows:

>(by-induction length-of-append

(nil (!claim basis-step))

52

((cons h t) (!inductive-step h t)))

Theorem: (forall ?l1:NatList

(forall ?l2:NatList


(plus (length ?l1)

(length ?l2)))))

One might argue that even this proof is too low-level, and that given how “obvious” the result weare trying to prove is, more detail ought to be supressed. By increasing the level of automation, wecan skip more steps. First, as we did with the natural numbers example, let us define the propertywe are trying to prove in a more modular way, as a procedure that takes any term denoting a list ofnatural numbers and builds the appropriate proposition:

>(define (length-of-append-property l1)

(forall ?l2

(= (length (append l1 ?l2))

(plus (length l1) (length ?l2)))))

Procedure length-of-append-property defined.

We are able to establish both the basis and the inductive step automatically, expressing the entireinduction as follows:

>(by-induction (forall ?l (length-of-append-property ?l))

(nil (!prove (length-of-append-property nil)))

((cons x rest) (!prove (length-of-append-property (cons x rest)))))


(forall ?l2:NatList


(plus (length ?l1)

(length ?l2)))))

As before, we can abstract this proof into a generic method for performing induction on lists ofnatural numbers as follows:

>(define (nat-list-induction property)

(by-induction (forall ?l (property ?l))

(nil (!prove (property nil)))

((cons x rest) (!prove (property (cons x rest))))))

Method nat-list-induction defined.

We can now prove our original goal with a single line:

>(!nat-list-induction length-of-append-property)


(forall ?l2:NatList


(plus (length ?l1)

(length ?l2)))))

We will see later that an even greater degree of reusability can be achieved with polymorphism.

53

4.3.2 Trees

We can define binary branching structures with natural numbers at the leaves as follows:

>(datatype NatBTree

(leaf Nat)

(node NatBTree NatBTree))

New datatype NatBTree defined.

For instance, the tree r

JJ

JJr1

JJ

JJ2 0

is represented by the term

(node (node (leaf (succ (succ zero)))

(leaf zero))

(leaf (succ zero)))

As the reader might be able to guess at this point, reasoning by induction on such trees is donewith proofs of the form

(by-induction G((leaf I) D1)

((node I1 I2) D2))

Here G is the goal we seek to establish, i.e., a proposition of the form

(forall t:NatBTree P)

asserting that every binary tree of natural numbers has some property P . As before, the basis stepwill deal with the irreflexive constructor leaf, while the inductive step will deal with the reflexiveconstructor node. Specifically, the basis step, performed by D1, must establish the conclusion P [t 7→(leaf n)] for a fresh variable n; in other words, it must show that an arbitrary tree of the from(leaf n) has the desired property P .

The inductive step must establish the proposition P [t 7→ (node t1 t2)] on the assumptions P [t 7→t1] and P [t 7→ t2] (the inductive hypotheses), where t1 and t2 are fresh. In other words, it mustshow that an arbitrary tree of the form (node t1 t2) has the desired property P , assuming that thesubtrees t1 and t2 have it. Symbolically:

β `D1[I 7→ n] ; P [t 7→ (leaf n)] β ∪ {P [t 7→ t1], P [t 7→ t2]} `D2[Ij 7→ tj ] ; P [t 7→ (node t1 t2)]

β ` (by-induction G ((leaf I) D1) ((node I1 I2 D2)) ; G

where n, t1, t2 are fresh variables, j ∈ {1, 2}

As an example, suppose we define a function that “reflects” a binary tree:

54

>(declare reflect (-> (NatBTree) NatBTree))

New symbol reflect declared.

>(define reflect-axiom-1

(forall ?n

(= (reflect (leaf ?n))

(leaf ?n))))

Proposition reflect-axiom-1 defined.

>(define reflect-axiom-2

(forall ?t1 ?t2

(= (reflect (node ?t1 ?t2))

(node (reflect ?t2) (reflect ?t1)))))

Proposition reflect-axiom-2 defined.

>(assert reflect-axiom-1 reflect-axiom-2)


We can think of reflect as a generalized two-dimensional version of list reversal. For instance,the result of applying reflect to the tree depicted graphically earlier would be:r

JJ

JJr0 2

JJ

JJ

1

We will prove the following by induction:

(forall ?t (= (reflect (reflect ?t)) ?t))

We again begin with an informal proof.Proof: For the basis step we need to derive the identity

(= (reflect (reflect (leaf n)))

(leaf n))

for arbitrary n. This follows directly from (two applications of) reflect-axiom-1.For the inductive step we need to show

(= (reflect (reflect (node t1 t2)))

(node t1 t2))

for arbitrary t1 and t2, assuming the inductive hypotheses

(= (reflect (reflect t1)) t1)

and(= (reflect (reflect t2)) t2)

55

Let L and R denote the left and right sides of the desired equation:

L = (reflect (reflect (node t1 t2)))

R = (node t1 t2)

We show (= L R) by reducing both sides to a common term. We have:

L = (reflect (node (reflect t2) (reflect t1))) (by reflect-axiom-2)L = (node (reflect (reflect t1)) (reflect (reflect t2))) (by reflect-axiom-2)L = (node t1 t2) (by the inductive hypotheses)

This completes the induction.

We express the Athena proof as follows:

>(define (reflect-property t)

(= (reflect (reflect t)) t))

Procedure reflect defined.

>(by-induction (forall ?t (reflect-property ?t))

((leaf n) (!derive (reflect-property (leaf n)) [reflect-axiom-1]))

((node t1 t2) (dlet ((L (reflect (reflect (node t1 t2))))

(R (node t1 t2)))

((= L R) BY

(!id-chain [(= L (reflect (node (reflect t2) (reflect t1))))

[reflect-axiom-2]

(= L (node (reflect (reflect t1)) (reflect (reflect t2))))

[reflect-axiom-2]

(= L (node t1 t2))

[(reflect-property t1) (reflect-property t2)]])))))

Theorem: (forall ?t:NatBTree

(= (reflect (reflect ?t))

?t))

A more automatic proof is also possible:

>(by-induction (forall ?t (reflect-property ?t))

((leaf n) (!prove (reflect-property (leaf n))))

((node t1 t2) (!prove (reflect-property (node t1 t2)))))

Theorem: (forall ?t:NatBTree

(= (reflect (reflect ?t))

?t))

4.3.3 General case

In the general case, a by-induction proof is of the form

(by-induction (forall x P) (π1 D1) · · · (πn Dn))

56

where the variable x in P ranges over a datatype T , and each πi is a datatype pattern. Such a patternis either a constant constructor of T (such as zero or nil); or else it is of the form (c ρ1 · · · ρk) wherec is a constructor of T with arity k and each ρi is either an identifier I or another pattern.

To evaluate such a deduction in a given assumption base β, we go through the pairs (πi Di)sequentially, i = 1, . . . , n. Specifically, let Ii

1, . . . , Iimi

be the identifiers that occur in the patternπi. Fresh variables vi

1, . . . , vimi

are generated, and by consistently replacing every occurrence of Iij

inside πi by vij (for j = 1, . . . ,mi) we obtain a term ti of sort T . Appropriate inductive hypotheses

P [x 7→ vij ] are constructed for every fresh variable vi

j whose sort in ti is T , and then the proofDi[Ii

1 7→ vi1, . . . , I

imi

7→ vimi

] is evaluated in the assumption base β augmented with the inductivehypotheses. If Di produces the expected conclusion P [x 7→ ti] then we proceed with the next pattern,otherwise an appropriate error is generated. If all pattern-deduction pairs are successfully processedin this fashion, we ultimately produce the conclusion (forall x P).

Note that the given patterns must exhaust the datatype T ; that is, any given constructor groundterm3 of sort T must be an instance of—i.e., it must “match”—some pattern. Thus the naturalnumber patterns in the following inductive proofs are acceptable:

(by-induction goal

(zero ...)

((succ zero) ...)

((succ (succ n)) ...))

because every natural number matches one of the three patterns. But a proof of the form

(by-induction goal

(zero ...)

((succ zero) ...)

((succ (succ (succ n))) ...))

would be rejected. Athena would complain that the given patterns do not exhaust Nat, and wouldproduce (succ (succ zero)) as a counter-example.

3A “constructor ground term” is a ground term built exclusively from constructors.

57

Exercise 4.1 Give an Athena proof of the associativity of append:

(forall ?l1 ?l2 ?l3

(= (append ?l1 (append ?l2 ?l3))

(append (append ?l1 ?l2) ?l3)))

First see if this can be proven completely automatically. Then give an Athena natural deduction proofof it at the same level of abstraction as you would in an informal English proof.

Exercise 4.2 Declare and define a list-reversal function as follows:

(declare reverse (-> (NatList) NatList))

(define reverse-axiom-1

(= (reverse Nil) Nil))

(define reverse-axiom-2

(= (reverse (cons ?x ?rest))

(append (reverse ?rest) (cons ?x nil))))

Prove the following: (forall ?l (= (reverse (reverse ?l)) ?l)).

Exercise 4.3 The preceding reversal function is a naive way to reverse a list: it takes quadratic timein the length of the given list. Consider the following reversal algorithm:

(declare linear-rev (-> (NatList NatList) NatList))

(define linear-rev-axiom-1

(forall ?l2

(= (linear-rev nil ?l2) ?l2)))

(define linear-rev-axiom-2

(forall ?x ?rest ?l2

(= (linear-rev (cons ?x ?rest) ?l2)

(linear-rev ?rest (cons ?x ?l2)))))

(declare efficient-reverse (-> (NatList) NatList))

(define efficient-reverse-axiom

(forall ?l

(= (efficient-reverse ?l)

(linear-rev ?l nil))))

linear-rev is tail-recursive and takes linear time in the length of the first list argument. Accord-ingly, efficient-reverse is a linear-time algorithm. But first it would be nice to make sure thatefficient-reverse actually does what it is supposed to do—reverse a list. Prove the following:

(forall ?l

(= (efficient-reverse ?l)

(reverse ?l)))

58

Chapter 5

Defining new methods

5.1 Proof abstraction

Many Athena proofs can be naturally abstracted and turned into proof algorithms called methods,which can then be applied to arbitrary arguments as if they were Athena primitives. As a simpleexample, consider the proof of (if A A), where A is a Boolean constant:

>(declare (A B C) Boolean)



New symbol C declared.

>(assume A

(!claim A))

Theorem: (if A A)

Here A happens to be a Boolean constant, but that is immaterial; we could replace A by any propositionP and get a perfectly good deduction. Accordingly, we can abstract this proof into a generic proofmethod, let us call it M for simplicity, that will take an arbitrary proposition P and prove the conditional(if P P):

>(define M

(method (P)

(assume P

(!claim P))))

Method M defined.

We say that P is a parameter of M (in this case the only parameter, since M is unary), while thededuction (assume P (!claim P)) is the body of M. We can now use M as if it were a primitive method:

>(!M false)

Theorem: (if false false)

59

>(!M (and A A))

Theorem: (if (and A A)

(and A A))

Now let us write a binary method that takes any two propositions P and Q and derives the condi-tional (if (and P Q) (and Q P)). We call this method conj-comm (for “conjunction commutativity”):

>(define conj-comm

(method (P Q)

(assume (and P Q)

(dseq (P BY (!left-and (and P Q)))

(Q BY (!right-and (and P Q)))

((and Q P) BY (!both Q P))))))

Method conj-comm defined.

>(!conj-comm A B)


(and B A))

>(!conj-comm A A)

Theorem: (if (and A A)

(and A A))

>(!conj-comm A [])

Error, top level, 3.21: Wrong kind of value given as second argument

to propositional constructor and---a proposition was expected, but

the given argument was a list: [].

It is instructive to analyze this error message. When conj-comm is applied to two actual argumentsv1 and v2, Athena proceeds to evaluate the body of conj-comm with the actual arguments v1 and v2

substituted in place of the parameters P and Q, respectively.1 So when we apply conj-comm to A andthe empty list [], Athena attempts to evaluate the body of conj-comm with A and [] substituted inplace of P and Q, namely:

(assume (and A [])

(dseq (A BY (!left-and (and A [])))

([] BY (!right-and (and A [])))

((and [] A) BY (!both [] A))))

This deduction is ill-formed, e.g., the hypothesis (and A []) is nonsense (a type error), with a listappearing where a proposition should be. In a statically typed language such an error would bedetected during the definition of conj-comm, but Athena is dynamically typed, which means that theerror can only surface at runtime, during the application of conj-comm to actual arguments. Observealso that the offending argument [] in the expression (and A []) appeared in line 3, position 21 in

1This is similar to the process of “β-substitution” in the λ-calculus.

60

the input stream of the earlier definition of conj-comm, which is why the error message refers to thatparticular location.

5.2 Beyond parametric abstraction: pattern matching

Both of the methods in the previous section were obtained verbatim from concrete proofs simply bysubstituting parameters in place of concrete propositions. This process is called parametric absraction,and is the simplest way to define methods. In many cases, however, we need to define methods thatwork harder; most methods need to at least analyze the form of their inputs and then choose anappropriate course of action. This is accomplished using pattern matching.

Consider, for instance, the following specification for a method:

Write a unary method curry that takes a premise of the form (P ∧Q)⇒R and derivesthe conclusion P ⇒ (Q⇒R).

(Terminological note: we say that an input of a method M is a “premise” if that input is a propositionthat can be assumed to hold—i.e., to be in the assumption base—whenever M is called. So in thiscase the above specification means that every time curry is called with an actual argument, thatargument will be (a) a proposition of the form (P ∧Q)⇒R, and (b) this proposition will be in theassumption base.) In more traditional graphical notation, we want curry2 to perform the followingtype of inference:

` (P1 ∧ P2)⇒P3

` P1 ⇒ (P2 ⇒P3)

The following Athena method does that:

(define curry

(method (premise)

(dmatch premise

((if (and P Q) R) (assume P

(assume Q

(dseq ((and P Q) BY (!both P Q))

(R BY (!mp premise (and P Q))))))))))

Here curry uses the pattern-matching syntax form dmatch to analyze the form of the input premise,and to decompose it into its constituents. The general form of dmatch is

(dmatch V(π1 D1)

...(πn Dn))

where π1, . . . , πn are patterns and D1, . . . , Dn are deductions. The value V is the “discriminant”that we try to match against the given patterns. We will cover pattern matching in detail later on(Section 7.3), but some informal remarks will be useful at this point.

2Note: the name curry derives from identifying the connectives ∧ and ⇒ with the type constructors × and →,respectively. In that light, the currying rule is viewed as transforming a functional type of the form (T1 × T2)→ T3 tothe “curried” T1→ T2→ T3.

61

To evaluate a dmatch of the above form in an assumption base β and a lexical environment ρ,3

we start testing the discriminant V sequentially against the patterns πi, in the order i = 1, . . . , n, todetermine if it matches any of them. When we encounter the first pattern πj that successfully matchesthe discriminant under some substitution ρ′ = [I1 7→ V1, . . . , Ik 7→ Vk], we evaluate the correspondingdeduction Dj in the context of β and ρ augmented with ρ′; the result of that evaluation becomes theresult of the entire dmatch.

An attempt to match a value V against a pattern π will either result in failure, indicating that πdoes not correctly capture the structure of V ; or else it will produce a matching lexical environmentρ′ = [I1 7→ V1, . . . , Ik 7→ Vk], signifying that V successfully matches π under the bindings Ij 7→ Vj ,j = 1, . . . , k. In the latter case we say that V matches π under ρ′.

The underscore character _ is the “star operator” constant that matches any given value whatso-ever.

Suppose, for example, that the discriminant is the proposition (or true (not false)). This propo-sition:

• matches the pattern (or P1 P2) under [P1 7→ true, P2 7→ (not false)];

• matches P under [P 7→ (or true (not false))];

• matches (or true (not Q)) under [Q 7→ false]

• matches all three patterns _, (or _ _), and (or _ (not _)) under the empty environment [].

To make this concrete, suppose that the proposition (if (and A B) C) is in the assumption base,and we want to apply the method curry to it in order to derive (if A (if B C)). The application(!curry (if (and A B) C)) will spawn the evaluation of the body of curry in the context premise 7→(if (and A B) C). This leads us to the dmatch inside the body of curry, with (if (and A B) C) asthe discriminant. The first—and only—pattern (if (and P Q) R) successfully matches the discrim-inant under [P 7→ A, Q 7→ B, R 7→ C]; we therefore proceed to evaluate the corresponding hypotheticaldeduction in the properly augmented environment, which will ultimately yield the desired conclusion(if A (if B C)):

>(assert (if (and A B) C))

The proposition

(if (and A B)

C)


>(!curry (if (and A B) C))

Theorem: (if A

(if B C))

3A lexical environment is a mapping from identifiers to values. Athena maintains a global lexical environmentthat holds all the definitions and declarations made by the user. E.g., when a user enters a directive such as(define P (or true false)), the global lexical environment is extended with the binding P 7→ (or true false).

62

5.3 More on pattern matching

Propositions are not the only values on which we can perform pattern matching. We can also writepatterns for terms, for lists, and indeed for any combination thereof. Consider, for instance, thefollowing patterns:

1. t

2. (mother t)

3. (mother (father person))

4. (father _)

The term (mother (father ann))—denoting Ann’s paternal grandmother—matches the first patternunder [t 7→ (mother (father ann))]; it matches the second pattern under [t 7→ (father ann)]; itmatches the third pattern under [person 7→ ann]; and it does not match the fourth pattern.

Term variables (such as ?x) are themselves Athena values, and hence can become bound to patternvariables. For example, the term (union ?s1 ?s2) matches the pattern (union left right) under thebindings left 7→ ?s1, right 7→ ?s2. Any occurrence of a term variable inside a pattern acts as aconstant—it can only be matched by that particular variable. For example, the pattern (succ ?n)

will only match exactly one value: the term (succ ?n).Quantified propositions can also be straightforwardly decomposed. Consider, for instance, the

pattern (forall x P). The proposition (forall ?p (male (father ?p))) will match this pattern underthe bindings x 7→ ?p, P 7→ (male (father ?p)). The proposition

(forall ?x

(exists ?y (and (<= ?x ?y)

(not (= ?x ?y)))))

will also match, with x 7→ ?x, P 7→ (exists ?y (and (<= ?x ?y) (not (= ?x ?y)))).Any identifier inside a pattern that is not a function symbol or a propositional constructor (such as

if, forall, etc.) is interpreted as a pattern variable, and can become bound to a value during patternmatching. For instance, assuming that person has not been declared as a function symbol, then theterm (father ann) will match the pattern (father person) under [person 7→ ann]. But in the pattern(mother joe), the identifier joe, being a function symbol, is not a pattern variable and hence cannotbecome bound to any values. The only value that will match the pattern (mother joe) is the term(mother joe). So it is crucial to distinguish function symbols from regular identifiers inside patterns.

Athena supports non-linear patterns, where multiple occurrences of the same pattern variable areconstrained to refer to the same value. For instance, the pattern (or P P) matches the proposition(or true true) under [P 7→ true]; but it does not match the proposition (or true false). Likewise,the pattern (union s s) matches the terms (union null null) and

(union (intersection ?x ?y) (intersection ?x ?y))

but does not match (union ?foo null).Lists are usually taken apart with two main types of patterns: patterns of the form (list-of π1 π2),

and those of the form [π1 · · ·πn] (where the πi are arbitrary subpatterns). The first type of pat-tern matches any non-empty list [V1 · · ·Vk], k ≥ 1, such that V1 matches π1 and the tail [V2 · · ·Vk]matches π2. For example, the three-element list [zero ann (father peter)] matches the pattern(list-of head tail) under the bindings head 7→ zero, tail 7→ [ann (father peter)]. The second kind

63

of pattern, [π1 · · ·πn], matches those n-element lists [V1 · · ·Vn] such that Vi matches πi, i = 1, . . . , n.For instance, the list [ann peter] matches the pattern [s t] under s 7→ ann, t 7→ peter; but it doesnot match the pattern [s s]—a non-linear pattern that is only matched by two-element lists withidentical first and second elements.

We have only scratched the surface of Athena’s pattern-matching capabilities here. The subjectis covered in detail in Section 7.3.

5.4 Additional examples

Let us write a method to implement modus tollens. Modus tollens is an inference rule that allows usto go from two premises of the form P ⇒Q and ¬Q to the conclusion ¬P :

` P ⇒Q ` ¬Q

` ¬P

We can implement this in Athena as follows:

>(define modus-tollens

(method (premise1 premise2)

(dmatch [premise1 premise2]

([(if P Q) (not Q)] ((not P) BY

(suppose-absurd P

(dseq (Q BY (!mp premise1 P))

(false BY (!absurd Q (not Q))))))))))

Method modus-tollens defined.

>(assert (if A B) (not B))

The proposition

(if A B)


The proposition

(not B)


>(!modus-tollens (if A B) (not B))

Theorem: (not A)

Note that the evaluation of a dmatch will result in an error if the discriminant does not match any ofthe given patterns. E.g., in the case of modus-tollens:

>(!modus-tollens (or A B) B)

Error, top level, 3.6: dmatch failed---the list [(or A B) B]

did not match any of the given patterns.

When defining a method, it is not necessary to explicitly use the keyword method. An alternativeway to define modus-tollens, for instance, is this:

64

>(define (modus-tollens premise1 premise2)


([(if P Q) (not Q)] ((not P) BY

(suppose-absurd P

(dseq (Q BY (!mp premise1 P))

(false BY (!absurd Q (not Q))))))))))

Method modus-tollens defined.

In general, any definition of the form (define M (method (I1 · · · In) D)) can also be expressed as(define (M I1 · · · In) D).

A more succinct definition of modus-tollens can be given as follows:

(define (modus-tollens premise1 premise2)


([(if P Q) (not Q)] (suppose-absurd P

(!absurd (!mp premise1 P) premise2)))))

(To understand why this works, recall the assumption-base semantics of nested method calls fromSection 2.7.2.) There is a tradeoff between conciseness and readability. Which of the two versionsof modus-tollens is more appropriate depends on the context. As a general guideline, “throw-away”methods of little importance can be written in any style, but readability and clarity should be aprimary concern for all other methods (or proofs for that matter).

Next, we define a method uncurry that proceeds in the reverse direction from curry:

` P ⇒ (Q⇒R)` (P ∧Q)⇒R

>(define (uncurry premise)

(dmatch premise

((if P (if Q R)) (assume (and P Q)

(R BY (dseq (P BY (!left-and (and P Q)))

((if Q R) BY (!mp premise P))

(Q BY (!right-and (and P Q)))

(R BY (!mp (if Q R) Q))))))))

Method uncurry defined.

>(assert (if A (if B C)))

The proposition

(if A

(if B C))


>(!uncurry (if A (if B C)))


C)

And an alternative, more succinct definition:

65

>(define (uncurry premise)

(dmatch premise

((if P (if Q R)) (assume (and P Q)

(!mp (!mp premise (!left-and (and P Q)))

(!right-and (and P Q)))))))

Method uncurry defined.

We now give a proof of the conditional

(if (not (exists ?s (empty ?s)))

(forall ?s (not (empty ?s))))

In words, if there are no empty sets then every set is non-empty. This is a form of inference known asquantifier negation, whereby a negation sign preceding a quantifier is moved inward after “flipping”the quantifier (in this case, from existential to universal).

(assume-let (hypothesis (not (exists ?s (empty ?s))))

(pick-any set

(suppose-absurd (empty set)

(dlet ((conclusion (!egen (exists ?s (empty ?s)) set)))

(!absurd conclusion hypothesis)))))

It should be obvious that there is nothing special here about the empty predicate or about sets;the same reasoning could be performed with any unary predicate P over an arbitrary universe ofdiscourse. So we are naturally led to abstract this proof into a generic method that takes a unaryrelation symbol P as its argument and performs the above deduction (note that there are variousother ways of abstracting this proof into a method; we will present some alternatives later on):

>(define (qn-1 P)

(assume-let (hypothesis (not (exists ?x (P ?x))))

(pick-any object

(suppose-absurd (P object)

(dlet ((conclusion (!egen (exists ?x (P ?x)) object)))

(!absurd conclusion hypothesis))))))

Method qn-1 defined.

We can now apply this method to different symbols to obtain the corresponding theorems:

>(!qn1 empty)

Theorem: (if (not (exists ?x:Set

(empty ?x)))

(forall ?object:Set

(not (empty ?object))))

>(!qn-1 female)

Theorem: (if (not (exists ?x:Person

(female ?x)))

(forall ?object:Person

(not (female ?object))))

66

As another example, consider again the reasoning we used to infer that the relation younger isasymmetric, given that it is irreflexive and transitive (Section 3.2). It is again evident that there isnothing special about the fact that we are dealing with persons and their ages. The same reasoningcould be applied to any given binary relation on an arbitrary domain: as long as the relation happensto be irreflexive and transitive, we can prove that it is asymmetric using the exact same reasoningwe used for younger. We abstract this reasoning into a generic method that takes a binary relationsymbol R as input:

>(define (prove-asymmetric R)

(pick-any x y

(assume (R x y)

(suppose-absurd (R y x)

(dseq (!derive (R x x) [(R x y) (R y x) (transitive R)])

(!derive (not (R x x)) [(irreflexive R)])

(!absurd (R x x) (not (R x x))))))))

Method prove-asymmetric defined.

This method uses a unary procedure irreflexive that takes an arbitrary binary relation symbol R

and builds a proposition stating that R is irreflexive. We can define irreflexive as follows:

>(define (irreflexive R)

(forall ?x

(not (R ?x ?x))))

Procedure irreflexive defined.

We stress that irreflexive is a procedure, not a method. That is, it does not prove anything; itsimply constructs a proposition. That proposition may or may not be a logical consequence of thecurrent assumption base. For instance:

>(irreflexive <=)

Proposition: (forall ?x:Rational

(not (<= ?x ?x)))

>(irreflexive younger)

Proposition: (forall ?x:Person

(not (younger ?x ?x)))

We will have much more to say about the distinction between procedures and methods in the sequel.The prove-asymmetric method also uses a procedure transitive that takes a binary relation symbolR and builds a proposition stating that R is transitive. We can define transitive as follows:

>(define (transitive R)

(forall ?x ?y ?z

(if (and (R ?x ?y)

(R ?y ?z))

(R ?x ?z))))

Procedure transitive defined.

This completes the definition of prove-asymmetric. We can now apply this method to a particularsymbol to get the appropriate result:

67

>(assert (irreflexive younger) (transitive younger))

The proposition

(forall ?x:Person

(not (younger ?x ?x)))


The proposition

(forall ?x:Person

(forall ?y:Person

(forall ?z:Person


(younger ?y ?z))

(younger ?x ?z)))))


>(!prove-asymmetric younger)


(forall ?y:Person

(if (younger ?x ?y)


5.5 Recursive methods

Methods can perform arbitrary computation at runtime: they can do pattern matching and condi-tional branching, they can call other methods and procedures, and they can iterate for as long asnecessary—they can even go on indefinitely, without ever terminating. Recursive methods that iter-ate by repeatedly calling themselves are very useful and common in Athena. We will present a fewexamples in this section; much more involved examples will appear later.

Consider a unary method dn* that performs iterative double negation, taking as input a premiseof the form ¬ · · · ¬P with an arbitrary number n ≥ 0 of negation signs at the front, and shavesoff as many pairs of negation signs as possible by repeatedly applying the primitive method dn (seeSection 2.1 for a review of dn’s semantics). We can implement dn* as follows:

>(define (dn* premise)

(dmatch premise

((not (not _)) (!dn* (!dn premise)))

(_ (!claim premise))))

Method dn* defined.

The input proposition to dn* must be a premise, i.e., it must be in the assumption base at the timeof invocation. If the input premise is prefixed by at least two negation signs, i.e., if it is of the form(not (not Q)), then dn* applies dn to it and passes the result (namely, Q) to a recursive application ofdn*. Observe that because of the semantics of nested method calls, the recursive application of dn* willtake place in an assumption base that contains Q in addition to the original premise (not (not Q));this is crucial in order to maintain the aforementioned invariant that the input to dn* must always bein the assumption base. By contrast, if the input premise is not of the form (not (not Q)), then dn*

simply claims it, which will presumably succeed owing to the foregoing invariant. Some examples:

68

Method call Corresponding assumption base

(!dn* (not (not (not A)))))) {(not (not (not A)))}↓

(!dn (not (not (not A)))))) {(not (not (not A)))}↓

(!dn* (not A)))))) {(not A), (not (not (not A)))}↓

(!claim (not A)) {(not A), (not (not (not A)))}↓

Result: (not A)

Figure 5.1: Method call chain and corresponding assumption bases spawned by a call to dn*.



>(define P1 (not (not A)))

Proposition P1 defined.

>(define P2 (not (not (not P1))))


>(define P3 (or true false))


>(assert P1 P2 P3)

... output of assert ommitted ...

>(!dn* P1)

Theorem: A

>(!dn* P2)

Theorem: (not A)

>(!dn* P3)

Theorem: (or true false)

Figure 5.5 depicts, from top to bottom, the successive method calls spawned by the application

69

(!dn* (not (not (not A)))), as well as the respective assumption bases in which the calls occur.We assume that the initial method call takes place in the assumption base that contains only theproposition (not (not (not A))).

As another example, recall that Athena uses the primitive method uspec to perform universalspecialization: If P is a premise of the form (forall x Q), then (!uspec P t) derives the body Q withevery free occurrence of x safely replaced by t. If we are dealing with a premise that has multipleuniversal quantifiers at the front, we may need to issue several nested calls to uspec, which can beinconvenient. For instance, if P is the proposition

(forall ?x ?y ?z

(if (and (<= ?x ?y)

(<= ?y ?z))

(<= ?x ?z)))

then in order to specialize this with some terms t1, t2, and t3 we need to write

(!uspec (!uspec (!uspec P t1) t2) t3) (5.1)

We can alleviate this by writing a generalized version of uspec that takes a premise of the form∀ x1 · · · ∀ xn . P , n ≥ 0, and a list of n terms [t1 · · · tn], and derives the body P with every freeoccurrence of xi replaced by ti:

>(define (uspec* premise terms)

(dmatch terms

([] (!claim premise))

((list-of first rest) (!uspec* (!uspec premise first) rest))))

Method uspec* defined.

Using uspec*, we can perform the specialization (5.1) simply by writing (!uspec* P [t1 t2 t3]).

70

Chapter 6

Polymorphism

6.1 More on domains and function symbols

A domain can be polymorphic. As an example, consider sets over an arbitrary universe, call it T:

(domain (Set-Of T)

The syntax for introducing polymorphic domains is the same as in the monomorphic case, exceptthat now the domain name is flanked by an opening parenthesis to its left and a list of identifiersto its right, followed by a closing parenthesis; e.g., (Set-Of T), or in general, (dom-name I1...In).1

The identifiers I1...In serve as sort parameters, or sort variables, indicating that dom-name is a sortconstructor that takes n sorts S1, ..., Sn as arguments and produces a new sort as a result:

(dom-name S1 · · ·Sn)

For instance, Set-Of is a unary sort constructor that can be applied to an arbitrary sort, say thedomain Real, to produce the sort (Set-Of Nat) (intuitively, the domain of all sets of real numbers).

For the sake of uniformity we will regard monomorphic domains and datatypes (such as Ide andNat) as nullary sort constructors.

We are now in a position to define Athena sorts precisely. Suppose we are given a collection ofsort constructors SC, with each S ∈ SC having a unique arity n ≥ 0, and a collection of objects SV,which are to serve as sort variables. Assuming that SC and SV are disjoint, we define a sort over SCand SV as follows:

• Every sort variable in SV is a sort over SC and SV.

• Every nullary sort constructor S ∈ SC is a sort over SC and SV.

• If S1, . . . , Sn are sorts over SC and SV, n > 0, and S ∈ SC is a sort constructor of arity n, then(S S1 · · ·Sn) is a sort over SC and SV.

• Nothing else is a sort over SC and SV.

For example, suppose that1The identifiers I1, ..., I−n are not required to be distinct. This is in contrast to the sort parameters of polymorphic

datatype definitions and function symbol declarations; see footnotes 8 and 10.

71

SC = {Ide, Boolean, Set-Of}and SV = {S, T}, where Ide and Boolean are nullary sort constructors, while Set-Of is unary. Thenthe following are all sorts over SC and SV:

1. Boolean

2. Ide

3. (Set-Of Boolean)

4. S

5. (Set-Of T)

6. (Set-Of (Set-Of Ide))

A ground sort is one that contains no sort variables. All of the above examples are ground except4 and 5.

We note that, as we have defined them here, sorts are Herbrand terms, and hence all of the usualconcepts related to such terms are directly applicable to sorts. Considering sorts over SC={Ide,Boolean, Set-Of} and SV={S, T, W}, for example, we might say that (Set-Of Ide) matches (or isan instance of) (Set-Of T), under the substitution {T 7→ Ide}. Furthermore, for our purposes, twoterms that are literal variants of each other will be considered identical. For example,

(Pair-Of S (Pair-Of T (Set-Of S)))

and

(Pair-Of W (Pair-Of S (Set-Of W)))

should be understood as one and the same. (The basic theory of Herbrand terms, including matching,renamings, etc., is covered in Appendix ??.)

Thus we see that a polymorphic domain is not really a domain at all, but rather a domain template.Substituting ground sorts for the sort parameters gives rise to a specific domain, which is an instanceof the said template. Accordingly, we may view a polymorphic domain as the collection of all itsground instances.

We can now say that the universes of discourse of a given Athena session are the ground sortsthat can be built from the various sort constructors introduced in that session. For example, if weintroduced the polymorphic domain Set-Of, then given that Ide is a built-in domain and Boolean isa built-in datatype, our universes would include: y

Ide

Boolean

(Set-Of Ide)

(Set-Of Boolean)

(Set-Of (Set-Of Ide))

(Set-Of (Set-Of Boolean))

(Set-Of (Set-Of (Set-Of Ide)))

and so on. Of course the presence of polymorphic sort constructors such as Set-Of will usually giverise to infinitely many ground sorts. Most times only a very small subset of these will be of interest.

72

6.2 Polymorphic function symbols

Polymorphic domains pave the way for polymorphic function symbols. The syntax for declaring apolymorphic function symbol is the same as in the monomorphic case, except that the necessary sortvariables must appear listed within parentheses before the reserved symbol ->. Thus the general formis

(declare fun-symbol ((I1 · · · In) -> (S1 · · ·Sn) S)) (6.1)

where I1, . . . , In are the identifiers that will serve as sort variables,2 and, as before, Si is the sort ofthe ith argument and S is the sort of the result. Presumably, some of these sorts will involve the sortvariables. Examples:

(declare member ((T) -> (T (Set-Of T)) Boolean))

(declare id ((T) -> (T) T))

(declare insert ((T) -> (T (Set-Of T)) (Set-Of T)))

(declare = ((T) -> (T T) Boolean))

The first declaration introduces a polymorphic membership predicate that takes an element ofarbitrary domain T along with a set over T and “returns” either true or false. The second andfourth declarations introduce symbols for a polymorphic identity function and a polymorphic equalitypredicate, respectively. The latter is predeclared in Athena.

Polymorphic constants may be declared as well, e.g.,

(declare empty-set ((T) (Set-Of T)))

A function symbol declaration of the form (6.1) is admissible iff fun-symbol is distinct from everyfunction symbol that is visible at the point of the declaration; and if every Si, as well as S, is a sortover SC and {I1, . . . , In}, where SC is the set of sort constructors that are visible at the point ofdeclaration.3

Intuitively, a polymorphic function symbol f should be thought of as a collection of monomorphicfunction symbols, each of which may be seen as “an instance” of f . The declaration of each of thoseinstances is obtainable from the declaration of f by replacing the sort variables by ground sorts. Forexample, the declaration of the symbol id above (polymorphic identity) might be seen as syntax sugarfor infinitely many monomorphic function symbol declarations:

(declare idIde (-> (Ide) Ide))

(declare idBoolean (-> (Boolean) Boolean))

(declare id(Set−Of−Ide) (-> ((Set-Of-Ide)) (Set-Of Ide)))

2{I1, . . . , In} are required to be distinct.3Notice that SC and {I1, . . . , In} might not be disjoint. Consider, for example,

(declare insert ((Nat) -> (Nat (Set-Of Nat)) (Set-Of Nat)))

In such a case the name Nat will be treated as a sort parameter, overriding (or “masking”) its role as a sort constant.This is, of course, similar to the lexical scoping of λ-bound parameters in programming languages.

73

...

and so on.The advantages of polymorphic function symbols are harnessed by Athena’s polymorphic proposi-

tions. For instance, given the preceding declarations, an Athena user could formulate a polymorphicproposition such as

(forall ?x (= (id ?x) ?x))

This polymorphic proposition can be seen as a collection of infinitely many monomorphic propositions:it makes an assertion about idBoolean (namely, that the value of idBoolean for any given Booleanvalue is that same value), and an assertion about id(Set-Of Boolean), and so on ad infinitum. Thisexpressive economy is the power of polymorphism: with a single statement we manage to makeinfinitely many assertions.

6.3 Polymorphic datatypes

Since datatypes are just special kinds of domains, they too can be polymorphic. Polymorphicdatatypes resemble polymorphic algebraic types found in languages such as Haskell, Miranda, SML,etc. Here are two examples for polymorphic lists and ordered pairs:

(structure (List-Of T)nil(cons T (List-Of T)))

(datatype (Pair-Of S T)(pair S T))

Note that the syntax is the same as for monomorphic datatypes, with two exceptions:

1. The datatype name is now flanked by an opening parenthesis to its left and a list of identifiers toits right, followed by a closing parenthesis; e.g., (List-Of T) or (Pair-Of S T), or, in general,(struc-name I1 · · · In). Just as for polymorphic domains, the identifiers I1, . . . , In

4 serve as sortparameters, indicating that struc-name is a sort constructor that takes n sorts S1, . . . , Sn asarguments and produces a new sort as a result, (struc-name S1 · · ·Sn). For instance, List-Ofis a unary sort constructor that can be applied to an arbitrary sort, say the domain Ide, toproduce the sort (List-Of Ide); while Pair-Of is a binary type constructor and can thus beapplied to any two sorts to produce a new one, e.g.,

(Pair-Of Boolean (List-Of Ide))

2. The profile of a constructor can be more complex: it can be an arbitrary sort over all the sortconstructors that are visible at that point and the sort parameters I1, . . . , In. For instance, thesort of the second argument of cons is declared to be (List-Of T), a sort obtained by applyingthe sort constructor List-Of to the sort variable T.

4The identifiers I1, . . . , In are required to be distinct.

74

In full generality, the form of a datatype definition is given by the following grammar:

datatype-def ::= (datatype datatype-clause)

datatype-clause ::= datatype-profile con-profile+

datatype-profile ::= datatype-name | (datatype-name I+)

con-profile ::= con-name | (con-name S+)

where con-name and datatype-name are plain identifiers, as is I; N+ (for any non-terminal N) standsfor a non-empty sequence of occurrences of N ; and S designates an arbitrary sort.5

The notion of a reflexive constructor remains unchanged from the monomorphic case: if an ar-gument of a constructor is of a sort that involves the name of the datatype, then the constructor isreflexive; otherwise it is irreflexive. Moreover, we say that a datatype is inductive iff it has at leastone irreflexive constructor.

Finally we can state with precision exactly when a datatype definition is acceptable. We say thata definition of a datatype D is admissible if

1. the name D is distinct from every domain and datatype name visible at the point where D isdefined;

2. the constructors of D are distinct from one another, as well as from every function symbolintroduced prior to the definition of D;

3. every argument sort of every constructor of D is a sort over SC ∪ {D} and {I1, . . . , In}, whereSC is the set of sort constructors that are visible at the point of D’s definition and I1, . . . , In

are the sort parameters of D (if any); and finally,

4. D is inductive.

Just as monomorphic datatypes can be viewed as syntax sugar for monomorphic domain andfunction symbol declarations along with the postulation of the appropriate (monomorphic) free gen-eration axioms and induction principle, polymorphic datatypes can be thought of as syntax sugar forpolymorphic domain and function symbol declarations along with the postulation of the appropriate(polymorphic) free generation axioms. For instance, the definition of List-Of above can be seen assyntax sugar for

(domain (List-Of T))

(declare nil ((T) (List-Of T)))

(declare cons ((T) -> (T (List-Of T)) (List-Of T)))

(assert (forall ?head ?tail

(not (= nil (cons ?head ?tail)))))

(assert (forall ?head1 ?head2 ?tail1 ?tail2

(if (= (cons ?head1 ?tail1)

(cons ?head2 ?tail2))

(and (= ?head1 ?head2)

(= ?tail1 ?tail2)))))

5Thus S is generated by the grammar I | (I S+)

75

(assert (forall ?list

(or (= ?list nil)

(exists ?head ?tail

(= ?list (cons ?head ?tail))))))

The first axiom ensures that nil is different from all the values of cons; the second that cons isinjective; and the third that every list if either empty or the result of prepending an element tothe front of some other list. Since polymorphic propositions are essentially infinite collections ofmonomorphic propositions, these axioms make sure that each of the ground instances of List-Of isfreely generated in the regular, monomorphic sense discussed in Chapter 4.

When D is a polymorphic datatype of arity n and S1, . . . , Sn are ground sorts, then the sort(D S1 · · ·Sn) is inductively generated, and should be regarded as a monomorphic datatype. Thatdatatype’s definition can be retrieved from the definition of D by replacing the sort parametersI1, . . . , In by the sorts S1, . . . , Sn, respectively. E.g., (Pair-Of Ide Boolean) might be seen as amonomorphic datatype PairIde×Boolean with one binary constructor

pairIde×Boolean

whose first argument is Ide and whose second argument is Boolean, namely, as the datatype

(structure PairIde×Boolean

(pairIde×Boolean Ide Boolean)

Likewise, (List-Of Ide) should be understood as a monomorphic datatype ListIde definable as

(structure ListIdenilIde(consIde Ide ListIde)

Accordingly, just as with other polymorphic domains, we may view a polymorphic datatype asthe collection of all its ground instances. As the foregoing examples suggest, with each differentinstantiation of the “template,” the polymorphic constructors of the datatype have to be appropriatelyindexed as well, so that, for example, (Pair-Of Ide Boolean) and (Pair-Of Nat Nat), being twodistinct datatypes, have two distinct constructors, namely

pairIde×Boolean

and

pairNat×Nat

6.4 Predefined sorts

Certain sorts are predefined in Athena. These are:

1. the datatype Boolean, discussed earlier; and

2. the domain Ide of meta-identifiers.

76

There are infinitely many constants pre-declared for Ide. Those are all of the form ’I where Iis a regular Athena identifier, e.g., ’foo, ’x, ’233, ’*, ’sd8838jd)@!, etc. These are called meta-identifiers. They are used to represent identifiers when one models the abstract syntax of some objectlanguage in Athena. As a brief example, consider the untyped λ-calculus, where an expression iseither a variable (identifier), or an abstraction, or an application. This grammar can be formalizedas an Athena datatype:

(structure Exp(var Ide)(lambda Ide Exp)(app Exp Exp))

Then the term (lambda ’x (var ’x)) would represent the identity function. Having abstract syntaxidentifiers as a built-in domain with which one can perform both computation and deduction (e.g.we can quantify over elements of Ide) allows Athena to offer support for some of the issues that arisewhen one studies languages with binding operators, such as alphabetic conversion and equivalence.

77

Chapter 7

Athena as a programming language

In this chapter we will discuss the computational aspects of Athena, i.e., the use of Athena forgeneral-purpose programming. Much of what we will say here also pertains to the deductive part ofthe language.

As in most functional languages, computation in Athena may be depicted as follows. On one sidewe have a class of expressions, such as (+ 2 foo), which are the syntactically well-formed strings ofthe language; and on the other side we have a class of abstract values, such as the integers or finitesequences thereof. Computation amounts to mapping expressions to values. This process is calledevaluation, since it produces “the value” of a given expression.1 We speak of the value of an expressionbecause, barring non-determinism,the evaluation process can assign at most one value to any givenexpression. And we say “at most” because evaluation might not always succeed in yielding a value: itmight fail to terminate (by getting into an infinite loop), or it might generate an error(such as divisionby zero). We will write E and V for the sets of expressions and values, respectively; and we will usethe letters E and V as typical elements of these two sets.

The evaluation of an expression always takes place in the context of a lexical environment, whichmaps identifiers to denotable values. Athena supports the imperative paradigm, and so its denotablevalues include updateable memory locations, or “cells.” For semantic purposes we assume that thereis an infinite sequence of such locations, each of which is either empty or contains some value V (thecontents of the cell). In such a language, evaluation needs a second parameter in addition to a lexicalenvironment: it needs a store, which is something that captures “the state of the world” at a givenpoint in time, the world here being the computer memory. There are many ways to formalize stores;the key requirement is that they must be able to tell us the contents of any given memory location.A simple model is one that represents stores as functions that map any location to its contents (or to“empty”, signifying an unassigned cell). That is how we will understand stores from here on.

Thus in such languages one evaluates an expression relative to a given environment ρ and a givenstore σ. (In Athena there is a third semantic abstraction that is necessary for evaluation, namely,an assumption base β. But for our present purposes we can safely ignore that dimension.) Becauseexpressions might have side effects, the evaluation process does not only result in a value V but alsoin a store σ′, which might or might not be the same as the original σ, depending on what “destructiveassignments,” if any, were performed in the course of the evaluation. Therefore, mathematically,we might think of evaluation as a partially computable function that takes an expression E, anenvironment ρ, and a store σ; and either produces a value V and a store σ′, or else it results in error

1Or, more accurately, some finite, printable description of that value.

78

E ::= I Identifiers| C Characters| S Strings| () Unit| (var I) Term variables, abbreviated ?I

| (meta-id I) Meta-identifiers, abbreviated ’I

| (check (E1 E′1) · · · (En E′

n)) Conditional expressions| (lambda (I∗) E) Lambda expressions| (E E∗) Applications| [E∗] Lists| (method (I∗) D) Methods| (let ((I1 E1) · · · (In En)) E) Let expressions| (letrec ((I1 E1) · · · (In En)) E) Recursive let expressions| (match E (π1 E1) · · · (πn En)) Match expressions| (try E+) Try expressions| (cell E) Cells| (set!E1 E2) Assignments| (ref E) References| (seq E+) Sequences| (while E1 E2) Loops| (& E+) Short-circuit boolean “and”| (|| E+) Short-circuit boolean “or”

Figure 7.1: Syntax of Athena expressions

or non-termination. That is the view we will take in this chapter; later we will amplify it to accountfor assumption bases. A final point to keep in mind is that the evaluation of every expression alwaystakes place in the context of a set of declarations (of domains, datatypes, and function symbols) thatis lexically apparent (i.e., determined by the textual point where the expression appears). We will seethat this is important for pattern matching; it affects how a pattern is interpreted.

7.1 Expressions and values of Athena

Figure 7 displays the expressions of Athena, and may be taken as a definition of E . The variable π inthe match clause ranges over patterns. The definition of patterns is given in Figure 7.2. The variableD in the method clause ranges over Athena deductions, which will be covered in a later chapter. Thereader can disregard methods for the time being.

Figure 7.1 displays the kinds of values that make up the computational universe of Athena, andmay be taken as a description of V. Some brief remarks on each type of value follow.

1. The unit value is a single special object, distinct from all other values. It is used primarily asthe result of imperative expressions.

79

The unit value Characters Function valuesFunction symbols Substitutions MethodsTerms Propositional constructors Lists of valuesPropositions Quantifiers Cells

Figure 7.2: The values in the computational universe of Athena.

2. Function symbols are what we discussed in Section 1.3.2 and in Chapter 4: they are datatype(or structure) constructors, or symbols introduced via declare. This means that symbols suchas succ and false are actual values, possible results of computations.

3. Terms and propositions are as explained in the last chapter, though we will have more to sayabout them here.

4. Characters are individual ASCII symbols. The syntax of characters is covered in Section ??.

5. Substitutions are functions from term variables to terms that are the identity almost everywhere;they are discussed in a general setting in Appendix ??.

6. Propositional constructors are the five connectives that are used to build compound propositions:not, and, or, if, and iff. Since these are values, they too can be the results of computations.

7. There are also three quantifiers: forall, exists, and exists-unique. These too can be freelypassed around in computations.

8. By a “function value” we will mean a binary function that takes a sequence of values V1, . . . , Vn, n ≥0 and a store σ as arguments and returns a value and a store as a result. Function values arethus higher-order: some of the Vi might themselves be function values.

9. Methods were informally discussed in Chapter 5, and will be revisited later. The fact that theyare values means that they may be passed as arguments to functions or returned as results offunction calls.

10. Lists of values are just that: heterogeneous lists that may contain values of mixed types. Atypical list value V will be written as V = [V1 · · ·Vn], where the Vi are its member values(which might themselves be lists). Lists are the main data structures in Athena, besides terms.They are very useful because they can package up arbitrary amounts of data–even potentiallyinfinite amounts–and can be conveniently taken apart through pattern matching, as describedin Section 7.3.4.

11. Finally we have cells, which act as storage containers that can hold arbitrary values (possiblyother cells). For our purposes, a cell will be associated with a unique natural number l (thinkingof the computer memory as an infinite sequence: cell 0, cell 1, cell 2, cell 3, etc.).

These families of values are not quite pairwise disjoint; there is some overlap between (a) termsand function symbols, and (b) terms and propositions. Regarding the first, a constant symbol suchas zero or peter counts both as a symbol and as a term. And regarding (b), a term of Booleansort, such as false or (female ann), counts both as a term and as a proposition. We will see later

80

that these two overlaps not only present no problems in practice, but are in fact quite convenient.For a rigorous specification of the semantics we ought to give a precise description of what value Vand what store σ′, if any, will be produced when we evaluate an arbitrary expression E in a givenenvironment ρ and store σ. This is done in Sec. 1.5. But prior to that we will give a much moreinformal account in order to convey a sense of the big picture. To that end, our discussion in the nextfew sections will mostly leave the store out of the picture, since by and large Athena programmingdoes not make use of side effects. This means that we will be speaking as if identifiers denoted valuesdirectly, rather than memory locations. As we have already explained, this is not strictly correct; butit is unlikely to lead to any serious confusion and will make for a simpler exposition. The reader whodoes not feel a need for an informal explanation may proceed directly to the rigorous semantics ofSection 7.8.

7.2 Informal semantics

We consider each kind of expression that appears in Figure 7 with the exception of methods.

• Identifiers: The value of an identifier I is always the value associated with it in the currentlexical environment, provided that I is bound in that environment. It is an error if I is unbound.

• Unit: The value of the expression () is always the unit value, regardless of the given environmentand store.

• Term variables: The value of ?I is the term ?I.

• Meta-identifiers: The value of ’I is the (constant) term ’I, of the predefined sort Ide.

• Check expressions: To evaluate an expression of the form

(check (E1 E′1) · · · (En E′

n))

we evaluate each Ei in turn, looking for the value true (one of the two constructors of theBoolean datatype). When we find an Ei that returns true, we proceed to evaluate E′

i and theresult of E′

i becomes the result of the entire expression. It is an error if no Ei produces true,or—what amounts to the same thing—if there are no cases at all, i.e., if n = 0. It is also anerror if some Ei returns a value other than true or false. The keyword else may appear asEn, i.e., as the condition of the last clause. In that event, if no preceding Ei has yielded a truevalue, the expression E′

n is evaluated and its value is returned as the final result.

• Procedures: The value of an expression of the form

(lambda (I1 · · · In) E)

is a closure, namely, a record consisting of the “text” of the function (its parameters I1, . . . , In

and its body E) along with whatever environment ρ is in effect at the point where the lambdaexpression is defined. This closure unambiguously defines a procedure: later, when it is invokedwith actual values V1, . . . , Vn as arguments, the body E is evaluated relative to the saved envi-ronment ρ augmented with the bindings I1 7→ V1, . . . , In 7→ Vn. This mechanism gives us lexicalscoping: any free identifiers within the function body E take their meanings from the environ-ment in which the function was defined, not from the environment in which it is called. Inaddition, patterns inside the function body are interpreted relative to the declarations that were

81

visible at the point at which the function was defined. The evaluation of lambda expressionsalways terminates, since all we need to do is package up the required information (the function’sdefinition along with the current environment) into a closure. The parameters I1, . . . , In mustbe distinct, but there may be none of them, in which case we have a procedure of no arguments(sometimes called a thunk).

The underscore character _ may appear in the place of a parameter if we do not need to referto that parameter inside the body of the lambda. For instance, (lambda (_) true) denotes aprocedure that will always return true no matter what argument it is given.

• Applications: For convenience, Athena allows the use of function-application notation

(E E1 · · ·En)

for a number of purposes. Primarily, of course, it is used when E denotes a procedure, inwhich case the whole expression signifies the application of that procedure to the values of thearguments E1, ..., En. But the notation is also used, for example, for substitution application(recall that substitutions and procedures are two distinct types of values). E.g., if theta denotesa substitution and t is a term then (theta t) denotes the term obtained by applying thesubstitution theta to t. For instance, if theta is the empty (identity) substitution, and t isthe term (succ ?n), then (theta t) will result in the term (succ ?n). It is an error if theapplication of a substitution results in an ill-typed term.

So, unlike most functional languages, it is not the case that in every expression that looks likean application the leftmost component must necessarily denote a procedure. As we just saw, itmay be a substitution, which is an altogether different kind of value from a procedure. It canalso be a function symbol, or a propositional constructor, or a quantifier. We discuss each ofthese cases in turn:

– The value of E is a symbol f of arity n ≥ 0. In that case, if the value of each Ei is a termti, then the value of (E E1 · · ·En) is the term (f t1 · · · tn), provided that the said termis well-sorted. It is an error if the number n does not match the arity of the symbol f ;or if the value of some Ei is not a term; or if the term (f t1 · · · tn), where ti is the valueof Ei, is not well-sorted. Thus, for instance, the expression (succ ?n) will successfullyproduce the term (succ ?n), but (succ ?n ?m), (succ succ),and (succ peter) will allfail, the first because the wrong number of arguments is given to the symbol succ, thesecond because a non-term value is given as an argument to a function symbol, and thethird because of a sort error.

– The value of E is a propositional constructor ◦ (namely, one of not, and, or, if, or iff).In that case, if the number of the arguments n matches the arity of ◦ and the value ofeach Ei is a proposition Pi, then the value of (E E1 · · ·En) is the proposition obtained byjoining the various Pi with the connective ◦, provided that the result is well-sorted. Thevalue of an Ei might also be a term of sort Boolean, in which case it will be automati-cally coerced into a proposition. Accordingly, to evaluate (not true), we first evaluatetrue, which directly yields the term true. Because this is a term of sort Boolean, it iscoerced into a proposition, as required by not; and the resulting proposition (not true)is obtained. Likewise, evaluating (and (not true) (<= zero ?x)) results in the cor-responding proposition, since (not true), as we just saw, produces a proposition, and(<= zero ?x) produces a term of sort Boolean, which is hence coercible to a proposition,as required by and, which demands two propositions as arguments.

82

– The value of E is a quantifier q (namely, either forall, or exists, or exists-unique).In that case we must have n >= 2, i.e., there must be at least two argument expressions,E1 and E2. Further, the values of all the arguments except the last one (En) must bevariables ?I1 · · · ?In−1 and the value of En must be a proposition P (or a term of sortBoolean, which can be coerced into a proposition). If that is the case then the resultingvalue is the quantified proposition

(q ?I1 · · · ?In−1 P).

For instance, the expression (forall ?foo1 ?foo2 (<= ?foo1 ?foo2)) produces the propo-sition (forall ?foo1 ?foo2 (<= ?foo1 ?foo2)). It is an error if n < 2; or if the valueof one of E1, . . . , En−1 is not a variable; or if the value of En is not a proposition.

The final use of the notation (E E1 · · ·En), of course, is for procedure applications. For in-stance, the expression ((lambda (a) a) ?x) will apply the closure obtained by evaluating(function (a) a), as discussed earlier, to the argument ?x. In this case the result will be theterm ?x. We might have n = 0, in which case the value of E must be a function of no arguments.

• Let expressions: To obtain the value of an expression of the form

(let ((I1 E1) · · · (In En)) E)

we evaluate each Ej in turn, binding Ij to the result and continuing with Ej+i. Once we havedone this for j = 1, ..., n, we finally evaluate the body E in an environment in which every Ij

is bound to the value of Ej . (Note that this is similar to Scheme’s let*.) There might be nobindings at all, i.e., we might have n = 0, and in that case the value of the let is the value ofthe body E.

• Letrec expressions: Expressions of the form

(letrec ((I1 E1) · · · (In En)) E)

allow for mutually recursive bindings: an identifier Ij might occur free in any Ei, even for i ≤ j.The precise meaning of these expressions is spelled out in Section 7.8 via a desugaring in termsof let and set! expressions. For now an intuitive understanding will suffice. In most cases thevarious Ei are lambda expressions, but that does not have to be the case in general. Some typicaluses of letrec will appear in Section ??. As is the case with let expressions, it is possible tohave no bindings at all (n = 0), and in that event the value of the letrec is the value of thebody E.

• Try expressions: The try construct is a backtracking mechanism: To evaluate (try E1 E2),we first try to evaluate E1. If that successfully produces a value, then we return that valueas the result of the entire try expression. But if an error occurs during the evaluation of E1,we then proceed to evaluate E2. If E2 successfully yields a value, we return that value as theresult of the try; otherwise we raise an error to the effect that neither alternative succeeded. Atry with three alternatives is desugared into the two-element case, with (try E1 E2 E3) asan abbreviation for (try E1 (try E2 E3)); a try with four alternatives is desugared into atry with three alternatives, and so on. A try with one alternative, (try E), is legal, and isequivalent to E.

83

• Cells: To evaluate an expression of the form (cell E), we first evaluate E to obtain somevalue V , and then we create and return a new cell (store location) with V as its contents. qqq

• References: To evaluate an expression of the form (ref E), we first evaluate E. It is an errorif the value of E is not a cell. If it is a cell c, we return the value stored inside c as the result.

• Assignments: To evaluate an expression of the form (set! E1 E2) we first evaluate E1. It isan error if the value of E1 is not a cell. If it is a cell c, we proceed to obtain the value of E2, callit V2, and then we destructively change the contents of c to be V2, overwriting whatever mighthad been the previous contents of c. The set! expression itself returns the unit value.

• Sequences: To evaluate an expression of the form (seq E1 · · ·En), we evaluate each Ei inturn, i = 1, ..., n. The final result is the value of the last expression, En. The unit value isreturned if the expression sequence is empty (i.e., if n = 0).

• While loops: To evaluate a loop (while E1 E2), we first evaluate E1. If the result is true,we go on to evaluate E2. When that is finished, we re-evaluate E1. If it is still true, we evaluateE2 again. Afterwards we go back to evaluate E1, and so forth: we keep evaluating E2 as longas E1 returns true. If and when E1 returns something other than true, we discontinue theevaluation of E2. The unit value is then returned as the result of the entire while expression.It is an error if E1 returns something other than true or false.

• Match expressions: In an expression of the form

(match E (π1 E1) · · · (πn En))

the expression E is called the discriminant, and each pair (πi Ei) is a match clause. The firstthing we do is evaluate the discriminant E to obtain some value V . We then start matchingthe value V against each pattern πi. The patterns are tried in the order in which they appear,i = 1, ..., n. A successful match will result in a finite set of bindings between certain identifers(the pattern variables) and values, which constitutes a mini lexical environment. If no successfulmatch is found, an error occurs. Otherwise, let πj be the first pattern that successfully matchesthe value V , and let ρj be the resulting set of bindings. Then the expression Ej is evaluated inthe environment of the match expression augmented with ρj , and its value becomes the resultof the entire match expression. The precise definition of pattern matching (i.e., exactly when agiven value matches a given pattern, and what bindings are produced as a result) can be foundin Section 7.3.

• Short-circuit boolean operations: & and || perform the operations of “and” and “or” on thetwo-element set {true, false}. They are given as special forms rather than simple top-levelfunctions in order to allow for short-circuit evaluation.2 In particular, to evaluate (& E1 · · ·En),we first evaluate E1, to get a value V1. V1 must be a Boolean term—otherwise an error occurs.If it is false, then the result is false; if it is true, then we proceed to evaluate E2, to get avalue V2. Again, it is an error if V2 is not a Boolean term. Assuming that V2 is a Boolean term,we return false if V2 is false; otherwise, if V2 is true, we proceed with E3, and so on. If everyEi is true then we finally return true as the result. We evaluate (|| E1 · · ·E2) similarly, butwith the roles of true and false reversed.

2Because Athena is a strict (call-by-value) language, if, say & was just an ordinary binary function, then both of itsarguments would have to be evaluated before the operation could be carried out; likewise for ||.

84

When defining procedures at the top level, we can write

(define (f I1 · · · In) E)

as an abbreviation for (define f (letrec ((f (lambda (I1 · · · In) E))) f)). For instance, thetwo following definitions of the map list functional are equivalent:

(define map

(letrec ((map (lambda (f L)

(match L

([] [])

((list-of x rest) (add (f x) (map f rest)))))))

map))

(define (map f L)

(match

([] [])

((list-of x rest) (add (f x) (map f rest)))))

A number of mutually recursive procedures f1, . . . , fn, each with parameters I∗j and body Ej , canalso be defined using the following syntax sugar:

(define (f1 I∗1) E1 · · · (fn I∗n) En)

Finally, an additional piece of terminology: the eight identifiers not, and, or, if, iff, forall, exists,and exists-unique will be called logical tokens. We next discuss pattern matching in detail.

7.3 Pattern matching

7.3.1 General discussion

Athena has an expressive pattern language, shown in Figure 7.2, that makes it easy to take apartquite complicated structures. Disregarding parentheses and square brackets, a pattern is made up ofidentifiers that fall into three classes:

1. Keywords, such as list, some-quant, etc.

2. Function symbols (such as succ, true, etc.), or logical tokens (e.g., if, forall, etc.) These willbe called pattern constants.

3. Pattern variables.

An identifier counts as a pattern variable iff it is neither a keyword, nor a function symbol, nor alogical token. Pattern variables act as place holders: during the matching process they can becomebound to arbitrary values. By contrast, a pattern constant I—as the name implies—can only besuccessfully matched against the unique semantic object (symbol or logical token) that answers to thename I. Hence, for example, assuming all and only the declarations of Chapter 1 as given, the pattern(succ n) contains only one pattern variable: n. The other identifier, succ, is a function symbol, andhence a pattern constant. Of course had n been declared as a function symbol, then it would alsobe a pattern constant instead of a variable. We thus stress that whether or not an identifier inside apattern is interpreted as a pattern variable depends on what function symbols have been declared upto that point.

85

π ::= I

| x

| m

| C

| S

| ()

|| (val-of I)

| (bind I π)

| (list-of π1 π2)

| [π1 · · ·πn]

| (π1 · · ·πn)

| (some-var I)

| (some-prop-con I)

| (some-quant I)

| (some-term I)

| (some-atom I)

| (some-prop I)

| (some-list I)

| (some-cell I)

| (some-proc I)

| (some-method I)

| (some-symbol I)

| (some-sub I)

where C and S range over character and string constants,respectively, while x is a term variable and m is a meta-identifier:

x ::= (var I) | ?I

x ::= (meta-id I) | ’I

Figure 7.3: Syntax of Athena patterns

86

Pattern matching works as follows: a value V is matched against a pattern π, in the context of someenvironment ρ, and results either in failure (in which case we say that V did not “match” the pattern);or in a finite mapping B that assigns a value to each pattern variable in π in such a way that if weregard the pattern variables of π as having the values that B dictates then we might think of π and Vas one and the same thing. Suppose, for example, that V is the term (succ (succ zero)), and π isthe pattern (succ n). Here we have a successful match, resulting in the binding n 7→ (succ zero).As another example, let V be the two-element list whose first member is the empty substitution andwhose second element is the constant term zero; i.e., V = [ zero]; and let π be the pattern [x zero].Since zero is a symbol, this pattern has only one pattern variable, the identifier x. Matching V issuccessful and results in the binding x 7→ {}.

Multiple occurrences of the same pattern variable express an identity constraint: any matchingvalue must have identical parts in the corresponding positions.3 For instance, the pattern [x x]does not match just any two-element list, but only those two-element lists whose two members areidentical. This is a very expressive mechanism, but it can lead the matching process astray if twovalues which do not admit equality testing are matched up against the same pattern variable. Forexample, if we tried to match the value of the expression

[(lambda (y) y) (lambda (z) z)]

against the pattern [x x] we would get an error, the reason being that in order to decide whetherthere is a match here we need to know whether (lambda (y) y) and (lambda (z) z) have identicalvalues, a problem that is undecidable in general.

7.3.2 The wildcard pattern

The pattern _ matches any value in any environment. It is used when we do not care to decomposeor name a given value (or a part thereof).

7.3.3 Type distinguishers

A pattern of the form (some-T I) matches any value V of type T , binding I to V . It does not matchany other values. For instance:

>(define method-recognizer

(lambda (v)

(match v

((some-method M) M)

(_ (print "\nInput is not a method.\n")))))

Procedure method-recognizer defined.

>(method-recognizer left-and)

Method: left-and

>(method-recognizer [])

Input is not a method.

3A pattern in which some pattern variable occurs more than once is called non-linear.

87

Unit: ()

If we do not wist to name the value, we can use the underscore character instead of an identifier:(some-T _). For instance:

>(define method-recognizer

(lambda (v)

(match v

((some-method _) true)

(_ false))))

Procedure method-recognizer defined.

>(method-recognizer left-and)

Term: true

>(method-recognizer ())

Term: false

7.3.4 Matching compound values

Pattern matching is used mainly for decomposing compound values. There are three kinds of complexAthena values that can be decomposed with patterns: lists, terms, and propositions. We discuss eachin turn.

Matching lists

Lists of values can be decomposed with list patterns. A list pattern has one of the following forms:

1. (some-list I); or

2. (list-of π1 π2);

3. [π1 · · ·πn]; or

4. (split π1 π2); or

5. (bind I π), where π is a list pattern.

Each of these patterns captures certain forms (“shapes”) that a list of values might have. The mostinclusive of them is the first: any list of values V = [V1 · · ·Vn] will successfully match a pattern ofthe form (some-list I), resulting in the binding I 7→ V . Patterns of the second kind will matchany non-empty list whose first element (the head) matches the pattern π1 and whose tail matchesthe pattern π2. For instance, the list [true () ’foo] matches the pattern (list-of first rest),yielding the bindings first 7→ true and rest 7→ [() ’foo].

Next, a pattern of the form [π1 · · ·πn] is successfully matched only by a list value of the form[V1 · · ·Vn] such that Vi matches πi (respecting any possible constraints introduced by multiple occur-rences of the same pattern variable).4

4To be fully correct, we should say that the values V1, . . . , Vn sequentially match the patterns π1, . . . , πn. This isdefined precisely at the end of Section ??.

88

A list L = [V1 · · ·Vn] matches (split π1 π2) if there is some way to split L into two partsL1 and L2 such that the left part L1 matches π1 and the right part L2 matches π2. To find sucha decomposition, we start scanning L from left to right. That is, we first try L1 = [] and L2 =[V1 · · ·Vn] = L. If this decomposition works, meaning that L1 = [] matches π1 and L2 = L matchesπ2

5, we then return the corresponding set of bindings. If not, we try L1 = [V1] and L2 = [V2 · · ·Vn].If we get a match, we return the appropriate set of bindings, otherwise we try L1 = [V1 V2] andL2 = [V3 · · ·Vn]; and so on. If we reach the end of L (meaning L1 = L and L2 = []) without finding amatch, we fail.

Finally, a list of values [V1 · · ·Vn] matches a pattern of the form (bind I π) iff it matches π. Thedifference is that the identifier I becomes bound to V in any subsequent matching. This is a generalcharacteristic of bind patterns; see Section ??.

Matching terms

Patterns of the form (π π1 · · ·πn), n > 0, are called compound. Compound patterns are used formatching terms and propositions. Roughly, we look at terms and propositions as trees, and we matchthe root of the tree against the pattern π and the ith subtree against the pattern πi. For example, theterm (union null ?foo) matches the compound pattern (union s1 s2), resulting in the bindingss1 7→ null and s2 7→ ?foo. Assuming that f, s and t are pattern variables (i.e., that they are notfunction symbols), the pattern (f s t) is matched by any term with a binary function symbol at theroot. For instance, the term

(union null (intersection ?s1 ?s2))

matches (f s t), giving the bindings

f 7→ union, s 7→ null, t 7→ (intersection ?s1 ?s2)

Accordingly, the following procedure takes any term of the form (f t1 t2) as input and returns thelist [t2 t1 f ], consisting of the two subterms t1 and t2 in reverse order and the function symbol f :

>(define break-term

(lambda (t)

(match t

((f t1 t2) [t2 t1 f]))))

Procedure break-term defined.

>(break-term (plus ?n (succ zero)))

List: [(succ zero) ?n plus]

>(break-term (succ ?n))

Error, top level, 3.7: match failed---the term (succ ?n)


Note that the procedure break-term above works fine as long as we assume that the identifiersf, t1 and t2 have not been previously declared as function symbols. Otherwise, if, say, f had been

5More precisely, we should say that the values [], L sequentially match the patterns π1, π2.

89

introduced as a binary function symbol at some point prior to the definition of break-term, thenbreak-term will only match terms with f at the root—it will not match an arbitrary term with abinary function symbol at the root:

>(declare f (-> (Person Person) Person))

Function symbol f declared.

>(define break-term

(lambda (t)

(match t

((f t1 t2) [t2 t1 f]))))


>(break-term (f ann peter))

List: [peter ann f]


Error, top level, 3.7: match failed---the term (plus ?n (succ zero))


If a lot of function symbols have been declared, it becomes easy when writing patterns to forgetthat a certain identifier I was previously introduced as such a symbol. We then proceed to write thepattern under the mistaken understanding that I will serve as a pattern variable, when in fact it isa pattern constant. To avoid this and ensure that an identifier is used as a pattern variable, we canwrap it around the appropriate type distinguisher. For instance, the pattern

((some-symbol f) (some-term t1) (some-term t2))

will always match any given term with a binary function symbol at the top, even if f (or t1 or t2 forthat matter) was previously introduced as a function symbol:

>(declare f (-> (Person Person) Person))

Function symbol f declared.

>(define break-term

(lambda (t)

(match t

(((some-symbol f) (some-term t1) (some-term t2)) [t2 t1 f]))))


>(break-term (f ann peter))

List: [peter ann f]


List: [(succ zero) ?n plus]

90

The pattern ((some-symbol f) (some-term t) t) matches any term with a binary functionsymbol at the top and two identical subterms. For example:

>(match (union null null)

(((some-symbol f) (some-term t) t) ’ok))

Term: ’ok

>(match (plus (succ zero) (succ zero))


Term: ’ok

>(match (union ?s null)


Error, top level, 1.2: match failed---the term (union ?s null)


Sometimes we wish to match terms whose top function symbols have arbitrary arities whichcannot be known in advance. In such cases it is useful to be able to deconstruct a term into twoparts, the symbol at the root and a list of its children in left-to-right order. Such destructuring can beachieved with compound patterns of the form (π1 π2), where π2 is a list pattern. A term of the form(f t1 · · · tn) is matched against such a pattern by matching the top function symbol f against thepattern π1 and the list [t1 · · · tn] against the pattern π2. Thus, for example, the term (union ?s null)matches the pattern (union (some-list args)) under the binding args 7→ [?s null]. It alsomatches the pattern

((some-symbol f) (list-of (some-term t) rest))

under the bindings f 7→ union, t 7→ ?s, and rest 7→ [null].

Matching propositions

Propositions are matched against compound patterns in a similar tree-like manner. Atomic propo-sitions are matched as if they were terms. Propositions of the form (◦ P1 · · ·Pn), for some proposi-tional constructor ◦, are matched against a compound pattern (π π1 · · ·πn) by sequentially matchingthe values ◦, P1, . . . , Pn against the patterns π, π1, . . . , πn. So, for example, to match the propo-sition (or (not false) (not false)) against the pattern (or P P), we first successfully matchthe propositional constructor or against the pattern constant or, then the proposition (not false)against the pattern variable P, successfully obtaining the binding P 7→ (not false), and then again(not false) against P, which is consistent with the binding obtained at the previous step.

An arbitrary proposition of the form (◦ P1 · · ·Pn) with ◦ ∈ {not, and, or, if, iff} can also bematched by a pattern of the form ((some-prop-con I) π) for some list pattern π. Here I will bebound to ◦ and π will be matched against the list [P1 · · ·Pn]. E.g.:

>(define (break-prop P)

(match P

(((some-prop-con pc) (some-list props)) [pc props])))

Procedure break-prop defined.

91

>(break-prop (not true))

List: [not [true]]

>(break-prop (and true false true))

List: [and [true false true]]

>(break-prop (or false))

List: [or [false]]

>(break-prop (if false true))

List: [if [false true]]

>(break-prop (iff true true))

List: [iff [true true]]

>(break-prop true)

Error, top level, 2.5: match failed---the term true


>(break-prop (forall ?x (= ?x ?x)))

Error, top level, 2.5: match failed---the proposition (forall ?x (= ?x ?x))


The same ideas apply to quantified propositions. A proposition of the form (q x P) is matchedagainst a compound pattern (π π1 π2) by sequentially matching the values q, x, P against the pat-terns π, π1, π2. Consider, for instance, the pattern

(forall v (and P1 P2))

This pattern can be matched by any universally quantified proposition whose body is a conjunction.The pattern

((some-quant q) v1 (q v2 B))

is matched by any proposition with two occurrences of the same quantifier at the front. For example,

>(define (foo P)

(match P

(((some-quant q) v1 (q v2 B)) [q v1 v2 B])))

Procedure foo defined.

>(foo (exists ?x (exists ?y (= ?x ?y))))

List: [exists ?x ?y (= ?x ?y)]

92

>(foo (forall ?x (exists ?y (= ?x ?y))))

Error, top level, 2.4: match failed---the proposition (forall ?x (exists ?y (= ?x ?y)))


It is also possible to match propositions with arbitrarily long blocks of quantifiers at the front, orat any other point inside the proposition. For instance, a proposition of the form

∀ x1 · · ·xn . P

can be captured by the pattern

(forall (some-list vars) body)

resulting in the bindings vars 7→ [x1 · · ·xn] and body 7→ P . The number n could be zero, in whichcase vars would be bound to the empty list. Likewise, the pattern

(forall [v1 v2 v3] (exists (list-of v more) B))

would match any proposition of the form

∀ x1 x2 x3 . ∃ y1 · · · yk . P

for arbitrary k > 0, with the bindings

v1 7→ x1

v2 7→ x2

v3 7→ x3

v 7→ y1

more 7→ [y2 · · · yk]B 7→ P

The slightly different pattern

(forall [vl v2 v3] (exists (list v2 more) B))

additionally constrains the first existentially quantified variable to be identical to the second univer-sally quantified variable.

Even greater generality is afforded by a pattern such as

((some-quant q) (some-list vars) body)

which is matched by any proposition with any non-empty block of quantifiers at the front, i.e., anyproposition of the form

q x1 · · ·xn . B

for n > 0.In general, patterns of the form (π1 π2 π3), where

93

1. π1 is a quantifier pattern, namely, either one of the three pattern constants forall, exists,and exists-unique, or a pattern of the form (some-quant I); and

2. π2 is a list pattern

are used to capture propositions with blocks of quantifiers at the front. A proposition P matches sucha pattern iff it can be written in the form

q x1 · · ·xn . B

with n ≥ 0 and B not of the form q y1 · · · yk . B′ for any k > 0 (observe that every proposition canbe written in this form),where

• the quantifier q matches the pattern π1;

• some maximal prefix [x1, . . . , xi] of [x1, . . . , xn], i ≤ n, matches the pattern π2; and

• the proposition q xi+1 · · ·xn . B matches the pattern π3.

These three matchings take place sequentially, and if they are successful they might result in bindingsthat could affect subsequent matchings.

In the second of the above clauses, the phrase “maximal prefix” means that there must not exist alarger prefix [x1, . . . , xj ], j > i, j ≤ n such that [x1, . . . , xj ] matches π2. This qualification is necessaryfor disambiguating certain cases in an intuitively appealing way. For instance, consider the proposition(forall ?x ?y (= ?x ?y)) and the pattern

(forall (some-list vars) B)

Here three alternative pairs of bindings suggest themselves:

1. vars 7→ [] B 7→ (forall ?x ?y (= ?x ?y))

2. vars 7→ [?x] B 7→ (forall ?y (= ?x ?y))

3. vars 7→ [?x ?y] B 7→ (= ?x ?y)

The last of these seems to be the most intuitive, and is the one that will be chosen by virtue of themaximal-prefix restriction. If we explicitly wanted one of the other sets of bindings, say the secondalternative, then we could have used the pattern (forall [v] B) or

(forall (list-of v more) B)

which would have given us, respectively, the bindings

v 7→ ?x B 7→ (forall ?y (= ?x ?y))

v 7→ ?x more 7→ [?y] B 7→ (= ?x ?y)

Other patterns

A pattern of the form ?I—or equivalently, (var I)—matches that exact same variable term, ?I, andnothing else. E.g., the only value that matches the pattern ?foo is the term ?foo. Likewise, a patternof the form ’I—or equivalently, (meta-id I)—matches only that exact same meta-identifier, andnothing else. E.g., the only value that matches the pattern ’foo is the constant term ’foo.

The pattern () is matched only by the unit value.The pattern _ matches any value whatsoever, in any environment and at any position in any

pattern.A pattern of the form (bind I π) matches a value V iff V matches π; the only additional effect

is that I gets bound to V . For instance,

94

>(define (f x)

(match x

((list-of _ (bind rest (list-of _ _))) rest)))

Procedure f defined.

>(f [’a ’b])

List [’b]

A pattern of the form (val-of I) matches a value V iff, in the lexical environment in which thematching is taking place, the value of I is V . For example, the function foo below takes an argumentx and a list l and checks whether the first element of l is x:

>(define (foo x l)

(match l

((list-of (val-of x) _) true)

(_ false)))


>(foo ’a [’a ()])

Term: true

>(foo ’a [()])

Term: false

In combination with split patterns, val-of patterns can be quite powerful. For instance, considerthe list-membership problem: given a value v and a list L, return true or false according to whetheror not v is an element of L, respectively. Normally this would be encoded using a recursive searchprocedure. But by using slit and val-of patterns, it can be done non-recursively:

>(define (member? v L)

(match L

((split _ (list-of (val-of v) _)) true)

(_ false)))

Procedure member? defined.

>(member? () [’a () ’b])

Term: true

>(member? () [’a ’b ’c])

Term: false

Here we have relegated the necessary computation (search) for scanning the input list to the semanticsof split and val-of patterns.

A pattern of the form (some-var I) matches any variable term x, such as ?n, ?a23, etc., andresults in the binding I 7→ x; it does not match anything else. For instance,

95

>(define (f t)

(match t

((some-var v) [v])

(_ ())))

Procedure f defined.

>(f ?xyz)

List: [?xyz]

>(f ’foo)

Unit: ()

A pattern of the form (some-atom I) matches any atomic proposition (term of sort Boolean),and binds I to it. It does not match anything else. The remaining patterns of the form (some-T I)with T ranging over prop-con, quant, term, etc., are self-explanatory. For example, (some-sub s)matches any substitution value, and binds s to it. It does not match anything except substitutions.Likewise for the other patterns.

7.3.5 The pattern matching algorithm

In this section we give a precise algorithmic answer to the following question: given a value V , apattern π, and an environment ρ, does V match π in ρ? If so, what is the resulting set of bindings?The algorithm we give below answers these questions. The algorithm always terminates, either infailure, signifying that V does not match π, or with a finite mapping that assigns values to thepattern variables of π.6 The algorithm also takes such a mapping Bin as input. Initially, the top-levelcall will invoke the algorithm with the empty mapping Bin = ∅.

The algorithm proceed by a case analysis of the structure of π:

• Case 1: π is the wildcard pattern _. In that case return Bin unchanged.

• Case 2: π is a term variable (such as ?foo), or a meta-identifier (such as ’foo), or a character(such as ‘A), or a string (such as "Hello world."), or the unit value (). Then if V is thecorresponding value (?foo, ’foo, etc.), return Bin unchanged; otherwise fail.

• Case 3: π is an identifier I. Then if I is a pattern constant (i.e., a function symbol, propositionalconstructor, or quantifier), then the match succeeds iff V is the corresponding value, in whichcase we simply return Bin unchanged; else we fail.7 Otherwise I is a pattern variable, and couldmatch V provided that it has not already been committed. So we check to see whether I isbound in Bin. If it is not, then we simply return Bin[I 7→ V ]. Otherwise, we must ensure thatBin(I) and V are identical. If they are not, or if they do not admit equality testing, we fail; andif they are identical, we simply pass on Bin unchanged.

6We view such a mapping B as a finite function from identifiers to values. We thus write B(I) for the value that Bassigns to I (assuming that I is in the domain of B); we write B[I 7→ V ] to denote the extension of B with the mappingI 7→ V , i.e., the mapping that assigns V to I and agrees with B everywhere else; and we write ∅ for the empty mappingthat is undefined on all identifiers.

7Strictly speaking, we assume that in addition to the lexical environment, we also have a way of telling whether ornot a given identifier is a function symbol.

96

• Case 4: π is a list pattern of the form [π1 · · ·πn]. In that case, if V is a list of values [V1 · · ·Vn],we try to sequentially match V1, . . . , Vn against the patterns π1, . . . , πn (see the paragraph atthe end of this section about sequentially matching a number of values V1, . . . , Vn against npatterns π1, . . . , πn). Otherwise we fail.

• Case 5: π is a list pattern of the form (list-of π1 π2). In that case, if V is a non-empty listof values [V1 · · ·Vn], n > 0, we try to sequentially match the values V1, [V2, . . . , Vn] against thelist of patterns π1, π2. Otherwise we fail.

• Case 6: π is a list pattern of the form (split π1 π2). Then if V is not a list of values, we fail.Otherwise, we consider whether V can be expressed as the concatenation of two lists L1 and L2

such that (i) L1 is the least prefix of V that matches π1 under ρ and Bin, producing a set ofbindings B as output, and (ii) L2 matches π2 under ρ and B, producing a set of bindings B′ asoutput. If so, we return B′; if no such decomposition of V exists, we fail.

• Case 7: π is of the form (val-of I). We then check to see if I is bound in the environment ρ.If it is not, we fail. If it is bound to some value V ′, we check whether the two values V and V ′

are identical; if the values do not admit equality testing, or if they do but are not identical, wefail, else we return B.

• Case 8: π is of the form (π1 π2 · · ·πn+1), for n > 0. As we explained earlier, patterns of thisform are used for matching non-variable terms and propositions. Accordingly, we distinguishthe following cases:

– (i) V is a term t of the form (f t1 · · · tm), m ≥ 0 (we regard constant terms as being ofthis form, with m = 0). We then distinguish two subcases:

1. n = 1 and π2 is a list pattern. In that case we try to sequentially match the valuesf, [t1 · · · tm] against the patterns π1, π2.

2. Otherwise, if either n > 1 or π2 is not a list pattern, we try to sequentially match thevalues f, t1, . . . , tm against the patterns π1, π2, . . . , πn+1.

– (ii) V is a proposition P . We then distinguish two subcases, on the basis of the pattern:

1. If n = 2, π1 is a quantifier pattern,8 and π2 is a list pattern, we first express P in theform

q x1 · · ·xk . P ′ (7.1)

for some quantifier q, k ≥ 0, and P ′ not of the form q y1 · · · ym . P ′′, m > 0. Note thatthe identity of q might not be determined if k = 0, i.e., if P is not a quantified propo-sition. This might affect the first of the steps below. Essentially, if q is undeterminedthen it will match π1 if the latter is one of the three quantifier pattern constants, butnot if π1 is of the form (some-quant I), since that would necessitate a binding I 7→ q,which clearly cannot be carried out if q is undetermined. Specifically, after we expressP in the form (7.1), we do the following, in sequential order:∗ if k > 0 then match q against the pattern π1 in ρ and Bin, and let B be the output

of that matching. Otherwise, if π1 is of the form (some-quant I), fail; else letB = Bin and continue with the next step;

8I.e., one of the three pattern constants forall, exists, exists-unique; or a pattern of the form (some-quant I);or else a pattern of the form (bind I π), where π is a quantifier pattern.

97

∗ match the largest possible prefix [x1, . . . , xi], i ≤ k, against π2 in ρ and B, and letB′ be the resulting bindings;

∗ match the proposition q xi+1 · · ·xk . P ′ against π3 in ρ and B′.2. Otherwise:

∗ If P is an atomic proposition t, we match t as a term against π.∗ If P is a compound proposition of the form (◦ P1 · · ·Pn), for some propositional

constructor ◦ and n > 0, then: If n = 1 and π2 is a list pattern then we sequentiallymatch ◦, [P1 · · ·Pn] against the patterns π1, π2 in ρ and Bin; else we sequentiallymatch ◦, P1, . . . , Pn against the patterns π1, π2, . . . , πn+1 in ρ and Bin.

∗ Finally, if P is a quantified proposition of the form q x . P ′, we sequentially matchthe values q, x, P ′ against the patterns π1, . . . , πn+1.

– (iii) If V is neither a term nor a preposition, we fail.

• Case 9: π is of the form (bind I π′). In that case we match V against π′ in ρ and B, producinga mapping B′, and then return B′[I 7→ V ].

• Case 10: π is of the form (some-var I). In that case we check to see whether V is a variableterm (say, ?foo, or ?x395, etc.). If it is, we return Bin[I 7→ V ]; otherwise we fail. When π isof the form (some-atom I), we check to see whether V is an atomic proposition (i.e., a term ofsort Boolean), and if so, we return Bin[I 7→ V ], otherwise we fail. Likewise for the remainingtype distinguishers. For instance, for (prop-con I) we check whether V is one of the fivepropositional constructors (not, and, or, if, iff) and if so, we return Bin[I 7→ V ], else we fail.

Finally, to sequentially match a number of values V1, . . . , Vn against a number of patterns π1, . . . , πm

in a given environment ρ and mapping B, we do the following: First, if n 6= m, we fail. Otherwise,if n = 0, we return B. Else, if n = m and n > 0, we match V1 against π1 in ρ and B, resulting in amapping B′, and then proceed to sequentially match V2, . . . , Vn against π2, . . . , πn in ρ and B′.

7.4 Numbers in Athena

Athena does not have numbers (either integers or reals) built-in as primitive syntactic categories.However, both integer and real numerals can be introduced as constant terms, and then all the usualthings that can be done with numbers in a typical programming language can also be done in Athena.In fact, treating numerals as constant terms has some added advantages: they can be directly matchedor unified, they can appear inside propositions, they can be quantified over, etc.

Integers

Integer numerals are typically introduced as follows:

>(domain Number)

New domain Number introduced.

>(use-numerals (0 ...) Number)

Numerals in the range (0 ...) can now be used as terms of Number.

98

The name Number is of course arbitrary; any other name would do, provided it is not already used todenote another domain. The use-numerals directive is essentially equivalent to the following infinitesequence of symbol declarations:

>(declare 0 Number)

New symbol 0 declared.

>(declare 1 Number)


>(declare 2 Number)


.

.

.

It instructs Athena to treat every numeral in that range as a constant term of sort Number. Accordingly,if we tell Athena to print out the sort of, say 75, we get the following:

>(print (join "\n" (sort-of 75) "\n"))

Number

Unit: ()

There is no limit to the size of numerals one might enter:

>(exists ?n (= ?n 12198439883246723726329234823672390237237892223940404))

Proposition: (exists ?n:Number

(= ?n 12198439883246723726329234823672390237237892223940404))

The usual numeric operations of addition, multiplication, etc., can be introduced with the followingdirective:

>(define-numeric-operations)

The following binary procedures have been introduced:

plus, minus, times, div, mod, num-equal?, less?, and greater?.

These procedures can be applied to any numerals in the (0 ...) range.

This directive augments the top-level Athena environment with the listed procedures, which performnumeric operations on numerals (terms of sort Number) in the expected manner. For instance:

>(plus 10 47)

Term: 57

>(define (cube n) (times n (times n n)))

Function cube defined.

>(cube 10)

Term: 1000

>(cube "foo")

99

Error, top level, 1.34: Wrong kind of value given as first argument

to times---a term was expected, but the given argument was a list: [‘f,‘o,‘o].

>(less? 23 754)

Term: true

If one prefers different names for these operations, they can be renamed, since Athena is higher-order:

>(define + plus)

Function + defined.

>(+ 20 30)

Term: 50

These operations do not have infinite precision. Overflow or underflow will occur if the relevant sizebounds (which are left unspecified in the language definition) are exceeded:

>(define (fact n)

(check ((num-equal? n 0) 1)

(else (times n (fact (minus n 1))))))

Function fact defined.

>(fact 10)

Term: 3628800

>(fact 20)

Error, top level, 3.17: Call to times resulted in overflow.

As introduced by the foregoing directives, the subtraction procedure minus will be partial. Thatis, if its first argument is less than the second (in the customary ordering), then the numeral 0 willbe returned by default.

>(minus 3 1)

Term: 2

>(minus 3 200)

Term: 0

However, if we have already introduced a unary function symbol which we intend to denote the inverseoperation on the numeric domain (viewed as a group, with 0 acting as the identity element), then wecan instruct Athena to take this into account as illustrated below. First, suppose that we intend tomodel the inverse function on Number (i.e., unary minus) by the symbol ~:

>(domain Number)

New domain Number introduced.

>(declare ~ (-> (Number) Number))

New symbol ~ declared.

>(use-numerals (0 ...) Number)

100

Numerals in the range (0 ...) can now be used as terms of Number.

>(define-numeric-operations ~)



These procedures can be applied to any numeric constants, where a "numeric constant" is

either a numeral in the (0 ...) range, or a term of the form (~ t),

where t is a numeric constant.

By passing the symbol ~ as an “argument” to define-numeric-operations, we instruct Athena to use~ as an inverse operation, with 0 as the identity. We now have:

>(minus 2 5)

Term: (~ 3)

>(plus (~ 3) (~ 5))

Term: (~ 8)

>(times (~ 3)

(~ (~ (~ 4))))

Term: 12

>(plus 7 (~ 7))

Term: 0

Of course any other function symbol could be used in place of ~ so long as it was of the signature(-> (Number) Number). Athena will check to make sure that this is the case; it is important for sortchecking, which can affect logical soundness.

Note that if we were only interested in the positive integers, we could have issued the followingdeclarations:

>(domain PosNum)

New domain PosNum introduced.

>(use-numerals (1 ...) PosNum)

Numerals in the range (1 ...) can now be used as terms of PosNum.

We then have:

>23

Term: 23

>(print (join "\n" (sort-of 23) "\n"))

PosNum

Unit: ()

>0

Error, top level, 1.1: Unbound identifier: 0.

101

In this case, 0 is not recognized. The previous numeric operations can be introduced here as well witha directive of the form (define-numeric-operations). Here minus will return 1 as the least element bydefault. Since the positive integers do not have an identity element in the usual interpretation, if wetry to pass a unary function symbol of signature (-> (PosNum) PosNum) to define-numeric-operations,the latter will ignore it and will issue a relevant warning.

Reals

Real numerals can also be used in a similar manner:

>(domain Real)

New domain Real introduced.

>(use-numerals (0.0 ...) Real)

Numerals in the range (0.0 ...) can now be used as terms of Real.

>3.14278

Term: 3.14278

>.9548458495839434647483173947655648309302387352754445763252373293229857645210

Term: .9548458495839434647483173947655648309302387352754445763252373293229857645210

>23.

Term: 23.

>55

Term: 55

>(define-numeric-operations)


plus, minus, times, div, num-equal?, less?, and greater?.

These procedures can be applied to any numerals in the (0.0 ...) range.

>(plus 1.3 .7)

Term: 2.0

>(div 9.0 3)

Term: 3.0

Inverses and subtraction issues are handled as in the case of integers:

>(domain Real)

New domain Real introduced.

>(declare ~ (-> (Real) Real))


>(use-numerals (0.0 ...) Real)

102

Numerals in the range (0.0 ...) can now be used as terms of Real.



plus, minus, times, div, num-equal?, less?, and greater?.

These procedures can be applied to any numeric constants, where a "numeric constant" is

either a numeral in the (0.0 ...) range, or a term of the form (~ t),

where t is a numeric constant.

>(minus 2.0 5.0)

Term: (~ 3.0)

>(times (div 3.0 1) (~ 3.0))

Term: (~ 9.0)

We could also restrict attention to the positive reals with a directive of the form

(use-numerals (0.0 ...) Real).

This will affect inverse and subtraction issues just as it does in the case of integers.

7.5 Characters and strings

ASCII characters can be directly input into Athena in accordance with the following grammar (whereC ranges over characters):

C ::= ‘Printable (E.g., ‘A, ‘f, ‘@, etc.)| ‘\Code (E.g., ‘\3, ‘\112, etc.)| ‘\^Control-Character (E.g., ‘\^C)| ‘\Escape-Character (E.g., ‘\n, ‘\\, etc.)

Escape-Sequence ::= " | \ | a | b | n | r | f | t | vControl-Character ::= A | · · · | Z | @ | [ | ] | / | ^ | _ | ‘\blank

where:

1. Printable is any printable ASCII character other than the backslash \ (to wit, any characterwith ASCII code from 32 through 91 or 93 through 127, inclusive); and

2. Code is any sequence of one, two, or three digits, e.g., 9, 03, 124, etc..

Since any ASCII character with code c can be expressed as ‘\c, the other three ways of representingcharacters are strictly speaking redundant, but useful nevertheless. Escape characters have standardmeanings, e.g., \n is the newline character, \t is the tab, etc..

An Athena string is simply a list of characters. String constants can be directly input betweenquotes. A control character, i.e., one with ASCII code c ≤ 32, can be directly embedded inside astring constant by writing it as \c. Escape sequences can also be embedded inside a string constant.E.g.:

103

>(define s "\nHello world!\n")

List s defined.

>s

List: [‘\n ‘H ‘e ‘l ‘l ‘o ‘\blank ‘w ‘o ‘r ‘l ‘d ‘! ‘\n]

>(print s)

Hello world!

Unit: ()

Any procedure that operates on lists will automatically work on strings, e.g.:

>(define s (join "Green ideas"

" sleep "

"furiously.")

String s defined.

>(print s)

Green ideas sleep furiously.

Unit: ()

>(rev "abc")

List: [‘c ‘b ‘a]

We can also use list pattern matching to do string matching. For instance, the following procedurereturns true or false according to whether or not the input string contains the substring "begin":

>(define (foo s)

(match s

((split _ (split "begin" _)) true)

(_ false)))


>(foo "if x < 0 begin y := 2; z := a; end")

Term: true

7.6 Built-in procedures

7.6.1 Input and output facilities

The following top-level procedures perform rudimentary input/output:

104

• read is a nullary procedure that reads a single character from standard input; the character isreturned as the result of the procedure.

• write-val is a unary procedure that prints an arbitrary value to standard output. It returnsthe unit value.

• write is like write-val, but it also prints out the type of the value and a blank line aboveand below the value printout. Another difference is in the way strings are handled. Whereaswrite-val treats a string literally as a list of characters, write will print a string in conventionalformat, and will properly interpret any control characters embedded inside the string (tabs, newline characters, etc.). Note that when printing propositions, both write and write-val willprint out the sorts of quantified variables.

• print is a unary procedure that takes a string and writes it to standard output. Controlcharacters embedded inside the string are properly interpreted. The unit value is returned asthe result of the call.

• val->string is a unary procedure that takes an arbitrary value V and returns the printedrepresentation of V in the form of a string. Therefore, (print (val->string v)) has the sameeffect as write-val.9

• read-file is a unary procedure that takes the name of a file (represented as a string) andreturns a list of all the characters in the file, from first to last.

• write-file is a binary procedure that takes two strings s1 and s2. If a file by the name of s1

exists, then the string s2 is written into it (appended at the end, not overwriting any informationthat might already exist in s1). If not, a file by the name of s1 is created anew and s2 is writteninto it. The string s1 can be either an absolute path name (e.g., c:\\tmp\\foo.txt), or arelative name such as foo.txt.

• var->string is a unary procedure that converts a term variable to a string. E.g.,

(var->string ?x)

produces the string "x".

• symbol->string is a unary procedure that converts a function symbol to a string.

• id->string is a unary procedure that converts a meta-identifier to a string.

• string->var is a unary procedure that converts a string to a term variable. E.g.,

(string->var "foo")

produces ?foo.

• string->id is a unary procedure that converts a string to a meta-identifier.9With one small exception: As we already pointed out, calling write-val on a proposition will print out the sorts

of quantified variables; but val->string will not.

105

7.6.2 Terms

The following top-level procedures deal with terms:

• make-term is a binary procedure that takes a function symbol f and a list of terms [t1 · · · tn]and returns the term (f t1 · · · tn), provided that the said term is well-sorted; an error is reportedotherwise.

7.7 Examples

In this section we will present a number of examples illustrating the most important of the pointswe have made so far. A few of the examples use some of the top-level bindings shown in Figure ??.Those are explained in detail in Section ??. The reader may look them up there as the need arises.We begin by writing a negation function that maps true to false and vice versa:

>(define (negate b)

(match b

(false true)

(_ false)))

Procedure negate defined.

Note that this function will produce a result even for inputs outside the two-element set {true, false}.This is common in weakly typed programming languages. In our examples we will take care to specifyprecisely what values each procedure is supposed to recognize as meaningful inputs, and exactly whatresults should be produced for them. This will serve as the specification contract for the procedure.An implementation will then be judged correct iff it meets that specification, i.e., if given legitimateinput values it produces the appropriate output (or has the appropriate side effects), where “appropri-ate” means as advertised in the specification. What happens for input values outside of the specifiedinput range is irrelevant, since it was left unspecified. Accordingly, if we specify that a negationprodecure must simply map the term true to the term false and vice versa, then the proceduregiven above is a correct implementation. An alternative implementation is:

(define (negate b)

(match b

(true false)

(false true)

(_ (error "Incorrect input value given to negate."))))

This is equally correct with respect to the specification. Of course if we had explicitly specified that anerror must occur if the input value is neither true nor false, then the first implementation would beincorrect. (Note: The predefined procedure error takes a string, prints it out, and halts evaluation.)

Next we illustrate Athena’s built-in syntax forms for short-circuit boolean evaluation (note thatnull? returns true or false depending on whether the given list of values is empty):

>(& true (null? []) (negate false))

Term: true

>(& true true true false)

Term: false

106

>(define (loop) (loop))

Procedure loop defined.

>(& false (loop))

Term: false

>(|| false true (loop))

Term: true

Next we demonstrate some simple numeric procedures:

>(domain Integer)

New domain Integer introduced.

>(use-numerals (0 ...) Integer)

Numerals in the range (0 ...) can now be used as terms of Integer.

>(declare ~ (-> (Integer) Integer))





These procedures can be applied to any numeric constants, where

a "numeric constant" is either a numeral in the (0 ...) range,

or a term of the form (~ t), where t is a numeric constant.

We can now write a few simple procedures on integers:

(define (pred n)

(minus n 1))

(define (less-than-or-equal x y)

(|| (less? x y)

(equal? x y)))

(define (positive? x)

(greater? x 0))

(define (abs n)

(check ((less? n 0) (times n (~ 1)))

(else n)))

(define

107

(even? n)

(check ((equal? n 0) true)

(else (odd? (pred (abs n)))))

(odd? n)

(check ((equal? n 0) false)

(else (even? (pred (abs n))))))

And here we have two versions of factorial, a functional and an imperative one:

(define (fact n)

(check ((less? n 1) 1)

(else (times n (fact (pred n))))))

(define (fact-imper n)

(let ((result (cell 1))

(counter (cell n))

(_ (while (positive? (ref counter))

(seq (set! result (times (ref counter) (ref result)))

(set! counter (pred (ref counter)))))))

(ref result)))

Let us now write some standard procedures on lists, beginning by implementing head and tail.(The predefined binary procedure add “conses” an element at the front of a list; while join takes anarbitrary number of lists as arguments and concatenates them.)

(define (head L)

(match L

((list-of x _) x)))

(define (tail L)

(match L

([] [])

((list-of _ rest) rest)))

(define (map f L)

(match L

([] [])

((list-of x rest) (add (f x) (map f rest)))))

(define (filter pred L)

(match L

([] [])

((list-of x rest) (check ((pred x) (add x (filter pred rest)))

(else (filter pred rest))))))

(define (fold f L id)

(match lst

([] [])

([x] (f x id))

((list-of x (list-of y more))

(fold f (add (f x y) more) id))))

Here is a linear-time list reversal procedure:

108

(define (reverse L)

(letrec ((rev (lambda left right)

(match left

([] right)

((list-of x rest) (rev rest (add x right))))))

(rev L [])))

The following procedure flattens a list of lists:

(define (flatten L)

(match L

([] [])

((list-of L1 rest) (join L1 (flatten rest)))))

Note that procedures can be anonymous:

>((lambda (x) x) [])

List: []

And they are higher-order:

>(define (compose f g)

(lambda (x)

(f (g x))))

Procedure compose defined.

>(define first head)

Procedure first defined.

>(define second (compose head tail))

Procedure second defined.

>(second [’a ’b ’c])

Term: ’b

Next we write some procedures dealing with terms. The first one returns a list of the children of agiven term, viewing the term as a tree (so that variables and constant terms have no children, whilethe children of a term of the form (f t1 · · · tn) are t1 · · · tn):

>(define (children t)

(match t

((some-var _) [])

(((some-symbol _) (some-list args)) args)))

>(children ?x)

List: []

>(children true)

109

List: []

>(children (subset null (union ?s ?t)))

List: [null (union ?s ?t)]

The next two procedures determine whether a given term is a variable or a constant, respectively:

(define (var? t)

(match t

((some-var _) true)

(_ false)))

(define (constant? t)

(& (null? (children t))

(negate (var? t))))

The following procedure returns a list of all the variables that occur in a given term:

(define (vars t)

(match t

((some-var x) [x])

(((some-symbol f) (some-list args)) (fold join (map vars args) []))))

As written, the list returned by get-vars might contain repetitions. We can weed out duplicates withthe following procedure, which takes an arbitrary list of values and removes all duplicate occurrencesof the same value:

(define (rd L)

(match L

((split L1 (list-of x (split L2 (bind L3 (list-of x _))))) (rd (join L1 L2 L3)))

(_ L)))

And here is a procedure for converts a term into a string:

>(define

(term->string t)

(check ((var? t) (var->string t))

((constant? t) (symbol->string t))

(else (join (symbol->string (root t)) "(" (terms->string (children t)) ")")))

(terms->string terms)

(match terms

((list-of t more) (check ((null? more) (term->string t))

(else (join (term->string t) "," (terms->string more)))))))

We finally write a few procedures that manipulate propositions. We begin with a procedure thatdetermines literal propositional equality (as opposed to alphabetic equivalence):

>(define

(lit-eq P Q)

(match [P Q]

([(some-atom A) A] true)

([((some-prop-con pc) (some-list args)) (pc args’)] (lit-eq* args args’))

([((some-quant q) x (some-prop B)) (q x B’)] (lit-eq B B’))

110

(_ false))

(lit-eq* props1 props2)

(match [props1 props1]

([[] []] true)

([(list-of P rest1) (list-of P rest2)] (lit-eq* rest1 rest2))

(_ false)))

Procedure lit-eq defined.

Procedure lit-eq* defined.

>(define P (forall ?x (exists ?y (= ?x ?y))))

Proposition P defined

>(define Q (forall ?x (exists ?foo (= ?x ?foo))))

Proposition Q defined.

>(equal? P Q)

Term: true

>(lit-eq P Q)

Term: false

7.8 Rigorous semantics

In this section we specify in full precision the result of evaluating any expression E in any environmentρ and store σ. By an environment we will mean a computable function that maps any given identifiereither to a value or to a special “unbound” token. And by a store we will mean a computable functionthat maps any natural number (representing a memory location) to a value (the location’s contents)or to a special “unassigned” token; we stipulate that infinitely many numbers must be bound to“unassigned.” For any function f : A 7→ B and a ∈ A, b ∈ B, we write f [a 7→ b] for the function fromA to B that maps a to b and every other x ∈ A to f(x).

The result of evaluating an expression in a given ρ and σ is one of three things: either a pairconsisting of a value V and a store σ′, signifying successful completion of the computation; or apair consisting of an error token and a store σ′, indicating the occurrence of an error during thecomputation (a store σ′ is still necessary to correctly handle backtracking); or non-termination. Thespecification is given in English, but without any ambiguity. Since the evaluation of most expressionsleaves the state unaffected, if we do not explicitly specify what the new state is then it should beassumed that it is the same as the state in which the expression was evaluated; i.e., that σ′ = σ.Also, unless we explicitly say otherwise we will assume that the evaluation of an expression E isimmediately halted if the evaluation of a subexpression of E produces an error and a store σ′; in suchcases the result for E becomes that same error token and σ′.

• Identifiers: If an identifier I is bound in the given environment ρ to some value V , then itsvalue is V ; it is an error if I is unbound.

111

• Unit: The value of the expression () is always the unit value, regardless of ρ and σ.

• Term variables, meta-identifiers, and characters are always self-evaluating, regardless ofρ and σ.

• Check expressions: To evaluate an expression of the form

(check (E1 E′1) · · · (En E′

n))

in ρ and σ, we first evaluate E1 in ρ and σ, possibly resulting in a value V1 and store σ1.10

If V1 is the constant term true, then the result of the entire check expression is the resultobtained by evaluating E′

1 in ρ and σ1. Otherwise, if V1 is false, the process continues with thesecond alternative, taking into account any state changes that the evaluation of E1 might haveengendered: the expression E2 is evaluated in ρ and σ1, possibly resulting in a value V2 and astore σ2. If V2 is true, then the final result is that of evaluating E′

2 in ρ and σ2. Otherwise,if V2 is false, E3 is evaluated in ρ and σ2, its value compared to true, and so forth. Clearly,evaluation might diverge if some Ei or E′

i diverges. It is an error if there are no alternatives (i.e.,if n = 0), in which case we return an error token and the store σ as the result of the evaluation;or if no alternative succeeds (i.e., no Ei ever produces true), in which case we return an errortoken and the store σn; or if some Ei results in a value other than true or false, in which casewe return an error token and the store σi. If En is the identifier else and no Ej produced true,j < n, then the final result is that of evaluating E′

n in ρ and σn−1.11

• Procedures: The value of (lambda (I1 · · · In) E) in ρ and σ is a function value that takesa sequence of n values V1, . . . , Vn and a store σ′ as arguments, and evaluates E in ρ[I1 7→V1, . . . , In 7→ Vn] and σ′. Note that the lexical environment is statically determined whereas thestore is dynamic. The evaluation of procedures always terminates successfully.

• Applications: The value of an expression of the form

(E E1 · · ·En)

in ρ and σ is obtained as follows. First, E is evaluated in ρ and σ, possibly resulting in a valueV and store σ′. Then we proceed with a case analysis of V :

1. If V is a function value φ, we evaluate E1, . . . , En, sequentially, starting with E1 in ρ andσ′, to obtain values V1, . . . , Vn and a store σn. (This sequential evaluation threads the storefrom the evaluation Ei to that of Ei+1. Specifically, first we evaluate E1 in ρ and σ′, toobtain a value V1 and a store σ1; then we evaluate E2 in ρ and σ1 to obtain a value V2

and store σ2; we continue in this fashion until we evaluate En in ρ and σn−1, resulting ina value Vn and store σn.) The final result is that of applying φ to V1, . . . , Vn and σn.

2. If V is a function symbol f of arity n, we sequentially evaluate E1, . . . , En, beginning withE1 in ρ and σ′, to obtain values V1, . . . , Vn and a store σn. If each Vi is a term ti, thenthe resulting value is the term (f t1 · · · tn), provided that the said term is well-sorted. Ifit is not, we return an error token and the store σn as the result. Also, if some Vi is not aterm, we return an error token and the store σi. Finally, if V is a function symbol of arityother than n, we return an error token and σ′.

10In accordance with the convention we made in the beginning of this section, if the evaluation of E1 in ρ and σgenerates an error and some store σ1, we simply return that as the result of the entire check expression and halt.

11The keyword else may only appear in the position of En; it is a syntax error if else occurs anywhere else.

112

3. If V is a propositional constructor ◦, we evaluate E1, . . . , En sequentially to obtain valuesV1, . . . , Vn and a store σn. If each Vi is a proposition Pi, then the resulting value is theproposition (◦ P1 · · ·Pn), provided that the latter is well-sorted, while the resulting storeis σn. If the said proposition is not well-sorted, we return an error and the store σn; andif some Vi is not a proposition, we return an error and the store σi.

4. If V is a quantifier q, we check to make sure that n > 1; if not, we return an error tolkenand the store σ′. We then evaluate E1, . . . , En sequentially to obtain values V1, . . . , Vn anda store σn. The first n − 1 values must all be term variables x1, . . . , xn−1, while the lastvalue Vn must be a proposition P . If not, we halt with an error and the correspondingstore, otherwise the resulting value is the proposition

(q x1 (q x2 · · · (q xn P) · · · ))

provided that it is well-sorted, while the resulting store is σn. If the foregoing propositionis not well-sorted, we return an error token and the store σn.

5. If V is a substitution θ, check whether n = 1. If not, we produce an error and the storeσ′ and halt. Otherwise we evaluate E1 in ρ and σ′, to obtain a value V1 and a store σ1.Then:

– If V1 is a term t1, the resulting value is the term θ̂(t1), provided that the latter iswell-sorted; if it is not, we return an error token and σ1.12

– If V1 is a list of terms [t1 · · · tn], then the resulting value is the list of terms [θ̂(t1) · · · θ̂(tn)]provided that each θ̂(ti) is well-sorted. If one is not, we return an error token and thestore σ1. (See the Appendix for a rigorous definition of when a term is well-sorted.)

– If V1 is neither a term not a list of terms, we halt in error with the store σ1.

In either of the first two cases the resulting store is σ1

6. If V is neither a function value, nor a function symbol, propositional constructor, quantifier,or substitution, then we halt with an error token and σ′.

• Let expressions: The evaluation of

(let ((I1 E1) · · · (In En)) E)

in ρ and σ proceeds by induction on n:

1. If n = 0 then the result is that obtained by evaluating E in ρ and σ.

2. If n > 0 then the result is that obtained by evaluating

(let ((I2 E2) · · · (In En)) E)

in ρ[I1 7→ V1] and σ1, where V1 and σ1 are the value and store resulting from the evaluationof E1 in ρ and σ.

• Letrec expressions: The result of evaluating

(letrec ((I1 E1) · · · (In En)) E)

in ρ and σ is that of evaluating the following expression in ρ and σ:

12We write θ̂(t) for the term obtained by applying the substitution θ to the term t. See Appendix ?? for details.

113

(let ((I1 (cell ())) ... (In (cell ())))

(seq

(set! I1 E′1)

...(set! In E′

n)

E′))

where each E′j is obtained from Ej by replacing every free occurrence of Ij by (ref Ij), j =

1, . . . , n; E′ is likewise obtained from E. This desugaring is the classic way to “tie the knot,”and is often used to implement recursion when the implementation language supports state.The various Ei will usually be procedure or method expressions, but this need not be the case.

• Try expressions: The try construct is a backtracking mechanism: To evaluate (try E1 E2)in ρ and σ, we first evaluate E1 in ρ and σ. If this results in a value V1 and store σ1, we returnV1 and σ1 as the results of the entire try; but if it results in an error token and a store σ1, wereturn the results of evaluating E2 in ρ and σ1.

• Cells: To evaluate an expression of the form (cell E) in ρ and σ, we first evaluate E toobtain some value V and store σ′. We then let l be the smallest natural number such that σ′(l)is unassigned, and we return the cell represented by the location l and the store σ′[l 7→ V ].

• References: To evaluate an expression of the form (ref E) in ρ and σ, we evaluate E in ρand σ, possibly obtaining a value V and store σ′. If the value V is not a memory location wereturn an error token and σ′. If it is a memory location l, we return the value σ′(l) (it is anerror if cell l is unassigned, although this could never happen in our semantics) and the storeσ′.

• Assignments: To evaluate an expression of the form (set! E1 E2) in ρ and σ, we first evaluateE1 in ρ and σ, possibly obtaining a value V1 and store σ1. If V1 is not a memory location (cell),we halt with an error and σ1 as the result. If it is a cell l, we proceed to evaluate E2 in ρ andσ1, possibly obtaining a value V2 and store σ2. We then return the unit value and the storeσ2[l 7→ V2].

• Sequences: To evaluate an expression of the form (seq E1 · · ·En) in ρ and σ, we proceed byinduction on n. If n = 1 we simply return the results of evaluating E1 in ρ and σ. If n > 1,we evaluate E1 in ρ and σ, possibly obtaining a value V1 and store σ1, and we then return theresult of evaluating (seq E2 · · ·En) in ρ and σ1 (note that the value V1 is thus thrown away).

• While loops: To evaluate an expression of the form (while E1 E2) in ρ and σ:

1. We evaluate E1 in ρ and σ, possibly obtaining a value V1 and store σ1.

2. If V1 is the constant term false, we return the unit value and store σ1. Otherwise, if V1 istrue, we evaluate E2 in ρ and σ1 to obtain some value V2 and store σ2. We then continuewith the first step, only now E1 is evaluated in ρ and σ2 (rather than ρ and σ). If V1 isneither true nor false, we return an error token and σ1.

• Match expressions: To evaluate an expression of the form

(match E (π1 E1) · · · (πn En))

114

in ρ and σ, we first evaluate the discriminant E in ρ and σ, possibly obtaining a value V and astore σ′. We then go through the patterns π1, . . . , πn sequentially, trying to find a pattern that ismatched by V in ρ (using the algorithm of Section 7.3). If no such pattern is found, we halt withan error and σ′. Otherwise let πj be the first matching pattern and let {I1 7→ V1, . . . , Ik 7→ Vk}be the resulting finite set of bindings. The final value and store are those produced by evaluatingEj in ρ[I1 7→ V1, . . . , Ik 7→ Vk] and σ′.

• Short-circuit boolean operations: Expressions of the form (& E1 · · ·En) and (|| E1 · · ·En)are evaluated just as explained in Section 7.2, making sure to thread the store in the obviousmanner (the store resulting from the evaluation of E1 in ρ and σ should be the store in whichthe evaluation of E2 should take place, if E2 is to be evaluated at all; and so on).

7.9 Some programming examples

7.9.1 Sorting algorithms

In this section we will implement some standard sorting algorithms: bubble-sort, quick-sort, andmerge-sort. All of our sorting procedures will be polymorphic, taking a binary comparison procedureless? as input.

Bubble-sort

The key operation here is bubble, which “bubbles” the smallest element of the list to the last position.More precisely, a permutation L′ of the input list L is returned such that the smallest element of Lis the last element of L′:

(define (bubble L less?)

(match L

((list-of x (bind tail (list-of y rest)))

(check ((less? x y) (add y (bubble (add x rest) less?)))

(else (add x (bubble tail less?)))))

(_ L)))

7.9.2 Mutable data structures

Can we implement mutable data structures such as the pointer-based linked lists, graphs, etc., ofimperative programming languages? That is an important question because mutable data struc-tures can sometimes improve the asymptotic complexity of certain useful algorithms. The answer isaffirmative—we can readily build such structures by using mutable cells. The first and most funda-mental mutable structure we will implement is the mutable pair, supporting the following operations:

• (make-mutable-pair x y): Return a mutable pair whose first and second components are xand y, respectively.

• (mutable-pair? p): Return true is p is a mutable pair, and false otherwise.

• (get-first p): If p is a mutable pair, return its first component.

• (get-second p): If p is a mutable pair, return its second component.

115

• (set-first p x): If p is a mutable pair, set its first component to be x (destroying whateverthe first component of p might have been previously).

• (set-second p x): Like set-first, but affecting the second component of the given pair.

We will implement mutable pairs simply as pairs of cells:

(define (make-mutable-pair x y)

[(cell x) (cell y)])

(define (mutable-pair? p)

(match p

([(some-cell _) (some-cell _)] true)

(_ false)))

(define (get-first p)

(ref (first p)))

(define (get-second p)

(ref (second p)))

(define (set-first p x)

(set! (first p) x))

(define (set-second p x)

(set! (second p) x))

Here is an example of mutable pairs in action:

>(define p (make-mutable-pair 1 2))

List p defined.

>(get-first p)

Term: 1

>(get-second p)

Term: 2

>(set-first p 9)

Unit: ()

>(get-first p)

Term: 9

>(get-second p)

Term: 2

116

Another very useful mutable structure is that of a node, which can be viewed as a mutable paircontaining two fields: a piece of data x of arbitrary type, and a link next that (possibly) points toanother node. Linked lists, trees, etc., are built by constructing nodes and linking them appropriately.An implementation of nodes must support the following operations:

• null: A constant pointer value. Any pointer having null as its value is considered uninitialized(or rather, pointing nowhere in particular).

• (make-node x): Create and return a new node with x as its data field and null as its nextfield.

• (data n): if n is a node, return its data field. In particular, we should have:

(data (make-node x)) = x

• (next n): if n is a node, return its next field (this will be either null or else another node). Inparticular, (next (make-node x)) = null.

• (set-data n x): if n is a node, set its data field to x (destroying whatever value the data fieldmight have had previously).

• (set-next n m): if n and m are nodes, set its next field to m.

We adopt throughout the convention that null will be represented by the unit value (). With thisconvention, nodes can be readily implemented as mutable pairs:

(define null ())

(define (make-node x)

(make-mutable-pair x null))

(define data get-first)

(define next get-second)

(define set-data set-first)

(define set-next set-second)

We also define a useful auxiliary procedure terminal-node? that returns true if the next field of agiven node is null and false otherwise:

(define (terminal-node? n)

(equal? (next n) null))

Circular data structures can now be easily constructed, e.g., by making the next field of a node npoint to n:

>(define n (make-node 9))

List n defined.

>(data n)

117

Term: 9

>(terminal-node? n)

Term: true

>(set-next n n)

Unit: ()

>(data (next n))

Term: 9

>(data (next (next n)))

Term: 9

>(terminal-node? n)

Term: false

>(equal? n (next n))

Term: true

The node constructed above can be pictured as follows:

9n : s?

7.9.3 In-trees

In-trees are like regular trees, except that they have “up-links” instead of the usual “down-links.”That is, a node points upwards to its (unique) parent, instead of pointing downward to its children.The root has a null pointer. An example is shown in Figure 7.4. In-trees are a useful data structure.We will use them to implement dynamic equivalence relations efficiently (Section 7.9.4).

An implementation of in-trees must support the following operations:

1. (make-in-tree-node x): Create a single-node in-tree, and store the data item x in the newnode.

2. (root? n): Return true or false according to whether or not the tree node n is a root node.In particular, we must have (root? (make-in-tree-node x)) = true.

3. (data n): Return the data item stored in the tree node n. In particular,

(data (make-in-tree-node x)) = x

118

ma6

mb��3

mcmdQ

QQQk

Figure 7.4: A sample in-tree.

4. (parent n): Return the node that is the parent of node n. We require that n must not be aroot node.

5. (set-data n x): Set the data field of node n to be x.

6. (set-parent n m): Set the parent of node n to be node m.

We can easily implement in-tree nodes in terms of the generic nodes we introduced earlier. The nextpointers will simply be thought of as pointing upwards to parents:

(define make-in-tree-node make-node)

(define root? terminal-node?)

(define parent next)

(define set-parent set-next)

Note that data and set-data are automatically defined on in-tree nodes by virtue of the latter beingregular generic nodes.

7.9.4 Implementing dynamic equivalence relations

In this section we will implement an efficient union-find data structure, which is a data structure formanipulating dynamic equivalence relations. A dynamic equivalence relation13 is simply one whoseextension changes over the course of a computation. For purposes of exposition, we will be concernedwith equivalence relations on a finite set of elements A = {a1, . . . , an}, for some fixed n > 0. We willuse the symbol ≡ to denote an equivalence relation on A.

Initially, each element ai is only equivalent to itself. That is, the initial equivalence relation isthe identity, and we thus have n equivalence classes: {a1}, . . . , {an}. Subsequently, we are to processequivalence instructions of two kinds:

1. An IS query, of the form IS ai ≡ aj? This query asks whether ai and aj are currentlyequivalent.

13Recall that an equivalence relation is one that is reflexive, symmetric, and transitive.

119

2. A MAKE command, of the form MAKE ai ≡ aj . The effect of this command is to updatethe relation ≡ so as to make ai and aj equivalent.

Note that the answer to a query of the form IS ai ≡ aj is “yes” iff either (i) a command MAKE ai ≡aj was previously executed; or else (ii) ai ≡ aj follows from previous MAKE commands in tandemwith the reflexive, symmetric, and transitive laws that characterize equivalence relations. By anequivalence program we will mean a sequence of m > 0 equivalence instructions.14

Suppose, for example, that the underlying domain A consists of five elements a1, . . . , a5, so thatwe start with the five equivalence classes {a1}, {a2}, {a3}, {a4}, {a5}. The left column of the tablebelow shows several numbered instructions that are to be processed sequentially; the right columnshows the appropriate response. Note that the response to a MAKE instruction is to modify thecurrent equivalence classes. Also note that each MAKE instruction decreases the total number ofequivalence classes by one.

Initial equivalence classes: {a1}, {a2}, {a3}, {a4}, {a5}

Instruction Response1. IS a1 ≡ a3? No2. MAKE a2 ≡ a4 {a1}, {a2, a4}, {a3}, {a5}3. IS a1 ≡ a2? No4. MAKE a1 ≡ a4 {a1, a2, a4}, {a3}, {a5}5. IS a2 ≡ a1? Yes6. IS a1 ≡ a5? No7. MAKE a5 ≡ a3 {a1, a2, a4}, {a3, a5}8. IS a3 ≡ a5? Yes9. IS a3 ≡ a2? No

10. IS a4 ≡ a4? Yes

The above equivalence program consisted of 10 instructions. In general, we want to implementdynamic equivalence relations so that we can “execute” an equivalence program of arbitrary lengthm > 0 as efficiently as possible. Efficiency is to be analyzed as a function of m and n (recall that nis the size of the underlying domain A).

There are several ways to implement dynamic equivalence relations. One of the most efficientis to represent the equivalence classes by dynamic sets implemented as in-trees. For instance, theequivalence class {a1, a2, a4} from the above example might be represented by the in-tree:

ma1

��

��3

ma4

ma2

QQ

QQk

We will consider three basic operations on such trees:14The term “equivalence program” derives from FORTRAN, which allowed the programmer to make equivalence

declarations specifying that certain variables and/or array locations should be aliased. The compiler problem of correctlyassigning storage locations (in a way that respects the given equivalence declarations) was the initial impetus that ledto efficient algorithms for manipulating dynamic equivalence relations.

120

1. CREATE(ai) creates a new single-node in-tree with ai as its data field and null as its parentfield, representing the singleton equivalence class {ai}. Since each element can only be in oneequivalence class, we require that CREATE(ai) is only executed once, initially. Specifically,we assume that the n operations CREATE(a1), . . . ,CREATE(an) will be carried out first,before any other operations take place.

2. UNION(t1, t2) “joins” two the disjoint sets represented by the in-trees rooted at t1 and t2.Concretely, the operation makes the parent field of t1 point to t2. E.g.:

t1 -

ma1

��

��3

ma4

ma2

QQ

QQkt2 - ma5

After UNION(t1, t2):

ma1

��

��3

ma4

ma2

QQ

QQk

��

��3ma5

We stress that (a) t1 and t2 are the root nodes of the corresponding in-trees, and (b) t1 6= t2,i.e., the trees represent disjoint sets.

3. FIND(ai) returns the root node of the unique in-tree (equivalence class) containing ai.

Using these operations, we can implement the equivalence instructions IS and MAKE with thefollowing equivalent? and unite procedures:

(define (equivalent? x y) ## Answers IS queries

(equal? (find x) (find y)))

(define (unite x y) ## Implements MAKE instructions

(union (find x) (find y)))

We use a vector V of length n to hold (pointers to) the various in-trees, with V[i] storing theequivalence class of ai. Initially, after the n operations CREATE(a1), . . . ,CREATE(an) have takenplace, the vector will look as follows:

s6ma1

1 2 3 4 5

s6ma2

s6ma3

s6ma4

s6ma5

The three foregoing operations can now be implemented straightforwardly:

(define create make-in-tree-node)

(define union set-parent)

121

The only slightly more complicated case is that of FIND, since we may need to climb up the tree toget to the root:

(define (chase-up t)

(check ((root? t) t)

(else (chase-up (parent t)))))

(define (find i)

(chase-up (vector-sub V i)))

where V is the vector pointing to the trees.To make things concrete, suppose that n = 5, and that ai = i, so that an element can serve as its

own subscript in V. We can then define:

(define V (make-vector 5 ()))

(vector-set V 1 (create-set 1))





This completes the initialization phase.The equivalence program shown earlier can now be executed by feeding the following sequence of

expressions to the Athena interpreter:

(equivalent? 2 4)

(unite 2 4)

(equivalent? 2 1)

(unite 1 4)

(equivalent? 2 1)

(equivalent? 1 5)

(unite 5 3)

(equivalent? 3 5)

(equivalent? 3 2)

(equivalent? 4 4)

And here is the output we obtain:

Term: false

Unit: ()

Term: false

Unit: ()

Term: true

Term: false

Unit ()

Term: true

Term: false

Term: true

We emphasize an important precondition of unite: the elements i and j to be united must not alreadybe in the same equivalence class; they must be in disjoint sets. Otherwise we will end up creating

122

a cycle, by making the root of (find i)15 point back to itself. From that point on, any attempt tofind an element in that equivalence class will lead to an infinite loop.

An application is connected components, Kruskal’s algorithm, and fast congruence closure on termgraphs.

7.9.5 Weighted union and path compression

What is the asymptotic complexity of our implementation? More precisely, given an equivalenceprogram P of m instructions, and with the underlying domain A having n elements, how quickly canour implementation “execute” P?

It is not difficult to see that after a sequence of n− 1 MAKE operations of the form

MAKE(a1, a2),MAKE(a2, a3), . . . ,MAKE(an−1, an)

the final equivalence class created by our implementation (namely, A itself) will be represented by anin-tree that is essentially a linked list:

n

s6man

s6ma1 -

1 2 3

s6ma2 -

s6ma3 - -

· · ·

· · ·

Subsequently, a list of n invocations of FIND(a1) (e.g., as would be required to answer n successivecopies of the query IS a1 ≡ a2) will result in n traversals of the entire list, where each traversal ofcourse requires O(n) steps. Accordingly, the final cost of executing the program is at least O(n2) (orequivalently, O(m2), since in this case m is taken to be twice as large as n). This is in fact a tightbound, meaning that the worst-case complexity is also bounded above by m2; it is easy to show thatthat no equivalence program can possibly require more than m2 link operations for its execution.

A significant improvement in asymptotic complexity can be achieved with two simple heuristics:weighted union and path compression. We describe each in turn.

The reason why we end up with a linked list in the preceding scenario is because we keep turninga large in-tree into a subtree of a single-node in-tree. This has the effect of appending a single-nodelist at the end of a large linked list. We can rectify this simply by ensuring that when we join twoin-trees, we always make the smaller one a subtree of the larger one.16 We change our implementationso that an in-tree node n maintains an integer denoting the size of the tree rooted at n; we make thisfield mutable, since it needs to be properly updated after a union. We do not change the definitionof union, which still makes the first given tree a subtree of the second. But we change the definitionof unite so as to take account of sizes when linking two trees:

(define (create-set i)

(make-in-tree-node [i (cell 1)]))

(define (get-size n)

(ref (second (data n))))

15Which will be the same as the root of (find j) if i and j are already equivalent!16In the case of a tie we can just make an arbitrary decision.

123

(define (set-size n new-size)

(set! (second (data n)) new-size))

(define (link root-1 root-2 new-size)

(seq (union root-1 root-2)

(set-size root-1 new-size)

(set-size root-2 new-size)))

(define (unite i j)

(let ((i-root (find i))

(j-root (find j))

(i-size (get-size i-root))

(j-size (get-size j-root))

(new-size (plus i-size j-size)))

(check ((less? i-size j-size)

(link i-root j-root new-size))

(else (link j-root i-root new-size)))))

The path compression heuristic is applied during chase-up: When searching for the root of theequivalence class of some i, we make each node along the way point directly to the root (once thelatter is found), potentially breaking up a long chain of links. E.g.:

ma1

6

ma2

6

ma3

6

ma4

After path compression: ma1�

��3

ma4

ma2

6 ma3Q

QQ

Qk

This can speed up subsequent find operations significantly. The only modification we need to makeis in chase-up:

(define (chase-up t)

(check ((root? t) t)

(else (seq (set-parent t (chase-up (parent t)))

(parent t)))))

Note that chase-up is no longer tail-recursive, but it is still linear in the size of the tree branch.Weighted union by itself improves the worst-case complexity of our implementation to O(m log n).

Combining weighted union with path compression further improves the complexity to almost O(m),17

although the required analysis for showing this rigorously is rather subtle.17More precisely, O(m A−1(m, n)), where A−1 is the inverse of Ackermann’s function, which has an extremely slow

rate of growth.

124

7.9.6 A Prolog interpreter

In this section we will implement a Prolog engine. The exercise will pull together many of the topicswe have discussed so far: higher-order and anonymous procedures, pattern matching, backtracking,lexical scoping, and Athena’s primitives for dealing with substitutions, unification, etc. By leveragingthese mechanisms, we are able implement the entire Prolog system in less than one page of Athenacode.

By a Horn clause we will mean an Athena proposition of the following kinds:

1. A fact of the form∀ x1 · · ·xn . P (x1, . . . , xn)

n ≥ 0, where P (x1, . . . , xn) is an atomic proposition whose all and only free variables arex1, . . . , xn; or

2. a rule, i.e., a conditional of the form

∀ x1 · · ·xn . [P1 ∧ · · · ∧ Pk]⇒Pk+1

k > 0, where x1, . . . , xn are all and only the free variables that occur in the various Pi. We referto Pk+1 as the conclusion of the rule and to P1, . . . , Pk as the antecedents.

By a query we will mean an arbitrary atomic proposition (that may contain free variables). Inwhat follows we introduce a domain and function symbols to talk about ancestry, and define a fewgenealogical facts and rules:

(domain Person)

(declare (father mother parent grandparent ancestor) (-> (Person Person) Boolean))

(declare (adam eve seth peter mary paul joe) Person)

(define fact-1 (father adam seth))

(define fact-2 (mother eve seth))

(define fact-2 (father seth peter))

(define fact-4 (father peter paul))

(define fact-5 (mother mary paul))

(define fact-6 (father paul joe))

(define rule-1

(forall ?x ?y

(if (father ?x ?y)

(parent ?x ?y))))

(define rule-2

(forall ?x ?y

(if (mother ?x ?y)

(parent ?x ?y))))

(define rule-3

(forall ?x ?y ?z

(if (and (parent ?x ?y)

(parent ?y ?z))

125

(grandparent ?x ?z))))

(define rule-4

(forall ?x ?y

(if (parent ?x ?y)

(ancestor ?x ?y))))

(define rule-5

(forall ?x ?y ?z

(if (and (ancestor ?x ?y)

(parent ?y ?z))

(ancestor ?x ?z))))

(define database [fact-1 fact-2 fact-3 fact-4 fact-5 fact-6

rule-1 rule-2 rule-3 rule-4 rule-5])

(define query-1 (mother mary paul))

(define query-2 (grandparent ?x joe))

(define query-3 (ancestor adam ?x))

(define queries [(grandparent ?x joe) (ancestor ?FOO ?x)])

The main procedure, solve, will take a list of Horn clauses (facts and/or rules) and a list of queries[Q1 · · ·Qm] and will search the SLD-tree of the query list in the standard depth-first Prolog fashion,resulting either in a substitution θ such that the application of θ to each Qi is a logical consequence ofthe Horn clauses; or in failure, indicating that no such substitution could be found; or in divergence(an infinite loop). For rigorous definition of SLD-trees and the Prolog’s execution model, see eitherthe Athena paper “A trusted implementation of SLD-resolution” [?] or an introductory book in LogicProgramming or Prolog [4, 5, 6, 1].

The implementation is shown in its entirety in Figure 7.5. The main procedure is solve, as speci-fied above, aided by the six auxiliary procedures get-conclusion, match-conclusion-of, get-matches,try-matches, and get-subgoals. We discuss each in turn.

The procedure get-conclusion simply obtains the conclusion of a Horn clause. The first case ofthe match handles rules and the second handles facts. The procedure match-conclusion-of tries tomatch a query goal with the conclusion of some Horn clause P. First a freshly renamed version Q of Pis obtained, and then the conclusion of Q is unified with goal. If the unification successfully producessome substitution theta, then the pair [Q theta] is returned. We will refer to the pair [Q theta] asa match. If the unification fails, then match-conclusion-of returns false.

The procedure get-matches returns a list of all the matches between a query and a list of Hornclauses, as determined by match-conclusion-of. This list could be empty if the query does not matchthe conclusion of any of the clauses, which would cause our interpreter to fail. If the clause listconstitutes a logic program, then the left-to-right processing ensures that we examine the rules in thegiven order.

The higher-order procedure try-matches takes a list of matches [[P1 θ1] · · · [Pn θn]] and a bi-nary procedure process-match, which we will refer to as a match handler, and sequentially appliesprocess-match to Pi and θi, for i = 1, . . . , n, until such an application succeeds or until there are nomore matches, in which case a failure occurs.

126

(define (get-conclusion P)(match P

((forall (some-list _) (if _ concl)) concl)((forall (some-list _) concl) concl)))

(define (match-conclusion-of P goal)(let ((Q (rename P)))

(match (unify goal (get-conclusion Q))((some-sub sub) [Q sub])(_ false))))

(define (get-matches goal clauses)(filter-out (map (lambda (c) (match-conclusion-of c goal)) clauses)

(lambda (x) (equal? x false))))

(define (try-matches matches process-match)(match matches

((list-of [P matching-sub] rest)(try (process-match P matching-sub)

(try-matches rest process-match)))))

(define (get-subgoals clause)(match clause((forall (some-list _) (if (and (some-list L)) _)) L)((forall (some-list _) (if (some-atom A) _)) [A])(_ [])))

(define (solve queries clauses)(letrec ((search (lambda (goals theta)

(match goals([] theta)((list-of goal more-goals)

(try-matches (get-matches goal clauses)(lambda (clause theta’)

(let ((new-goals (theta’ (join (get-subgoals clause) more-goals))))(search new-goals (compose-subs theta’ theta))))))))))

(search queries {})))

Figure 7.5: An Athena implementation of a Prolog interpreter.

The procedure get-subgoals produces the conjuncts of the antecedent of a given rule. If theantecedent is an atom A, then A is the only conjunct; otherwise the antecedent must be a conjunctionof several propositions, which are returned in a list.

The main procedure solve is implemented in terms of the internal auxiliary procedure search,which takes a list of goals (queries) [G1 · · ·Gn] and the current substitution theta (which will initiallybe the empty substitution). If there are no goals (i.e., if n = 0), search simply returns theta.Otherwise we select the first goal from the list, G1 (in accordance with Prolog’s computation rule)and obtain all the matches [P1 θ1], . . . , [Pk θk] of G1 with the various clauses (if there are no suchmatches, we fail). For each such match [Pi θi], we get the subgoals A1, . . . , Am of Pi and then werecursively call search on θ([A1 · · ·Am G2 · · ·Gn]) and θ′ ◦ θ. This continues until we either succeedor else run out of matches, at which point we will fail.

Note that this implementation, while very short and simple, is also quite inefficient. For instance,

127

Bibliography

[1] K. R. Apt. Logic Programming. Handbook of Theoretical Computer Science. North-Holland,1990.

[2] T. Arvizo. A virtual machine for a type-ω denotational proof language. Masters thesis, MIT,June 2002.

[3] M. Bergmann, J. Moor, and J. Nelson. The Logic Book. Random House, New York, 1980.

[4] W. F. Clocksin and C. S. Mellish. Programming in Prolog. Springer Verlag, 4th edition, 1994.

[5] U. Nillson and J. Maluszynski. Logic, Programming and Prolog. John Wiley and Sons, 2nd edition,1995.

[6] L. Sterling and E. Shapiro. The Art of Prolog. MIT Press, Cambridge, MA, 2nd edition, 1994.

[7] A. Voronkov. The anatomy of Vampire: implementing bottom-up procedures with code trees.Journal of Automated Reasoning, 15(2), 1995.

[8] C. Weidenbach. Combining superposition, sorts, and splitting. In A. Robinson and A. Voronkov,editors, Handbook of Automated Reasoning, volume 2. North-Holland, 2001.

128

An Athena tutorial - People | MIT...

Documents

Transcript of An Athena tutorial - People | MIT...