Peter LeFanu Lumsdaine September 11, 2020 arXiv:2009 ...

A general definition of dependent type theories

Andrej Bauer Philipp G. HaselwarterPeter LeFanu Lumsdaine

September 11, 2020

Abstract

We define a general class of dependent type theories, encompassing Martin-Löf’s intuitionistic type theories and variants and extensions. The primary aim ispragmatic: to unify and organise their study, allowing results and constructionsto be given in reasonable generality, rather than just for specific theories. Com-pared to other approaches, our definition stays closer to the direct or naïve readingof syntax, yielding the traditional presentations of specific theories as closely aspossible.

Specifically, we give three main definitions: raw type theories, a minimal setupfor discussing dependently typed derivability; acceptable type theories, includingextra conditions ensuring well-behavedness; and well-presented type theories, gen-eralising how in traditional presentations, the well-behavedness of a type theory isestablished step by step as the type theory is built up. Following these, we showthat various fundamental fitness-for-purpose metatheorems hold in this generality.

Much of the present work has been formalised in the proof assistant Coq.

Contents1 Introduction 2

2 Preliminaries 6

3 Raw syntax 9

4 Raw type theories 16

5 Well-behavedness properties 28

6 Well-founded presentations 43

7 Discussion and related work 56

8 References 59

A Formalisation in Coq 61

The authors benefited greatly from visits funded by the COST Action EUTypes CA15123.This material is based upon work supported by the U.S. Air Force Office of Scientific Research under

award number FA9550-17-1-0326, grant number 12595060.

1

arX

iv:2

009.

0553

9v1

[m

ath.

LO

] 1

1 Se

p 20

20

https://eutypes.cs.ru.nl

1 Introduction

1.1 OverviewWe give a general definition of dependent type theories, encompassing for exampleMartin-Löf’s intuitionistic type theories and many variants and extensions.

The primary aim is to give a setting for formal general versions of various construc-tions and results, which in the literature have been given for specific theories but areheuristically understood to hold for a wide class of theories: for instance, the conser-vativity theorem of (Hofmann, 1997), or the coherence theorems of (Hofmann, 1995;Lumsdaine and Warren, 2015).

This has been a sorely felt gap in the literature until quite recently; the present workis one of several recent approaches to filling it (Isaev, 2017; Uemura, 2019; Brunerie,2020b).

A secondary aim is to stick very closely to an elementary understanding of syntax.Established general approaches — for instance, logical frameworks and categoricalsemantics — give, in examples, not the original syntax of the example theories, butan embedded or abstracted version, typically then connected to the original syntax byadequacy or initiality theorems. Our approach directly recovers quite conventionalpresentations of the example theories themselves.

As a corollary of this goal, we must confront the bureaucratic design decisionsof syntax: the selection of structural rules, and so on. These are often swept underthe rug in specific type theories as “routine”; to initiates they are indeed standard, butnewcomers to the field often report finding this lack of detail difficult. We thereforeelide nothing, and set out a precise choice of all such decisions, carefully chosen andproven to work well in reasonable generality, which we hope will be of value to readers.

In pursuit of the above aims, we offer not one main definition of type theory, butthree, at increasing levels of refinement.

Firstly, we define raw type theories, as a conceptually minimal description of tra-ditional presentations of type theories by symbols and rules, sufficient to define thederivability relation, but not yet incorporating any well-formedness constraints on therules.

Secondly, we give sufficient conditions on a raw type theory to imply that deriv-ability over it is well-behaved in various standard ways. Specifically, we isolate simplesyntactic checks that suffice to imply core fitness-for-purpose properties, and packagethese into the notion of an acceptable type theory.

Thirdly, we analyse the well-founded nature of traditional presentations, involvedin more elaborate constructions such as the categorical semantics, as well as (arguably)the intuitive assignment of meaning to a theory. This leads us to the notion of well-presented type theories, which we hope can serve as a full-fledged proposal fulfillingour primary aim.

1.2 SpecificsWe aim, as far as possible, not to argue for any novel approach to setting up typetheories, but simply to give a careful analysis of how type theories are traditionallypresented, in order to lay out a generality in which such presentations can be situated.As such, the first few components of our definition are the expected ones.

We begin with an appropriate notion of signature, for untyped syntax with variable-binding, and develop the standard notions of “raw” syntactic expressions over such

2

signatures, including substitution, translation along signature morphisms, and so on.With the syntax of types and terms properly set up, a type theory is traditionally

presented by giving a collection of rules. Type theorists are very accustomed to readingthese — but as anyone who has tried to explain type theory to a non-initiate knows,there is a lot to unpack here. The core of our definition is a detailed study of thesituation: what is really going on when we write and read inference rules, check thatthey are meaningful, and interpret them as a presentation of type theory?

Take the formation rule for Π-types:

Γ ` A type Γ, x:A ` B type

Γ ` Π(x:A) . B type

When pressed to explain this, most type theorists will say that the rule represents in-ductive clauses for constructing derivations, or closure conditions for the derivabilitypredicate: given derivations of the judgements above the line, a derivation of the judge-ment below the line is constructed. In particular, if Γ, A and B are syntactically validrepresentations of a context and types, and the judgements

Γ ` A type, and Γ, x:A ` B type

are both derivable, then so is the judgement

Γ ` Π(x:A) . B type.

This understanding of rules is sufficient for explaining the definition of a specific typetheory, and defining derivability of judgements.

However, it is in general too permissive: to be well-behaved, type theories shouldnot be given by arbitrary closure conditions, but only by those that can be specifiedsyntactically by rules looking something like the traditional ones. In other words, wewant to make explicit the idea of a rule as a syntactic entity in its own right, accom-panied with a mathematically precise explanation of what makes it type-theoreticallyacceptable, and how it gives rise to a quantified family of closure conditions.

So to a first approximation, we say a rule consists of a collection of judgements —its premises — and another judgement, its conclusion. However, a subtlety lurks: whatare Γ, A, and B in the above Π-formulation rule?

The symbol Γ is easy: we can dispense with it entirely. We prescribe (as type the-orists often do, heuristically) that all rules should be valid over arbitrary contexts, andso since the arbitrary context is always present in the interpretation of a rule as a familyof closure conditions, it never needs to be included in the syntactic specification of therule. This precisely justifies a common “abuse of notation” in presenting type theories:the context Γ is omitted when writing down the rules, and one mentions apologeticallysomewhere that all rules should be understood as over an arbitrary ambient context.

ExplainingA andB is more interesting. They are generally called “metavariables”,and in the family of closure conditions they are indeed that — quantified variables ofthe meta-theory, ranging over syntactic entities. However, if the rule is to be consideredas formal syntactic entity,A andB must themselves be part of that syntax. We thereforetake the premises and conclusion of rules as formed over the ambient signature of thetype theory extended with extra symbols A and B to represent the metavariables of therule.

These considerations result in the notion of a raw rule, whose premises and con-clusion are raw judgements over a signature extended with metavariables. A raw type

3

theory is then just a family of raw rules. It holds enough information to be used, butstill permits arbitrariness that must be dispensed with. At the very least, the type andterm expressions appearing in the rule ought to represent derivable types and terms,respectively. Thus in the next stage of our definition we ask that every rule be accom-panied with derivations showing that the presuppositions hold, namely, that its typeexpressions are derivable types and that its terms have derivable types. Another condi-tion that we impose on rules is tightness, which roughly requires that the metavariablessymbols be properly typed by the premises, and for rules that build term or a typejudgements, that they do so in the most general form.

Even though every rule of a raw type theory may be presuppositive and tight, thetheory as a whole may be deficient, for example, if one of the symbols has no corre-sponding formation rule, or several of them. Overall we call a type theory tight whenthere is bijective correspondence between its symbols and formation rules, which aretight themselves. And to make sure equality is well behaved, we also require that foreach symbol there is a suitable congruence rule ensuring that the symbol commuteswith equality. When a raw type theory has all these features, we call it an acceptabletype theory.

Because the derivations of presuppositions appeal to the very rules they certify,an unsettling possibility of circular reasoning arises. We resolve the matter in twoways. First, we add one last stage to the definition of type theories and ask that all therules, as well as the premises within each rule, be ordered in a well-founded manner.Second, we show that for acceptable type theories whose contexts and premises arewell-founded as finite sequences, circularities can always be avoided by passing tothe well-founded replacement of the theory (Section 6.5). Apart from expelling thedaemons of circularity, the well-founded order supplies a useful induction principle.

In the end, the definition of a general type theory has roughly five stages:

1. the signature (Definition 3.8) describes the arities of primitive type and termsymbols that form the raw syntax (Definition 3.11),

2. raw rules (Definition 4.18) constitute a raw type theory (Definition 4.36),

3. the raw rules are verified to be tight and presuppositive (Definitions 5.1 and 5.6,and therefore acceptable (Definition 5.7),

4. the raw type theory is verified to consist of acceptable symbol rules (Defini-tion 5.11) and equations, that it is tight and congruous, and therefore acceptable(Definition 5.12).

5. finally, an acceptable type theory may be well-presented (Definition 6.20) andhence well-founded (Definition 6.19), or we may pass to its well-founded re-placement (Theorem 6.32).

We readily acknowledge that there are many alternative ways of setting up typetheories, each serving a useful purpose. It is simply our desire to actually give onemathematically complete description of what type theories are in general.

Once the definition is complete, we should provide evidence of its scope and utility.We do so in Section 5 by proving fundamental meta-theorems, among which are:

1. Derivability of presuppositions, Theorem 5.15, stating that the presuppositionsof a derivable judgement are themselves derivable.

2. Elimination of substitution, Theorem 5.22, stating that anything that can be de-rived using the substitution rules can also be derived without them.

4

3. Uniqueness of typing, Theorem 5.23, stating that a term has at most one type, upto judgmental equality.

4. An inversion principle, Theorem 5.27, that reconstructs the proof-relevant partof the derivation of a derivable judgement from the information given in thejudgement.

Our definitions are set up to support a meta-theoretic analysis of type theories, butdeviate from how type theories are presented in practice. First, one almost always en-counters only finitary syntax in which contexts and premises are presented as finite se-quences – we call these sequential contexts and premises and treat them in Sections 6.1and 6.2. Second, theories are not constructed in five stages, but presented through rulesthat are manifestly acceptable and free of circularities, while symbols are introducedsimultaneously with their formation rules. In Sections 6.3 and 6.4 we make these no-tions precise by defining well-presented rules and theories, and their realisations asraw type theories.

Having heard tales about minor but insidious mistakes in the literature on the meta-theory of syntax, we decided to protect ourselves from them by formalising parts ofour paper in the Coq proof assistant (Coq development team, 2020). An overviewof the formalisation is given in Appendix A, including comments about the meta-mathematical foundations sufficient for carrying out our work. The formalisation al-lows us to claim a high level of confidence and omission of routine syntactic arguments.Nevertheless, we still strove to make the paper self-contained by following the estab-lished standards of informal rigour.

1.3 DisclaimersHaving said what this paper is about, it is worth saying a little about what it is not.

It is most certainly not intended as a prescriptive definition of what all dependenttype theories should be. Many important type theories in the literature are not coveredby our definition, and we do not mean to reject them. The aim of this work is simplypragmatic: to encompass some large class of theories of interest, in order to betterorganise and unify their study. We very much hope our approach may be extended towider generalities.

We do not claim or aim to supersede other general approaches to studying type the-ories, such as those based on logical frameworks. Such approaches are well-developed,powerful for many applications, and sidestep some complications of the present ap-proach. However, all such approaches (that we are aware of) work by using a some-what modified syntax (e.g. embedded in a larger system) — the syntax they yield isnot obviously the same as the syntax given by a “direct” or “naïve” reading of presen-tations of theories. They are typically accompanied by adequacy theorems, or similar,showing equivalence between the modified syntax and the naïve, for the specific typetheory under consideration.

By contrast, we aim to directly study and generalise the naïve approach itself, which(to our knowledge) has not been done previously in such generality. Our motivationsare therefore largely complementary to such approaches. A more detailed comparisonis given in Section 7.

5

2 PreliminariesWe begin by setting up definitions and terminology of a general mathematical naturethat we will use throughout.

2.1 FamiliesFor several reasons, we work with families in places where classical treatments woulduse either subsets of, or lists from, a given set. While the term is standard (e.g. “theproduct of a family of rings”), we make rather more central use of it than is usual, sowe establish some notations and terminology.

Definition 2.1. Given a set X , a family K of elements of X (or briefly, a familyon X) consists of an index set indK and a map evK : indK → X . We let FamXdenote the collection of all families on X , and use the family comprehension notation〈ei ∈ X | i ∈ I〉 for the family indexed by I that maps i to ei. A family may beexplicitly described by displaying the association of indices to values. For example,we may write 〈0: e0, 1: e1, 2: e2〉 for the family 〈ei | i ∈ {0, 1, 2}〉.

Example 2.2. Any subset A ⊆ X can be viewed as a family 〈i ∈ X | i ∈ A〉, withindA asA itself and evA the inclusionA ↪→ X . Motivated by this, we will often speakof a family K as if it were a subset, writing x ∈ K rather than x ∈ indK, and treatingsuch x itself as an element of X rather than explicitly writing evK(x).

Example 2.3. Any list ` = [x0, . . . , xn] of elements of X can be viewed as a family,with ind ` = {0, . . . , n} and ev`(i) = xi, or equivalently ` = 〈0: x0, . . . , n : xn〉. Wewill often use list notation to present concrete examples of families.

Working constructively, it is quite important to keep the distinction between fam-ilies and subsets where classical treatments would confound them. For instance, apropositional theory is usually classically defined as a set of propositions; we wouldinstead use a family of propositions. In a derivation over the theory, uses of axiomstherefore end up “tagged” with elements of the index set of the theory, typically ex-plaining how a certain proposition arises as an axiom (since the same proposition mightoccur as an instance of axiom schemes in multiple ways). These record constructivecontent which may be needed for, say, interpreting axioms according to a proof bycases over the axiom schemes of the theory.

Our use of families where most traditional treatments use lists — e.g. for specifyingthe argument types of a constructor — is less mathematically significant. It is partlyto avoid baking in assumptions of finiteness or ordering where they are not required;but it is mostly motivated just by the formalisation, where families provide a moreappropriate abstraction.

Definition 2.4. A map of families f : K → L between families K and L on X is amap f : indK → indL such that evL ◦f = evK .

We shall notate such a map as 〈f(x)〉x∈indK . Indeed, the notation 〈f(x)〉x∈Aworks for any maps f : A→ B, as it is just an alternative way of writing λ-abstractions.

Families and their maps form a category FamX , which is precisely the slice cat-egory Set/X . A map r : X → Y yields a functorial action r∗ : FamX → FamYwhich takes K ∈ FamX to the family r∗K with ind(r∗K) = indK and evr∗K =r◦evK . It is perhaps clearer to write down the action in terms of family comprehension:r∗〈ei | i ∈ I〉 = 〈r(ei) | i ∈ I〉.

6

Definition 2.5. Given a function r : X → Y and familiesK, L onX , Y respectively, amap f : K → L over r is a map f : r∗K → L; equivalently, a map f : indK → indLforming a commutative square over r.

2.2 Closure systemsThe general machinery of derivations as closure systems occurs throughout logic, andis independent of the specific syntax or judgements of the logical systems involved.

Definition 2.6. A closure rule (P, c) on a set X consists of a family P of elementsin X , its premises, and a conclusion c ∈ X . A closure system S on a set X is afamily of closure rules on X , where we respectively write premsR and conclR forthe premises and the conclusion corresponding to a rule R ∈ S. We write RuleX andClosX for the collections of closure rules and closure systems on X , respectively.

As is tradition, we display a closure rule with premises [p1, . . . , pn] and conclu-sion c as

p1 · · · pn

c

The constructions of closure rules and closure systems are evidently functorial inthe ambient set. A map f : X → Y sends a rule R ∈ RuleX to the rule f∗R ∈RuleY with prems(f∗R) := 〈f(p) | p ∈ premsR〉 and concl(f∗R) := f(conclR).Similarly, a closure system S on X is taken to the closure system f∗S on Y , definedby f∗S := 〈f∗r | r ∈ S〉.

Definition 2.7. A simple map S → T between closure systems S and T on X isjust a map between them as families. More generally, a simple map f : S → T overf : X → Y from S ∈ ClosX to T ∈ ClosY is just a simple map f : f∗S → T , orequivalently a family map f over f∗ : RuleX → RuleY .

A closure system yields a notion of derivation:

Definition 2.8. Given a closure system S on X , a family H of elements in X , and anelement c ∈ X , the derivations DerS(H, c) of c from hypotheses H are inductivelygenerated by:

1. for every h ∈ H , there is a corresponding derivation hyph ∈ DerS(H,h),

2. for every rule R ∈ S and a map D ∈∏p∈premsR DerS(H, p) there is a deriva-

tion der(R,D) ∈ DerS(H, conclR).

In the second clause above D is a dependent map, i.e., for each p ∈ premsR wehave Dp ∈ DerS(H, p). We do not shy away from using products of families anddependent maps when the situation demands them.

The elements of DerS(H, c) may be seen as well-founded trees with edges andnodes suitably labelled fromX , S, andH . We take such inductively generated familiesof sets as primitive; their existence may be secured one way or another, depending onthe ambient mathematical foundations. The essential feature of derivations, which werely on, is the structural induction principle they provide.

It is easy to check that derivations are functorial in simple maps of closure systems,in a suitable sense:

7

Proposition 2.9. A simple map f : SX → SY of closure systems over f : X → Yacts on derivations as f∗ : DerSX (H, c)→ DerSY (f∗H, f(c)) for each H and c. Theaction is moreover functorial, in that id∗ = id and (f ◦ g)∗ = f∗ ◦ g∗.

Often, one wants a more general notion of map, sending each rule of the sourcesystem not necessarily to a single rule of the target system, but instead to a derivedrule:

Definition 2.10. A derivation of a rule R over a closure system S is a derivation ofconclR from premsR over S. Given such a derivation, we call R a derived rule of S,or say R is derivable over S. A map of closure systems f : S → T over f : X → Yis a function giving, for each rule R of C, a derivation of f∗R in T .

To show that maps of closure systems preserve derivability, we need a graftingoperation on derivations.

Lemma 2.11. Given an ambient closure system S, suppose D is a derivation of c fromhypotheses H over S, and for each h ∈ H , Dh is a derivation of h from H ′. Thenthere is a derivation of c from H ′ over S.

Proof. The derivation of c from H ′ is constructed inductively from a derivation of cfrom H:

1. if c is derived as one of the hypotheses h ∈ H , then Dh derives c from H ′,

2. if der(R,D′) derives c fromH , then for each p ∈ premsRwe inductively obtaina derivation D′′p of p from H ′ from the corresponding derivation D′p of p fromH , and assemble these into the derivation der(R,D′′) of c from H ′.

Definition 2.12. A closure system map f : S → T over f : X → Y acts on deriva-tions: if D is a derivation of c from H over S, there is a derivation f∗D of f(c) fromf∗H over T .

Proof. f∗D is defined by recursion on D. Wherever D uses a rule R of S, with deriva-tions Dh of the premises, f∗D uses the given derivation f(R) of f∗R, with the deriva-tions f∗Dh grafted in at the hypotheses.

Categorically, grafting can be recognised as the multiplication operation of a monadstructure on derivations, and our maps of closure systems can be seen as Kleisli mapsfor this monad (relative to simple maps). One can thus show that they form a category,that the action on derivations is functorial, and so on. We do not make this precise here,as it is not required for the present paper.

2.3 Well-founded ordersThere will be several occasions when we shall have to prevent dependency cycles (be-tween premises of a rule, or between rules of a type theory). For this purpose we reviewa notion of well-foundedness which accomplishes the task.

Definition 2.13. A strict partial order on a set A is an irreflexive and transitive rela-tion < on A. A subset S ⊆ A is <-progressive when, for all x ∈ A,

(∀y ∈ A . y < x⇒ y ∈ S)⇒ x ∈ S.

A well-founded order is a strict partial order < in which a subset is the entire set assoon as it is<-progressive. For each x ∈ A, the initial segment ↓i := {y ∈ A | y < x}is the set of elements preceding x with respect to the order.

8

In terms of an induction principle a strict partial order is well-founded when, for everypredicate ϕ on A,

∀x ∈ A . (∀y ∈ A . y < x⇒ ϕ(y))⇒ ϕ(x)⇒ ∀x ∈ A .ϕ(x).

Classically there are many equivalent definitions of well-founded orders. Construc-tively, the situation is more complicated, cf. (Taylor, 1999, §2.5); this definition is oneof the most standard, and the most suited to our purpose.

3 Raw syntaxIn this section, we set out our treatment of raw syntax with binding, sometimes called“pre-syntax” to indicate that no typing information is present at this stage. There isnothing essentially novel — briefly, we use a standard modern treatment, closely in-spired by that of (Fiore et al., 1999), but focus on concrete constructions rather than cat-egorical characterisations. So we take raw expressions as inductively generated trees,and scope systems, developed below, for keeping track of variable scopes and binding.We spell out the details in order to have a self-contained presentation tailored to ourrequirements, and to set up terminology and notation we will use later.

3.1 Scope systemsWe first address the question of how to treat variables and binding. Should we use termswith named variables up to α-equivalence, or de Bruijn indices, or reuse the bindingstructure of a framework language? The last option is appealing, as it dispenses withmany cumbersome details, but we shall avoid it precisely because we want to confrontthe cumbersome details of type theory.

Rather than choosing a particular answer, we formulate and use a general struc-ture for binding, abstracting away the implementation-specific details of several ap-proaches, but retaining the common structure required for defining syntax.

Definition 3.1. A scope system consists of:

1. a collection of scopes S;

2. for each scope γ, a set of positions |γ|;3. an empty scope 0 with no positions, |0| = ∅;4. a singleton scope 1 with a unique position, |1| = 1;

5. operations giving for all scopes γ and δ a sum scope γ ⊕ δ, and functions

|γ| inl // |γ ⊕ δ| |δ|inroo

exhibiting |γ ⊕ δ| as a coproduct of |γ| and |δ|.

A scope may be seen as “a context, without the type expressions”: in raw syntax, onecares about what variables are in scope, without yet caring about their types.

The singleton scope 1 is not needed for most of the development of general typetheories — in the present paper, it is used only for sequential contexts (Section 6.1)and notions building on these. However, it is present in all examples of interest, so weinclude it in the general definition.

9

We will also use scopes to describe binders: if some primitive symbol S binds γvariables in its i-th argument, then for an instance of S in scope δ, the i-th argumentwill be an expression in scope δ ⊕ γ. Most traditional constructors bind finitely manyvariables; to facilitate this, we let [n] denote the sum of n ∈ N copies of 1, which alsoprovides alternative notations [0] and [1] for the empty and singleton scopes, respec-tively.

Example 3.2. De Bruijn indices and de Bruijn levels can be seen as scope systems,with N as the set of scopes, |n| := {0, 1, . . . , n− 1}, 0 := 0, and m⊕ n := m+ n.

The difference lies just in the choice of coproduct inclusions inl : |m| → |m +n| ← |n| : inr . Setting inl(i) := i + n, inr(j) = j gives de Bruijn indices, as goingunder a binder increments the variables outside it. Setting inl(i) := i, inr(j) = j + ngives de Bruijn levels, as higher positions go to the innermost-bound variables. Overthese scope systems, our syntax precisely recovers standard de Bruijn-style syntax, asin (de Bruijn, 1972) and subsequent work.

Both the de Bruijn scope systems are strict in the sense that we have equalitiesγ⊕0 = γ = 0⊕γ and (γ⊕δ)⊕η = γ⊕(δ⊕η), whereas general scope systems provideonly canonical isomorphisms. The equalities help reduce bureaucracy in several proofs,so we shall occasionally indulge in assuming that we work with a strict scope system.The doubtful reader may consult the formalisation, which makes no such assumptions.

Example 3.3. The finite sets system takes scopes to be finite sets, along with anychoice of coproducts. It may be prudent to restrict to a small collection, say the hered-itarily finite sets. Syntax over these scope systems gives a concrete implementation ofcategorical approaches such as (Fiore et al., 1999).

Example 3.4. Scope systems are not intrinsically linked to dependent type theories, butprovide a useful discipline for syntax of other systems. For instance, in geometric logic,the infinitary disjunction is usually given with a side condition that the free variablesof the disjunction must remain finite (Johnstone, 2002, D1.1.3(xi)). By using finitescopes, we can make the finiteness condition explicit from the start, and dispense withthe side condition. This is similar in spirit to (Fiore et al., 1999) and more closelymirrors the categorical semantics. By contrast, the classical Hilbert-type logics Lα,βof (Karp, 1964) allow genuinely infinite contexts and binders, which can be obtainedby taking scopes to be ordinals δ ∈ α, with |δ| := δ.

Example 3.5. The traditional syntax using named variables, for both free and boundvariables, is not an example of a scope system. In that approach, a fresh variable isnot introduced by summing with a scope, but rather by a multivalued map allowingextension by any unused name.

Some implementations of syntax treat free and bound variables separately, for in-stance locally nameless syntax (McKinna and Pollack, 1993) uses concrete names forfree variables but de Bruijn indices for bound variables. Scope systems as defined heredo not subsume such syntax, but could be generalised to do so.

Categorically, a scope system can be viewed precisely as a category Scope withinitial and terminal objects, binary coproducts, and a full and faithful functor into(Set, 0,+, 1) preserving this structure. The categorical structure is induced from Set:morphisms γ → δ are taken as functions |γ| → |δ|— we call these renamings. Twoof these, r : γ → δ and r′ : γ′ → δ′, give a sum map r ⊕ r′ : γ ⊕ γ′ → δ ⊕ δ′ arisingfrom the universal property of coproducts.

10

For the remainder of the paper, we fix a scope system. To make concrete examplesreadable, we shall write them using the de Bruijn scopes from Example 3.2. They canbe easily adapted to any other scope system.

3.2 Arities and signaturesWhile the arity of a simple algebraic operation is just the number of its arguments, thesituation is complicated in type theory by the presence of binders. Each argument ofa type-theoretic constructor may be a term or a type, and some of its variables may bebound by the constructor. We thus need a suitable notion of arity.

Definition 3.6. By syntactic classes, we mean the two formal symbols ty and tm,representing types and terms respectively. An arity α is a family of pairs (c, γ) wherec is a syntactic class and γ is a scope. We call the indices of α arguments and writeargα for the index set of α. Thus, each argument i ∈ argα has an associated syntacticclass clα i and a scope bindα i, which we call the binder associated with the argument i,and we can write α as α = 〈(clα i,bindα i) | i ∈ argα〉.

Example 3.7. In Martin-Löf type theory with the de Bruijn scope system, the con-structor Π has arity [(ty, 0), (ty, 1)]. That is, the arity has two type arguments, andbinds one variable in the second argument. A simpler example is the successor symbolin arithmetic whose arity is [(tm, 0)], because it takes one term argument and bindsnothing. Still simpler, the arity of a constant symbol is the empty family.

Note that arities express only the basic syntactic information and do not specify thetypes of term arguments and bound variables, which will be encoded later by typingrules.

Definition 3.8. A signature Σ is a family of pairs of a syntactic class and an arity. Wecall the elements of its index set symbols. Thus each symbol S ∈ Σ has an associatedsyntactic class cS and an arity αS . A type symbol is one whose associated syntacticclass is ty, and a term symbol is one whose associated syntactic class is tm. Thearguments argS of S are the arguments of its arity αS . Each argument i ∈ argS hasan associated syntactic class clS i and binder bindS i, as prescribed by the arity αS .

Example 3.9. The following signature describes the usual constructors for dependentproducts:

Π 7→ (ty, [(ty, 0), (ty, 1)]),

λ 7→ (tm, [(ty, 0), (ty, 1), (tm, 1)]),

app 7→ (tm, [(ty, 0), (ty, 1), (tm, 0), (tm, 0)]).

Let us spell out the last line. The symbol app builds a term, because its syntactic classis tm, from four arguments. The first and the second arguments are types, with onevariable bound in the second argument, while the third and the fourth arguments areterms. We thus expect an application term to be written as app(A,B, s, t), with onevariable getting bound in B.

Definition 3.10. A signature map F : Σ→ Σ′ is a map of families between them, thatis, a function from symbols of Σ to symbols of Σ′, preserving the arities and syntacticclasses. Signatures and maps between them form a category Sig.

11

3.3 Raw syntaxOnce a signature Σ is given, we know how to build type and term expressions over it.We call this part of the setup “raw” syntax to emphasise its purely syntactic nature.

Definition 3.11. The raw syntax over Σ consists of the collections of raw type ex-pressions ExprtyΣ(γ) and raw term expressions ExprtmΣ (γ), which are generated in-ductively for any scope γ as follows:

1. for every position i ∈ |γ|, there is a variable expression vari ∈ ExprtmΣ (γ);

2. for every symbol S ∈ Σ of syntactic class cS , and a map

e ∈∏i∈arg S ExprclS i

Σ (γ ⊕ bindS i),

there is an expression S(e) ∈ ExprcSΣ (γ), the application of S to arguments e.

Let us walk through the definition. The first clause states that the positions of γplay the role of available variable names, still without any typing information. Thesecond clause explains how to build an expression with a symbol S: for each argumenti ∈ argS, an expression ei of a suitable syntactic class must be provided, where eimay refer to variable names given by γ as well as the variables that are bound by Sin the i-th argument. The expressions ei are conveniently collected into a function e.When writing down concrete examples we write the arguments as tuples.

Example 3.12. The symbol Π has arity [(ty, 0), (ty, 1)]. So if A is a type expressionwith free variables amongst γ⊕0 (which is isomorphic to γ), andB is a type expressionwith free variables in γ ⊕ 1 (which has an extra free variable available), then Π(A,B)is a type expression with free variables in γ.

Definition 3.13. The action of a renaming r : γ → δ on expressions is the mapr∗ : ExprcΣ(γ)→ ExprcΣ(δ), defined by structural recursion:

r∗(vari) := varr(i),

r∗(S(e)) := S(〈(r ⊕ idbindS i)∗(ei)〉i∈arg S).

Note how the definition uses the functorial action of ⊕ to extend the renaming r whenit descends under the binders of a symbol.

Definition 3.14. The action of a signature map F : Σ → Σ′ on expressions is themap F∗ : ExprcΣ(γ)→ ExprcΣ′(γ), defined by structural recursion:

F∗(vari) := vari,

F∗(S(e)) := F (S)(F∗ ◦ e).

Proposition 3.15. The actions by renamings and signature maps commute with eachother, and respect identities and composition. That is, they make expressions into afunctor Expr : Sig × Scope→ Set{ty,tm}.

3.4 SubstitutionWe next spell out substitution as an operation on raw expressions, and note its basicproperties.

12

Definition 3.16. A raw substitution f : γ → δ over a signature Σ is a map f : |δ| →ExprtmΣ (γ). The extension f ⊕ η : γ ⊕ η → δ ⊕ η by a scope η is the substitution

(f ⊕ η)(inl(i)) := inl∗(f(i)) if i ∈ |δ|,(f ⊕ η)(inr(j)) := varinr (j) if j ∈ |η|.

The (contravariant) action of f on an expression e ∈ ExprcΣ(δ) gives the expressionf∗e ∈ ExprcΣ(γ), as follows:

f∗(vari) := f(i),

f∗(S(e)) := S(〈(f ⊕ bindS i)∗ei〉i∈argS).

Example 3.17. The above definition, specialized to the de Bruijn scope systems of Ex-ample 3.2, precisely recovers the usual definition of substitution with de Bruijn indicesor levels. Their explicit shift operators are abstracted away, in our setup, as renamingunder coproduct inclusions.

Definition 3.18. Any renaming r : γ → δ induces a substitution r : δ → γ, withr(i) := varr(i). In particular, each scope δ has an identity substitution idδ : i 7→ vari.Substitutions f : γ → δ and g : δ → θ may be composed to give a substitutiong ◦ f : γ → θ, defined by (g ◦ f)(k) := f∗(g(k)).

We often write r instead of r, a slight notational abuse grounded in the next propo-sition.

Proposition 3.19. For all suitable renamings r, substitutions f , g, and expressions e:

1. Substitution generalises renaming: r∗e = r∗e.

2. Identity substitution is trivial: id∗δ e = e.

3. Substitution commutes with renaming:

r∗(f∗e) = (i 7→ r∗f(i))∗e and f∗(r∗e) = (i 7→ f(r(i)))∗e.

4. Substitution respects composition: f∗(g∗e) = (g ◦ f)∗.

5. Composition of substitutions is unital and associative:

f = id ◦f = f ◦ id and f ◦ (g ◦ h) = (f ◦ g) ◦ h.

Proof. All direct by structural induction on expressions, as in the standard proofs forde Bruijn syntax.

The interaction of substitutions with signature maps is similarly straightforward:signature maps act functorially on raw substitutions, and the constructions of this sec-tion are natural with respect to the action. More precisely:

Proposition 3.20. Given a signature map F : Σ → Σ′, and a raw substitution f :δ → γ over Σ, there is a raw substitution F∗f : δ → γ over Σ′ given by (F∗f)(i) =F∗(f(i)), and the action respects composition and identities in F . Moreover, givensuch F , for all suitable f and e, we have F∗(f∗e) = (F∗f)∗(F∗e); similarly, for allsuitable f , g, we have F∗(f ◦ g) = F∗f ◦ F∗g.

13

3.5 Metavariable extensions and instantiationsAs mentioned in the introduction, when we write down the rules of type theories, wewill need to extend the ambient signature Σ by new symbols to represent the metavari-ables of the rule.

For instance, consider the constructor Π, with arity [(ty, 0), (ty, 1)]. In the rulefor Π formation (Example 3.12), we shall use two new symbols A, B, correspondingto the arguments of Π in the premises of the rule. The classes and arities of these newsymbols are given by their classes and binders as arguments of Π: they are both typesymbols; the first one takes no arguments, and the second one takes one term argument.

Definition 3.21. The simple arity simp γ of a scope γ is the arity indexed by thepositions |γ|, and whose arguments all have syntactic class tm with no binding, i.e.,simp γ := 〈(tm, 0) | i ∈ |γ|〉.

Definition 3.22. The metavariable extension Σ + α of Σ by arity α is the signatureindexed by ind Σ + argα, defined by

(Σ + α)ι0(S) := ΣS and (Σ + α)ι1(i) := (clα i, simp(bindα i)).

We usually treat the injection of symbols Σ→ Σ+α as an inclusion, writing S insteadof ι0(S); and for each i ∈ argα, we write metai for the corresponding new symbolι1(i) of Σ + α, the metavariable symbol for i.

Example 3.23. The symbol λ has arity [(ty, 0), (ty, 1), (tm, 1)]. It thus gives rise to themetavariable extension Σ+[(ty, 0), (ty, 1), (tm, 1)], which adjoins to Σ three metavari-able symbols meta0, meta1, meta2, which for readability we may rename to A, B, t,with classes and arities (ty, []), (ty, [(tm, 0)]), and (tm, [(tm, 0)]) respectively. That is,A and B are type symbols and t a term symbol, with the latter two each taking a termargument. These will then appear in the rule for λ, to denote the arguments of a genericinstance of λ. We will often use more readable names for metavariable symbols, ashere, without further comment.

Let γ be a scope and metai(e) a raw expression over signature Σ+α and γ. Then eis a map which assigns to each j ∈ |bindα i| an expression in ExprtmΣ+α(γ) becauseγ ⊕ 0 = γ. Thus we may construe e as a raw substitution e : bindα i → γ. Withthis in mind, the following definition explains how metavariables are replaced withexpressions.

Definition 3.24. Given a signature Σ, an arity α, and a scope γ, an instantiation I ∈InstΣ,γ(α) of α in scope γ is a family of expressions Ii ∈ Exprcli α

Σ (γ ⊕ bindi α),for each i ∈ argα. Such an instantiation acts on expressions e ∈ ExprcΣ+α(δ) to giveexpressions I∗e ∈ ExprcΣ(γ ⊕ δ), by replacing each occurrence of metai in e with acopy of Ii, with the arguments of metai recursively substituted for the correspondingvariables in Ii:

I∗(metai(e)) := (γ ⊕ e)∗Ii,I∗(varj) := varι1(j),

I∗(S(e)) := S (〈I∗ej〉j∈arg S).

We call I∗e the instantiation of e with I .

14

Example 3.25. Anticipating Example 4.19, the rule for function application will bewritten as follows, where for readability x stands for var0:

` A type x:A ` B(x) type ` s : Π(A,B(x)) ` t : A

` app(A,B(x), s, t) : B(t)

All expressions are in the metavariable extension Σ + αapp, where Σ is some am-bient signature including Π and app, and αapp = [(ty, 0), (ty, 1), (tm, 0), (tm, 0)] asin Example 3.9. The symbols A, B, s, t are the metavariable symbols of this exten-sion. An instantiation of αapp in scope γ consists of expressions A ∈ ExprtyΣ(γ),B ∈ ExprtyΣ(γ ⊕ 1), and s, t ∈ ExprtmΣ (γ). Instantiating the conclusion with theseexpressions, over some context Γ, gives the judgement Γ ` app(A,B, s, t) : B[t/x].

Building on the above, instantiations also act on other objects built out of expres-sions, including substitutions and other instantiations; at the same time, being them-selves syntactic objects, instantiations are acted upon by substitutions and signaturemaps; and all of these are suitably natural and functorial, as follows.

Definition 3.26. Given an instantiation I ∈ InstΣ,γ(α):

1. The instantiation I acts on a substitution f : δ′ → δ over Σ + α to give asubstitution I∗f : γ ⊕ δ′ → γ ⊕ δ, defined by

(I∗f)(inl i) := inl i (i ∈ γ),

(I∗f)(inr j) := I∗(fj) (j ∈ δ).

2. The instantiation I acts on an instantiation J ∈ InstΣ+α,δ(β) to give an in-stantiation I∗J ∈ InstΣ,γ⊕δ(β), defined by (I∗J)j := I∗(Jj).

3. A substitution f : δ → γ over Σ acts on the instantiation I to give an instantia-tion f∗I ∈ InstΣ,δ(α), defined by (f∗I)i := (f ⊕ bindα i)

∗Ii.

4. The instantiation I is translated along a signature map F : Σ→ Σ′ to give aninstantiation F∗I ∈ InstΣ′,γ(α), defined by (F∗I)i := F∗(Ii).

Proposition 3.27. For all suitable instantiations I , J , signatures maps F , G, substitu-tions f , g, arities α, expressions e, and scopes γ, δ, θ:

1. Translation along signature maps is functorial:

(G ◦ F )∗I = G∗(F∗I) and idΣ∗I = I.

2. The actions of substitutions and instantiations on expressions and on each otherare natural with respect to translation along signature maps:

F∗(I∗e) = (F∗I)∗((F + α)∗e),

F∗(I∗f) = (F∗I)∗((F + α)∗f),

F∗(I∗J) = (F∗I)∗((F + α)∗J),

F∗(f∗I) = (F∗f)∗(F∗I).

3. The action of substitutions on instantiations is functorial:

(f ◦ g)∗I = g∗(f∗I) and idγ∗I = I.

15

4. The action of instantiations is natural with respect to substitutions:

(f∗I)∗e = (f ⊕ δ)∗(I∗e),I∗(g

∗e) = (I∗g)∗(I∗e).

5. The action of instantiations is “associative” in the sense that

(I∗J)∗e = I∗(J∗((ι0 + β)∗e))

holds modulo the canonical associativity isomorphism between the scopes (γ ⊕δ)⊕ θ and γ ⊕ (δ ⊕ θ) of the left- and right-hand sides.

The above properties, while routine to prove, are a lot of boilerplate. They can beincorporated, if desired, into the statement that syntax forms a scope-graded monadon signatures in the sense of (Orchard et al., 2020), and that instantiations are certainKleisli maps for this monad. As ever, however, we emphasise in this paper the elemen-tary viewpoint, rather than categorical abstraction.

4 Raw type theoriesHaving described raw syntax, we proceed with the formulation of raw type theories.These hold enough information to prevent syntactic irregularities, and can be usedto specify derivations and derivability, but are qualified as “raw” because they allowarbitrariness and abnormalities that are generally considered undesirable.

4.1 Raw contextsDefinition 4.1. A raw context Γ is a scope γ together with a family on ExprtyΣ(γ)indexed by |γ|, i.e., a map |γ| → ExprtyΣ(γ). The positions of γ are also called thevariables of Γ. We often write just Γ for γ, e.g., i ∈ |Γ| instead of i ∈ |γ|, andExprcΣ(Γ) instead of ExprcΣ(γ). We use a subscript for the application of a context Γto a variable i, such that Γi is the type expression at index i.

This definition is somewhat non-traditional for dependent type theories, in a cou-ple of ways. Contexts are more usually defined as lists, so their variables are or-dered, and the type of each variable is assumed to depend only on earlier variables,i.e. Ai ∈ ExprtyΣ(i). In our definition, the variables form an arbitrary scope, with noorder assumed; and each type may a priori depend on any variables of the context.

One may describe the two approaches as sequential and flat contexts, respectively.The flat notion contains all the information needed when contexts are used in deriva-tions; we view the sequentiality as extra information that may be provided later by aderivation of well-formedness of a context, cf. Section 6.1, but that is not needed at theraw level.

Example 4.2. With the de Bruijn index scope system, a raw context Γ of scope n maybe written as 〈Ai ∈ ExprtyΣ(n) | i ∈ |n|〉. A more familiar way to display Γ is as list[(n − 1) : An−1, . . . , 0: A0], where each Ai ∈ ExprtyΣ(n), but as raw contexts followthe flat approach, we should not think of this as imposing an order on the variables, andhence the list [0 : A0, . . . , (n−1) : An−1] denotes the same context Γ. Note, by the way,that at this stage contexts on de Bruijn indices are indistinguishable from contexts onde Bruijn levels. The difference becomes apparent once we consider context extension,and the scope coproduct inclusions come into play.

16

Definition 4.3. Let Γ be a raw context, δ be a scope, and ∆ : |δ| → ExprtyΣ(|Γ| ⊕ δ)a family of expressions indexed by |δ|. The context extension Γ .∆ is the raw contextof scope |Γ| ⊕ δ, defined as

(Γ .∆)inl j := (inl∗ ◦Γ)j and (Γ .∆)inr k := ∆k.

In other words, the extended raw context Γ .∆ is the map [inl∗ ◦Γ,∆] induced by theuniversal property of the coproduct ||Γ| ⊕ δ|.

Example 4.4. To continue Example 4.2, we can consider how context extension worksfor de Bruijn indices. Let Γ = [2: A2, 1: A1, 0: A0], where each Ai ∈ ExprtyΣ(3),and ∆ = [1: B1, 0: B0] with Bj ∈ ExprtyΣ(3 + 2). The coproduct inclusions areinl i = i+2 and inr j = j. The context Γ.∆ has scope 3+2 = 5, and is given by [(i 7→i+2)∗ ◦ Γ,∆], which computes to [4 : inl∗A2, 3: inl∗A1, 2: inl∗A0, 1: B1, 0: B0].The effect is that the variables from Γ are renamed according to the scope of ∆, andthe renaming acts on the associated type expressions accordingly, i.e., by shifting allvariables by 2.

Note that with raw contexts, we weaken types when extending a context. In ap-proaches using sequential contexts with scoped syntax, weakening is instead performedwhen types are taken out of a context: that is, the variable rule (precisely stated) con-cludes Γ ` vari : ι∗Γi, where ι : i → n is the inclusion of an initial segment into thefull context.

In concrete examples, we will write contexts in a more traditional style, as lists ofvariables names with their associated types:

x1 :A1, . . . , xn :An.

Like other syntactic objects, raw contexts are acted upon by signature maps andinstantiations. The functoriality and commutation properties for these actions followdirectly from the corresponding properties for expressions.

Definition 4.5. Given a signature map F : Σ → Σ′ and a raw context Γ over Σ,the translation of Γ by F is the raw context F∗Γ over Σ′, with |F∗Γ| := |Γ| and(F∗Γ)i := F∗Γi.

Proposition 4.6. The action of signature maps on raw contexts is functorial:

F∗(G∗Γ) = (F ◦G)∗Γ and idΣ∗Γ = Γ,

for all suitable F , G, Γ.

The action of instantiations is a little more subtle. Acting pointwise on the ex-pressions of the context is not enough, since the instantiated expressions lie in a largerscope. We need to supply extra types for the extra scope, i.e., the scope of the instanti-ation must itself underlie a raw context.

Definition 4.7. An instantiation I in context Γ over Σ for arity α is an instantiationI ∈ InstΣ,|Γ|(α). Then for any raw context ∆ over Σ + α, the context instantiation(I,Γ)∗∆ is the context over Σ with scope |Γ| ⊕ |∆| and with type expressions

(I,Γ)∗∆inl(i) := inl∗ Γi (i ∈ γ),

(I,Γ)∗∆inr(j) := I∗∆j (j ∈ δ).

Briefly, (I,Γ)∗∆ is the context extension of Γ by the instantiations of the types of∆.

17

Proposition 4.8. Given instantiations

I ∈ InstΣ,|Γ|(α) in context Γ,

K ∈ InstΣ+α,|∆|(β) in context ∆, and

Θ ∈ Inst(Σ+α)+β,|Θ|(γ) in context Θ,

the equation(I,Γ)∗((K,∆)∗Θ) = (I∗K, (I,Γ)∗∆)∗Θ

holds modulo the canonical isomorphism between the scopes |Γ| ⊕ (|∆| ⊕ |Θ|) and(|Γ| ⊕ |∆|)⊕ |Θ| of their right- and left-hand sides.

4.2 JudgementsOur type theories have four primitive judgement forms, following Martin-Löf (Martin-Löf, 1984):

A type “A is a type”t : A “term t has type A”A ≡ B “A and B are equal as types”s ≡ t : A “s and t are equal as terms of type A”

These are represented with symbols ty, tm, tyeq, and tmeq, respectively. For ourneeds we need to describe the judgement forms quite precisely. In fact, the followingelaboration may seem a bit too precise, but we found it quite useful in the formalisationto make explicit all the concepts involved and distinctions between them.

Each judgement form has a family of boundary slots and possibly a head slot,where each slot has an associated syntactic class, as follows:

Form Boundary Head

ty [] tytm [ty] tmtyeq [ty, ty]tmeq [tm, tm, ty]

The table encodes the familiar constituent parts of the judgement forms:

1. “A type” has no boundary slots; the head slot, indicated by A, is a type.

2. “t : A” has one boundary type slot indicated by A, called the underlying type;the head, indicated by t, is a term.

3. “A ≡ B” has two type slots indicated by A and B, called the left-hand side andright-hand side; there is no head.

4. “s ≡ t : A” has two term slots indicated by s and t, and a type slot indicatedby A, called the left-hand side, the right-hand side and the underlying type,respectively; there is no head.

The slots of a judgement form are the slots of its boundary, and the head, if present.

Definition 4.9. Given a raw context Γ and a judgement form φ, a hypothetical judge-ment of that form over Γ is a map J taking each slot of φ of syntactic class c to anelement of ExprcΣ(Γ). We write Judg Σ for the set of all hypothetical judgements

18

over Σ. The types of Γ are the hypotheses and J is the thesis of the judgement. Whenthere is no ambiguity, we will (following traditional usage) speak of a judgment tomean either a whole hypothetical judgement Γ ` J , or just a thesis J .

A hypothetical judgement is an object judgement if it is a term or a type judgement,and an equality judgement if it is a type or a term equality judgement.

Example 4.10. The boundary and the head slots of the judgement form tm are [ty] andtm, respectively. Thus a hypothetical judgement of this form over a raw context Γ is amap taking the slot in the boundary to a type expression A ∈ ExprtyΣ(Γ) and the headslot to a term expression t ∈ ExprtmΣ (Γ). This corresponds precisely to the informationconveyed by a traditional hypothetical term judgement “Γ ` t : A”.

In view of the preceding example we shall write a hypothetical judgement over Γgiven by a map J in the traditional type-theoretic way

Γ ` J

where the elements of J are displayed in the corresponding slots.Just like raw contexts, judgements are acted on by signature maps and instantia-

tions.

Definition 4.11. Given a signature map F : Σ → Σ′ and a judgement Γ ` J overΣ, the translation F∗(Γ ` J) is the judgement F∗Γ ` F∗J over Σ′ of the same form,where the thesis F∗J is J with F∗ applied pointwise to each expression.

Proposition 4.12. This action is functorial: F∗(G∗(Γ ` J)) = (FG)∗(Γ ` J), and(idΣ)∗(Γ ` J) = (Γ ` J), for all suitable F , G, Γ, J .

Definition 4.13. Given a signature Σ, a hypothetical judgement ∆ ` J over a metavari-able extension Σ + α, a raw context Γ over Σ, and an instantiation I of α in context Γ,the judgement instantiation (I,Γ)∗(∆ ` J) is the judgement Γ . I∗∆ ` I∗J over Σ,where the thesis I∗J is just J with I∗ applied pointwise.

Proposition 4.14. Given instantiations

I ∈ InstΣ,|Γ|(α) in context Γ and

K ∈ InstΣ+α,|∆|(β) in context ∆,

and a judgement Θ ` J over (Σ + α) + β, the equation

(I,Γ)∗((K,∆)∗(Θ ` J)) = (I∗K, (I,Γ)∗∆)∗(Θ ` J)

holds modulo the canonical associativity renaming between their scopes.

4.3 BoundariesIn many places, one wants to consider data amounting to a hypothetical judgementwithout a head expression (if it is of object form, and so should have a head). Forinstance, a goal or obligation in a proof assistant is specified by such data; or whenadjoining a new well-formed rule to a type theory, before picking a fresh symbol for it(if it is an object rule), the conclusion is specified by such data.

These entities crop up frequently, and seem almost as fundamental as judgements,so deserve a name.

19

Definition 4.15. Given a raw context Γ and a judgement form φ, a hypothetical bound-ary of form φ over Γ is a map B taking each boundary slot of φ of syntactic class c toan element of ExprcΣ(Γ).

We display boundaries as judgements with a hole � where the head should stand,or with ≡? in place of ≡:

� type � : A A ≡? B s ≡? t : A

Since equality judgements have no heads, there is no difference in data between anequality judgement and an equality boundary, but there is one of sense: the judgementA ≡ B asserts an equality holds, whereas the boundary A ≡? B is a goal to beestablished or postulated. Analogously, � type and � : A can be read as goals, theformer that a type be constructed, and the latter that A be inhabited.

The terminology about judgements, as well as many constructions, carry over toboundaries. In particular, the action of signature maps and instantiations on bound-aries is defined just as in Definitions 4.11 and 4.13, and enjoys analogous properties toPropositions 4.12 and 4.14.

Finally, and crucially, boundaries can be completed to judgements. The data re-quired depends on the form: completing an object boundary requires a head expression;completing an equality boundary, just a change of view.

Definition 4.16. Let B be a boundary in scope γ over Σ.

1. If B is of object form, then given an expression e of the class of B in scope γ,write B[e] for the completion of B with head e, a judgement over γ.

2. If B is of equality form, then the completion of B is just B itself, viewed as ajudgement.

Proposition 4.17. Completion of boundaries is natural with respect to signature maps:F∗(B[e]) = (F∗B)[F∗e].

4.4 Raw rulesWe now come to raw rules, syntactic entities that capture the notion of “templates”that are traditionally used to display the inference rules of a type theory. The rawrules include all the information needed in order to be used, for defining derivationsand derivability of judgements — but they do not yet include the extra properties wetypically check when considering rules, and which guarantee good properties of theresulting derivability predicates. We return to these later, in Section 5.1.

Definition 4.18. A raw ruleR over a signature Σ consists of an arity αR, together witha family of judgements over the extended signature Σ + αR, the premises of R, andone more judgement over Σ + αR, the conclusion of R. An object rule is one whoseconclusion is an object judgement, otherwise it is an equality rule.

Example 4.19. Following on from Example 3.9, the raw rule for function applicationhas arity

αapp = [(ty, 0), (ty, 1), (tm, 0), (tm, 0)].

Writing A, B, s, t for the metavariable symbols of the extended signature Σ +αapp, thepremises of the rule are the four-element family:

[ ` A type, [0 : A] ` B(var0) type, ` s : Π(A,B(var0)), ` t : A ]

20

and its conclusion is the hypothetical judgement

` app(A,B(var0), s, t) : B(t).

Of course, the traditional type-theoretic way of displaying such a rule is

` A type x:A ` B(x) type ` s : Π(A,B(x)) ` t : A

` app(A,B(x), s, t) : B(t)

It may seem surprising that we have B(x) and B(t) rather than, say, B and B[t/x],since this style is usually apologised for as an abuse of notation. Here, it is precise andformal; B is a metavariable symbol in Σ + αapp, so it is applied to arguments whenused in the syntax. Also note that the occurrences of x in the third premise and theconclusion are implicitly bound by Π and app, as can be discerned from their arities.When we instantiate the rule below with actual expressions A, B, s and t, then B(x)and B(t) will be translated into B and B[t/x] respectively.

Definition 4.20. The functorial action of a signature map F : Σ → Σ′ is the map F∗which takes a rule R over Σ to the rule F∗R over Σ′ whose arity is the arity αR of R,and its premises and conclusion are those of R, all translated along the action of theinduced map F + αR : Σ + αR → Σ′ + αR.

A raw rule should not itself be thought of as a closure rule (though formally it isone), but rather as a template specifying a whole family of closure rules.

Definition 4.21. Given a rule R, a raw context Γ, and an instantiation I of its arity αRover Γ, all over a signature Σ, the instantiation of R under I , Γ, is the closure rule(I,Γ)∗R on Judg Σ whose premises and conclusion are the instantiations of the corre-sponding judgements of R under I , Γ. The closure system clR associated to R is thefamily 〈(I,Γ)∗R | Γ ∈ Cxt Σ, I ∈ InstΣ,Γ(αR)〉 of all such instantiations.

Example 4.22. Continuing Example 4.19, the raw rule Rapp for application gives theclosure system cl(Rapp), containing for each raw context Γ (with scope γ) and expres-sions A,B ∈ ExprtyΣ(γ), s, t ∈ ExprtmΣ (γ) a closure rule

Γ ` A type Γ, x:A ` B type Γ ` s : Π(A,B) Γ ` t : A

Γ ` app(A,B, s, t) : B[t/x].

This is visually similar to Rapp itself, but not to be confused with it. The instan-tiation is a single closure rule, written over the ambient signature Σ; the original rawrule, written over Σ + αapp, is a template specifying the whole family of such closurerules. In the raw rule, A, B, s, t are metavariable symbols from the extended signatureΣ + αapp; in the instantiatiaion, the symbols A, B, s, t (note the difference in fonts)are the actual syntactic expressions the raw rule was instantiated with.

This construction of cl formalises the usual informal explanation that a single writ-ten rule is a shorthand for a scheme of closure conditions, with the quantification of thescheme inferred from the written rule.

Proposition 4.23. The construction cl is laxly natural in signature maps, in that givenF : Σ → Σ′ and a rule R over Σ, there is an induced simple map of closure systemsclR→ clF∗R, over F∗ : Judg Σ→ Judg Σ′.

21

Proof. For each instantiation I ∈ InstΣ,|Γ|(αR), we have F∗I ∈ InstΣ′,|Γ|(αR) and(F∗I, F∗Γ)∗R = F∗((I,Γ)∗R).

This is lax in the sense that the resulting map clR→ clF∗R will not usually be anisomorphism: in general, not every instantiation of αR over Σ′ is of the form F∗I . Thisillustrates the need for considering raw rules formally, rather than just viewing a typetheory as a collection of closure rules: when translating a type theory between signa-tures, we want not just the translations of the original closure rules, but all instantiationsof the translated raw rules.

One might hope for cl to be similarly laxly natural under instantiations. However,this is not so straightforwardly true; we will return to this in Proposition 4.31, once thestructural rules are introduced, and show a weaker form of naturality.

4.5 Structural rulesThe rules used in derivations over a type theory will fall into two groups:

1. the structural rules, governing generalities common to all type theories;

2. the specific rules of the particular type theory.

The structural rules over a signature Σ are a family of closure rules on Judg Σ,which we now lay out. They are divided into four families:

• the variable rules,

• rules stating that equality is an equivalence relation,

• rules for conversion of terms and term equations between equal types, and

• rules for substitutions,

We have chosen the rules so that the development of the general setup requires nohard meta-theorems, as far as possible. In particular, we include the substitution rulesinto the formalism so that we can postpone proving elimination of substitution untilSection 5.4. You might have expected to see congruence rules among the structuralrules, but those we take care of separately in Section 4.6 because they depend on thespecific rules.

The first three families of structural rules are straightforward.

Definition 4.24. For each raw context Γ over a signature Σ, and for each i ∈ |Γ|, thecorresponding variable rule is the closure rule

Γ ` Γi type

Γ ` vari : Γi

Taken together, the variable rules form a family indexed by such pairs (Γ, i).

While this had to be given directly as a family of closure rules, the next two groupsof structural rules can be expressed as raw rules.

22

Definition 4.25. The raw equivalence relation rules are the following raw rules:

` A type

` A ≡ A

` A type ` B type ` A ≡ B

` B ≡ A

` A type ` B type ` C type ` A ≡ B ` B ≡ C

` A ≡ C

` A type ` s : A

` s ≡ s : A

` A type ` s : A ` t : A ` s ≡ t : A

` t ≡ s : A

` A type ` s : A ` t : A ` u : A ` s ≡ t : A ` t ≡ u : A

` s ≡ u : A

The equivalence relation rules over Σ is the sum of the closure systems associated tothe above equivalence relation rules, over a given signature Σ.

We trust the reader to be able read off the arities of the metavariable symbols ap-pearing in raw rules. For instance, from the use of A and s in the above term reflexivityrule we can tell that the rule has arity [(ty, 0), (tm, 0)].

The conversion rules are written as raw rules, as well.

Definition 4.26. The raw conversion rules are the following raw rules:

` A type ` B type ` s : A ` A ≡ B

` s : B

` A type ` B type ` s : A ` t : A ` s ≡ t : A ` A ≡ B

` s ≡ t : B

Again, the conversion rules over Σ is the sum of the closure systems associated to theabove conversion rules, over a signature Σ.

The remaining groups are the substitution and equality-substitution rules.The substitution rule should formalize the notion that “well-typed” substitutions

preserve derivability of judgements. Treatments taking simultaneous substitution asprimitive usually say something like: a raw substitution f : ∆ → Γ is well-typed if∆ ` f(i) : f∗Γi for each i ∈ |Γ|. However, taking all these judgements as premises inthe substitution rule is rather profligate: most substitutions in practice act non-triviallyonly on a small part of the context. For instance, a single-variable substitution may berepresented as a raw substitution Γ→ Γ .A acting trivially on Γ, so no checking shouldbe required there. Indeed, in treatments taking single-variable substitution as primitive,only require checking of the substituted expression. To abstract this situation, we definethe substitution rule as follows. Recall that a subset X ⊆ Y is complemented whenX ∪ (Y \X) = Y , a condition that is vacuously true in classical logic.

Definition 4.27. A raw substitution f : ∆ → Γ acts trivially at i ∈ |Γ| when thereis some (necessarily unique) j ∈ |∆| such that f(i) = varj and ∆j = f∗Γi. Given acomplemented subsetK ⊆ |Γ| on which f acts trivially, the corresponding substitutionrule is the closure rule

Γ ` Jfor each i ∈ |Γ| \K: ∆ ` f(i) : f∗Γi

∆ ` f∗J(4.1)

23

The substitution rules form a family of closure rules, indexed by f : ∆ → Γ, K, andΓ ` J .

In the above definition,K is thought of as a set of positions at which f is guaranteed toact trivially, but it may also do so outside K, as there is no harm in checking positionsat which f acts trivially.

The substitution rules are formulated carefully for another, more technical reason.In inductions over derivations (e.g. for Lemma 5.20), when a substitution descendsunder a binder, it gets extended to act trivially on the variables introduced by the binder.Within the inductive cases, we may not yet have enough information to conclude thatthe types of the bound variables are well-formed, but we can rely on the trivial actionof the substitution. Keeping substitution rules flexible and economical in this waytherefore keeps these inductive proofs much cleaner.

Along similar lines, we have rules stating that substitution of equal terms givesequal results.

Definition 4.28. Raw substitutions f, g : ∆ → Γ act jointly trivially at i ∈ |Γ| whenthere is some (necessarily unique) j ∈ |∆| such that f(i) = g(i) = varj and ∆j =f∗Γi = g∗Γi. Given a complemented subset K ⊆ |Γ| on which f and g act jointlytrivially, the corresponding equality-substitution rules are the closure rules

Γ ` A typefor each i ∈ |Γ| \K:

∆ ` f(i) : f∗Γi ∆ ` g(i) : g∗Γi ∆ ` f(i) ≡ g(i) : f∗Γi

∆ ` f∗A ≡ g∗A

Γ ` t : Afor each i ∈ |Γ| \K:

∆ ` f(i) : f∗Γi ∆ ` g(i) : g∗Γi ∆ ` f(i) ≡ g(i) : f∗Γi

∆ ` f∗t ≡ g∗t : f∗A

The equality-substitution rules form a family of closure rules, indexed by f : ∆ → Γ,K, and either Γ ` A type or Γ ` t : A.

Definition 4.29. The structural rules over Σ, denoted Struct Σ, is the sum of thefamilies of closure rules set out above: the variable, equivalence relation, conversion,substitution, and equality-substitution rules.

Proposition 4.30. Given a signature map F : Σ→ Σ′, there is a simple map of closuresystems Struct Σ→ Struct Σ′ over F∗ : Judg Σ→ Judg Σ′.

Proof. This is straightforward, amounting to checking that for each instance of a struc-tural rule over Σ, F acts on the data to give an instance of the same structural rule overΣ′, and the resulting closure condition is the translation along F∗ of the original closurecondition over Σ.

Before giving a similar statement about instantiations of structural rules, we mustfirst tie up the loose end from above about instantiation of closure systems of raw rules.

Proposition 4.31. Let R be a raw rule over Σ, R + β its translation to an extensionΣ + β, and I an instantiation of β in some context Γ. Then there is a closure systemmap cl(R+ β)→ clR+ Struct Σ, over (I,Γ)∗ : Judg (Σ + β)→ Judg Σ.

24

Proof. We need to show that for each instantiation K ∈ InstΣ+β,|∆|(αR) in somecontext ∆, the closure condition I∗((K,Γ)∗R) is derivable from clR+ Struct Σ.

Given such K and Γ, we can instantiate both under I to get an instantiation of Rover Σ. We might hope that (I∗K, I∗Γ)∗R = I∗ ((K,Γ)∗(R+ β); by Proposition 4.14,we see that this does not strictly, but only up to an associativity renaming in each judge-ment.

The substitution structural rule comes to our rescue here. For each judgement Jinvolved in R, with context Θ, the associativity renamings give substitutions betweenI∗(K∗Θ) and (I∗K)∗Θ acting trivially at every position, so the substitution rule letsus derive I∗(K∗J) from (I∗K)∗J (with no further premises), and vice versa.

The desired derivation of I∗ ((K,Γ)∗(R+ β) from clR+ Struct Σ therefore con-sists of (I∗K, I∗Γ)∗R, together with an instance of the substitution rule after the con-clusion and before each premise, implementing the associativity renamings.

Proposition 4.32. Let I ∈ InstΣ,|Γ|(α) be an instantiation in context Γ. Then thereis a closure system map Struct(Σ + α) → Struct Σ, over (I,Γ)∗ : Judg (Σ + α) →Judg Σ.

Proof. For the structural rules given as raw rules, the required derivations are given byProposition 4.31.

For the other structural rules, we start as in Proposition 4.30. Given an instanceof a structural rule over Σ + α, we instantiate its data under I to get an instance ofthe same structural rule over Σ. Wrapping this instance in associativity renamings,derived by the substitution rule as in Proposition 4.31, gives the required derivation ofthe instantiation of the original instance.

4.6 Congruence rulesCongruence rules, which state that judgemental equality commutes with type and termsymbols, are peculiar enough to demand special attention.

They are present in almost all type theories, but rarely explicitly written out, and areoften classified as structural rules. We reserve that term for the rules of the precedingsection, which are independent of the specific theory under consideration. Congruencerules, by contrast, depend on the specific rules of a theory; for instance, the congruencerule for Π is determined by the formation rule for Π.

In this section we define how any object rule determines an associated congruencerule. We first set up an auxiliary definition, associating equality judgements to objectjudgements.

Definition 4.33. For signature maps `, r : Σ→ Σ′ and an object judgement J over Σ,we define the equality judgment (`, r)∗J over Σ′ by

(`, r)∗(Γ ` A type) := (`∗Γ ` `∗A ≡ r∗A),

(`, r)∗(Γ ` t : A) := (`∗Γ ` `∗t ≡ r∗t : `∗A).

Definition 4.34. Suppose R is a raw object rule over a signature Σ, with premises〈Pi | i ∈ I〉 and conclusion C. Let φi be the judgement form of Pi, and take Iob :={i ∈ I | φi ∈ {ty, tm}}, the set of object premises of R. The associated congruencerule R≡ is a raw rule with arity αR≡ := αR + αR, defined as follows, where `, r :

25

Σ + αR → Σ + αR≡ are signature maps

`(ι0(S)) := ι0(S), r(ι0(S)) := ι0(S),

`(ι1(M)) := ι1(ι0(M)), r(ι1(M)) := ι1(ι1(M)) :

1. The premises of R≡ are indexed by the set I + I + Iob, and are given by:

(a) the ι0(i)-th premise is `∗Pi,

(b) the ι1(j)-th premise is r∗Pj ,

(c) the ι2(k)-th premise is the equality (`, r)∗Pk, cf. Definition 4.33.

2. The conclusion of R≡ is (`, r)∗C.

Example 4.35. Definition 4.34 works as expected. For example, the congruence ruleassociated with the usual product formation rule

` A type x:A ` B(x) type

` Π(A,B(x)) type

comes out to be` A′ type x:A′ ` B′(x) type` A′′ type x:A′′ ` B′′(x) type` A′ ≡ A′′ x:A′ ` B′(x) ≡ B′′(x)

` Π(A′,B′′(x)) ≡ Π(A′′,B′′(x))

4.7 Raw type theoriesAfter a considerable amount of preparation, we are finally in position to formulate whata rudimentary general type theory is.

Definition 4.36. A raw type theory T over a signature Σ is a family of raw rules over Σ.

Definition 4.37. The associated closure system of a raw type theory T over Σ is theclosure system clT := Struct Σ +

∐R∈T clR on Judg Σ; that is, it consists of the

structural rules for Σ, and the closure rules generated by the instantiations of the rulesof T . A derivation in T is a derivation over the closure system clT , in the sense ofDefinition 2.8.

Note that we have not included the congruence rules into the closure system as-sociated with a raw type theory. Instead, the presence of congruence rules will berequired separately as a well-behavedness condition in Section 5.2. Derivability andadmissibility of rules may now be defined as follows.

Definition 4.38. Let T be a raw type theory over Σ, and R a raw rule over Σ.

1. R is derivable from T if its conclusion is derivable from its premises, over T +αR.

2. R is admissible for T if for every instance (I,Γ)∗R, its conclusion is derivableif its premises are derivable, all over T .

We record the basic category-theoretic structure of raw type theories.

26

Definition 4.39. Given a signature map F : Σ→ Σ′, and raw type theories T , T ′ overΣ and Σ′ respectively, a simple map F : T → T ′ over F is a family map T → T ′

over F∗ : RawRule Σ → RawRule Σ′. Such F is thus a map giving for each rule Rof T a rule F (R) of T ′, whose premises and conclusion are the translations along F ofthose of R. There are evident identity simple maps, and composites over compositesof signature maps, forming a category over the category of signatures.

Furthermore, a signature map F : Σ → Σ′ acts on a raw type theory T over Σ,to give a raw type theory F∗(T ) over Σ′, which consists of the translations F∗R of therules R of T . As with family maps, maps T → T ′ over F correspond precisely tomaps F∗T → T ′ over idΣ′ . In the case of the inclusion to a metavariable extensionι0 : Σ→ Σ + α, we write T + α for the translation ι0∗T of T to Σ + α.

Proposition 4.40. The construction cl is functorial in simple maps: a simple map ofraw type theories F : T → T ′ over F : Σ → Σ′ induces a map F∗ : clT → clT ′

over F∗ : Judg Σ→ Judg Σ′, and hence provides a translation of any derivation D ∈DerT (H, (Γ ` J)) to a derivation F∗D ∈ DerT ′(F∗H, (F∗Γ ` F∗J)).

Proof. Direct from the functoriality and naturality properties of structural rules (Propo-sition 4.30) and of the closure systems associated to raw rules (Proposition 4.23).

Corollary 4.41. A signature map F : Σ→ Σ′ acts on D ∈ DerT (H, (Γ ` J)) to givea derivation F∗D ∈ DerF∗T (F∗H,F∗(Γ ` J)), functorially so.

Proof. By Proposition 4.40, using the canonical simple map T → F∗T over F .

We use the previously corollary quite frequently to translate a derivation over a rawtype theory to its extension. We mostly leave such applications implicit, as they areeasily detected.

Instantiations also preserve derivability, but this is a significantly more involvedconstruction — more so than one might expect — bringing together many earlier con-structions and lemmas, and relying in particular on almost all the properties of Propo-sition 3.27.

Proposition 4.42. Given a raw type theory T over Σ, an instantiation I ∈ InstΣ,Γ(α)induces a closure system map (I,Γ)∗ : clT + α→ clT over (I,Γ)∗ : Judg Σ + α→Judg Σ, where T + α is the translation of T by the inclusion Σ→ Σ + α.

Proof. Again, direct from similar properties of structural rules (Proposition 4.32) andclosure systems of raw rules (Proposition 4.31).

Corollary 4.43. Let T be a raw type theory over Σ. An instantiation I ∈ InstΣ,Γ(α)acts on a derivation D ∈ DerT+α(H, (∆ ` J)) to give the instantiation (I,Γ)∗D ∈DerT ((I,Γ)∗H, I∗(∆ ` J)).

Note that the hypotheses H and the judgement ∆ ` J in the statement reside in thetranslation T + α by the inclusion Σ→ Σ + α.

4.8 SummaryRaw type theories give a conceptually minimal way to make precise what is meant bytraditional specifications of type theories, and a similarly minimal amount of data fromwhich to define derivability on judgements.

27

As the name suggests, raw type theories are not a finished product. Type theoriesin nature almost always satisfy further well-formedness properties, and are rejectedby audiences if they do not. In the next two sections, we will discuss these well-formedness properties.

In some ways, raw type theories may therefore be viewed as an unnatural or un-desirable notion. However, most of the well-behavedness properties — or rather, theconditions on rules implying well-behavedness — themselves involve checking deriv-ability of certain judgements. So raw type theories, as the minimal data for definingderivability, give a natural intermediate stage on the way to our main definition of “rea-sonable” type theories.

5 Well-behavedness propertiesIn this section we identify easily-checked syntactic properties of the rules specifying atype theory, and prove basic fitness-for-purpose meta-theorems, which together articu-late the rules-of-thumb that researchers habitually use to verify that some collection ofinference rules defines a “reasonable” type theory.

5.1 Acceptable rulesNot all raw rules are deemed reasonable from a type-theoretic point of view. But whatstandard of “reasonable” are we aiming to delineate? Essentially, the same as for theaxioms of a theory in first-order logic: the axioms must be well-formed enough to begiven some meaning, although that meaning may be “false”, “wrong”, or otherwiseunexpected.

Consider for instance the following possible modifications of the rule for app, allwritten as raw rules:

` A type x:A ` B(x) type ` f : Π(A,B(x)) ` a : A

` app(A,B(x), f, a) : B(a)(5.1)

` A type x:A ` B(x) type ` f : A ` a : A

` app(A,B(x), f, a) : B(a)(5.2)

` A type x:A ` B(x) type ` f : Π(A,B(x)) ` a : Π(A,B(x))

` app(A,B(x), f, a) : B(a)(5.3)

The first is the usual rule for app, and should certainly be considered acceptable.The second asks the argument f to be of type A. This is “wrong” under the usual

reading of Π and app, but not entirely meaningless: one can introduce app with thistyping rule, and obtain a well-behaved (if bizarre) type theory. So this should be ac-cepted as a type-theoretic rule.

The third asks the argument a to be of type Π(A,B(x)). This is “not even wrong”:the conclusion purports to introduce a term of type B(a), but that is not a well-formedtype, since B expects an argument of type A, so a is not suitable (at least in the ab-sence of other rules implying that ` A ≡ Π(A,B(x))). This will therefore not be anacceptable rule.

28

Another unacceptable rule would be:

` A typex:A ` B(x) type ` f : Π(A,B(x)) ` a : A ` a : Π(A,B(x))

` app(A,B(x), f, a) : B(a)(5.4)

This is again clearly nonsense: it introduces a twice, with two different types.There are rules which are not uncommon in practice, but which we will not accept

directly, such as:

` f : Π(A,B(x)) ` a : A

` app(A,B(x), f, a) : B(a)(5.5)

While the rule is completely reasonable, making sense of it is rather subtle: check-ing, for instance, that B(a) in the conclusion is well-formed requires applying somekind of inversion principle, to the type Π(A,B(x)) from the premises. Whether suchan inversion principle is available depends on the particularities of the type theory un-der consideration. In general, we want acceptable rules to be more straightforwardlyand robustly well-behaved, so we expect that every metavariable used by the rule isexplicitly introduced by some (unique) premise.

Finally, some rules have variant forms given by moving simple premises into thecontext of the conclusion. For example, the rule for application is sometimes given as

` A type x:A ` B(x) type

x:A, y :Π(A,B(x)) ` app(A,B(x), y, x) : B(x)(5.6)

This variant has been called the hypothetical form, in contrast to the universalform (5.1). With substitution included as a structural rule, the two forms are equiv-alent: each is derivable from the other. In the absence of a substitution rule, they arenot equivalent; the hypothetical form is too weak. We have also heard it argued thatthe universal form should be seen as conceptually prior. So both forms are arguablyreasonable; but the universal form (5.1), with empty conclusion context, has the clearerclaim, and no generality is lost by restricting to such forms.

Summarising the above discussion, there are several simple syntactic criteria com-monly used as rules-of-thumb to determine “reasonability” of rules. We now formallydefine these criteria, and collect them into a definition of acceptability of rules.

Definition 5.1. Suppose R is a raw rule with arity αR over a signature Σ. We say thatR is tight when there exists a bijection β between the arguments of αR and the objectpremises of R, such that for each argument i of αR,

1. the context of the premise β(i) has the scope bindαR i;

2. the judgement form of the premise β(i) is clαR i;

3. the head expression of the premise β(i) is metai(〈varj〉j∈bindαRi).

Note that the bijection β is unique, if it exists. The definition of tightness is ad-mittedly a bit technical, but it captures a well-formedness condition of rules whichis familiar but infrequently discussed explicitly. Namely, a rule is tight if its objectpremises provide the “typing” of its metavariable symbols.

Tightness alone does not suffice to make a rule reasonable, e.g., the rule (5.3) istight but still broken because the type expression B(a) is senseless. We need anothercondition which ensures that the type and term expressions appearing in the rule makesense.

29

Definition 5.2. To each judgement Γ ` J , we associate the family of presuppositionsPresup (Γ ` J), defined as the judgements formed over Γ by placing the boundaryslots of J in the head position as follows:

Presup (Γ ` A type) := [ ],

Presup (Γ ` s : A) := [Γ ` A type],

Presup (Γ ` A ≡ B) := [Γ ` A type,Γ ` B type],

Presup (Γ ` s ≡ t : A) := [Γ ` A type,Γ ` s : A,Γ ` t : A].

We shall need to know later on that presuppositions are natural with respect to theaction of signature maps, instantiations, and raw substitutions.

Proposition 5.3. Let Γ ` J be a judgement over Σ, and F : Σ→ Σ′ a signature map.Then Presup (F∗Γ ` F∗J) = F∗(Presup (Γ ` J)).

Proof. This is clear, for instance the presupposition of F∗Γ ` F∗s : F∗A is F∗Γ `F∗A type, which is precisely what we get when F acts on Γ ` A type, the presuppo-sition of Γ ` s : A.

The reasoning that established the analogous statements about the actions of in-stantiations and raw substitutions is similarly easy.

Proposition 5.4. Let Γ ` J be a judgement over a metavariable extension Σ + α andI ∈ InstΣ,γ(α) an instantiation. Then Presup (I∗Γ ` I∗J) = I∗(Presup (Γ ` J)).

Proposition 5.5. Let Γ ` J be a judgement and f : ∆→ Γ a raw substitution. Everypresupposition of ∆ ` f∗J has the form ∆ ` f∗J ′, where Γ ` J ′ is a presuppositionof Γ ` J ′.

There is a weaker and a stronger condition that we can impose on a rule with regardsto the presuppositions of its conclusion.

Definition 5.6. Let T be a raw type theory over a signature Σ and R a raw rule over Σ:

1. a raw rule R is weakly presuppositive over T when every presupposition of theconclusion ofR is derivable in T (translated from Σ to Σ+αR) from the premisesof R and the presuppositions of the premises of R,

2. a raw ruleR is presuppositive over T when all presuppositions of the conclusionand of the premises of R are derivable in T (translated from Σ to Σ + αR) fromthe premises of R.

As far as derivability is concerned, weakly presuppositive rules are good enough,for a rule cannot be applied unless its premises have already been derived, in whichcase their presuppositions will be derivable as well — which is the gist of the proofof Theorem 5.15. However, if we were to give a meaning to a raw rule on its own,we would be hard-pressed to explain what the premises are about, unless their presup-positions were derivable as well, hence we take the stronger variant as the standardone.

Definition 5.7. A raw rule R is acceptable for a raw type theory T if it is tight, pre-suppositive over T , and has empty conclusion context.

30

Example 5.8.

1. The above rules (5.1), (5.2), (5.3), and (5.6) are tight.

2. The above rules (5.1), (5.2), (5.4), and (5.6) are presuppositive.

3. The rule

` A type

which allows us to infer that every type expression is a type, is not tight.

4. If S is a type symbol with the empty arity, the rule

` S() type

is presuppositive and tight.

5. Symmetry of type equality comes in two versions:

` A ≡ B

` B ≡ A

` A type ` B type ` A ≡ B

` B ≡ A

The left-hand one is not tight and is presuppositive, and the right-hand one istight and presuppositive.

Proposition 5.9. The congruence rule associated to an acceptable object rule is ac-ceptable.

Proof. Let R be a tight and presuppositive raw object rule over a raw type theory Twith premises 〈Γi ` Ji | i ∈ I〉. There exists a bijection βR between object premisesof R and the arguments of αR.

Definition 4.34 lays out the associated congruence rule R≡. Its arity is αR≡ =αR + αR and its premises are indexed by I + I + Iob, where Iob is the set of objectpremises of R. The bijection βR≡ witnessing tightness of R≡ is given by

βR≡(ι0(i)) := ι0(βR(i)) and βR≡(ι1(j)) := ι1(βR(j)).

Let us verify that the properties for tightness of R≡ required in Definition 5.1 followdirectly from the tightness of R. For any ι0(i) ∈ argαR≡ :

(1) The context of the premise βR≡(ι0(i)) = ι0(βR(i)) is `∗Γi. The signaturemap ` : Σ + αR → Σ + αR≡ does not change the underlying scope of Γi,and thus `∗Γi has the same underlying scope as Γi, which equals bindαR i be-cause R is tight. Furthermore, bindαR i = bindαR≡

(ι0(i)), as required.

(2) The premise βR≡(ι0(i)) has judgement form clαR≡ ι0(i) by the analogous rea-soning.

(3) The head of the premise βR≡(ιi(i)) is `∗e where e = metai(〈varj〉j∈bindαRi)

by tightness of R. We need to show that `∗e = metaι0(i)(〈varj〉j∈bindαR≡ι0(i)),

but this equation holds by the definitions of ` and of αR≡ .

The case of ι1(j) ∈ argαR≡ is symmetric.We also need to show that all presuppositions of the premises and the conclusion

of R≡ are derivable in TR≡ from the premises of R≡, where TR≡ = ι0∗ ◦ T is thetranslation of T along ι0 : Σ→ Σ + αR≡ .

31

Consider the premise Γι0(i) ` Jι0(i) at index ι0(i) for some i ∈ I . By Propo-sition 5.3, a presupposition of this premise is a presupposition P = (Γi ` J ′) ofthe corresponding premise in R, translated along the signature map `. By presuppos-itivity of R, the judgement P is derivable from TR = ι0∗ ◦ T , the translation of Talong ι0 : Σ → Σ + αR. By Corollary 4.41, we can translate such a derivation of Palong `∗, yielding a derivation in T ′ = `∗ ◦ TR, where T ′ is TR translated along `. ButT ′ = `∗ ◦ TR = `∗ ◦ ι0∗ ◦ T = ι0∗T = TR≡ , so we obtain a derivation in the correcttheory.

The case of a premise indexed by ι1(j) with j ∈ I is similar, but the last steprequires translation along the signature map r = idΣ+αR + ι1 instead, mapping themetavariable symbols of αR to the right-hand side metavariables of R≡.

A premise P indexed by ι2(k) is an equality associated to the k-th object premiseofR. The presuppositions of P are derived by the corresponding object premises ι0(k)and ι1(k), and in the case of a term equation, the presupposition of the left-hand side.

A presupposition of the conclusion is derivable by appeal to the ruleR itself for leftand right hand side of the equation. In case R≡ is a term equation, the type judgementarising as presupposition of the conclusion of R is derivable in TR by presuppositivityof R, and can be translated along `∗ in the same way that we treated the left-handcopies of the premises.

Proposition 5.10. The raw structural rules, i.e., the equivalence relation rules and theconversion rules are acceptable for any type theory.

Proof. Tightness is obvious. Presuppositivity is obvious for all but the conclusion ofthe equality conversion rule ` s ≡ t : B, which immediately follows from the ordinaryconversion rule for term judgements.

5.2 Acceptable type theoriesIt may happen that a raw type theory is flawed, even though each of its rules is accept-able. For instance, we might simply forget to state a rule governing one of the symbols,or provide two contradicting rules for the same symbol. Thus we also need a notion ofacceptability of a raw type theory.

Definition 5.11. Suppose Σ is a signature and S ∈ Σ has arity αS . The genericapplication of S is the expression

S := S(〈metai(〈varj〉j∈bindαS i)〉i∈arg αS ).

We say that an inference rule R is a symbol rule for S when its arity is αS , the judge-ment form of the conclusion is the syntactic class of S, and its head is S.

Definition 5.12. A raw type theory T over Σ is:

1. tight if its rules are tight and there is a bijection β from the index set of Σ to theobject rules of T such that, for every symbol S of Σ, β(S) is a symbol rule for S;

2. presuppositive if all of its rules are presuppositive over T ;

3. substitutive if all its rules have empty conclusion context; and

4. congruous if for every object rule of T the associated congruence rule (cf. Defi-nition 4.34) is a rule of T .

32

A raw type theory is acceptable if it enjoys all of these properties.

The definition omits a common criterion for being “reasonable”, namely there be-ing a well-founded order that prevents cyclic references between parts of the theory.We address well-foundedness separately in Section 6, and provide a couple of exam-ples showing how cyclic references may appear in an acceptable type theory.

Example 5.13. Let Q be a quantifier-like type symbol which takes a type and a term,and binds one variable in the term, with the raw rule

` A type var0 :Q(A, t(var0)) ` t(var0) : A

` Q(A, t(var0)) type

The context in the second premise is not cyclic because Q binds var0, but the premiseitself is cyclic because the term metavariable t is introduced in a context that mentionsit, and the rule is only presuppositive thanks to itself. Even so, the rule can still be usedto derive judgements. For example, for any ` t : A we can form the type Q(A, t). Itis not clear what one would do with such rules, but we have no reason to banish themoutright.

Example 5.14. The second example of cyclic references is a Tarski-style universe thatcontains itself, formulated as follows. Let u be a term constant and El a type symboltaking one term argument, with the raw rules

` u : El(u)

` a : El(u)

` El(a) type

Think of u as the code of the universe El(u) that contains itself, and El as the con-structor taking codes to types. The rules themselves are not cyclic, and the type theorycomprising them and the associated congruence rules is acceptable. However, in orderto derive ` El(u) type, which is a presupposition for both rules, we need both rules. Inthis case the cycles can be broken easily enough: introduce a type constant ` U typeand the equation ` U ≡ El(u), then use U in place of El(u) in the above rules. In Sec-tion 6.5 we shall provide a general method for removing cyclic dependencies betweenrules by introduction of new symbols.

5.3 Derivability of presuppositionsOur first meta-theorem is a fairly easy one, giving a property that is always desired butnot often explicitly discussed.

Theorem 5.15 (Presuppositions theorem). Let T be a raw type theory with all rulesweakly presuppositive. If a judgement is derivable over T , then so are all its presuppo-sitions.

Proof. We proceed by induction on derivations D over T .If D ends with a variable rule (Definition 4.24), then the only presupposition ap-

pears directly as the premise of the rule, so we may re-use its subderivation.If D ends with a substitution rule (Definition 4.27), then its conclusion must be

∆ ` f∗J for some substitution f : ∆→ Γ and judgement Γ ` J . By Proposition 5.5,each presupposition of the conclusion is ∆ ` f∗J ′ for some presupposition Γ ` J ′of Γ ` J . But Γ ` J is a premise of the last rule of D, so by induction we have a

33

derivation D′ of Γ ` J ′. So we can apply the substitution rule with Γ ` J ′ (derived byD′) and the same substitution f (with its premises derived as in D) to get the desiredderivation of ∆ ` f∗J ′.

Similarly, ifD ends with an equality substitution rule (Definition 4.28), substitutingan pair f, g : ∆ → Γ into a judgement Γ ` J , each presupposition of the conclusioncan be derived by either a substitution (along f or g individually) or an equality substi-tution (along f, g) into some presupposition of Γ ` J .

The equivalence and conversion rules (Definitions 4.25 and 4.26) are presuppositiveby Proposition 5.10, so we treat them together with the specific raw rules of T .

If D ends with an instance (I,Γ)∗R of a raw rule R (either specific or structural),then its conclusion is of the form (I,Γ)∗∆ ` I∗J , where ∆ ` J is the conclusionof R. Now Proposition 5.4 tells us that each presupposition of the conclusion is aninstantiation (I,Γ)∗∆ ` I∗J

′ of some presupposition ∆ ` J ′ of ∆ ` J . SinceR is weakly presuppositive, ∆ ` J ′ is derivable from the premises of R plus theirpresuppositions. So by Corollary 4.43, (I,Γ)∗∆ ` I∗J ′ is derivable from the premisesof (I,Γ)∗R plus their presuppositions, which in turn are derivable by induction.

5.4 Elimination of substitutionIn this section we show that over an acceptable type theory, the substitution rules (Def-initions 4.27 and 4.28) can be eliminated: anything derivable with them is derivablewithout. At least, this will hold over a strict scope system; for a general scope system,it can be almost eliminated but not quite entirely.

Definition 5.16. An instance of the substitution rule (Definition 4.27) is a trivial re-naming, or just trivial, if its substitution f : ∆ → Γ corresponds to a renaming onunderlying scopes of the form inl−1 : |Γ| = |∆| ⊕ 0 → |∆|, acting trivially at allpositions.

Typically these arise with Γ = (I,∆)∗[ ], an instantiation of the empty context; atrivial renaming is therefore of the form

(I,∆)∗[ ] ` J∆ ` (inl−1)∗J.

In a strict scope system, trivial renamings are identities and hence redundant.To avoid ambiguity with variance, we will in this section distinguish more carefully

than usual between a renaming function r : |Γ| → |∆| and its associated substitutionr : ∆→ Γ.

Definition 5.17. Call a derivation over a raw type theory T substitution-free if it usesonly trivial instances of the substitution rule, and does not use the equality substitutionrule. Equivalently, it uses just the variable rule, equality rules, conversion rules, trivialrenamings, and the specific rules of T .

The core of this section, Lemma 5.20, will be that substitution is admissible forsubstitution-free derivations; this can be seen as defining an action of substitution onsuch derivations. We first need an analogous action of renaming, paralleling how sub-stitution on expressions needed renaming to be defined first.

Lemma 5.18 (Admissibility of renaming). Let T be a substitutive type theory, withsignature Σ. Let Γ and Γ′ be contexts over Σ, and r : |Γ| → |Γ′| a renaming acting

34

trivially at all positions in the sense of Definition 4.27, i.e. such that Γ′r(i) = r∗Γifor all i ∈ Γ. Then given a substitution-free derivation D of Γ ` J in T , there is asubstitution-free derivation r∗D of Γ′ ` r∗J .

Proof. For this proof, we say a renaming respects types when it acts trivially at allpositions; and say a derivation D with conclusion Γ ` J is renameable if we havean operation giving, for every Γ′ and renaming r : |Γ| → |Γ′| respecting types, aderivation r∗D of Γ′ ` r∗J .

We show by induction that every derivation is renameable. Call the derivation underconsideration D, and suppose given in each case suitable Γ′, r.

If D concludes with a variable rule, giving Γ ` vari : Γi, then by induction, we canrename the derivation of the premise Γ ` Γi type to a derivation of Γ′ ` Γ′r(i) type,and then apply the variable rule to derive Γ′ ` varr(i) : Γ′r(i), which is the desiredjudgement since r respects types.

IfD concludes with a trivial renaming s : Γ→ ∆, then by induction, the derivationof the premise is renameable; so renaming it along r ◦ s, we are done.

Otherwise, D concludes with an instantiation (I,Θ)∗R, where I ∈ InstΣ,Θ(αR)is an instantiation, and R is either an equality rule, a conversion rule, or a specific ruleof T . In each cases the conclusion of R is of the form ` J ′, with empty context; soΓ is exactly (I,Θ)∗[ ], and J is has the form I∗J

′. So now to derive Γ′ ` r∗(I∗J ′),we will apply the same raw rule R with the instantiation I ′ := (r ◦ inl)∗I . Computingwith renamings and instantiations according to (Proposition 3.27) shows that the con-clusion of (I ′,Γ′)∗R is not quite Γ′ ` r∗J , but is the same modulo a trivial renaming,according to the following commutative square:

Γ′inl◦r //

r

%%

Θ

(I ′,Γ′)∗[ ] //

inl

OO

Γ = (I,Θ)∗[ ]

inl

OO

We can therefore conclude the derivation of r∗D by the rule (I ′,Γ′)∗R followed by atrivial renaming.

It remains to derive the premises of (I ′,Γ′)∗R. Each such premise is linked to acorresponding premise of (I,Θ)∗R by a renaming repecting types — specifically, acontext extension of r ◦ inl. But by induction, we have renameable derivations of thepremises of (I,Θ)∗R; so we are done.

It is worthwhile to record a special case of admissibility of renaming.

Corollary 5.19 (Admissibility of weakening). If a substitutive raw type theory derivesΓ ` J substitution-free, then it also derives ∆ ` w∗J substitution-free for any weak-ening w : Γ→ ∆, i.e., an injective variable renaming such that ∆w(i) = w∗Γi for alli ∈ |Γ|.

We can now give the action of substitution on derivations.

Lemma 5.20 (Admissibility of substitution). Let T be a substitutive raw type theoryover signature Σ. Let f : ∆ → Γ be a raw substitution over Σ, and K ⊆ |Γ| acomplemented subset such that:

1. f acts trivially at each i ∈ K in the sense of Definition 4.27, i.e., for some j ∈ ∆,f(i) = varj and ∆j = f∗Γi

35

2. for each i ∈ |Γ| \K, T derives ∆ ` f(i) : f∗Γi without substitutions.

Given a substifution-free derivation D of Γ ` J over T , there is a substitution-freederivation f∗D of ∆ ` f∗J .

Before proceeding with the proof, we take a moment to comment on the conditionthe lemma assumes on f . It matches the condition used in the premises of the substi-tution rule, skipping type-checking on a set of indices K on which f acts trivially, andrequiring derivability of ∆ ` f(i) : f∗Γi only for i ∈ |Γ| \K. An alternative, maybemore conventional condition would be to require ∆ ` f(i) : f∗Γi for all i ∈ |Γ|.

There are a couple of reasons to weaken the condition as we do, thus strengtheningthe statement of the lemma. The superficial one is its applicability to raw substitutionsthat potentially contain ill-formed type expressions. The more essential one is thatstrengthened formulation is needed to keep the proof structurally inductive, allowingus to descend under a premise with a non-empty context, without needing to checkthat in the process the domain of f∗ is extended with well-formed types. Even if theyare in fact well formed, we cannot show this by appealing to an induction hypothesis,because the derivations involved are not structural subderivations of the one we arerecursing over. What happens instead is that verification of well-formedness of typesin contexts is deferred until their variables are accessed, at which point the variablerule provides the desired structural subderivations. This phenomenon seems to be agenuine consequence of spelling out the proof for a general class of type theories. Forany specific type theory, only certain concrete type-schemes will occur in contexts ofpremises of rules; and these specific type-schemes are always designed by their authorsin such a way that they can be shown well-formed individually, so that the inductivearguments do not break.

Proof of Lemma 5.20. Within this proof, all derivations are assumed substitution-free,and a (substitution-free) derivation D of a judgement Γ ` J is called substitutablewhen, for all ∆, f andK satisfying the condition of the lemma, we have a (substitution-free) derivation of ∆ ` f∗J . We prove by induction that every derivation D is substi-tutable. Much of the proof parallels that of Lemma 5.18.

Suppose D concludes with a variable rule showing Γ ` vari : Γi, and f : ∆ → Γis a suitable substitution, acting trivially on K ⊆ |Γ|. When i ∈ K, we work justas in Lemma 5.18: given a suitable substitution into the conclusion, we inductivelysubstitute the premise derivation along the same substitution, and then conclude withthe variable rule. Otherwise, for i ∈ |Γ| \K, we use the derivation of ∆ ` f(i) : f∗Γigiven by assumption.

Next, if D concludes with a trivial renaming inl−1 : Γ → Γ′, to conclude Γ ` J ,then suppose f : ∆ → Γ is a substitution acting trivially on K and with derivationsof ∆ ` f(i) : f∗Γi for i ∈ |Γ| \ K. Then the substitution inl−1 ◦ f : ∆ → Γ′ actstrivially on inl(K) ⊆ |Γ′|, and the same derivations witness that ∆ ` f(inl−1 i) : f∗Γ′ifor i ∈ |Γ′| \ inl (K). So by induction, we can substitute the derivation of the premisealong inl−1 ◦ f to derive ∆ ` f∗Jas required.

Otherwise, D must conclude with an instantiation (I,Θ)∗R, for some instantiationI ∈ InstΣ,Θ(αR) and R a flat rule (structural or specific) with empty-context conclu-sion ` J . So, suppose given a suitable substitution f : ∆→ (I,Θ)∗[ ], acting triviallyon K ⊆ |(I,Θ)∗[ ]| = Θ⊕ 0; we need to derive ∆ ` f∗I∗J .

Just as in Lemma 5.18, we substitute I along inl◦f : ∆→ Θ to get another instan-tiation I ′ of αR over ∆, such that the conclusion of (I ′,∆)∗R is just a trivial renamingaway from ∆ ` f∗I∗J . So it remains just to derive the premises of (I ′,∆)∗R.

36

Again as in Lemma 5.18, by induction we have substitutable derivations of allpremises of (I,Θ)∗R. So it suffices to give, for each premise (I ′,∆)∗Ψ ` I ′∗J ′ of(I ′,∆)∗R, a substitution g : (I ′,∆)∗Ψ → (I,Θ)∗Ψ to the corresponding premiseof (I,Θ)∗R, with g∗I∗J ′ = I ′∗J

′, and with g satisfying the conditions of the lemma.(Recall that (I,Θ)∗Ψ is the context extension of Θ by the instantiations of types fromΨ, with positions |Θ| ⊕ |Ψ|, and (I ′,∆)∗ similarly.) We define:

g := (inl ◦ f)⊕ |Ψ| : (I ′,∆)∗ → (I,Θ)∗Ψ.

Now g∗I∗J′ = I ′∗J

′ follows directly from Proposition 3.27 (which we will con-tinue to use without further comment), the definitions of I ′ and g, and the followingcommuting diagram.

(I ′,∆)∗Ψ

inl��

g // (I,Θ)∗Ψ

inl��

∆inl◦f //

f

''

Θ

(I ′,∆)∗[ ] //

inl

OO

(I,Θ)∗[ ]

inl

OO

Next, g clearly acts trivially on inr∗ |Ψ| ⊆ |(I,Θ)∗Ψ|. It also acts trivially on thesubset of inl∗ |Θ| corresponding to the given K ⊆ |Θ| ⊕ 0 on which f acts trivially.

Taking the union of these as the trivial set for g, it remains to show that for i inthe subset of inl∗ |Θ| corresponding to |Θ| ⊕ 0 \ K, we have (I ′,∆)∗Ψ ` g(i) :g∗((I,Θ)∗Ψ)i. But this judgement is just the renaming of ∆ ` f(j) : f∗((I,Θ)∗∅)ialong the evident map |∆| ⊕ 0 → |∆| ⊕ |Ψ|. So using Lemma 5.18 to rename thederivation of ∆ ` f(j) : f∗((I,Θ)∗[ ])i supplied with f , we are done.

Next, we show that substitution respects judgemental equality of raw substitutions.For this, we introduce a handy notation: for raw substitutions f, g : Γ′ → Γ and anobject judgement J , with head expression e and boundary B, we write (f ≡ g)∗J forthe equality judgement asserting that f∗e and g∗e are equal over the boundary f∗B.Thus Γ′ ` (f ≡ g)∗(A type) stands for Γ′ ` f∗A ≡ g∗A and Γ′ ` (f ≡ g)∗(e : A)stands for Γ′ ` f∗e ≡ g∗e : f∗A.

Lemma 5.21 (Admissibility of equality substitution). Let T be a substitutive and con-gruous raw type theory over Σ. Let f, g : Γ′ → Γ be raw substitutions over Σ, andK ⊆ |Γ| a complemented subset such that:

1. for each i ∈ K there exists some (necessarily unique) j ∈ |Γ′| such that f(i) =g(i) = varj , and Γ′j = f∗Γi or Γ′j = g∗Γi,

2. for each i ∈ |Γ| \ K, T derives Γ′ ` f(i) : f∗Γi and Γ′ ` g(i) : g∗Γi andΓ′ ` f(i) ≡ g(i) : f∗Γi without substitutions.

If T derives Γ ` J without substitutions, then T derives Γ′ ` f∗J , Γ′ ` g∗J , and (if Jis an object judgement) Γ′ ` (f ≡ g)∗J , still without substitutions.

The assumption on f and g is perhaps a little surprising, especially the last “or”in case (1). Another peculiarity is the fact that we include Γ′ ` f∗J and Γ′ ` g∗Jin the conclusion rather than obtaining them by elimination of substitution. The pointis that the induction arguments need to work when we pass into extended contexts of

37

premises, whereby types of the form f∗A or g∗A are introduced; so we cannot assumeeither f or g satisfying the conditions of Theorem 5.22 individually, but need to give acondition on them together that is preserved. And since this condition is too weak forapplying elimination of substitution to f or g, we carry the conclusion of that along aswell.

Proof of Lemma 5.21. The proof proceeds by induction on the derivation D of Γ ` J .The details are closely analogous to elimination of substitution, so we spell out fewer.

Consider the case when D ends with a variable rule

Γ ` Γi type

Γ ` vari : Γi

If case (2) applies for i, we are done immediately. If case (1) applies, we obtainderivations of Γ′ ` f∗Γi type, Γ′ ` g∗Γi type, and Γ′ ` f∗Γi ≡ g∗Γi by induc-tion hypothesis. If Γ′j = f∗Γi then Γ′ ` varj : f∗Γi follows by the variable rule andΓ′ ` varj : g∗Γi from it by conversion. And of course, Γ′ ` varj ≡ varj : Γ′j isderivable by reflexivity. If Γ′j = g∗Γi, the situation is symmetric.

Otherwise, D concludes with an instantiation I∗R where I ∈ InstΣ,Γ(αR) and Ris either an equality rule, a conversion rule, or a specific rule of T . The conclusion ofI∗R has the form Γ ` I∗J ′. We need to derive Γ′ ` f∗(I∗J ′), Γ′ ` g∗(I∗J ′), and if Ris an object rule then also Γ′ ` (f ≡ g)∗(I∗J

′).Let us first verify that the raw substitutions f ′ := f ⊕|∆| : Γ′ . (f∗I)∗∆→ Γ . I∗∆

and g′ := g ⊕ |∆| : Γ′ . (f∗I)∗∆ → Γ . I∗∆ satisfy the conditions (1) and (2)of the lemma for the complemented subset K ′ ⊆ |Γ . I∗∆| = |Γ| + |∆|, given byK ′ = {ι0(i) | i ∈ K} ∪ {ι1(k) | k ∈ |∆|}:

1. For ι0(i) ∈ K ′, there is j ∈ Γ′ such that f(i) = g(i) = varj and either Γ′j =f∗Γi or Γ′j = g∗Γi. Now f ′ and g′ satisfy condition (1) because f ′(ι0(i)) =varι0(j) = g′(ι0(i)) and (Γ′ . (f∗I)∆∗)ι0(j) = ι0∗Γ

′j , which is equal either to

ι0∗(f∗Γi) = f ′∗(Γ . I∗∆) or to ι0∗(g∗Γi) = g′∗(Γ . I∗∆), as the case may be.

2. For ι1(k) ∈ K ′, condition (1) is satisfied by f ′ and g′: it is clear that f ′(ι1(k)) =varι1(k) = g′(ι1(k)), while f ′∗(Γ . I∗∆)ι1(k) = f ′∗(I∗∆k) = (f∗I)∗∆k =(Γ′ . (f∗I)∗∆)ι1(k) holds, where we used Proposition 3.27 in the second step.

3. For ι0(j) ∈ (|Γ|+ |∆|)\K ′, f ′ and g′ satisfy (2) because the desired judgements

Γ′ . (f∗I)∗∆ ` f ′(ι0(j)) : f ′∗(Γ . I∗∆)ι0(j)

Γ′ . (f∗I)∗∆ ` g′(ι0(j)) : g′∗(Γ . I∗∆)ι0(j)

Γ′ . (f∗I)∗∆ ` f ′(ι0(j)) ≡ g′(ι0(j)) : f ′∗(Γ . I∗∆)ι0(j)

are respectively equal to

Γ′ . (f∗I)∗∆ ` ι0∗(f(j)) : ι0∗(f∗Γj)

Γ′ . (f∗I)∗∆ ` ι0∗(g(j)) : ι0∗(g∗Γj)

Γ′ . (f∗I)∗∆ ` ι0∗(f(j)) ≡ ι0∗(g(j)) : ι0∗(f∗Γj)

and these are derivable by Lemma 5.18 applied to the renaming ι0 and the as-sumptions (2) for f and g.

38

We similarly check that the raw substitutions f ′′ := f⊕|∆| : Γ′.(g∗I)∗∆→ Γ.I∗∆and g′′ := g ⊕ |∆| : Γ′ . (g∗I)∗∆→ Γ . I∗∆ satisfy the conditions of the lemma withthe same set K ′, too. The verification is similar to the case of f ′ and g′ above, and atthis point the disjunction in (1) lets us exchange the role of g and f .

We may now derive Γ′ ` f∗(I∗J ′) by the closure rule (f∗I)∗R, as it has the correctconclusion by Proposition 3.27. To see that its premises are derivable, we verify thatfor any premise ∆ ` J ′′ of R, the instantiation by f∗I , namely

Γ′ . (f∗I)∗∆ ` (f∗I)∗J′′ (5.7)

is derivable. The corresponding premise of I∗R, which is

Γ . I∗∆ ` I∗J ′′, (5.8)

is derivable by assumption, and so we obtain (5.7) by the induction hypothesis for (5.8)applied to f ′ and g′.

By a similar argument Γ′ ` g∗(I∗J ′) is derivable by the closure rule (g∗I)∗R, weonly need to use f ′′ and g′′ instead of f ′ and g′ to derive the premise

Γ′ . (g∗I)∗∆ ` (g∗I)∗J′′. (5.9)

It remains to be checked that Γ′ ` (f ≡ g)∗(I∗J′) is derivable when R is an object

rule. Because T is congruous, the congruence rule C associated with R is a specificrule of T . Let `, r : Σ+αR → Σ+αC be the signature maps from Definition 4.34 andI ′ := f∗I+g∗I ∈ InstΣ,Γ′(αR+αR). Note that I ′∗ ◦`∗ = f∗I and I ′∗ ◦r∗ = g∗I . Theinstantiation I ′∗C is a closure rule whose conclusion is precisely Γ′ ` (f ≡ g)∗(I∗J

′),so we only have to establish that its premises are derivable, of which there are threekinds:

1. For each premise ∆ ` J ′′ of R, there is a corresponding premise I ′∗(`∗(∆ `J ′′)), which is equal to (5.7). We have already seen that it is derivable.

2. For each premise ∆ ` J ′′ of R, there is a corresponding premise I ′∗(r∗(∆ `J ′′)), which is equal to (5.9). Its derivability has been established, too.

3. For each object premise ∆ ` J ′′ of R, there is a corresponding premise, namelythe associated equality judgement (Definition 4.33) instantiated by I ′. A shortcalculation relying on Proposition 3.27 shows that the judgement is

Γ′ . (f∗I)∗∆ ` (f ′ ≡ g′)∗J ′′,

which is one of the consequences of the induction hypothesis for (5.8) applied tof ′ and g′.

We can now put these together into the main theorem of this section.

Theorem 5.22 (Elimination of substitution). Let T be a substitutive and congruousraw type theory; then every derivable judgement over T has a substitution-free deriva-tion.

Proof. Work by induction over the original derivation. At substitution rules, applyLemma 5.20; and at equality substitution rules, Lemma 5.21.

39

5.5 Uniqueness of typingWhether it is desirable for a term to have many types depends on one’s motivations, butcertainly in our setting, where the terms record detailed information about premises, weshould expect a term to have at most one type, which we prove here.

Theorem 5.23. If a tight, substitutive raw type theory T derives Γ ` A type, Γ `B type, Γ ` t : A and Γ ` t : B then it also derives Γ ` A ≡ B.

Proof. By Theorem 5.22 it suffices to prove the claim for substitution-free derivations.Suppose we have derivations DA, DB , D1 and D2:

DA

Γ ` A type

DB

Γ ` B type

D1

Γ ` t : A

D2

Γ ` t : B

The proof proceeds by a double induction on the derivations D1 and D2.Consider the case where D1 ends with a conversion:

D1,A′

Γ ` A′ typeD1,A

Γ ` A type

D1,t

Γ ` t : A′D1,eq

Γ ` A′ ≡ AΓ ` t : A

We apply the induction hypothesis to D1,t and D2 to derive Γ ` A′ ≡ B. The desiredΓ ` A ≡ B now follows from D1,eq by symmetry and transitivity of equality. Thecase where D2 ends with a conversion is symmetric, except that it does not require theuse of symmetry.

Consider the case where D1 ends with a variable rule:

D′1Γ ` Γj type

Γ ` varj : Γj

Because T is tight D2 must end with a variable or a conversion rule. We have alreadydealt with the latter one. If D2 ends with a variable rule, then A = Γj = B, and wemay conclude Γ ` A ≡ B by reflexivity.

In the remaining case D1 and D2 both end with instantiations of specific rulesof T . Let β be the map which takes each symbol S ∈ Σ to the corresponding symbolrule in T . There is a unique symbol S ∈ Σ, such that D1 and D2 both end withinstantiations of β(S):

D1

Γ ` I∗S(〈metai〉i∈arg S) : I∗C

D2

Γ ` J∗S(〈metaj〉j∈arg S) : J∗C

Of course, I∗C is just A and J∗C is B, and both heads are equal to t, from which itfollows that

S(〈I∗metai〉i∈arg S) = S(〈J∗metai〉i∈arg S),

and so I and J are equal because they match on every i ∈ argS. Thus A = I∗C =J∗C = B, and we may derive Γ ` A ≡ B by reflexivity.

We record a more economical version of uniqueness of typing, which one can affordin reasonable situations.

40

Corollary 5.24. If an acceptable type theory derives Γ ` e : A and Γ ` e : B then italso derives Γ ` A ≡ B.

Proof. Apply Theorem 5.15 to Γ ` e : A and Γ ` e : B to obtain Γ ` A type andΓ ` B type, and conclude by Theorem 5.23.

Acceptability also easily gives us uniqueness of typing for term equalities.

Corollary 5.25. If an acceptable type theory derives Γ ` s ≡ t : A and Γ ` s ≡ t : Bthen it also derives Γ ` A ≡ B.

Proof. Again, apply Theorem 5.15 to get Γ ` A type and Γ ` B type, and concludeby Theorem 5.23.

5.6 An inversion principleGiven the fact that a judgement Γ ` J is derivable, to what extent can a derivation ofit be constructed just from the information given in the judgement? We show in thissection that, for sufficiently well-behaved type theories, one can read off the proof-relevant part of a derivation from the head of J . The proof-irrelevant parts are theapplications of conversion rules, and subderivations of equalities. The former may bearranged to always appear just once after variable and symbol rules, while the lattermust be dealt with on a case-by-case basis, as a particular type theory may or may notpossess an algorithm that checks derivability of equalities.

When we attempt to reconstruct a derivation from a judgement, the first obstaclewe face is what types should be given to the subterms appearing in a judgement. Forthe variables the answer is clear, while for symbol expressions it is natural to use thetypes dictated by the corresponding rules, as follows.

Definition 5.26. Let T be a tight raw type theory over Σ and β the assignment ofrules to the symbols of Σ. Thus for each term symbol S ∈ Σ, the conclusion of β(S)

takes the form ` S : AS for some AS ∈ ExprtyΣ+αR(0). Given a term expression

t ∈ ExprtmΣ (Γ), its natural type τΓ(t) ∈ ExprtyΣ(Γ) is defined by

τΓ(vari) := Γi and τΓ(S(e)) := e∗AS ,

where we used e ∈∏i∈arg S ExprclS i

Σ (γ ⊕ bindS i) as an instantiation, so that e∗ASis the expression in which each metai(e

′) is replaced by (e′)∗ei.

To put it more simply, the natural type of S(e) is the type one obtains by applyingthe symbol rule for S to the premises determined by e.

Theorem 5.27 (Inversion principle). Let T be an acceptable type theory over Σ.

1. If T derives Γ ` vari : A then it does so by an application of a variable rule,followed by a conversion:

D′

Γ ` vari : Γi

D′′

Γ ` Γi ≡ AΓ ` vari : A

41

2. If T derives Γ ` S(e) : A then it does so by an application of the symbol rulefor S, followed by a conversion:

D′

Γ ` S(e) : τΓ(S(e))

D′′

Γ ` τΓ(S(e)) ≡ AΓ ` S(e) : A

3. If T derives Γ ` S(e) type then it does so by an application of the symbol rulefor S.

Proof. Let T be an acceptable type theory over Σ, and β the assignment of symbolrules to the symbols of Σ. To establish the first two claims, we proceed by inductionon a substitution-free derivation D, which exists by Theorem 5.22.

If D ends with a variable rule,

D

Γ ` Γi type

Γ ` vari : Γi

then we obtain the desired derivation by attaching a dummy conversion rule:

D

Γ ` Γi type

Γ ` vari : Γi

D

Γ ` Γi type

Γ ` Γi ≡ Γi

Γ ` vari : Γi

Otherwise, D ends with an application of the conversion rule

D′

Γ ` vari : B

D′′

Γ ` B ≡ AΓ ` vari : A

We apply the induction hypothesis to D′ to obtain a derivation of the form

D∗

Γ ` vari : Γi

D∗∗

Γ ` Γi ≡ BΓ ` vari : B

Using the transitivity rule, we combine D∗∗ and D′′ into a derivation of Γ ` Γi ≡ A,which can then be used together with D∗ to get the desired form of derivation.

If D ends with an application of the symbol rule β(S),

D′

Γ ` S(e) : τΓ(S(e)),

then by Theorem 5.15 there is a derivationD′′ of the presupposition Γ ` τΓ(S(e)) type.We apply reflexivity toD′′ to obtain Γ ` τΓ(S(e)) ≡ τΓ(S(e)), and then conversion toget the desired derivation. Otherwise, D ends with an application of a conversion rule,in which case we proceed as in the variable case.

The third claim is trivial, because β(S) is the only rule which can be instantiatedto have the conclusion Γ ` S(e) type, apart from substitution rules, which we havedispensed with.

42

The above theorem may be applied repeatedly to obtain a canonical form of theproof-relevant part of a derivation. The missing subderivations of equalities must beprovided by other means. Also notice that it is easy enough to avoid insertion of un-necessary appeals to conversion rules along reflexivity.

A useful consequence of Theorem 5.27 is the fact that a type of a term may becalculated directly from the term (and the symbol rules), as long as it has one.

Corollary 5.28. In an acceptable type theory, a typeable term has its natural type.

Proof. Whenever Γ ` e : A is derivable, then so is Γ ` e : τΓ(e) because its derivationappears as a subderivation in the statement of Theorem 5.27.

6 Well-founded presentationsSo far, our type theories have omitted one typical characteristic occurring in practice:the ordering of the presentation of the theory. This ordering appears, implicitly orexplicitly, at three levels:

1. The positions of a context usually form a finite sequence, and each type dependsonly on the preceding part of the context.

2. The premises of each rule typically follow some well-founded order, usuallysimply a finite sequence, and the boundary of each premise depends only on theearlier ones.

3. The rules of the theory are themselves well-founded, and each rule depends onlyon the earlier rules. This order is quite often infinite, in for instance theories withhierarchies of universes, and need not be total, as seen in the example below.

At each of these three levels, the “depends only on” holds in two senses:

1. Raw expressions: each type expression of a context uses only the preceding vari-ables; in a rule, the expressions of each premise boundary only use previously-introduced metavariables; and in a theory, the raw premises and boundary of arule only use previously-introduced symbols of the theory.

2. Derivations for presuppositivity: each type expression in a context is a derivabletype over just the preceding part of the context; each premise of a rule can bechecked well-formed using just the preceding premises; and so can each ruleusing just the earlier rules.

Example 6.1. The Π-formation rule uses no symbols of the signature in its premisesor boundary, and relies only on structural rules for its well-formedness. The rules forλ-abstraction and function application both use Π in their raw expressions, and dependon the Π-formation rule for their reasonability, but not on any other earlier symbols orrules. And the β-reduction rule in turn depends on all three of these.

Similarly, there is a natural order within the rules for Σ-types. On the other hand,neither of the Σ- or Π-type groups naturally precedes the other, and it would be unnat-ural to force them into a total order.

Traditionally, well-foundedness is treated in two different ways, depending on thelevels. Since the class of all contexts of a theory is formally defined — contexts are“user-definable” — their well-foundedness must be explicitly mandated somehow; andso it is, usually by the context judgement ` Γ cxt.

43

Rules and theories by contrast are not “user-definable”: each development usuallypresents a single theory, or a few, with a specific collection of rules. The ordering onthese can therefore be left entirely unstated, but it is almost always clearly present. Thewriter ensures it when setting up the theory; the reader follows it when understandingthe theory, and convincing themself of its reasonableness; and it is respected in laterproofs and constructions.

It would be jarring, for instance, and often logically impossible, to give the se-mantics of β-reduction before that of λ-abstraction. On the other hand, it would beunsurprising if a writer introduced the rules for Π-types before those for Σ-types, butthen gave semantics with Σ-types first. Overall, the implicit partial order on rules isalways respected, but no particular total extension of it is.

Defining what it means for a presentation to be ordered is a little subtler than onemight expect; we work up to it gradually, considering contexts first, then rules, andfinally theories.

6.1 Sequential contextsIn our setting, with scoped syntax, and with contexts as maps from positions to types(we henceforth refer to these as flat contexts), traditional sequential contexts may berecovered in various ways. They are all straightforwardly equivalent — indeed, a suffi-ciently informal statement of the traditional definition could be read as any of them —but explicitly comparing them provides a useful warmup for the less straightforwardcases with rules and type theories later.

Recall that [n] denotes the sum of n ∈ N copies of [1]. When working with sequen-tial contexts, we will identify the positions of [n] with {0, . . . , n − 1}, and denote theevident “subscope inclusion” maps by wi,j : [i]→ [j].

Definition 6.2 (Sequential context I). A raw sequential context over a signature Σ isa list Γ = [Γ0, . . . ,Γn−1], where Γi ∈ ExprtyΣ([i]), for each i ∈ [n]. We write Γ<ifor the initial segment [Γ0, . . . ,Γi−1]. The flattening of a raw sequential context is theraw flat context of scope [n] whose i-th type is the weakening wi,n∗(Γi). We typicallyleave flattening implicit, writing Γ both for a sequential context and its flattening.

Given a signature Σ and a raw type theory T over it, a raw sequential context Γ overΣ is well-formed over T if for each i ∈ Γ, the judgement Γ<i ` Γi type is derivable.

Alternately, we can define sequentiality as a property of flat contexts:

Definition 6.3 (Sequential context II). A raw flat context Γ of scope [n] is sequential iffor each i ∈ [n], all variables varj occurring in Γi have j < i. Thus each Γi is uniquelyof the form wi,n∗(Γi), from which we define the initial segments Γ<i as sequential rawcontexts of scope [i].

A sequential context Γ over Σ is well-formed over T if for each i ∈ Γ, the judge-ment Γ<i ` Γi type is derivable.

Finally, we can define well-formed flat contexts via the traditional derivation rules,without reference to raw sequential contexts.

Definition 6.4 (Sequential context III). The property ` Γ cxt, read as “Γ is sequen-tially well-formed” over a given theory, is the inductive predicate on flat contexts de-

44

fined by the following closure conditions, the latter for all suitable Γ, A:

` [ ] cxt` Γ cxt Γ ` A type

` Γ . A cxt

Each of the above definitions moreover has two possible readings: proof-relevant,where by derivability of a judgement Γ ` A type we mean that a specific derivationis given, and proof-irrelevant, where we merely mean that some derivation exists. Wetake the proof-relevant reading in all cases.

Proposition 6.5. Definitions 6.2, 6.3 and 6.4 are all equivalent, as predicates on flatcontexts, in both their proof-relevant and -irrelevant forms.

Proof. Essentially straightforward, given the fact, already mentioned in Definition 6.3,that the variable-occurrence constraint there precisely characterises the images of theweakenings wi,n∗ : ExprtyΣ([i]) ↪→ ExprtyΣ([n]).

Of these definitions, Definition 6.3 is the simplest to state, especially if one sweepsunder the rug the inverse-weakening required for defining initial segments. However,when spelling out details carefully, this inverse-weakening is tedious to keep track of.When we bump these definitions up to sequential rules or type theories, therefore, wewill focus on approaches based on Definitions 6.2 and 6.4.

6.2 Sequential rulesNext, we wish to define sequential rules, in which premises form a finite sequence, andeach refers only to the previous ones. Analogously to Definition 6.3, the easiest versionto state is to start from an ordinary raw rule, and add desirable properties, together withrestrictions on how earlier parts can be used in later parts:

Definition 6.6 (Sequential rule, provisional). Let R be a tight raw rule over a signa-ture Σ, with premises indexed by [n], and β the bijection witnessing its tightness. Wesay that R is sequential if for all i ∈ [n] and j ∈ αR, if metaj appears in the i-thpremise, then β(j) < i.

Moreover, say that R is (sequentially) well-formed over a raw type theory T , alsoover Σ, when for all i ∈ [n], the presuppositions of the i-th premise ofR can be derivedfrom the premises indexed by [i].

This definition is adequate, but is in several regards somewhat unsatisfying:

1. We have said here, as in Definition 6.3, that the premises are formed over theextension by all the metavariables, and their presuppositions derived from all thepremises, but use only the preceding ones. When applying this condition, onetypically wants to consider them as formed, or derived, over the extension byjust the preceding initial segment. So rather than restricting them back there, andhaving to keep track of such restriction, it is simpler to say from the start, as inDefinition 6.2, that they are formed, or derived, over those initial segments.

2. “Tightness” gives a redundancy of data in two ways. Firstly, the heads of allobject-judgement premises are redundant: each head must be the correspondingmetavariable, applied to all variables of its scope. Besides this, the arity itself isdetermined by the indexing family of the rule together with the scopes and formsof the premises.

45

These issues can be remedied by defining sequential presentations inductively, anal-ogously to Definition 6.4, and adding each premise not as a full judgement but just itsboundary, whose head, if any, will be filled in automatically to ensure tightness byconstruction.

Definition 6.7. Given a signature Σ and a raw type theory T over it, we define in-ductively the sequential premise-families P with arities αP , and simultaneously theirflattenings as families of judgements over Σ + αP , as follows.

1. The empty sequential premise-family 〈〉 has empty arity α〈〉, and its flatteningis the empty family.

2. Let P be a sequential premise-family, with arity αP and flattening F . Let ∆ ` Bbe a boundary over Σ+αP , such that all presuppositions of ∆ ` B are derivableover (T + F ) + αP (i.e. the translation of T + F from Σ to Σ + αP ), and ∆ isa well-formed sequential context over the same theory.

Then there is an extension sequential premise-family P ; (∆ ` B).

If ∆ ` B is an object boundary of class c, then the associated arity αP ;(∆`B) isαP + 〈(c, |∆|)〉; or if ∆ ` B is an equality boundary, αP ;(∆`B) is just αP .

The flattening of P ; (∆ ` B) is 〈ι∗(Γi ` Ji) | i ∈ I〉+ 〈(ι∗∆) ` J〉, where ι isthe inclusion αP → αP ;(∆`B), and J is ι∗B with the head, if any, filled by theexpression meta?(〈vari〉i∈δ), where ? is the new argument adjoined to the arity.

Notationally, we will not distinguish the flattening from the sequential premise-familyitself.

Definition 6.8. A sequential rule P =⇒ J over Σ and T is a sequential premise-family P , together with a judgement Γ ` J over Σ + αP , the conclusion, whosepresuppositions are derivable over T +P (with T translated to the metavariable exten-sion). A sequential rule has an evident flattening as a raw rule.

Again, we do not notate the flattening explicitly. Note that as in the definition of rawrules the conclusion J has an empty context.

The addition of a boundary in the extension step of Definition 6.7 is precisely analo-gous to the traditional context extension rule, as in Definition 6.4. There, the extensionis specified just by a type A, but its effect is to add a term-of-type judgement vari :A,where the term is automatically determined to be a (fresh) variable, rather than speci-fied as input to the extension.

Reading Definition 6.7 with an eye towards computer-formalisation, one may noteit can be formalized in several ways: as an inductive-recursive definition of a set withfunctions to arities and families of judgements; as an inductive family of sets indexedover pairs of an arity and a family of judgements; or an N-indexed sequence of setstogether with functions to arities and families, by induction on n ∈ N, the length of thefamily. These are all equivalent, by standard generalities about inductive definitions.

Proposition 6.9. The flattening of a sequential rule with empty conclusion context isacceptable.

Proof. Tightness is immediate, by construction of the arity of the premise-family andthe heads of its object-judgement premises. Presuppositivity is similarly by construc-tion, from the well-formedness conditions in the definitions of sequential rules andpremise-families, with the latter inductively translated along metavariable extensionsas the premise-family is built up.

46

As we defined premise-families using just boundaries rather than complete judge-ments, similarly when we define well-founded type theories we will specify them usingsequential rules whose conclusions have no heads. We will also (for substitutivity) re-strict attention to empty conclusion contexts.

Definition 6.10. A sequential rule-boundary P =⇒ B over Σ and T is a sequentialpremise-family P , together with a boundary Γ ` B over Σ + αP with empty context,whose presuppositions are derivable over T + P .

Rule-boundaries can of course be completed to rules, by filling in a head if required.

Definition 6.11. The realisation of a sequential rule-boundary R = (P =⇒ B) as asequential rule (or, via flattening, a raw rule) is defined according to the form of B:

1. If B is an object boundary of class c, then given a symbol S ∈ Σ with arity αPand class c, the realisation R[S] of R with S is the sequential rule

P =⇒ B[S]

given by completing B with the generic application of S.

2. If B is an equality, no further input is required: the realisation of R is justP =⇒ B with B viewed as a judgement.

This gives, by construction:

Proposition 6.12. The realisation of an object rule-boundary for S yields a symbolrule for S.

Example 6.13. The sequential rule-boundary

(` A type); (x:A ` B(x) type) =⇒ (` � type)

realised with the symbol Π gives the sequential rule

(` A type); (x:A ` B(x) type) =⇒ (` Π(A,B(x)) type)

whose flattening is the usual formation rule for dependent products, as in Example 4.35.With Σ instead of Π, it gives the formation rule for dependent sums.

6.3 Well-presented rulesSequential rules and rule-boundaries give a satisfactory treatment covering most ex-ample theories, and sufficing for many purposes, including implementation in proofassistants. For instance, the Andromeda proof assistant (Andromedans, 2020; Baueret al., 2019) implements a variant of sequential rules and rule boundaries in the trustednucleus.

Here we consider the generalisation from finite sequences to arbitrary well-foundedorders, partly to encompass infinitary rules, but mainly as a warm-up for well-foundedtheories.

Definitions 6.14, 6.15 and 6.18 given below are rather long and pedantic, so wegive first a guiding overview. We follow the pattern first seen in Definition 6.2, wherethe components of the definitions must be stratified into several stages, with each stagemaking use of functions defined on earlier stages.

47

1. At the first stage, we can specify just the shape of the family of premises. Thisconsists of a well-ordered set (P,<), to index the premises, along with for eachi ∈ P , the judgement form ϕi and scope γi for the i-th premise.

Form this data we can compute the arity of the rule, αP , and more generally thearity αP<i , specifying what metavariables may occur in the i-th premise.

2. At the second stage, with the arities αP<i available, we can specify the rawsyntax of the premises. The i-th premise Pi is given by a boundary Γi ` Bi ofform ϕi, scope γi, and written over Σ + αP<i .

From these, filling in heads of object premises as required for tightness, we canconstruct the flattening of P as family of judgements over Σ + αP , and moregenerally the flattening of P<i over Σ + αP<i .

3. At the third stage, with the flattenings available, we can now specify the well-formedness conditions. Derivations of presuppositions of Pi should be givenover the ambient theory T , translated up to Σ + αP<i , and with (the flatteningsof) preceding premises P<i available as hypotheses.

4. We are now done with the hard part. Having specified the premises, the conclu-sion is given as in the sequential case, as a well-formed boundaryB over Σ+αP ,whose head (if B is of object form) will later be filled in to yield a symbol rule.

While the above explanation sounds plausible, it sweeps several technical subtletiesunder the rug. Most importantly, since each premise is specified over its own signatureΣ + αP<i , we need to handle the translations between these extensions, and up tothe overall extension Σ + αP . Spelling out all details in full, we have the followingdefinitions.

Definition 6.14. A well-founded premises-shape (I, S) is given by a well-founded set(I,<), and a family S = 〈(ϕi, γi)〉i∈I , where ϕi is a judgement form and γi a scope.Given these, we define:

1. The arity αS of S is the subfamily 〈(ϕi, γi)〉{i∈I|ϕi∈{ty,tm}} of the object formsof S.

2. For each i ∈ I , the initial segment S<i := 〈(ϕj , γj)〉j<i is itself a well-foundedpremises-shape indexed by the initial segment ↓i ⊆ I , and hence it also has anassociated arity αS<i .

3. For each i < j ∈ I , there are evident family maps αS<i → αS<j and henceS<i → S<j , satisfying evident composition conditions with each other and withthe subfamily inclusions S<j → S.

Definition 6.15. Given a signature Σ and a well-founded premises-shape (I, S) as inDefinition 6.14, a well-founded premise-family P is given by a family B = 〈Bi〉i∈Iwhere Bi is a boundary of form ϕi in scope γi, and over Σ + αS<i . Given these, wedefine:

1. The flattening P [ is the family of judgements 〈Pi〉i∈I over Σ + αS , where Pi isthe boundary Bi translated along the inclusion Σ + αS<i → Σ + αS , and whenϕi ∈ {ty, tm} completed with the head expression metai(〈varj〉j∈γi).

2. For each i ∈ I , the initial segment B<i yields a well-founded premise familyP<i with respect to the well-founded premises-shape S<i indexed by the initialsegment ↓i. Thus it has its own flattening P [<i.

48

3. For each j < i, the judgement Pj as a member of the flattening P [<i translatedalong the signature inclusion Σ +αS<i → Σ +αS yields the same flattening Pj ,but as a member of P [.

This exhibits the translation of P [<i along the inclusion Σ + αS<i → Σ + αS asa subfamily of P [.

4. Similarly, for all k < j < i, the judgement Pk as a member of the flattening P [<jtranslated along the signature inclusion Σ + αS<j → Σ + αS<i yields the sameflattening Pk, but as a member of P [<i.

This exhibits the translation of P [<j along the inclusion Σ + αS<j → Σ + αS<ias a subfamily of P [<i.

Definition 6.16. A well-founded premise-family P as in Definition 6.15 is well-formedover T if for each i ∈ I , there are derivations of all presuppositions of Bi from hy-potheses P [<i, in the translation of T to Σ + αS<i .

A well-presented premise-family is a well-formed well-founded premise-family P .Its arity αP is the associated arity αS of the underlying premises-shape S.

When no confusion can occur, we will write the flattening of a well-founded premise-family P just as P , rather than P [.

Definition 6.17. A well-presented rule-boundary P =⇒ B over Σ, T consists of awell-presented premise-family P together with a boundary with empty context ` Bover Σ + αP , the conclusion boundary, such that all presuppositions of B derivablefrom P in the translation of T to Σ + αP . The arity of such a rule-boundary is thearity αP of its premise-family.

Definition 6.18. The realisation of a well-presented rule-boundary R = (P =⇒ B)as a raw rule is defined according to the form of B.

1. If B is an object boundary of class c, then given a symbol S ∈ Σ of arity αP andclass c, the realisation R[S] of R with S has premises the flattening of P , andconclusion B[S].

2. If B is an equality boundary, no extra input is required: the realisation of Rhas premises the flattening of P , and conclusion just B viewed as an equalityjudgement.

6.4 Well-presented type theoriesFinally, we reach well-foundedness for type theories. Once again, a by now familiarpattern emerges. It is fairly straightforward to define well-foundedness as an after-market property of acceptable type theories, but a better definition is obtained byputting in a little more work.

We start with the simpler version.

Definition 6.19. Let T = 〈Ri〉i∈I be an acceptable type theory over a signature Σ,and let β : |Σ| → I the bijection from symbols to their rules. Then T is well-foundedwhen all its rules are well-founded, and the index set I has a well-founded order <,such that:

1. If S ∈ Σ appears in Ri then β(S) < i.

49

2. Each Rj has derivations of presuppositions that only refer to symbols S withβ(S) < j and rules Ri with i < j.

For the more refined version, we follow a similar pattern to what we saw for well-presented rules in Section 6.3, with the definition stratified into three stages:

1. First, the shape: a well-founded order (to index the rules), and the premises-shapes and judgement forms of all rules. This suffices to compute the signatureof the theory, and of its initial segments.

2. Next, the raw part: for each rule of the theory, a well-founded premise-family,and (raw) conclusion boundaries, of the shapes and forms specified in the firststage, and over the signature of the appropriate initial segment. These sufficeto compute the flattening of the theory as a raw type theory, and of its initialsegments.

3. Finally, the derivations showing well-formedness of each rule over the precedinginitial segment.

Having previously given Definitions 6.14, 6.15 and 6.16 in rather excruciating de-tail, we proceed here slightly more concisely, trusting the reader to be able to fill in theelided details along the lines spelled out in those definitions.

Definition 6.20.

1. A well-founded type theory shape T consists of a well-founded order (I,<),together with for each i ∈ I , a well-founded premises-shape Si and judgementform ϕi (seen as the premises shape and conclusion form of the ith rule).

From these, we can define the total signature ΣT of T : its symbols are just{Si ∈ I | ϕi ∈ {ty, tm}}, with Si having arity αSi and class ϕi. Similarly, weget signatures for initial segments ΣT<i , and signature maps between these, andfrom these to ΣT .

2. A well-founded raw type theory T consists of a well-founded type theory shapeas above (which we also call T ), together with for each i ∈ I , a well-foundedpremise-family Pi of shape Si and a boundary Bi of form ϕi over the signa-ture ΣT<i .

From these, we can define the flattening T [ of T as a raw type theory over ΣT .Its rules consist of the realisations of all rule-boundaries Pi =⇒ Bi, using (wheni is of object form) the symbol Si, together with the associated congruence rulesof the object rules thus added. Similarly, we obtain the flattening T [<i of eachinitial segments of T , as a raw type theory over ΣT<i .

3. A well-presented type theory T consists of a well-founded raw type theory Tas above that is additionally well-formed, in that it is equipped with, for each i,derivations exhibiting Pi =⇒ Bi as a well-formed rule-boundary over T [<i.

As with well-presented rules, we will not notate flattening, when there is no ambiguity.

Proposition 6.21. The flattening of a well-presented type theory T is acceptable andwell-founded.

Proof. Well-foundedness, tightness, substitutivity, and congruousness are immediateby construction. Presuppositivity is almost as direct, requiring just translation of well-formedness of rules from the signatures ΣT<i and theories T<i up to the full signatureΣ and theory T .

50

6.5 The well-founded replacementDefinition 5.12 of acceptable type theories allows cyclic references of three kinds: be-tween types of a context, premises of a rule, or rules of a type theory. We shall notconcern ourselves with the former two, since type theories occurring in practice allavoid them by using sequential contexts and sequential rules from Sections 6.1 and 6.2.We address the latter one, to vindicate our design choices from earlier sections, todemonstrate that our setup supports non-trivial meta-theoretic methods, and to give aninteresting new construction that likely has further applications.

For the remainder of this section, all contexts, raw rules, and rule-boundaries arepresumed to be sequential. Also, it will be convenient to speak of a raw theory Twithout explicitly displaying its underlying signature. When we need to refer to thesignature, we do so by writing ΣT .

In Example 5.14 an acceptable type theory was rectified to a well-founded one bythe introduction of a new symbol and an equation. When one looks at other specific ex-amples the same strategy works, possibly with the introduction of several symbols andequations. In order to present a general method we first lay some category-theoreticgroundwork. We save the adventure of spiralling into the depths of category theory foranother day, and instead establish just enough structure to keep the syntactic construc-tions organized.

Definition 6.22. A raw syntax map f : Σ → Σ′ is given by a family of expressionsfS ∈ ExprcS

Σ′+arg S(0), one for each S ∈ Σ. Such a map acts on e ∈ ExprcΣ(γ) to givef∗e ∈ ExprcΣ′(γ) by

f∗(vari) := vari and f∗(S(e)) := (f∗ ◦ e)∗fS . (6.1)

The metavariable extension f +α : Σ +α→ Σ′+α by arity α is the raw syntax mapdefined by

(f + α)S := fS and (f + α)metai := metai(〈varj〉j∈bindα i)

In words, a raw syntax map Σ→ Σ′ interprets each symbol in Σ as a suitable com-pound expression over Σ′, the interpretation extends compositionally to all expressionsover Σ, and metavariable extensions act on such a map by extending it trivially.

Let us unravel the second clause in (6.1), as it is a bit terse. Given a symbol S ∈ Σand its arguments e ∈

∏i∈arg S ExprclS i

Σ (γ ⊕ bindS i), the composition f∗ ◦ e takeseach i ∈ argS to f∗ei ∈ ExprclS i

Σ′ (γ ⊕ bindS i) – it is an instantiation of arity argS,which thus acts on fS to yield an expression (f∗ ◦ e)∗fS ∈ ExprcS

Σ′ (γ), where we tookinto account that γ ⊕ 0 = γ.

The action of a raw syntax map evidently extends from expressions to context,judgements, and boundaries, and thanks to the metavariable extensions also to rawrules and rule-boundaries.

Proposition 6.23. Signatures and raw syntax maps form a category:

• The identity morphism idΣ : Σ→ Σ takes S ∈ Σ to the generic application S.

• The composition of f : Σ → Σ′ and g : Σ′ → Σ′′ is the map g ◦ f : Σ → Σ′′

that interprets each S ∈ Σ as (g ◦ f)S := g∗fS .

Raw syntax map actions are functorial.

Proof. A straightforward application of the basic properties of instantiations.

51

Given raw theories T and T ′, a raw syntax map f : ΣT → Σ′T between the under-lying signatures may be entirely unrelated to T and T ′. Requiring it to map derivablejudgements in T to derivable judgements in T ′ helps, but ignores the fact that rawtheories are families of raw rules, not derivable judgements. Here is a better definition.

Definition 6.24. A raw theory map f : T → T ′ is a raw syntax map f : ΣT → Σ′T onthe underlying signatures which maps each specific rule R of T to a derivation of f∗Rin T ′.

Proposition 6.25. Raw theories and raw theory maps form a category RawTh. A rawtheory map f : T → T ′ acts functorially on a derivation D of Γ ` J in T to give aderivation f∗D of f∗Γ ` f∗J in T ′.

Proof. Let us first describe the action of f on a derivation D of Γ ` J . We proceed byrecursion on the structure of D.

Structural rules are mapped to the corresponding structural rules, e.g., if D con-cludes with a variable rule Γ ` vari : Γi then f∗D concludes with the variable rulef∗Γ ` vari : f∗Γi, and similarly for other structural rules.

Consider the case whereD ends with an instantiation I∗R of a specific ruleR of T ,whose premises are 〈Γk ` Jk〉k∈K . Suppose f takes R to the derivation DR of f∗Rin T ′. First we recursively map each derivation Dk of the k-th premise I∗Γk ` I∗Jkto a derivation f∗Dk of f∗(I∗Γk ` I∗Jk) in T ′, which is the same as (f∗I)∗Γk `(f∗I)∗Jk, because acting by I and then by f is the same as acting by the instantiationf∗I . We then take f∗D to be the derivation (f∗I)∗RD with the derivations f∗Dk of itshypotheses grafted onto it, as in Definition 2.10.

It is straightforward to verify that the action on derivations so constructed satisfiesfunctoriality, (g ◦ f)∗D = g∗(f∗D).

The categorical structure of RawTh is inherited from the structure of raw syntaxmaps. Additionally, given composable raw theory maps f and g, we let their composi-tion g ◦ f take a specific rule R to the derivation g∗DR, where DR is the derivation off∗R provided by f .

It may happen that a raw theory map f : T → T ′ maps an uninhabited type to aninhabited one, or an underivable equality to a derivable one. Let us make precise thesense in which such a map fails to be conservative, and at the same time generaliseinhabitation of types to general completion of boundaries in the presence of premises.

Definition 6.26. Given a raw theory T , and an object rule-boundary P =⇒ B over ΣTwhose syntactic class is cB , say that e ∈ ExprcBΣT+αP

(0) realises the rule-boundarywhen P =⇒ B[e] is derivable in T .

Example 6.27. Inhabitation of a closed type A corresponds to realisation of the rule-boundary 〈〉 =⇒ � : A. We can also express more general inhabitation tasks, forinstance (` A type) =⇒ � type asks for a construction of a type from a type parame-ter A, and (` A type); ([var0 :A] ` B(var0) type) =⇒ � type for a Π-like higher typeconstructor.

Definition 6.28. A raw theory map f : T → T ′ is conservative when:

1. f reflects equations: if T ′ derives the equational rule f∗R then T derives theequational rule R, and

2. f reflects realisers: there is a map r such that if e realises the object rule-boundary f∗(P =⇒ B) in T ′ then r(R, e) realises P =⇒ B in T .

52

In the definition, we asked for a map r to witness reflection of realisers in order toavoid spurious applications of the axiom of choice.

Lemma 6.29. Every raw theory map f : T → U factors through a conservative mapf† : T † → U

Tf //

U

T †f†

>>

in a weakly universal fashion.

In the lemma, by weak universality we mean that whenever f factors through aconservative map g : V → U , there is a map h : T † → V , not necessarily unique, suchthat the following diagram commutes:

Tf //

j

''

U

T †f†

>>

h

��V

g

RR (6.2)

Proof of Lemma 6.29. We shall construct T † by adjoining new symbols to T , so thatwhenever e realises f∗R inU , the corresponding new symbol realisesR in T . However,the new symbols generate new rule-boundaries, so the process needs to be iterated, andwe have to adjoin equations as well. The construction of f† : T † → U thus proceedsin an inductive fashion, as follows.

Initially the signature ΣT † is just ΣT , the specific rules of T † are those of T , and f†

acts like f . We inductively extend ΣT † with new symbols, T † with new specific rules,and f† with new values, as follows:

1. If P =⇒ B is an object rule-boundary in T † of arity α and e realises f†∗P =⇒f†∗B in U , then we extend ΣT † with a new symbol c(P=⇒B,e) of arity α, and T †

with the associated symbol rule P =⇒ B [c(P=⇒B,e)]. We extend f† by lettingit map c(P=⇒B,e) to e.

2. If R is an equational rule in T † such that f†∗R is derivable in U , we extend T †

with R as a specific rule.

Because we assumed all contexts and premise-families to be sequential, the inductivedefinition is complete after countably many repetitions of the above process. Alterna-tively, T † could be constructed as a suitable colimit.

The map f obviously factors as f = f† ◦ i where i : T → T † is induced by theinclusion ΣT → ΣT † .

The map f† is conservative by construction. It obviously reflects equations, whilean object rule-boundary R in T †, such that e realises f†∗R in U , is realised by c(R,e).

It remains to be shown that the factorization is weakly universal. Consider a fac-torization f = g ◦ j through a conservative map g : V → U , as in (6.2). There existsa map r, not necessarily unique, that witnesses conservativity of g. The desired factorh : T † → V is defined inductively: it acts like j on symbols of ΣT , and takes c(R,e) tor(h∗R, e).

53

Corollary 6.30. A raw theory T has a weakly universal conservative map t : Twf → Twhere Twf is well-founded.

Proof. We take Twf = 〈〉† and t = o†, as in Lemma 6.29, where o : 〈〉 → T isthe unique map from the empty theory 〈〉. Weak universality is immediate, and well-foundedness of 〈〉† is witnessed by the inductive nature of its construction.

Definition 6.31. The map t : Twf → T from Corollary 6.30 is called the well-foundedreplacement of T .

Here, finally is the main theorem of this section.

Theorem 6.32. If T is acceptable, the map t : Twf → T has a section.

The theorem establishes a form of equivalence of Twf and T , because any con-servative map r : U → V with a section s : V → U is an isomorphism up tojudgmental equality. Indeed, r ◦ s = idV because s is a section of r, while s ◦ ris identity up to judgemental equality: given any derivable rule P =⇒ A type in U ,the rule r∗P =⇒ r∗(s∗(r∗A)) ≡ r∗A is derivable in V by reflexivity, and henceP =⇒ s∗(r∗A) ≡ A in U by conservativity of r. An analogous argument works forterm judgements.

Proof of Theorem 6.32. The theory Twf has symbols of the form c(P=⇒B,e), where Bis a closed boundary, but in the proof we will have to deal with boundaries that refer tovariables.

For this purpose, define the (variable-to-metavariable) promotion of an expressione ∈ ExprcΣ(γ) to be the expression e ∈ ExprcΣ+simp γ(0), cf. Definition 3.21, whichis e with the variables replaced by metavariables,

vari := metai, and S(e) := S(〈ei〉i∈arg S).

The associated demotion is the instantiation Dγ ∈ InstΣ,γ(simp γ) which takes themetavariables back to variables, Dγ(metai) = vari. Thus we have e = Dγ∗e.

One level up, given a premise-family P and a context Γ over Σ + αP , the pro-motion of Γ is the extension P ; Γ of P in which the variables of Γ are promoted tometavariables of suitable types,

(P ; [ ]) := P and (P ; Γ . A) := (P ; Γ); (` A type).

We begin the construction of a section of t by defining a map d which maps se-quential rules and rule-boundaries from T to Twf by replacing compound expressionse with suitable symbols c(R,e) from Twf . When acting on contexts and judgements, dtakes a sequential rule-family P from T as an additional parameter, in which case wewrite dP .

The map d recurses over a premise-family in T to give a premise-family in Twf :

d(〈〉) := 〈〉 and d(P ; (Γ ` J)) := (d(P ); dP (Γ ` J)).

Similarly, it takes a sequential context Γ over T + P to one over Twf + d(P ):

dP ([ ]) := [ ],

dP (Γ . A) := dP (Γ) . (D|Γ|∗c((d(P );dP (Γ)=⇒� type),A)).

54

In the second clause, dP recurses into Γ and extends it with the rather intimidating

D|Γ|∗c((d(P );dP (Γ)=⇒� type),A),

which is just the generic application of the c-symbol for the promoted A, demoted backto Γ. Note that (d(P ); dP (Γ)) = d(P ; Γ) and hence t∗(dP (Γ . A)) = Γ . A.

It remains to explain how dP maps a judgement Γ ` J over T + P to one overTwf + d(P ). Here too we use the same method of demoting a generic application of asymbol for a promoted expression:

dP (Γ ` A type) := (Γ′ ` A′ type),dP (Γ ` t : A) := (Γ′ ` t′ : A′),

dP (Γ ` A ≡ B) := (Γ′ ` A′ ≡ B′),dP (Γ ` s ≡ t : A) := (Γ′ ` s′ ≡ t′ : A′),

whereΓ′ := dP (Γ),

A′ := D|Γ|∗c((d(P );Γ′=⇒� type),A),

B′ := D|Γ|∗c((d(P );Γ′=⇒� type),B),

t′ := D|Γ|∗c((d(P );Γ′=⇒�:A′),t),

s′ := D|Γ|∗c((d(P );Γ′=⇒�:A′),s).

By having d map � to � the above clauses also provide the action of d on boundaries.Finally, let d map a sequential rule P =⇒ J in T to the sequential rule d(P ) =⇒ J ′

where dP (` J) = (` J ′), and similarly for rule-boundaries.We have arranged d in such a way that t∗(d(R)) = R for any sequential rule R

over T . Moreover, if R is derivable in T , then d(R) is derivable in Twf , by an appealto suitable symbol rules in Twf . For instance, d maps the rule P =⇒ A type to therule d(P ) =⇒ c((d(P )=⇒� type),A) type. If the former is derivable in T then the latteris a symbol rule of Twf .

At last, let us define the section s of t. Consider first a type symbol S ∈ ΣT .Because T is acceptable, it has a unique symbol rule P =⇒ S type. When we map itwith d we get

d(P ) =⇒ c((d(P )=⇒� type),S) type,

which is a symbol rule in Twf . We may therefore take sS := c((d(P )=⇒� type),S).A term symbol S ∈ ΣT is dealt with analogously. Its symbol rule takes the form

P =⇒ S : A, which is mapped by d to

d(P ) =⇒ c((d(P )=⇒�:A′),S) : A′,

where A′ = c((d(P )=⇒� type),A), Again, this is a symbol rule in Twf , so we may definesS := c((d(P )=⇒�:A′),S).

Example 6.33. We revisit Example 5.14, the type theory expressing type-in-type ina cyclic fashion as ` u : El(u). Earlier we pointed out that the theory can be madewell-founded by using the defined type constant ` U ≡ El(u). The well-foundedreplacement works much the same way, except that it is replete with many more definedsymbols. The analogue of U appears already at the first stage of the construction.

55

Indeed, the rule-boundary 〈〉 =⇒ � type is realised by El(u), hence the well-foundedreplacement contains the type constant U = c(〈〉=⇒� type,El(u)). We also have the newsymbols

El = c((`a:U)=⇒� type,El) and u = c(=⇒�:U,u),

and these suffice to express the type equation ` U ≡ El(u), which is a specific rule ofthe well-founded replacement because it is mapped to a valid equation in the originaltype theory.

7 Discussion and related workWe set out to give a detailed and general mathematical definition of dependent typetheories, accomplished by an analysis of their traditional accounts. Having completedthe task, let us take stock of what has been accomplished.

We calibrated abstraction at the level that keeps a connection with concrete syn-tax, but also clearly identifies the category-theoretic structure underlying the abstractsyntax. As such, our work may serve as a theoretical grounding and a guideline forpractical implementations of type theories on one hand, and on the other as a ladderto be climbed and discarded by those who wish to ascend to higher, more abstractviewpoints of dependent type theories.

There is not much to remark on our treatment of raw syntax, as the topic has beenstudied before and is well understood, except to remark that scope systems have servedus well as a general approach to scoping and binding of variables.

More interesting are the subsequent stages of our definition. Giving the definitionin generality forces us to isolate and precisely define various notions that are tradition-ally treated only informally, and often only passed on in folklore rather than in writing.For instance, we articulate precisely the distinction between the syntactic specificationof an inference rule, and the scheme of closure conditions that it begets — a distinctionwhich, once seen, was implicitly present all along, but which is not generally appreci-ated or consciously articulated.

We initially considered raw theories as just a stepping stone towards the defini-tion of well-presented type theories, but have come to feel that they are of significantintrinsic interest. Their simplicity makes them easy to work with, and even thoughthey permit deviations from the orthodox teachings, they boast with a surprisingly richcollection of meta-theorems.

The well-behavedness properties of raw rules and raw type theories, such as presup-positivity, tightness, and congruity, took some effort to define and explain, but quicklypaid off. On a technical level, they allowed us to fine-tune the requirements that enablethe various meta-theorems. More importantly, once we incorporated them into ourtype-theoretic vocabulary they streamlined communication and invigorated the mindwhere there used to be just an uneasy adherence to heuristic techniques.

We hope that our selection of meta-theorems is illustrative enough to inspire furthergenerally applicable meta-theorems, and comprehensive enough to relieve future de-signers of type theories from having to redo the work. We have intentionally restrainedany category-theoretic analysis of the landscape we explore, but it will be visible in thebackground to many readers, demanding future exploration of the categorical structureof the syntactic notions, both for its own sake and to connect with a general categoricalsemantics. This, of course, we hope to return to in future work.

56

How widely does our definition of type theories cast its net? It takes little ef-fort to enumerate many examples of interest, such as variants of Martin-Löf type the-ory (Martin-Löf, 1972), in both its intensional and extensional incarnations, homo-topy type theory (Univalent Foundations Program, 2013), simple type theory (Church,1932), some presentations of the higher-order logic of toposes (Lambek and Scott,1986), etc. But it is equally easy to list counter-examples: System F (Girard, 1972;Reynolds, 1974) and related type systems that directly quantify over all types, puretype systems (Barendregt, 1992), cubical type theories (Cohen et al., 2015), cohesivetype theories (Schreiber and Shulman, 2014), etc. Can such a diverse collection offormalisms be unified under the umbrella of an even more general definition of typetheories? Doubtlessly, our work can be pushed and stretched in various directions, andwe hope it will. But we also state again that we do not intend our definitions to bedefinitive or prescriptive, nor consider our methods to be superior to others. After all,type theory is an open-ended idea.

Several years ago Vladimir Voevodsky’s relentless inquiries into the precise math-ematical underpinnings of type theories motivated us to undertake the study of generaltype theories. We were hardly alone to do so. Voevodsky himself initially workedto develop the framework of B-systems and C-systems (Voevodsky, 2014, 2016), butthese will remain tragically unfinished. There is by now a spectrum of various ap-proaches to the meta-theory of type theories, which cannot be justly reviewed in theremaining space. We only mention a selection of contemporaneous developments andtheir relation to our work.

Logical framework approaches When discussing and presenting this work, we haveoften been asked why we bothered, when logical frameworks (LF) (Harper et al., 1993;Pfenning, 2001) already give a satisfactory definition of type theories.

The main answer is that most work with logical frameworks does not give a generaldefinition in the same sense that we are looking for. It sets up a framework within whichmany type theories may be defined, but that is not quite the same thing. Indeed, at thepoint when we were first embarking on the present project, no definition in a generalityclose to our aims had been proposed in the LF literature (to our knowledge), nor couldwe see how to do so using LF methods. Since then, Uemura has succeeded in giving avery clean general definition using the LF; we discuss that work in detail below.

A significant secondary motivation for the present approach, though, was to di-rectly recover the standard “naïve” presentation of particular type theories, in specificexamples. This problem is, by design, something that LF-based approaches bypassentirely. One may argue — as some have — that this desire is misguided: that sinceLF-based approaches are so much cleaner, naïve syntax should be discarded as obso-lete, and LF-embedded presentations of theories preferred as primitive. We howeverfind that view somewhat unsatisfactory, for several reasons.

Firstly, even if we should be always reading type theories as LF presentations,we have not been. At least within the literature on constructive type theory in thetradition of Martin-Löf, most work still uses the naïve reading, including the workof Martin Hofmann (Hofmann, 1997) and others (Martin-Löf, 1972; Streicher, 1991;Univalent Foundations Program, 2013). Or rather, most of the literature stays silentabout the issue, but where the intended reading is made clear, it tends to be the naïveone. Secondly, most work using LF approaches explicitly comments on the setup, andoften gives or cites adequacy theorems. This seems to suggest that writers agree thatthe correctness of LF presentations of type theories rests, in part, on their connection to

57

the naïve presentations. Finally, it seems very difficult to adopt a position of completelydiscarding the naïve readings, and reading all presentations of type theories always asshorthands for their LF embedding. This is because the framework is itself a typetheory, whose presentation must sooner or later be given the naïve reading, rather thanas embedded in a further framework — it cannot be “turtles all the way”.

We therefore believe it is important to have both the naïve and LF-based definitionsof syntax defined and developed in as wide a generality as possible (as well as otherapproaches, such as those of categorical logic). The naïve approach is most natural andconceptually basic. The LF approach is cleaner and simpler to set up, and easier toanalyse and apply for some purposes. Both should be available, and connected by anadequacy theorem, not just in special cases but in generality.

Uemura’s general type theories Another general definition of type theories has re-cently been given by Taichi Uemura. Indeed, (Uemura, 2019) provides two definitions,one semantic and one syntactic, and shows their correspondence with a general initial-ity theorem.

In terms of generality, Uemura’s definition essentially subsumes ours and indeedgeneralises much further, encompassing type theories with different judgement forms,while still retaining enough structure to allow proofs of important type-theoretic meta-theorems. It therefore quite satisfactorily solves one of the major goals we set out tosolve with the present project.

On inspection, however, Uemura’s approach is sufficiently different that it is com-plementary with our approach, rather than subsuming it. His syntactic definition isgiven via a particularly ingenious use of a logical framework; as with other LF-basedapproaches, this keeps the setup very clean, but means that in specific examples, it doesnot so closely recover the standard naïve reading of the theory in question.

It does not, therefore, address our secondary goal of taking seriously the naïvereading of syntax, and directly recovering it in examples. We therefore hope that itshould be possible in the future to connect our syntactic definition with Uemura’s bymeans of a generalised adequacy theorem, and show that for theories with Martin-Löf’soriginal four judgment forms, the two approaches are equivalent.

Other general definitions of dependent type theories Independently of our work,Guillaume Brunerie has proposed a general syntactic definition of dependent type the-ories (Brunerie, 2020a), and is formalising it in Agda (Brunerie, 2020b). His approachis very similar to ours, which we see as a welcome convergence of ideas.

Valery Isaev has also proposed a definition of dependent type theories, in (Isaev,2017). His approach is semantic, avoiding syntax with binding, and defining dependenttype theories as certain essentially algebraic theories extending the theory of categorieswith families. The generality of his definition seems to be roughly similar to ours,but a precise comparison seems slightly subtle to state, and is beyond the scope of thepresent paper.

Meta-theory of type theory in type theory When one takes type theory seriouslyas a foundation of mathematics, it is natural and even imperative, to develop the meta-theory of type theories in type theory itself. Whereas in the LF approach the expressiv-ity of the ambient formalism is curbed to ensure an adequacy theorem, here we gladlytrade adequacy for working in a full-fledged dependent type theory.

58

Such a project has been undertaken by Thorsten Altenkirch and Ambrus Kaposi (Al-tenkirch and Kaposi, 2016). Broadly speaking, a specific object type theory is con-structed in one fell swoop as a quotient-inductive-inductive type (QIIT) that incorpo-rates all judgement forms, the structural and the specific rules. The inductive characterof the definition automatically provides the correct notion of derivation, while the am-bient type theory guarantees that only derivable judgements can be constructed — the“raw” stage is completely side-stepped. The quotienting capabilities favourably relatethe ambient propositional equality with the object-level judgmental equality. From a se-mantic point of view, the construction is the type-theoretic analogue of an initial-modelconstruction. The ingenuity of the definition allows one to prove many meta-theoremsquite effortlessly, especially with the aid of a proof assistant.

Our bottom-up approach can add little to the setup in terms of abstraction, but canpossibly provide useful clues on how to pass from the case-by-case presentations ofobject type theories to a single type whose inhabitants are (presentations of) generaltype theories. For instance, the type of well-presented type theories would have toimprove on our staged definitions by joining them into a single mutually recursiveinductive definition that would incorporate the above QIIT construction of a singleobject-level theory, suitably adapted, as the realisation of a well-presented theory.

8 ReferencesThorsten Altenkirch and Ambrus Kaposi. Type theory in type theory using quotient

inductive types. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Sym-posium on Principles of Programming Languages (POPL), pages 18–29, 2016.

Andromedans. Andromeda proof assistant. http://www.andromeda-prover.org, 2020.

Henk Barendregt. Lambda Calculi with Types. In Samson Abramsky, Dov Gabbay,and Thomas Maibaum, editors, Handbook of Logic in Computer Science, volume 2,pages 117–309. Clarendon Press, 1992.

Andrej Bauer, Jason Gross, Peter LeFanu Lumsdaine, Michael Shulman, MatthieuSozeau, and Bas Spitters. The HoTT library: a formalization of homotopy typetheory in Coq. In CPP 2017: Proceedings of the 6th ACM SIGPLAN Conference onCertified Programs and Proofs, pages 164–172, January 2017.

Andrej Bauer, Philipp G. Haselwarter, and Anja Petkovic. A generic proof assistant. InFoundations and Applications of Univalent Mathematics – Herrsching (Germany),December 2019.

Guillaume Brunerie. A general class of dependent type theories. Seminar for founda-tions of mathematics and theoretical computer science, Faculty of mathematics andphysics, University of Ljubljana, March 2020a.

Guillaume Brunerie. A formalization of general type theories in Agda. https://github.com/guillaumebrunerie/general-type-theories, 2020b.

Alonzo Church. A set of postulates for the foundation of logic. The Annals of Mathe-matics, 33(2):346–366, 1932.

Cyril Cohen, Thierry Coquand, Simon Huber, and Anders Mörtberg. Cubical TypeTheory: a constructive interpretation of the univalence axiom. In 21st InternationalConference on Types for Proofs and Programs, Tallinn, Estonia, May 2015.

59

http://www.andromeda-prover.org

https://github.com/guillaumebrunerie/general-type-theories

https://github.com/guillaumebrunerie/general-type-theories

Coq development team. Coq. http://coq.inria.fr/, 2020.

N. G. de Bruijn. Lambda calculus notation with nameless dummies, a tool for auto-matic formula manipulation, with application to the Church-Rosser theorem. Inda-gationes Mathematicae (Proceedings), 75(5):381–392, 1972.

Marcelo Fiore, Gordon Plotkin, and Daniele Turi. Abstract syntax and variable binding(extended abstract). In 14th Symposium on Logic in Computer Science, pages 193–202. IEEE Computer Society, 1999.

Jean-Yves Girard. Interprétation Fonctionelle et Élimination des Coupures del’arithmétique d’ordre Supérieur. PhD thesis, Université Paris VII, France, 1972.

Robert Harper, Furio Honsell, and Gordon Plotkin. A framework for defining logics.Journal of the ACM, 40(1):143–184, January 1993.

Martin Hofmann. On the interpretation of type theory in locally Cartesian closed cate-gories. In Computer science logic (Kazimierz, 1994), volume 933 of Lecture Notesin Comput. Sci., pages 427–441. Springer, Berlin, 1995. doi: 10.1007/BFb0022273.

Martin Hofmann. Syntax and Semantics of Dependent Types. In Andrew M. Pittsand Peter Dybjer, editors, Semantics and Logics of Computation, pages 79–130.Cambridge University Press, 1997.

Valery Isaev. Algebraic presentations of dependent type theories, March 2017.arXiv:1602.08504.

Peter T. Johnstone. Sketches of an elephant: a topos theory compendium. Vol. 2, vol-ume 44 of Oxford Logic Guides. Oxford University Press, 2002.

Carol R. Karp. Languages with expressions of infinite length. North–Holland Publish-ing Co., Amsterdam, 1964.

Joachim Lambek and Philip J. Scott. Introduction to higher order categorical logic,volume 7 of Cambridge Studies in Advanced Mathematics. Cambridge UniversityPress, 1986.

Peter LeFanu Lumsdaine and Michael A. Warren. The local universes model: anoverlooked coherence construction for dependent type theories. ACM Trans. Com-put. Log., 16(3):Art. 23, 31, 2015. ISSN 1529-3785. doi: 10.1145/2754931.arXiv:1411.1736.

Peter LeFanu Lumsdaine, Andrej Bauer, and Philipp G. Haselwarter. A formali-sation of general type theories in Coq. https://github.com/peterlefanulumsdaine/general-type-theories/tree/arXiv, 2020.

Per Martin-Löf. An intuitionstic theory of types. Technical Report, University of Stock-holm, 1972.

Per Martin-Löf. Intuitionistic type theory, volume 1 of Studies in Proof Theory. LectureNotes. Bibliopolis, Naples, 1984.

James McKinna and Robert Pollack. Pure type systems formalized. In Typed LambdaCalculi and Applications, volume 664 of Lecture Notes in Computer Science, 1993.

60

http://coq.inria.fr/

https://arxiv.org/abs/1602.08504

http://arxiv.org/abs/1411.1736

https://github.com/peterlefanulumsdaine/general-type-theories/tree/arXiv

https://github.com/peterlefanulumsdaine/general-type-theories/tree/arXiv

Dominic Orchard, Philip Wadler, and Harley Eades. Unifying graded and parame-terised monads. Electronic Proceedings in Theoretical Computer Science, 317, May2020.

Frank Pfenning. Logical frameworks. In John Alan Robinson and Andrei Voronkov,editors, Handbook of Automated Reasoning (in 2 volumes), pages 1063–1147. Else-vier and MIT Press, 2001.

John C. Reynolds. Towards a Theory of Type Structure. In Colloque Sur La Pro-grammation, Paris, France, volume 19 of Lecture Notes in Computer Science, pages408–425. Springer Verlag, 1974.

Urs Schreiber and Michael Shulman. Quantum Gauge Field Theory in Cohesive Ho-motopy Type Theory. Electronic Proceedings in Theoretical Computer Science, 158:109–126, july 2014.

Thomas Streicher. Semantics of type theory. Progress in Theoretical Computer Science.Birkhäuser Boston Inc., 1991.

Paul Taylor. Practical foundations of mathematics, volume 59 of Cambridge Studiesin Advanced Mathematics. Cambridge University Press, 1999.

Taichi Uemura. A general framework for the semantics of type theory, November 2019.arXiv:1904.04097.

The Univalent Foundations Program. Homotopy Type Theory: Univalent Founda-tions of Mathematics. https://homotopytypetheory.org/book, Institute for AdvancedStudy, 2013.

Vladimir Voevodsky. B-systems, October 2014. arXiv:1410.5389.

Vladimir Voevodsky. Subsystems and regular quotients of C-systems, March 2016.arXiv:1406.7413.

A Formalisation in CoqWe have partially formalised our work in the Coq proof assistant (Coq developmentteam, 2020) on top of the HoTT library (Bauer et al., 2017). The formalisation ispublicly available at (Lumsdaine et al., 2020), wherein further instructions are givenon how to compile and use the formalisation. The formalisation will continue to evolvein future; the description here refers to the version tagged as arXiv.

The formalisation broadly follows the structure of the paper. Table 1 lists selectedmajor definitions and theorems from the paper, along with the names of the correspond-ing items in the formalisation, if any. Almost all material of Sections 2, 3 and 4 hasbeen formalised, as has some but not all of Section 5. Versions of the main definitionsof Section 6 are also formalised, but at time of writing, their treatment in the formali-sation has non-trivial differences from the definitions here; such items are marked withan asterisk.

Throughout the paper we worked rigorously but informally, and without discussingwhich mathematical foundation might be sufficient to carry out the constructions andproofs. On this topic we may consult the formalisation.

61

https://arxiv.org/abs/1904.04097

https://homotopytypetheory.org/book



Paper Formalisation

Family (Definition 2.1) Auxiliary.Family.familyClosure rule (Definition 2.6) Auxiliary.Closure.ruleClosure system (Definition 2.6) Auxiliary.Closure.systemDerivation (Definition 2.8) Auxiliary.Closure.derivationScope system (Definition 3.1) Syntax.ScopeSystem.systemDe Bruijn scope system (Example 3.2) Examples.ScopeSystemExamples.DeBruijnSyntactic class (Definition 3.6) Syntax.SyntacticClass.classArity (Definition 3.6) Syntax.Arity.aritySignature (Definition 3.8) Syntax.Signature.signatureSignature map (Definition 3.10) Syntax.Signature.mapRaw expressions (Definition 3.11) Syntax.Expression.expressionRaw substitution (Definition 3.16) Syntax.Substitution.raw_substitutionMetavariable extension (Definition 3.22) Syntax.Metavariable.extendInstantiation of syntax (Definition 3.24) Syntax.Metavariable.instantiate_expressionRaw context (Definition 4.1) Typing.Context.raw_contextRaw rule (Definition 4.18) Typing.RawRule.raw_ruleInstantiation of derivations (Corollary 4.43) Typing.RawTypeTheory.instantiate_derivationAssociated closure system (Definition 4.21) Typing.RawRule.closure_systemStructural rules (Definition 4.29) Typing.StructuralRule.structural_ruleCongruence rule (Definition 4.34) Typing.RawRule.raw_congruence_ruleRaw type theory (Definition 4.36) Typing.RawTypeTheory.raw_type_theoryAcceptable rule (Definition 5.7) (not formalised)Acceptable type theory (Definition 5.12) Metatheorem.Acceptability.acceptablePresuppositions theorem (Theorem 5.15) Metatheorem.Presuppositions.presuppositionAdmissibility of renaming (Lemma 5.18) Metatheorem.Elimination.rename_derivationAdmissibility of substitution (Lemma 5.20) Metatheorem.Elimination.substitute_derivationAdmissibility of equality substitution (Lemma 5.21) Metatheorem.Elimination.substitute_equal_derivationElimination of substitution (Theorem 5.22) Metatheorem.Elimination.eliminationUniqueness of typing (Theorem 5.23) (not formalised)Inversion principle (Theorem 5.27) (not formalised)Sequential context (Definition 6.4) ContextVariants.wf_context_derivation(∗)

Sequential rule (Definition 6.8) (not formalised)Well-presented rule (Definition 6.18) (not formalised)Well-presented type theory (Definition 6.20) Presented.TypeTheory.type_theory(∗)

Well-founded replacement (Theorem 6.32) (not formalised)

Table 1: The correspondence between the paper and the formalisation (Lumsdaineet al., 2020). Items marked with (∗) differ non-trivially from their counterparts in thepaper.

62

Our formalisation is built on top of a homotopy type theory library with an eyetowards future formalisation of the categorical semantics of type theories, but is so faragnostic with respect to commitments such as the Univalence axiom or the Uniquenessof identity proofs. The only axiom that we use is function extensionality. In otherwords, the code can be read in plain Coq.

The formalisation confirms that our development is constructive, there are no usesof excluded middle or the axiom of choice.

It is a bit harder to tell how many universes we have used, because Coq relieves theuser from explicit handling of universes. Two seem to be enough, one to serve as a baseand another to work with families over the base. The base universe can be very small,say consisting of the decidable finite types, if we limit attention to finitary syntax only.

We rely in many places on the ability to perform inductive constructions and carryout proofs by induction, and so we require some meta-theoretic support for these. Ofcourse, there is no shortage of induction in Coq, and even a fairly weak set theory willhave the capability to construct the necessary inductive structures, whereas the higher-order logic of toposes would have to be extended with W -types. Alternatively, wecould restrict to finitary syntax, contexts and rules throughout to allow Gödelization ofsyntax and reliance on induction supplied by arithmetic.

63

Peter LeFanu Lumsdaine September 11, 2020 arXiv:2009 ...

Documents

Transcript of Peter LeFanu Lumsdaine September 11, 2020 arXiv:2009 ...