ARITHMETICAL KNOWLEDGE AND ARITHMETICAL DEFINABILITY ...cholak/papers/walsh.pdf · I initially read...

ARITHMETICAL KNOWLEDGE AND ARITHMETICAL DEFINABILITY:

FOUR STUDIES

A Dissertation

Submitted to the Graduate School

of the University of Notre Dame

in Partial Fulfillment of the Requirements

for the Degree of

Doctor of Philosophy

by

Sean Walsh,

Michael Detlefsen, Co-Director

Peter Cholak, Co-Director

Graduate Programs in Philosophy and Mathematics

Notre Dame, Indiana

December 2010

c© Copyright by

Sean Walsh

2010

All Rights Reserved

ARITHMETICAL KNOWLEDGE AND ARITHMETICAL DEFINABILITY:

FOUR STUDIES

Abstract

by

Sean Walsh

The subject of this dissertation is arithmetical knowledge and arithmetical

definability. The first two chapters contain respectively a critique of a logicist

account of a preferred means by which we may legitimately infer to arithmetical

truths and a tentative defense of an empiricist account. According to the logicist

account, one may infer from quasi-logical truths to patently arithmetical truths

because the arithmetical truths are representable in the logical truths. It is argued

in the first chapter that this account is subject to various problems: for instance,

the most straightforward versions seem vulnerable to various counterexamples.

The basic idea of the alternative empiricist account considered in chapter two is

that complicated arithmetical truths like mathematical induction may be inferred

by way of confirmation from less complicated quantifier-free arithmetical truths.

The notion of confirmation here is understood probabilistically, and responses

are given in this chapter to several seeming problems with this importation of

probability into arithmetic.

The final two chapters are concerned with arithmetical definability in two

different settings. In the third chapter, the interpretability strength of the arith-

metical and hyperarithmetical subsystems of second-order Peano arithmetic is

Sean Walsh

compared to the interpretability strength of analogous systems centered around

two principles called Hume’s Principle and Basic Law V, which respectively axiom-

atize a standard notion of cardinality and an alternative conception of set. One of

the major results of this chapter is that the hyperarithmetic subsystem of Hume’s

Principle does not interpret the hyperarithmetic subsystem of second-order Peano

arithmetic. The fourth chapter is concerned with arithmetical definability in the

setting of descriptive set theory, where the relevant benchmark is between notions

which may be defined without quantification over elements of certain topological

spaces (Borel notions) and notions whose definitions do require such quantifica-

tion (analytic, coanalytic, projective notions). In this fourth chapter the Denjoy

integral is studied from the vantage point of descriptive set theory, and it is shown

that the graph of the indefinite integral is not Borel but rather is properly coana-

lytic. This contrasts to the Lebesgue integral, which is Borel under this measure

of complexity.

To my father, for always encouraging me to see where the circles cross.

ii

CONTENTS

FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

PREFACE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

CHAPTER 1: LOGICISM, INTERPRETABILITY, AND KNOWLEDGEOF ARITHMETIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Introduction: The Logicist Template . . . . . . . . . . . . . . . . 11.2 Background: The Interpretability of Theories and Structures . . . 61.3 Theory-Based Versions: the Plethora and Consistency Problems . 121.4 Structure-Based Version: the Isomorphism and Signature Problems 241.5 Conclusions and Directions for Further Research . . . . . . . . . . 401.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

CHAPTER 2: EMPIRICISM, PROBABILITY, AND KNOWLEDGE OFARITHMETIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592.1 Introduction: Inceptive and Amplificatory Empiricism . . . . . . . 592.2 Challenges to Access to Probability Assignments . . . . . . . . . . 67

2.2.1 Countable Additivity: Aligning the True and Probable . . 682.2.2 The Non-Computability of Probability Assignments . . . . 77

2.3 Challenges to Arithmetical Instance Confirmation . . . . . . . . . 842.3.1 Baker and the Exigencies of Arithmetical Sampling . . . . 852.3.2 Stable and Unstable Reasoning in Geometry and Arithmetic 95

2.4 Challenges from Alternative Inferences . . . . . . . . . . . . . . . 1062.5 Conclusions and Directions for Future Research . . . . . . . . . . 1162.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

CHAPTER 3: COMPARING PEANO ARITHMETIC, BASIC LAW V,AND HUME’S PRINCIPLE . . . . . . . . . . . . . . . . . . . . . . . . 1493.1 Introduction, Definitions, and Overview of Main Results . . . . . 149

iii

3.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1493.1.2 Definition of Signatures and Theories of PA2, BL2 and HP2 . 1503.1.3 Definition of Subsystems of PA2, BL2 and HP2 . . . . . . . . 1563.1.4 Summary of Results about the Provability Relation . . . . 1603.1.5 Summary of Results about the Interpretability Relation . . 162

3.2 Standard Models of HP2 and Associated Results . . . . . . . . . . 1663.2.1 Models of HP2 from Infinite Cardinals . . . . . . . . . . . . 1673.2.2 The Mutual Interpretability of PA2 and HP2 . . . . . . . . . 175

3.3 Standard Models of Subsystems of BL2 and Associated Results . . 1863.3.1 Generalities on Models of Subsystems of BL2 . . . . . . . . 1863.3.2 Hyperarithmetic Theory and Related Results . . . . . . . 1903.3.3 Standard Models of the Hyperarithmetic Subsystems of BL2 197

3.4 Barwise-Schlipf Models of Subsystems of BL2 and HP2 . . . . . . . 2033.4.1 Generalized Barwise-Schlipf/Ferreira-Wehmeier Theorem . 2033.4.2 Application to Algebraically Closed Fields . . . . . . . . . 2123.4.3 Application to O-Minimal Expansions of Real-Closed Fields 2203.4.4 Application to Separably Closed Fields . . . . . . . . . . . 227

3.5 Further Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

CHAPTER 4: DENJOY INTEGRATION: DESCRIPTIVE SET THEORYAND MODEL THEORY . . . . . . . . . . . . . . . . . . . . . . . . . . 2374.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2374.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

4.2.1 Absolutely Continuous Functions and Generalizations . . . 2404.2.2 Basic Properties of the Denjoy Integral . . . . . . . . . . . 2464.2.3 Lebesgue’s Lemma and the Subspaces . . . . . . . . . . . . 250

4.3 Descriptive Set Theory . . . . . . . . . . . . . . . . . . . . . . . . 2564.3.1 Three Derivatives and Functions of Arbitrarily High Rank 2574.3.2 Totalization: Calibrating Rank and Entry into Subspaces . 2664.3.3 Definability: The Derivatives are Borel . . . . . . . . . . . 274

4.4 Model Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2884.4.1 Indexes of Subgroups and Non-Definability of the Integral 2894.4.2 Elementary Equivalence and Decidability . . . . . . . . . . 294

4.5 Further Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . 304

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305

iv

FIGURES

1.1 Summary of Problems for Versions of the Logicist Template . . . 42

2.1 Alternative Confirming Inferences: Two Pairs of Contrasting Infer-ences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

3.1 Provability Relation in Subsystems of BL2, PA2, and HP2 . . . . . . 160

3.2 Interpretability Relation in Subsystems of BL2, PA2, and HP2 . . . 164

4.1 Containment Diagram for Subsets of M [a, b] and C[a, b] . . . . . . 241

v

ACKNOWLEDGMENTS

I would first and foremost like to thank my wife Kari for her love and patience–

I wouldn’t have been able to do any of this without her. I would also like to

thank my parents, Kevin and Linda, and my in-laws, Ron and Annette, for their

persistent support and encouragement.

My advisors Michael Detlefsen and Peter Cholak have obviously shaped this

dissertation and my overall thinking in ways that I cannot begin to describe,

and I would like to thank them for their patience and help on my dissertation,

as well as for their continued support for my doing interdisciplinary work. I

would also like to thank my teachers at Notre Dame, and in particular Timothy

Bays, Patricia Blanchette, Curtis Franks, Julia Knight, and Sergei Starchenko,

for introducing me to the philosophy of mathematics and mathematical logic.

Likewise, my education at Notre Dame has been enriched by the many visitors

to, as well as former members of, the logic group at Notre Dame, including Pe-

ter Gerdes, Karen Lange, David Lippel, Colin McLarty, Serge Randriambololona,

Reed Solomon, Vitezslav Svejdar, and William Tait. Finally, I would like to thank

two of my teachers from Gonzaga University, Wayne Pomerleau and John Burke,

who first introduced me to so much of what I have come to love about philosophy

and mathematics.

During graduate school I have been the beneficiary of generous financial sup-

port from many institutions and groups, including the Philosophy Department at

vi

Notre Dame, the Mathematics Departments at Notre Dame, the Ahtna Heritage

Foundation, the Deutscher Akademischer Austausch Dienst, the George-August

Universitat Gottingen, the National Science Foundation (under NSF Grants 02-

45167, EMSW21-RTG-03-53748, EMSW21-RTG-0739007, and DMS-0800198), the

Alexander von Humboldt Stiftung TransCoop Program, and the Ideals of Proof

Project, which in turn was funded and supported by Agence Nationale de la

Recherche, Universite Paris Diderot – Paris 7, Universite Nancy 2, College de

France, and Notre Dame. There are of course many people behind these vari-

ous institutions and groups, and I would like to especially thank Karine Chemla,

Brice Halimi, Gerhard Heinzmann, Felix Muhlholzer, Marco Panza, David Rabouin,

Ivahn Smadja, Jean-Jacques Szczeciniarz, and Christian Tapp.

I have been working on the material collected here for several years, and have

benefited both from many opportunities to present components of this material at

various conferences and seminars, as well as opportunities to discuss this material

with my teachers and friends, and I would like to record some of these debts here.

A very early version of some of the underlying thoughts from Chapter 1 was

presented at the Sixth Annual Midwest Philosophy of Mathematics Workshop

held at Notre Dame on October 8, 2005 under the title “Justifications of Hume’s

Principle and Mathematical Induction.” In that talk, I was focused on under-

standing the history of various attempts to justify Hume’s Principle and math-

ematical induction, but in giving the talk I was forced to try to articulate the

epistemic relationship between these two principles, and it was that first attempt

that prompted the reflections currently found in Chapter 1. A very succinct ver-

sion of the material from Chapter 1 was presented in the first half of my talk

at Dr. Detlefsen’s Ideals of Proof Fellows’ Workshop held at the Ecole normale

vii

superieure on September 8, 2009 under the title “The Role of Interpretability Re-

sults in the Justication of Axioms,” and a more definitive version of this material

was presented at FregeFest 2010 held at the Department of Logic and Philosophy

of Science at the University of California, Irvine on February 26, 2010 under the

title “Logicism, Interpretability, and Knowledge of Arithmetic.” Needless to say,

the material in Chapter 1 has been bettered by my having the opportunity to

present and discuss this material at these meetings, and in particular I would like

to thank Roy Cook, William Demopoulos, and Kai Wehmeier for a very helpful

discussion of this material subsequent to my talk in Irvine. I am also indebted

to several of my friends who have provided generous comments on this chapter,

including Andrew Arana, Sharon Berry, Sebastien Gandon, Christopher Porter,

and Iulian Toader. Finally, my understanding of Frege, logicism, and the partic-

ular topics which I treat in this chapter has been sharpened and deepened over

the years by my attending seminars by and discussing these matters with Patri-

cia Blanchette, whom I would particularly like to thank in this regard.

The Sixth Annual Midwest Philosophy of Mathematics Workshop was some-

what of a watershed event for me, both for the reasons mentioned above, and

because I heard a talk by Neil Tennant entitled “Natural Logicism,” which fo-

cused on the manner in which addition and multiplication were recoverable from

Hume’s Principle. It was in listening to Tennant’s talk that it first dawned on

me that there was something akin to a reverse mathematics project in the setting

of Hume’s Principle and Basic Law V. So subsequent to this talk, in the winter

months of 2005-2006, I started mapping out the provability and interpretability

relations among the predicative systems of Hume’s Principle and Basic Law V,

and here I would like to extend a special thanks to Christopher Porter, with whom

viii

I initially read and discussed some of the papers and books on this topic. Finally,

the material in this chapter has been improved by my having the opportunity

to present it to several gracious audiences. In particular, I would like to thank

Logan Axon, Joshua Cole, Stephen Flood, and Christopher Porter for listening

to me speak on this material in Dr. Cholak’s seminar, and I would like to thank

Antonio Montalban and the other organizers and participants in the University of

Chicago Logic Seminar, where I presented much of this material on March 2, 2009

under the title “Comparing Peano Arithmetic, Hume’s Principle, and Basic Law

V.” I would also like to thank Øystein Linnebo, Richard Pettigrew, and Albert

Visser, with whom I had some very helpful discussions of this material subsequent

to my arrival in Paris in summer 2009.

In regard to Chapter 4, I would like to thank Slawomir Solecki for several help-

ful discussions of this material. It was in conversations with him that it became

clear that the most natural way to treat the Denjoy integral was not in terms of

the measurable functions which are Denjoy integrable or the continuous functions

which are their indefinite integrals, but rather to focus on both notions simulta-

neously, as this was what was most naturally approximated by Borel notions. I

would also like to thank Dr. Solecki, and the other participants in the Urbana

Logic Seminar, for listening to me speak on this material on December 9, 2008 un-

der the title “Henstock-Kurzweil Integration: Descriptive Set Theory and Model

Theory.” Similarly, I would like to thank Steffen Lempp and the other partici-

pants in the Southern Wisconsin Logic Seminar, where I spoke on this material

on February 24, 2009, and likewise I would like to thank Alain Louveau and the

other participants in the Descriptive Set Theory working group at the Institute de

Mathematiques de Jussieu, where I spoke on this material on December 8, 2009.

ix

The material in Chapter 2 grew out of my attempt to try to understand what

was thought about the epistemology of the Peano axioms prior to Dedekind and

Frege’s seminal work published in the 1880s. Given some prior familiarity with

Kant, I was confident that Kant had never discussed mathematical induction, but

it was initially a great mystery for me to try to understand what exactly tran-

spired in the philosophy of arithmetic in the intervening century. I would thus

like to thank Paul Franks, Karl Ameriks, and Anja Jauernig, who first introduced

me to the philosophy of the post-Kantian period. It was when I was following up

the footnotes to Beiser’s The Fate of Reason that I first stumbled upon Kastner,

from whom I later encountered Fries (cf. Chapter 2, endnote 51). I would like to

thank David Rabouin and Sebastien Maronne for allowing me to present some of

this material on Kastner, Fries, and related figures at their Seminaire de travail

“Mathematiques a l’age classique” on December 2, 2009. Even though Kastner

and Fries don’t figure prominently in this final version of the dissertation, it was

trying to understand their idea that mathematical induction was epistemically

akin to enumerative induction which prompted me to write Chapter 2. An earlier

attempt at articulating the connection between mathematical induction and enu-

merative induction was presented at the First Paris-Nancy PhilMath Workshop

on October 21, 2009 under the title “The Justification of Mathematical Induc-

tion: The View from the 18th Century,” and I would like to thank Walter Dean,

Michael Potter, and Stewart Shaprio for a very helpful discussion subsequent to

that talk. Finally, I would like to thank my friends Andrew Arana and Iulian

Toader for several helpful discussions and comments on the material in this chap-

ter.

x

PREFACE

This dissertation is a study of arithmetical knowledge, arithmetical definabil-

ity, and the connections between them. Chapter 1 critically examines several

versions of a logicist account of arithmetical knowledge. The basic idea which

unifies these different accounts is that one may legitimately infer to knowledge

of arithmetical truths, like the Peano axioms, from knowledge of quasi-logical

principles like Hume’s Principle and the knowledge that the arithmetical truths

are representable in the logical truths. The focus of this first chapter is thus on

identifying some version of representation which would sustain this inference, and

my conclusion is that extant proposals are not successful in this regard. While

the notion of representation coming from the mathematical notion of interpreta-

tion seems like an important technical notion (and one which is further studied

in Chapter 3), this and related technical notions seem inadequate to the task of

providing a means by which to pass from knowledge of quais-logical truths like

Hume’s Principle to knowledge of arithmetical truths like the Peano axioms.

In Chapter 2, an alternative thesis is examined according to which knowledge

of arithmetical truths like the Peano axioms may be legitimately inferred from very

primitive arithmetical knowledge like 7+5=12 and the knowledge that these prim-

itive truths confirm these axioms. Here the notion of confirmation is a patently

probabilistic one, and thus the bulk of this chapter is focused on various challenges

which emerge when one tries to apply ordinary probabilistic notions to the setting

xi

of arithmetic. For instance, a probabilistic rule which initially seems quite natural

in the setting of arithmetic, namely a probabilistic version of the ω-rule, has the

consequence that arithmetical truth aligns with high probability. Likewise, unless

one assigns probability zero to some very basic arithmetical truths, all probability

assignments will be highly non-computable when represented in various standard

ways. Problems such as these obviously cast some initial doubt on the claim that

arithmetical knowledge can be based on knowledge of probabilities associated to

primitive arithmetical truths. The goal of this chapter is to offer responses to these

and other problems, and thus to at least secure the tenability of this empiricist

picture of arithmetical knowledge.

In Chapter 3, the mathematical portion of the dissertation begins, wherein

the overriding theme is arithmetical definability, where this roughly means defin-

ability without recourse to quantification over higher-order objects. In particular,

Chapter 3 is concerned with arithmetical definability in the setting of second-order

Peano arithmetic, Basic Law V and Hume’s Principle. Here the primary focus is

on the interpretability strength when conjoined with ∆11-comprehension, which is

standardly regarded as being on the outskirts of arithmetical definability. One of

the main results here is that there is a consistent extension of Basic Law V plus

∆11-comprehension which interprets ∆11 − CA0 (cf. Corollary 61). In contrast to

other known methods of constructing models of Basic Law V, this was done by

showing that the the minimal ω-model of ∆11 − CA0 is mutually interpretable with a

model of Basic Law V plus ∆11-comprehension (cf. Theorem 60). Likewise, in this

chapter it is shown that Hume’s Principle plus ∆11-comprehension is interpretable

in ACA0 but does not interpret ACA0 (cf. Corollary 99). This was done by building

a model of Hume’s Principle on a certain real closed field and by noting that the

xii

proof could be formalized in ACA0. Similar methods allow one to answer a question

of Linnebo (cf. Proposition 83 and Remark 81).

In Chapter 4, the focus of the dissertation shifts to arithmetical definability

in the setting of descriptive set theory, where the relevant demarcation line is

between Borel notions and analytic/co-analytic notions, the latter of which require

quantification over Polish spaces. Here the main question is whether a certain

integral which extends the Lebesgue integral, called the Denjoy integral, is Borel.

In this chapter it is shown that the relation “f is Denjoy integrable and F is

equal to its indefinite integral” is a co-analytic but not Borel relation on the

product space M [a, b] × C[a, b], where M [a, b] is the Polish space of real-valued

measurable functions on [a, b] and where C[a, b] is the Polish space of real-valued

continuous functions on [a, b] (cf. Corollary 195 and Figure 4.1). Using the same

methods, it is also shown that the class of indefinite Denjoy integrals is co-analytic

but not Borel relation on the space C[a, b], thus answering a question posed by

Dougherty and Kechris (cf. Corollary 197). In this chapter, some basic model

theory of the associated spaces of integrable functions are studied. Here the main

result is that, when viewed as an R[X]-module with the indeterminate X being

interpreted as the indefinite integral, the space of continuous functions on the

interval [a, b] is elementarily equivalent to the Lebesgue-integrable and Denjoy-

integrable functions on this interval.

Outside of the obvious topical similarities between Chapters 1-2 and Chap-

ters 3-4, I want to mention one final connection between the chapters of this

dissertation, a connection that I hope to explore further in subsequent work. Part

of idea of the type of logicism examined in Chapter 1 is that knowledge of the

Peano axioms is based in part on a knowledge that these axioms are representable

xiii

in terms of quasi-logical axioms like Hume’s Principle. If the relevant notion of

representation is taken to imply the technical notion of interpretation, then it is

natural to ask whether there are any interpretability results of this sort in settings

with limited amounts of comprehension. As mentioned above, in Corollary 99 of

Chapter 3, it is shown that the ∆11-comprehension version of Hume’s Principle

does not interpret the ∆11-comprehension version of the Peano axioms. Thus, this

suggests that the type of logicism considered in Chapter 1 has to additionally

defend the epistemic status of impredicative comprehension, i.e., comprehension

in which one quantifies over a higher-order object. In future work I hope to ex-

plore in more detail the philosophical implications of the technical results from

Chapter 3 for the tenability of the types of logicism which I consider in Chapter 1.

xiv

CHAPTER 1

LOGICISM, INTERPRETABILITY, AND KNOWLEDGE OF ARITHMETIC

1.1 Introduction: The Logicist Template

My topic in this chapter is the contention made by contemporary logicists

that knowledge of arithmetical principles may be based on knowledge of logical

principles. Here the notion of a “logical” principle is a rather loose one, and is

merely intended to convey the idea that the principle in question is epistemically

akin to modus ponens: it is apriori, it is analytic, etc. The primary example of a

principle which has been claimed to be logical in this sense is Hume’s Principle,

which roughly states that two properties have the same cardinality if and only if

they can be one-one correlated with each other, as the forks and knives on a well-

set dining room table can be correlated one-one with each other.1 Indeed, much of

the recent discussion of logicism has centered around Crispin Wright’s arguments

that Hume’s Principle is a logical principle in this sense.2 However, Wright and

other logicists are ultimately interested in Hume’s Principle because they think

that knowledge of it can account for our knowledge of arithmetic. Indeed, Wright

even says that “[. . . ] nothing can be essentially involved in the epistemology of

number theory that is not involved in an understanding, and knowledge of the

truth of Hume’s Principle” ([159] p. 366, [60] p. 255). My concern in this chapter

is with the question, which has been relatively neglected in the contemporary

1

literature, of how the logicist is able to contend that knowledge of principles like

Hume’s Principle can rationally sustain knowledge of arithmetical principles.

There are two respects in which this topic should be of interest to those with-

out prior interests in logicism. First, claims about the apriority or analyticity of

Hume’s Principle are independent of claims that knowledge of arithmetical prin-

ciples can be based on knowledge of Hume’s Principle. It is in these latter claims

that a template for how to acquire arithmetical knowledge is found, and it is

this template, which I call the Logicist Template, which shall be the focus of this

chapter. Indeed, since one can rationally endorse the Logicist Template without

endorsing the apriority or analyticity of Hume’s Principle, this template is poten-

tially of interest to those who are skeptical of or who even deny such claims of

apriority or analyticity. For instance, the Logicist Template would be of interest

to someone who thought that Hume’s Principle was aposteriori or synthetic, since

it would likewise show them how to proceed rationally from such knowledge to

knowledge of arithmetic.3

The second reason that this topic should be of interest to those without prior

interests in logicism is that the arithmetical principles in question, namely the

Peano axioms,4 are essential to contemporary mathematics. However, despite this,

contemporary philosophers of mathematics have had relatively little to say about

the epistemic status of the Peano axioms. For instance, among the Peano axioms

is the Mathematical Induction Principle, which says that if zero has a property

and if n + 1 has this property whenever n does, then all natural numbers have

this property. In his recent book Charles Parsons says of this principle: “Writers

on the foundations of arithmetic have found it difficult to state in a convincing

way why the principle of mathematical induction is evident” ([120] p. 264). Part

2

of what is important about the Logicist Template is thus that it is one of the few

contemporary accounts which explicitly addresses the question of the evidence for

mathematical induction.5

So in what follows, by the Logicist TemplateI shall mean the following schematic

claim: knowledge of arithmetical principles may be based on knowledge of logical

principles and the knowledge that these arithmetical principles can be represented

within the logical principles. This claim is schematic in two different respects.

First, it presupposes some specification of the arithmetical and logical principles

in question. Out of deference to the contemporary literature on logicism, in what

follows I shall assume that the arithmetical principles in question are the Peano

axioms and the logical principle in question is Hume’s Principle. However, none of

the points that I shall make in this chapter depend crucially on this specification.

The second sense in which the Logicist Template is schematic is that it pre-

supposes some antecedently specified notion of what it is for one set of principles

to be represented within another set of principles. In contemporary mathematical

logic, there are a number of notions of representation, which differ from one an-

other both in terms of what and how they represent. Some of these notions are

theory-based, wherein the key idea is that one theory is representable within an-

other if provability within the represented theory is matched by provability within

the representing theory. Others of these notions are structure-based, where the

key idea is that the represented structure be isomorphic to a structure definable

in the representing structure. In § 1.2, I review in more detail these differences

between the theory-based and structure-based notions of representation which are

found in contemporary mathematical logic.

While the versions of the Logicist Template which I consider in §§ 1.3-1.4 are

3

centered around these notions from contemporary mathematical logic, it is impor-

tant to emphasize that there is an obvious sense in which adopting such a perspec-

tive is both partially ahistorical and potentially limiting. For instance, while the

rudiments of the theory-based notion of representation seem to be present in tra-

ditional logicists such as Frege and Russell, it is not obvious that the same can be

said of the structure-based notions, simply due to the relatively recent provenance

of the model-theoretic ideas in terms of which the structure-based notions are

defined. Further, it is obvious but bears explicit mentioning that there is no rea-

son to think that the theory-based and structure-based versions of representation

considered here exhaust everything which might legitimately claim right to the

admittedly loose title of “a notion of representation.” However, in order to evalu-

ate the Logicist Template, some precise notion of representation must be offered,

and in this paper I focus on evaluating versions centered around the theory-based

and structure-based notions of representation from contemporary mathematical

logic.

In particular, in § 1.3, I consider theory-based versions of the Logicist Tem-

plate, in which it is contended that knowledge of arithmetical theory may be based

on knowledge of logical theory because the arithmetical theory may be represented

within the logical theory (in the manner of representation germane to theories).

My thesis here is that the theory-based versions of the Logicist Template cannot

exert an appropriate amount of control over the variety and scope of the proposi-

tions which are represented. In particular, two problems, which I call the plethora

problem and the consistency problem, show respectively that too much would be

counted as knowledge by this view or that inconsistent propositions would each

be counted as knowledge by this view.

4

In § 1.4, I turn to a version of the Logicist Template centered around a

structure-based notion representation. In particular, I articulate a specific structure-

based version of the Logicist Template and argue that it has the resources to over-

come the plethora problem and the consistency problem which beset the theory-

based versions. However, I argue that it does so at a certain cost, and that in

particular that it faces two problems pertaining to our knowledge of structures,

which I call the isomorphism problem and the signature problem. The isomorphism

problem is that this structure-based version requires knowledge of properties of

structures which are not invariant under isomorphism, a requirement which is

contrary to one way of making precise the thought that structures can only be

specified up to isomorphism. The signature problem is that knowledge of the sig-

nature of the natural numbers requires knowledge of the Peano axioms, so that

it seems that the structure-based version of the Logicist Template requires the

very knowledge which it seeks to deliver. While I will present reasons for thinking

that the isomorphism problem can be overcome, it is my view that the signature

problem poses a deep and presently unanswered challenge to the structure-based

version of the Logicist Template.

Hence, my overall conclusion in this chapter is that both the theory-based and

structure-based versions of the Logicist Template face deep problems, and hence

that hitherto no satisfactory version of the Logicist Template has been presented

which can secure the inference from knowledge of logical principles such as Hume’s

Principle to knowledge of arithmetical principles such as the Peano axioms. This,

of course, is not to say that this inference cannot be secured, but merely to point

out particular challenges to the extant proposals. Put positively, these challenges

give us a better picture of what a notion of representation must look like if it is

5

to sustain a viable version of the Logicist Template.

Before turning to an overview of the theory-based and structure-based versions

of representation which are found in contemporary mathematical logic, it is worth

underscoring the admittedly limited scope of the types of logicism that I consider

in this chapter. For, there are many important epistemic projects associated with

traditional logicists such as Frege and Russell– for instance, there is Frege’s idea

that arithmetical knowledge is more widely applicable than other types of math-

ematical knowledge.6 However, for the sake of being able to say something both

specific and brief, in this chapter I limit myself to an evaluation of the epistemic

strand of logicism which takes up Frege’s idea that mathematical induction is

“based on general logical laws” and Crispin Wright’s idea that Hume’s Principle

gives us a way to “apprehend the truth” of the Peano axioms.7

1.2 Background: The Interpretability of Theories and Structures

The goal of this section is to provide background on some of the notions of

representation which are to be found in contemporary mathematical logic. In the

tradition of mathematical logic, such representations are called “interpretations,”

and the reader who is already familiar with the notion of interpretability may

wish to proceed directly to § 1.3 and refer back to this section as needed. By way

of orientation, it is important to recall at the outset that part of the power of

mathematical logic resides in the fact that it moves back and forth between two

perspectives, one that is concerned with theories and proofs, and another that

is concerned with structures and definability. Hence, there are notions of inter-

pretability for theories and notions of interpretability for structures, and whereas

the former are centered around proof, the latter are centered around definability.

6

Structures and theories are both relative to formal languages or signatures, and

these are simply specifications of a class of constant symbols, relation symbols,

and function symbols. Given a signature, a structure then is simply a set along

with distinguished constants, relations, and functions on this set corresponding to

the symbols from the signature. Likewise, given a signature, a theory is simply

a collection of sentences in this signature. Natural examples of structures in this

sense are the real and complex fields, which are given in a signature containing

function symbols for addition and multiplication. Examples of theories in this

sense are the complete theories of the real and complex fields, i.e. the set of

all sentences in this signature which are true on these structure. The Zermelo-

Fraenkel axioms for set theory are another natural example of a theory, and in

this case the signature in question simply consists of the single binary relation

symbol corresponding to the membership relation.8 So, in this section, my goal is

to say what it means for one structure to be interpretable in another and what it

means for one theory to be interpretable in another.

The motivating idea behind the definition of the interpretability of one struc-

ture within another is that it is designed to generalize several classical construc-

tions from 19th Century mathematics. For instance, the field of complex numbers

is interpretable in the field of real numbers since the complex numbers can be

taken to be pairs of real numbers. Likewise, the real projective plane is inter-

pretable in the field of real numbers since the points of the real projective plane

can be taken to be equivalence classes (a,b,c)/E of non-zero triples (a, b, c) of real

numbers under the equivalence relation E of “being on the same line through the

origin”:

(a, b, c)E(x, y, z)⇐⇒ ∃ λ 6= 0 [a = λx & b = λy & c = λz] (1.1)

7

The notion of the interpretability of one structure in another generalizes these two

examples. In particular, if M is a structure, then a set X ⊆ Mn is definable in

M if there is a first-order formula ϕ(x), perhaps containing parameters from M ,

such that:

x ∈ X ⇐⇒M |= ϕ(x) (1.2)

Building on this, one says that a structure M is interpretable in a structure M∗

if it is isomorphic to a structure whose domain, constants, relations, and func-

tions are definable in M∗, perhaps using an equivalence relation for equality.

Here two structures in the same signature are said to be isomorphic if there is

a structure-preserving one-one map from the one onto the other.9 This is exactly

what happens with both the complex numbers and the real projective plane: the

complex numbers can be taken to be pairs of reals numbers and the points of the

real projective plane can be taken to be equivalence classes of non-zero triples of

real numbers.

To get a better sense for the types of distinctions that interpretability does

and does not recognize, it is helpful to define the notion of mutual interpretability.

Two structures are said to be mutually interpretable if each interprets the other.

One might initially think that mutually interpretable structures would have to

be very similar to each other, since each in some sense “contains” the other.

However, there are algebraic structures which are mutually interpretable with

geometric structures. For instance, the real numbers are mutually interpretable

with the Euclidean plane. One direction of this result is easy to see: the Euclidean

plane is interpretable in the real numbers since one can take points to be given

by their x- and y-coordinates and since one can take lines to be given by their

slope and y-intersection points. The other direction is non-trivial and is called

8

the “introduction of coordinates”: this result says that if one starts with the

points and lines of the Euclidean plane, then one can define notions of addition

and multiplication and thereby recover the real numbers.10 Hence, very natural

and traditional distinctions like the distinction between algebraic and geometric

structures cannot be recognized from the perspective of the interpretability of

structures.

Whereas the key role in the interpretability of structures is played by the

notion of definability, the key role in the interpretability of theories is played by

the notion of provability. In particular, one says that a theory T is interpretable

in a theory T ∗ if the primitives of the interpreted theory T can be translated into

formulas of the interpreting theory T ∗ so that the translation ϕ∗ of every theorem

ϕ of T is a theorem of T ∗. That is, the key idea is that the translation of theorems

are theorems:

T ` ϕ =⇒ T ∗ ` ϕ∗ (1.3)

For instance, the Zermelo-Fraenkel axioms for set theory interpret the Peano ax-

ioms for arithmetic because one can associate the arithmetical primitive “being

a natural number” with the set-theoretic formula “being a finite ordinal,” and

likewise one can associate “x < y” with “x ∈ y,” and “x = 0” with “x = ∅.” One

can then verify that the translations of arithmetical theorems are set-theoretic

theorems. For instance, it is a theorem of Peano arithmetic that no natural num-

ber is less than zero, and it is likewise a theorem of Zermelo-Fraenkel set theory

that no finite ordinal is contained in the empty set.11

It is important to note that this notion of a “translation” is not necessarily

imbued with any sense of “preservation of meaning,” but rather merely denotes a

mechanical method of transforming theorems of the interpreted theory into theo-

9

rems of the interpreting theory. One example which illustrates this phenomenon

nicely is when the interpreted and interpreting theories are the same. For instance,

take both the interpreted and interpreting theories to be the theory of a dense

linear order without endpoints. For the sake of concreteness, one can think of

these theories as the complete theory of the rational numbers (Q, <) as a linear

order.12 Then, if one translates “less than” by “greater than,” it is easy to see

that the translation of every theorem of the interpreted theory is a theorem of the

interpreting theory. For instance, it is a theorem of the interpreted theory that

the ordering is dense, i.e., between any two rational numbers is another rational

number:

∀ x, y [x < y → (∃ z x < z < y)] (1.4)

When translated, this theorem becomes the following theorem of the interpreted

theory:

∀ x, y [x > y → (∃ z x > z > y)] (1.5)

Hence, while translating “less than” by “greater than” preserves theoremhood and

hence provides us with an interpretation, it is clear that “less than” means some-

thing substantially different from “greater than.” This example illustrates that

translations used in interpretations are quite different in character from transla-

tions between natural languages.

It is instructive to contrast the interpretability of theories to the faithful inter-

pretability of theories. A theory T is said to be faithfully interpretable in a theory

T ∗ if T is interpretable in T ∗ so that translations of theorems are theorems and

so that translations of non-theorems are non-theorems:

T ` ϕ⇐⇒ T ∗ ` ϕ∗ (1.6)

10

It turns out that there are many examples of interpretations which are not faith-

ful interpretations. That is, while interpretability means that “provability facts”

about the interpreted theory are represented in the interpreting theory, it doesn’t

mean that the “non-provability facts” are so represented. For instance, the inter-

pretation of Peano arithmetic in Zermelo-Fraenkel set theory given above is not

a faithful interpretation because Peano arithmetic doesn’t prove its own consis-

tency, whereas Zermelo-Fraenkel set theory does prove the consistency of Peano

arithmetic. Such an example illustrates the fact that an interpreting theory may

be able to deduce more about the interpreted theory than the interpreted theory

itself can.13

A similar point about the potential incongruity of the interpreted and inter-

preting theories can be seen by switching to the perspective of structures. By the

completeness theorems, it is not difficult to see that one theory is interpretable

in another theory if and only if any structure that models the interpreting theory

uniformly interprets a structure that models the interpreted theory, where the

sense of “uniform” is that the same formulas are used each time to interpret the

models of the interpreted theory. Hence, to say that one theory is interpretable

in another is to say something about what happens in all the instantiations of

the interpreting theory. For instance, to say that axioms for arithmetic are inter-

pretable in axioms for set theory is to say something about what happens in every

model of set theory, and is not to say anything about what happens in all models

of arithmetic. So an interpreting theory may not be able to see all the models of

the theory which it interprets.

Finally, just as we defined the mutual interpretability of structures, so we can

define the mutual interpretability of theories and the mutual faithful interpretabil-

11

ity of theories. In particular, two theories are said to be mutually interpretable if

each interprets the other, and two theories are said to be mutually faithfully in-

terpretable if each faithfully interprets the other. Just as faithful interpretability

implies interpretability, so we have that mutual faithful interpretability implies

mutual interpretability. Hence, mutual faithful interpretability is the most re-

strictive notion of the interpretability of theories that we are considering here:

it implies but is not implied by the notions of mutual interpretability, faithful

interpretability, and interpretability.14

So in this section I have exposited two important families of notions of rep-

resentation. On the one hand, there are the notions of the interpretability and

mutual interpretability of structures. On the other hand, there are the notions of

the interpretability, faithful interpretability, mutual interpretability, and mutual

faithful interpretability of theories. From a certain perspective, this plurality of

notions of interpretability is of course exactly what one would expect. Hodges

makes this point with characteristic eloquence when he says: “Interpretations are

about different ways of looking at one and the same thing. So it should cause no

surprise that there are several different ways of looking at interpretations” ([70]

p. 201).

1.3 Theory-Based Versions: the Plethora and Consistency Problems

In the previous section, I distinguished between a family of theory-based no-

tions of representation and a family of structure-based notions of representation.

My concern here is to evaluate theory-based versions of the Logicist Template,

i.e., claims to the effect that knowledge of arithmetical theory may be based on

knowledge of logical theory and the knowledge that this arithmetical theory is

12

representable qua theory in the logical theory. In the next section, a structure-

based version of the Logicist Template will be considered. My thesis in this section

about the theory-based versions of the Logicist Template concerns the inability

of these versions to control the kinds of propositions which are representable qua

theories. In particular, each theory-based version of the Logicist Template seems

vulnerable to one of two problems, which I call the plethora problem and the

consistency problem. These problems are respectively that too much would get

counted as knowledge by these versions or that these versions would count both

a proposition and its negation as knowledge.

It is important to be clear that in this section I am considering a family of

theory-based versions of the Logicist Template, corresponding to the family of

theory-based notions of representation discussed in the previous section. In par-

ticular, there I distinguished between four different theory-based notions: the in-

terpretability, faithful interpretability, mutual interpretability, and mutual faithful

interpretability of theories. With respect to each such notion of interpretation X,

I want to here examine the following version of the Logicist Template:

Theory-Based Version of the Logicist Template (Relative to X-Interpretability):knowledge of arithmetical theory may be based on knowledge of logical the-ory and the knowledge that this arithmetical theory isX-interpretable withinthe logical theory.

My thesis about the failure of control applies across the board to all four theory-

based versions of the Logicist Template: in each case, either the plethora problem

or the consistency problem allows us to point to specific examples of propositions

which would on this account be counted as knowledge but which are neither

obviously nor necessarily known.

Before turning to these examples, let me briefly note one place where a promi-

nent logicist seems to endorse something quite similar to a theory-based version of

13

the Logicist Template. The following is a passage which Wright repeats verbatim

in two different essays:

The neo-Fregean thesis about arithmetic is that a knowledge of its funda-mental laws (essentially, the Dedekind-Peano axioms)– and hence of theexistence of a range of objects which satisfy them– may be based on Hume’sPrinciple as an explanation of the concept of cardinal number in general,and finite cardinal number in particular. More specifically, the thesis in-volves four ingredient claims: [¶] (i) that the vocabulary of higher-orderlogic plus the cardinality operator, octothorpe [#] or ‘Nx: . . . x. . . ’, providesa sufficient definitional basis for a statement of the basic laws of arithmetic;[¶] (ii) that when they are so stated, Hume’s Principle provides for a deriva-tion of those laws within higher-order logic [. . . ] ([160] p. 389, [161] p. 17,[60] pp. 256, 321).

It seems to me that the key idea expressed in these two roman numerals is that

(i) there is a way of translating arithmetical primitives into formulas about car-

dinalities, and that (ii) all the axioms of Peano arithmetic become theorems of

Hume’s Principle when so translated. This, of course, implies that the transla-

tion of theorems of Peano arithmetic are theorems of Hume’s Principle, which by

definition is what it means for the Peano axioms to be interpretable in Hume’s

Principle. Hence, it seems that what Wright is here suggesting is that knowledge

of the Peano axioms may be based on knowledge of Hume’s Principle because

Hume’s Principle interprets the Peano axioms.

However, there are some non-trivial difficulties involved in explicating the key

notion of a “sufficient definitional basis” which figures in component (i) of Wright’s

neo-Fregean thesis. I have understood it to mean simply a method of associating

arithmetical primitives to formulas about cardinalities. Hence, I have understood

components (i) and (ii) of Wright’s neo-Fregean thesis to mean that the inter-

pretability of the Peano axioms within Hume’s Principle is sufficient for knowl-

edge of the Peano axioms to be based on knowledge of Hume’s Principle. My

14

primary textual evidence for this can be found in Wright’s remarks in his essay

“Is Hume’s Principle Analytic?” ([161], [60] p. 307 ff). For instance, consider the

parenthetical remark which Wright makes at the opening of his essay:

The interest– if indeed any– of the question whether the principle [Hume’sPrinciple] is analytic is wholly consequential on what has come to be knownas Frege’s Theorem: the proof [. . . ] that second-order logic plus Hume’sPrinciple as sole additional axiom suffices for a derivation of second-orderarithmetic– or, more cautiously, for the derivation of a theory which allowsof interpretation as second-order arithmetic. (Actually I think the cautionis unnecessary– more of that later) ([161] p. 6, [60] p. 307).

I take it that, in this parenthetical remark, Wright is indicating that the inter-

pretability of the Peano axioms within Hume’s Principle is sufficient for his philo-

sophical purposes, which as he later states involves an account of our knowledge

of the Peano axioms, or as he calls them, “the fundamental laws of arithmetic.”

This seems to be confirmed when Wright later explicitly extrapolates on his

understanding of this key notion of a “sufficient definition basis” which figures in

component (i) of his neo-Fregean thesis. If I understand this key passage correctly,

Wright contends in the last sentence of the below quotation that interpretability

of the Peano axioms in Hume’s Principle suffices for our knowledge of “pure arith-

metic,” such as we find in number theory:

No question of course but that Frege shows how to define expressions whichcomport themselves like those for successor, zero, and the predicate ‘naturalnumber,’ thus enabling the formulation of a theory which allows of interpre-tation as Peano arithmetic. But– as we remarked right at the start– it isone thing to define expressions which, at least in pure arithmetical contexts,behave as though they express those various notions, another to define thosenotions themselves. [. . . ] How is the stronger point to be made good? [¶]Well, I imagine it will be granted that to define the distinctively arithmeti-cal concepts is so to define a range of expressions that the use thereby laiddown for those expressions is indistinguishable from that of expressions whichdo indeed express those concepts. The interpretability of Peano arithmeticwithin Fregean arithmetic ensures that has already been accomplished as faras all pure arithmetical uses are concerned ([161] pp. 17-18, [60] p. 322).

15

In the text immediately following this quotation, Wright goes onto discuss reasons

why Hume’s Principle can account for our knowledge of “applied arithmetic,” such

as e.g. my knowledge that I can infer from “there are exactly two F ’s” to “there

are distinct x, y which are F and everything that is an F is x or y.” But, in

any case, it is this last key sentence of the above quotation, namely that “the

interpretability of Peano arithmetic within Fregean arithmetic ensures that has

already been accomplished as far as all pure arithmetical uses are concerned,”

which is my primary evidence for understanding Wright as a proponent of a theory-

based version of the Logicist Template.

This of course is not to say that there are no passages in Wright which indicate

a sympathy for other versions of the Logicist Template. For instance, a few pages

earlier in this same essay, Wright says: “To be sure, it is a necessary condition of

the success of the neo-Fregean project that the relevant principle does more than

generate a theory within which arithmetic can be interpreted– there has to be

a tighter conceptual relationship than that” ([161] p. 15, [60] p. 317). However,

Wright does not say here what more is required or why more is required, and a

few pages later he goes onto make his remark that interpretability suffices “as far

as all pure arithmetical uses are concerned.” That is, if I understand correctly,

Wright ultimately endorses the contention that knowledge of “pure” statements of

arithmetic, such as the Mathematical Induction Principle, may be based entirely

on knowledge of Hume’s Principle and the knowledge that the Peano axioms are

interpretable in Hume’s Principle.

I want now to turn towards the evaluation of the claim that knowledge of

Hume’s Principle can be extended to knowledge of the Peano axioms by virtue of

the interpretability of the latter within the former. Of course, it is demonstrable

16

that Hume’s Principle interprets the Peano axioms. This result is sometimes

called Frege’s Theorem. The elements of the proof of this theorem can be found

in Frege’s writings, and the rediscovery of this theorem by Wright constitutes an

important contribution to our understanding of both traditional and contemporary

logicism.15 So my concern here is only with how to understand the philosophical

consequences of Frege’s theorem. In particular, I want now to consider the viability

of the following theory-based version of the Logicist Template: knowledge of an

arithmetical theory such as the Peano axioms may be based on knowledge of a

logical theory such as Hume’s Principle and the knowledge that this arithmetical

theory is interpretable in this logical theory.

One problem with this version of the Logicist Template, which I will dub

the plethora problem, has been voiced in different ways by Richard Heck and

Thomas Hofweber, and even earlier by Walter Hoering, although Hoering and

Hofweber are concerned with intertheoretic reduction and not with logicism in

particular.16 The plethora problem stems from the fact that many theories are

interpretable in the Peano axioms. For instance, it is well-known from the work

of Tarski that the complete first-order theory of the real and complex numbers

are interpretable in the Peano axioms.17 Since the interpretability of theories is

a transitive relation, Frege’s theorem implies that the complete theories of the

real and complex numbers are interpretable in Hume’s Principle. However, it

would seem strange to suggest that these theories can come to be known by way

of an interpretability result. For instance, one of the axioms of the complex

numbers is the Fundamental Theorem of Algebra, which asserts that every non-

zero polynomial in one variable has a root. The proofs of this theorem which

mathematicians accept and teach to their students are all non-trivial, and typically

17

require appeal to limits or to topological notions, each of which must be studied in

its own right before one can begin to understand these proofs of the Fundamental

Theorem of Algebra.18 It would seem counterintuitive to suggest that all of this

could be circumvented by appeal to a comparatively elementary interpretability

result. Hence, the plethora problem is that too much knowledge is generated by

the claim that knowledge of one theory can be based on knowledge of a theory

which interprets it.

One response to the plethora problem is simply to accept it and to strengthen

the notion of interpretation so as to avoid these sorts of counterexamples.19 In par-

ticular, thus far I have been considering a version of the Logicist Template which

claims that knowledge of arithmetical principles may be based on knowledge of

principles which interpret these arithmetical principles. Let us now consider a

more circumspect version of the Logicist Template which claims that knowledge

of arithmetical principles may be based on knowledge of principles which faithfully

interpret these arithmetical principles. Recall from the previous section that faith-

ful interpretability not only requires that translations of theorems are theorems,

but also that translations of non-theorems are non-theorems. One might initially

suspect that the plethora problem could be avoided by requiring that the inter-

preting theory know only as much about the interpreted theory as the interpreted

theory does itself.

However, it turns out that this is not the case. In particular, Tarski’s result was

that the complete theories of the real and complex numbers are interpretable in

the Peano axioms, and hence in Hume’s Principle. The theorems of the complete

theory of the complex numbers are by definition precisely the true statements

about the complex numbers, and the non-theorems are precisely the false state-

18

ments about the complex numbers. Hence, since the negations of false statements

are true statements, it follows that the negations of non-theorems are theorems in

this setting. Since translations preserve negations, it automatically follows that

the translations of non-theorems are non-theorems. Hence, the complete theories

of the real and complex numbers are faithfully interpretable in the Peano axioms,

and hence in Hume’s Principle. So, the plethora problem applies with equal force

to faithful interpretability as to interpretability itself.

This raises the question of whether a theory-based version of the Logicist Tem-

plate centered around mutual interpretability fares any better with respect to the

plethora problem than do versions based on interpretability and faithful inter-

pretability. Recall from the previous section that two theories are mutually inter-

pretable if each can interpret the other. Hence, I want to now consider a version

of the Logicist Template which says that knowledge of arithmetical principles may

be based on knowledge of principles which are mutually interpretable with these

arithmetical principles. It turns out that one can in fact avoid the plethora prob-

lem in this way. For instance, it is known from other parts of Tarski’s work on the

decidability of theories that the complete theories of the real and complex num-

bers are not mutually interpretable with the Peano axioms.20 Moreover, not only

can the plethora problem be avoided in this setting, but the analogous version of

Frege’s Theorem holds: in particular, it follows from work of Boolos that Hume’s

Principle and the Peano axioms are mutually interpretable.21,22

However, a different problem, which I call the consistency problem, besets

theory-based versions of the Logicist Template centered around mutual inter-

pretability. Such versions claim that knowledge of arithmetical principles may

be based on knowledge of principles which are mutually interpretable with these

19

arithmetical principles. The consistency problem is that the same sort of evidence

can provide us with “knowledge” of the negation of some of these arithmetical

principles. In particular, let us call the anti-Peano axioms the Peano axioms but

with the Mathematical Induction Principle replaced by its negation. Then, it

is essentially implicit in the work of Dedekind that the anti-Peano axioms are

mutually interpretable with the Peano axioms.23 Since mutual interpretability

is an equivalence relation, it follows from Frege’s Theorem that the anti-Peano

axioms are mutually interpretable with Hume’s Principle. Hence, if knowledge of

the Peano axioms may be based on Hume’s Principle because Hume’s Principle is

mutually interpretable with the Peano axioms, then presumably it is likewise true

that “knowledge” of the anti-Peano axioms may be based on Hume’s Principle

because Hume’s Principle is mutually interpretable with the anti-Peano axioms.

However, presumably it is absurd to suggest that both the Mathematical Induction

Principle and its negation can be known.

It turns out that the consistency problem is just as much a problem for mutual

faithful interpretability as it is for mutual interpretability. For instance, take the

theory T which is the theory of a dense linear order plus the axiom that one and

only one of the following possibilities occurs: either there is a least element and

no greatest element, or there is a greatest element and no least element. Then

by interpreting “greater than” by “less than,” one can easily see that T + ϕ and

T + ¬ϕ are mutually faithfully interpretable, where ϕ is the sentence saying that

there is a greatest element. Hence, it seems that it is just wrong to claim that

knowledge of one theory may be based on knowledge of a theory which is mutually

interpretable with that theory or which is mutually faithfully interpretable with

that theory. For, while this claim may support an inference from knowledge of

20

Hume’s Principle to knowledge of the Peano axioms, it does so at the cost of

failing to respect the dictum that one cannot know both a proposition and its

negation.24

It is helpful to contrast the consistency problem to the plethora problem. The

basic idea behind the plethora problem was that representations were too easy to

come by and hence that too much gets counted as knowledge if one claims that

knowledge of one theory may be based on knowledge of a theory which represents

that theory. By contrast, the consistency problem notes that such a claim leads

us to violate the basic principle that one cannot know both a proposition and

its negation. However, despite their differences, each of these two problems is

illustrative of a general problem of control: if one wants to claim that knowledge

can be passed along the kinds of theory-based representations considered here,

then one is faced with the problem that one cannot properly control the scope

and variety of the claims that get passed along. The plethora problem was that

too many things get counted as knowledge. The consistency problem was that

both a proposition and its negation would get counted as knowledge.

While it has been known for a long time that there are theories T and sentences

ϕ such that T + ϕ and T + ¬ϕ are mutually interpretable (or indeed mutually

faithfully interpretable), the relevance of this for logicism does not seem to have

been previously noted. However, its relevance for other programs in the philosophy

of mathematics has been previously discussed. For instance, Edward Nelson had

the idea of characterizing a very constructive theory of arithmetic as the collection

of all those sentences of arithmetic which were mutually interpretable with a very

weak set of arithmetical principles. But Nelson noted and posed the problem of

determining whether the conjunction of two sentences have this property whenever

21

the two sentences themselves individually have this property. Later, it was found

that both a sentence and its negation could have this property, thus undercutting

the idea that collection of all sentences with this property could have constituted

a theory of arithmetic in the first place.25

Another place in the philosophy of mathematics where the consistency prob-

lem has been noted is in discussions of set theory. In particular, some set theorists

have appealed to a theorem of Guaspari and Lindstrom, which says that finite ex-

tensions of the Zermelo-Frankel axioms for set theory are mutually interpretable if

and only if they prove exactly the same Π01-sentences.26 Here a Π0

1-sentence is sim-

ply a sentence which begins with a universal quantifier over natural numbers and

all of whose other quantifiers are bounded to natural numbers mentioned earlier

in the sentence. So, for example, Goldbach’s conjecture and the consistency state-

ments from Godel’s second incompleteness theorem are examples of such sentences.

There is a long tradition, stemming from Hilbert’s Program,27 of privileging such

sentences, and recently Peter Koellner has suggested a new variation on this idea.

Koellner’s idea is that the Π01-sentences are exactly the observational sentences,

so that the Guispari-Lindstrom theorem implies that while two mutually inter-

pretable set theories may disagree vastly about the nature of sets, they must of

necessity have the same observational consequences ([93] p. 98). However, there

does not seem to be any analogue of Koellner’s idea that is available to the logi-

cist. For, the logicist is interested in interpretations between theories in different

signatures, whereas the Guaspari-Lindstrom theorem only applies to extensions of

a fixed theory in a fixed signature. Further, since most of the sentences the logicist

is interested in, such as the claim that every natural number has a successor, are

not Π01-sentences, they would not be covered by the Guaspari-Lindstrom theorem

22

in the first place.

So in this section I have described how the problem of control emerges for

versions of the Logicist Template centered around four distinct notions of the

interpretability of theories: interpretability, faithful interpretability, mutual inter-

pretability, and mutual faithful interpretability. It is tempting to infer from this

that there is a problem of control for all versions of the Logicist Template cen-

tered around theory-based notions of representation. However, given the inherent

open-endedness of this notion of “theory-based,” it seems hard to substantiate

this claim. At best it seems that one can say something about what a notion of

representation must look like if it is going to sustain the Logicist Template. In

particular, the plethora problem shows us that such a notion of representation

can’t be such that many different theories are interpretable in any one given the-

ory. Likewise, the consistency problem shows that one can’t have cases where the

notion of representation in question fails to distinguish between a proposition and

its negation. To the best of my knowledge, all the known theory-based notions

of representations fail to meet one of these two conditions. However, this is no

reason to suspect that everything which has a right to the title of “a theory-based

version of the Logicist Template” will likewise fail.

In particular, one way in which the advocate of the Logicist Template might

respond to the criticism which I have been offering in this section is to present a

new description of the relationship between Hume’s Principle and the Peano ax-

ioms. Frege’s Theorem tells us that the Peano axioms are interpretable in Hume’s

Principle, but this does not preclude these two theories from being linked by some

stronger notion of interpretability, a notion which perhaps avoids the plethora

problem and the consistency problem. To the extent that this could be accom-

23

plished, it would be possible for the advocate of the Logicist Template to be in

complete accord with everything which I have said in this section. For, the most

general moral of the plethora problem and the consistency problem is that the

inference from knowledge of logical principles to knowledge of arithmetical prin-

ciples has to be based on something more than a knowledge of interpretability

(or mutual interpretability, or faithful interpretability, or faithful mutual inter-

pretability). For all that has been said in this section, it may be the case that this

something more is simply knowledge of a stronger theory-based interpretability

relation that links logical theory to arithmetical theory.

1.4 Structure-Based Version: the Isomorphism and Signature Problems

The goal of this section is to articulate and examine a single structure-based

version of the Logicist Template. This structure-based version is centered around

a notion of dual interpretability, which incorporates both theories and structures.

Subsequent to defining this notion, I describe how incorporating structures allows

this version to avoid the plethora and consistency problems. However, precisely

because it incorporates structures, the structure-based version of the Logicist Tem-

plate must now also account for our knowledge of structures, and this comes with

its own problems. In particular, I describe the isomorphism problem, which notes

that while the structure-based version requires knowledge of the properties of

structures which are not invariant under isomorphism, one might have the intu-

ition that all our knowledge of structures is so invariant. Likewise, I describe the

signature problem, which suggests that our knowledge of the signature of the nat-

ural numbers requires prior knowledge of arithmetical truths, and, in particular,

knowledge of arithmetical truths that the structure-based version was originally

24

designed to deliver. While I present reasons in this section for thinking that the

isomorphism problem can be overcome, I will argue in this section that the signa-

ture problem poses a deep challenge to the structure-based version of the Logicist

Template.

The motivating idea behind dual interpretability is not to relate individual

theories to individual theories or individual structures to individual structures,

but rather to relate one pairing of a theory with a structure to another pairing

of a theory with a structure. Such a pairing of a theory with a structure shall

be indicated by saying that the theory is about the structure. This locution is

introduced purely for the purpose of avoiding the cumbersomeness of speaking

of ordered pairs of theories and structures. Further, when I speak in this way

of a theory being about a structure, all that shall be assumed is that the theory

and the structure both have the same signature, and it shall not for instance be

assumed that the theory is true of the structure.

Having this bit of notation in place, the notion of dual interpretability can

now be defined. In particular, let us say that theory T about structure M is dual

interpretable in theory T ∗ about structure M∗ if (i) theory T is interpretable in

theory T ∗ and if (ii) structure M is interpretable in structure M∗, and if (iii) the

definitions used in both interpretations are the same. That is, the way in which

models of T ∗ uniformly interpret models of T is exactly the same way in which M∗

interprets M . In effect, the notion of a dual interpretability is a kind of pre-

established harmony of the interpretability of theories and structures. For, it

consists simply in the interpretability of theories on the one side and the inter-

pretability of structures on the other side, together with the added stipulation

that these two interpretations match up with one another.

25

This notion of dual interpretability allows for the definition of the following

structure-based version of the Logicist Template: knowledge that the arithmetical

principles are true of the natural numbers may be based on knowledge that the

logical principles are true of their subject-matter and the knowledge that these

arithmetical principles about the natural numbers are dual interpretable in the

logical principles about their subject-matter. Again, deferring to the contempo-

rary literature, we can take the arithmetical principles to be the Peano axioms and

the logical principles to be Hume’s Principle. Since Hume’s Principle says that

two properties are assigned the same cardinality if and only if they can be one-one

correlated with each other, I shall use the phrase “the cardinalities” to designate

the structure associated to Hume’s Principle. The structure-based version of the

Logicist Template then reads as follows:

Structure-Based Version of the Logicist Template: knowledge that the Peanoaxioms are true of the natural numbers may be based on knowledge thatHume’s Principle is true of cardinalities and the knowledge that the Peanoaxioms about the natural numbers are dual interpretable in Hume’s Principleabout cardinalities.

It is this version of the Logicist Template which shall occupy us throughout the

remainder of this chapter. But, as in the case of theory-based versions of the

Logicist Template, I do not mean to suggest that focus should be put on this

version because it is the only thing that might reasonably lay claim to the title of

a “structure-based version of the Logicist Template.” On my view, this particular

structure-based version merits our attention because it is able to avoid the prob-

lems which beleaguered the theory-based versions, namely the plethora problem

and the consistency problem.

This structure-based version is able to overcome the plethora problem because

it requires that there be an interpretation on the level of structures in addition to

26

an interpretation on the level of theories. The plethora problem was that an appeal

to interpretability would entitle us to more knowledge than we have legitimately

earned, and this structure-based version directly blocks this problem by crediting

us with knowledge only when there is knowledge of interpretability both at the

level of theories and the level of structures. To return to an example from the

previous section, it seems that knowledge of the Fundamental Theorem of Algebra

can be legitimately won by means of knowledge of a dual interpretability result,

viz. knowledge that the theory of the complex numbers about the structure of

the complex numbers is dual interpretable in the Peano axioms about the natural

numbers. For, this knowledge additionally involves being able to identify complex

numbers with certain sets of natural numbers. This, of course, is exactly the

route that mathematicians do take to establishing this theorem: they identify real

numbers with certain classes of natural numbers (Cauchy sequences, Dedekind

cuts), and then they identify complex numbers with pairs of real numbers, and

then they proceed to the Fundamental Theorem of Algebra by way of a study of

the analytic properties of the real and complex numbers which are expressible by

means of this identification. That is, if one attends to the actual proofs endorsed

by mathematicians, one sees quite immediately that the Fundamental Theorem of

Algebra is not a theorem of algebra but a theorem of analysis, since the theorem

is proved by recourse to analytic notions, which are made available by means of

interpretations on the level of structures. So, it is because this structure-based

version of the Logicist Template insists upon a knowledge of the interpretability of

both theories and structures that it can avoid the counterexamples which originally

attuned us to the plethora problem.28

The consistency problem revolved around examples of theories T and sen-

27

tences ϕ such that T + ϕ and T + ¬ϕ were mutually interpretable with one

another, so that the theory-based version of the Logicist Template was committed

to endorsing both a sentence and its negation. However, there is good reason to

think that this problem does not reemerge in the structure-based setting. In order

to see this, it is first necessary to state the following elementary result about dual

interpretability:

Elementary Result about Dual Interpretability: if theory T ∗ is true of struc-ture M∗ and if theory T about structure M is dual interpretable in theory T ∗

about structure M∗, then theory T is true of structure M .

That is, in the setting of dual interpretability, one can show that if the interpreting

theory is true of the interpreting structure, then the interpreted theory is true of

the interpreted structure. Hence, in the sense of dual interpretability, one can

actually demonstrate that truth is preserved downward under interpretability.29

It is easy to see how this elementary result allows us to overcome the consis-

tency problem. This problem was that there were natural examples of theories T

and sentences ϕ such that the two theories T +ϕ and T +¬ϕ were mutually inter-

pretable with one another and such that these two theories intuitively concerned

the same subject-matter, e.g. they were rival claims about natural numbers. How-

ever, the elementary result from the previous paragraph shows that these sorts of

examples cannot occur in this setting. For, suppose that T + ϕ and T + ¬ϕ are

both about the same structure M . Then it cannot be the case that (i) T is true

of M and that (ii) T + ¬ϕ about M is dual interpretable in T + ϕ about M , and

that (iii) T +ϕ about M is dual interpretable in T +¬ϕ about M . For, by (i), ei-

ther T+ϕ is true of M or T+¬ϕ is true of M . But, by (ii), if T+ϕ was true of M ,

then the elementary result from the previous paragraph tells us that T+¬ϕ would

be true of M , which is a contradiction. Likewise, by (iii), if T +¬ϕ was true of M ,

28

then this elementary result tells us that T +ϕ would be true of M , which is a con-

tradiction. Hence, what the elementary result from the previous paragraph tells

us is that by pinning everything down to a specific structure, the structure-based

version of the Logicist Template can avoid the consistency problem.30

It is also important to mention that this same elementary result provides us

with an explanation of why dual interpretability can be a source of knowledge.

For, it tells us that once one has a dual interpretability result, one may proceed in

a straightforwardly deductive manner from knowledge that the interpreting prin-

ciples are true of the interpreting structure to the knowledge that the interpreted

principles are true of the interpreted structure. In the previous section, the focus

was on counterexamples to theory-based versions of the Logicist Template and

the question was not even broached of what positive reasons one could adduce for

thinking that a theory-based interpretability result could be viewed as a source

of knowledge. Here, with respect to the structure-based version of the Logicist

Template, one can straightforwardly say why there is not anything mysterious

about how interpretability can be a source of knowledge: the mechanism that lets

us pass from knowledge of the interpreting theory to knowledge of the interpreted

theory in this setting is simply deduction.

I want now to discuss two problems with the structure-based version of the

Logicist Template, which I call the isomorphism problem and the signature prob-

lem. Both of these problems revolve around our knowledge of structures, and in

essence both problems arise when one starts to ask about the extent to which

knowledge of structures differs from knowledge of theories. It seems relatively

straightforward to speak of knowledge of theories, or knowledge that a theory is

true of a structure. For instance, it seems non-problematic to say that the recent

29

literature on the epistemic status of Hume’s Principle has focused on providing an

account of our knowledge of this theory, or perhaps an account of our knowledge

that this theory is true of the structure of cardinalities. However, when one begins

to speak of knowledge of structures that goes above and beyond knowledge of the

truths which they model, it seems that the picture becomes more opaque. The

idea behind both the isomorphism problem and the signature problem is that they

expose tensions between the transparency of our knowledge of structures and the

structure-based version Logicist Template. In the case of the isomorphism prob-

lem, I will indicate my reasons for thinking that this tension can ultimately be

tempered. However, I can presently see no way in which to alleviate this tension

in the case of the signature problem, and so on my view this problem poses a deep

and hitherto unanswered challenge to this structure-based version of the Logicist

Template.

The isomorphism problem is that the structure-based version of the Logicist

Template is inconsistent with a certain way of setting out an intuition about the

isomorphism of structures. Recall from § 1.2 that two structures are said to be

isomorphic if there is a structure-preserving one-one map from the one onto the

other (cf. endnote 9). One intuition which one might have about structures and

isomorphisms is that structures can only be specified up to isomorphism. The

content of this intuition might be explicated in terms of what I call the following

invariance thesis: any property that we know a structure to have is also had

by all isomorphic copies of this structure, regardless of whether or not we know

that this property is had by all these isomorphic copies.31 For instance, one

kind of knowledge of structures which we ostensibly have is that various first-

order sentences are true of structures, and since isomorphism preserves first-order

30

truth, this knowledge extends to isomorphic copies in the manner mandated by

the thesis. In essence, the invariance thesis predicts that our knowledge that

sentences are true of structures is paradigmatic of our knowledge of structures in

general.

To see the way in which this invariance thesis conflicts with the structure-based

version of the Logicist Template, it suffices to recall what is involved in the claim

that one structure is interpretable in another. In § 2, I presented the standard

definition of this notion, viz. that one structure is interpretable in another if it is

isomorphic to a structure whose domain, constants, relations, and functions are

definable in the second. For the sake of brevity, let us say that one structure is

definable in a second structure if its domain, constants, relations, and functions are

definable in the second. Hence, in this terminology, the standard definition reads

as follows: one structure is interpretable in another if it is isomorphic to a structure

definable in the second. It is easy to see that the interpretability relation is itself

invariant under isomorphism, in that if one structure is interpretable in a second,

then any isomorphic copy of the first is interpretable in any isomorphic copy of the

second. Hence, knowledge of the interpretability relation per se does not conflict

with the invariance thesis described above, since if I know that a structure is

interpretable in another, then anything isomorphic to the first is interpretable in

anything isomorphic to the second.

However, knowledge that one structure is interpretable in another involves

knowledge that the first structure is isomorphic to a structure which is definable

in the second, and definability is demonstrably not invariant under isomorphism.32

In particular, if one structure is definable in another, then it is simply false that

any isomorphic copy of the first is definable in any isomorphic copy of the sec-

31

ond.33 There are many ways to see this, but perhaps the most perspicuous way is

to adopt a set-theoretic perspective, and to note that while definability requires

that the defined structure be enumerated into the cumulative hierarchy immedi-

ately after the defining structure, it is nonetheless easy to construct isomorphic

copies that get enumerated at arbitrarily high stages. Hence, since definability is

not invariant under isomorphism, it follows that knowledge of definability is incon-

sistent with the invariance thesis. For, suppose that one structure is definable in

the second, and consider the property of “being definable in the second structure.”

If one knows that the first structure has this property, then the invariance thesis

requires that all isomorphic copies of the first structure have this property, which

is simply false. Likewise, if one knows that the second structure has the property

of “defining the first structure,” then the invariance thesis requires that all iso-

morphic copies of the second structure have this property, which is likewise false.

So this is why the invariance thesis is inconsistent with knowledge of definability

and hence with the structure-based version of the Logicist Template.

From a certain historical perspective, this is perhaps what one would expect.

For, thinking of Dedekind’s letter to Weber and Benacerraf’s and Cassirer’s criti-

cisms of Frege, it is not hard to convince oneself that structuralism was historically

borne of a rejection of logicism.34 Hence, to the extent that the invariance the-

sis could be counted as a structuralist thesis, one might have expected it to be

inconsistent with the structure-based version of the Logicist Template. However,

despite the fact that the invariance thesis is one way of rendering precise the in-

tuition that structures can only be specified up to isomorphism, it seems clear

that not all contemporary authors who identify themselves as structuralists are

ultimately committed to this thesis. For instance, while Resnik seems to endorse a

32

highly qualified version of the invariance thesis, other contemporary structuralists

such as Shapiro and Parsons do not seem committed to any form of this thesis.35

Indeed, if there is a thesis about structures that unites contemporary structuralists

such as Resnik, Shapiro, and Parsons, it is not any precisification of the intuition

that one can only specify structures up to isomorphism, but rather the thesis that

judgments about the identity and non-identity of mathematical objects are only

legitimate relative to some antecedently specified background structure.36 Unlike

the invariance thesis, this thesis about relativity is, as far as I can see, entirely

consistent with the structure-based version of the Logicist Template.

However, a robust defense of the structure-based version of the Logicist Tem-

plate requires that some positive reason be given for rejecting the invariance thesis,

and I think that such a reason is in fact available. In particular, while the picture

of our knowledge of structures which the invariance thesis recommends may not

be far from the truth with respect to intrinsic properties of structures, this pic-

ture is entirely inaccurate when it comes to the relational properties of structures.

While I do not claim to be able to provide an analysis of the notions of intrinsic

and relational, I can point to several examples which collectively cover most of

the properties of structures which have hitherto been studied. The paradigmatic

examples of intrinsic non-relational properties of structures include (i) whether a

given sentence is true or false of the structure, (ii) whether the structure has a

non-trivial automorphism, (iii) whether the structure has a non-trivial substruc-

ture. Paradigmatic examples of relational non-intrinsic properties of structures

include the following: (iv) one structure whose domain is a subset of natural

numbers being Turing computable in another structure whose domain is a subset

of natural numbers, (v) one structure being contained in the set-theoretic con-

33

structible universe relative to another structure, (vi) one structure being a Borel

subset of another structure equipped with an antecedently specified topology.37

While these three relational properties of structures constitute components of

the subject-matter of various sub-disciplines of mathematical logic– namely, com-

putable model theory, inner model theory, and descriptive set theory– they are

nonetheless not invariant under isomorphism.38 For instance, isomorphic copies of

computable structures need not be computable, and likewise for constructible and

Borel structures.39 Hence, if one thinks that these sub-disciplines of mathematical

logic provide us with knowledge of relational properties of structures, then one has

to reject the invariance thesis, since its mandate for invariance under isomorphism

is not satisfied by the types of knowledge generated by these disciplines.

I want now to turn to the signature problem. Like the isomorphism problem,

this problem exposes a tension between the structure-based version of the Logicist

Template and the perspicuity of the concept of the knowledge of structure. Unlike

the isomorphism problem, I presently see no way in which to dispel this tension.

The signature problem begins with the mundane observation that knowledge of

the dual interpretability result mentioned in the structure-based version of the

Logicist Template requires knowledge that the natural numbers are a structure

in the signature of the Peano axioms. This is simply due to the fact that for a

theory to be about a structure, it is necessary that the theory and the structure

share the same signature. Indeed, it was this shared signature which permitted

us to diffuse the consistency problem by recourse to the Elementary Result about

Dual Interpretability: for, unless the theory and the structure share a common

signature, it does not make sense to say that the theory is true of the structure.

So, by tracing out the definition of dual interpretability, one sees that knowledge

34

of the dual interpretability result from the structure-based version of the Logicist

Template requires knowledge that the natural numbers are a structure in the

signature of the Peano axioms.

However, it seems that knowledge that the natural numbers are a structure in

this signature as opposed to another signature requires knowledge of the Peano

axioms. For instance, the signature of the Peano axioms is traditionally taken

to contain function symbols for addition and multiplication, and is to be distin-

guished from the signature of first-order Presburger arithmetic, whose only symbol

is the addition symbol.40 It seems indisputable to me that we possess knowledge

that the natural numbers are a structure in the signature of the Peano axioms

and not a structure in the signature of first-order Presburger arithmetic. For, one

can point here to the fact that while the Presburger signature has resources to

express the primality of individual natural numbers, such as five and seven,41 it

does not have the resources to express the concept of primality in general. In

particular, any infinite set of natural numbers definable in the Presburger sig-

nature contains non-prime numbers,42 and since I know that there are infinitely

many prime numbers, I know that the concept of primality is not definable in the

Presburger signature. However, and this is the key point, it seems that when I

examine my reasons for thinking that the signature of the natural numbers is not

the Presburger signature, I advert to my knowledge of number-theoretic truths

such as the infinitude of primes, which are traditionally proven by recourse to the

Peano axioms.43 Hence, this is my reason for thinking that knowledge that the

natural numbers are a structure in this signature as opposed to another signature

requires knowledge of the Peano axioms.44

So I have argued that the knowledge that the natural numbers are a structure

35

in the signature of the Peano axioms as opposed to another signature requires

knowledge of the Peano axioms. However, it seems to me that it is also not un-

reasonable to suppose that if one has knowledge that the natural numbers are a

structure in the Peano signature, then one also has knowledge that the natural

numbers are not a structure in various other signatures, such as the Presburger

signature. Indeed, a capacity to rule out various relevant alternatives seems to

be a hallmark of both the knowledge evinced in mathematical practice and the

knowledge of foundational mathematical principles to which we aspire. For in-

stance, it is common in mathematical practice to regard an argument for the

claim that “All A’s are B’s or C’s” as deficient unless one can rule out alternative

claims like “All A’s are B’s” and “All A’s are C’s.” Hence, what this example

suggests is that mathematical knowledge requires knowledge that various relevant

alternatives do not obtain, and it for this reason that I think it reasonable to

suppose that if one has knowledge that the natural numbers are a structure in the

Peano signature, then one also has knowledge that the natural numbers are not a

structure in various other signatures, such as the Presburger signature.

This all now being in place, I am now in a position to state the thesis which

the signature problem is centered around. This thesis is called the priority thesis,

and it says the following: knowledge of the dual interpretability result from the

structure-based version of the Logicist Template requires knowledge of the Peano

axioms. It is not difficult to see that the priority thesis follows directly from

three claims for which I have presently been arguing. For, I first noted that

knowledge of the dual interpretability result plainly requires knowledge that the

natural numbers are a structure in the signature of the Peano axioms. Then I

argued that knowing that they are a structure in this signature requires knowing

36

that they are a structure in this signature as opposed to some other signature,

such as the Presburger signature. Finally, I argued that knowing that the natural

numbers are a structure in the Peano signature, as opposed to other signatures

such as the Presburger signature, requires knowledge of the Peano axioms. From

these three claims, one can straightforwardly deduce the priority thesis via two

applications of modus ponens.

Having argued for the priority thesis, I can now state the signature problem.

The signature problem is that the priority thesis is inconsistent with one natu-

ral conception of the epistemic role of the structure-based version of the Logicist

Template. On this conception, the structure-based version of the Logicist Tem-

plate is supposed to provide a sufficient condition for knowledge of arithmetical

principles such that this sufficient condition could obtain without prior knowledge

of the arithmetical principles. This, I take it, is part of what is traditionally taken

to be exciting and important about logicism: logicism claims to isolate a type of

logical knowledge which could in principle be used to first arrive at knowledge of

arithmetical principles. However, it is easy to see that the priority thesis implies

that the structure-based version of the Logicist Template cannot fulfill this role.

For, this structure-based version says that knowledge of Hume’s Principle and the

knowledge of a dual interpretability result is such a sufficient condition on knowl-

edge of the Peano axioms. But, the priority thesis plainly says that this sufficient

condition cannot obtain without prior knowledge of the arithmetical principles in

question, namely, the Peano axioms.

One might object that this conception of the epistemic role of the structure-

based version of the Logicist Template asks too much. In particular, one might

suggest that all that should be required is that the structure-based version of

37

the Logicist Template provide a sufficient condition on knowledge of arithmetical

principles, regardless of whether these sufficient conditions could obtain without

prior knowledge of the arithmetical principles. However, I take it that some added

condition such as this is necessary if one wants to separate the kind of sufficient

condition provided by logicism from various other sufficient conditions which are

not at all of interest. For instance, one sufficient condition on knowledge of arith-

metical principles is knowledge of arithmetical and geometrical principles. I take

it that it is obvious that the sufficient condition for knowledge of arithmetical

principles which logicism takes itself to provide are patently different in kind from

this sort of sufficient condition, and it seems that the natural way to distinguish

these two kinds of sufficient conditions is in terms of a requirement that the suf-

ficient conditions on knowledge of arithmetical principles be such that they could

obtain without prior knowledge of the arithmetical principles.

Hence, the signature problem is simply that the structure-based version of the

Logicist Template cannot provide such a sufficient condition, due to the priority

thesis. Further, it seems that several of most straightforward ways in which to

respond to the signature problem do not seem like practicable options. First, one

might suggest that the epistemic role which logicism ought to play is different

in kind from providing sufficient conditions such as these. Second, one might

suggest that knowledge of the signature of the natural numbers does not require

knowing that the signature is not that of various other rival signatures. Third,

one could argue that this latter knowledge does not in turn require knowledge of

the very arithmetical principles which logicism sought to deliver in the first place,

namely the Peano axioms. I do not view any of these straightforward responses as

viable options, and hence I regard the signature problem as a deep and hitherto

38

unanswered challenge to the structure-based version of the Logicist Template.

Prior to closing, it is helpful to explicitly point out why the signature problem

is not a problem for the theory-based versions of the Logicist Template. With

respect to each version, we can ask after that on which this version claims to

base our knowledge of arithmetical principles. Part of what is striking and in-

triguing about the theory-based versions is that these versions purport to base

our knowledge of arithmetical principles on knowledge that is not at all ostensibly

arithmetical in character, namely Hume’s Principle and Frege’s Theorem, which

are, respectively, a principle about the equality of cardinalities and a technical

theorem about the interpretability of two theories. However, the structure-based

version does base our knowledge of arithmetical principles on something whose

arithmetical character is readily apparent, namely, the knowledge that the natu-

ral numbers are a structure in the signature of the Peano axioms. The signature

problem then arises when we inquire as to what we base this latter knowledge

on. I have suggested that we base our knowledge that the natural numbers are a

structure in the signature of the Peano axioms on knowledge of the Peano axioms

themselves.

Finally, it is worth stating for the record one natural response to the signature

problem which should be of no consolation to the structure-based version of the

Logicist Template. One intuition which might emerge in the course of reflecting on

the signature problem is that it is simply misleading to speak about the signature

of the natural numbers, and that whatever the natural numbers are, they aren’t

something that comes equipped with a signature. Since structures by definition

come equipped with signatures, it is obvious that this thought cannot help the

structure-based version of the Logicist Template. However, it is by no means

39

obvious that there is not a hitherto unarticulated version of the Logicist Template

which could somehow provide a signature-free account of our knowledge of the

natural numbers.45

1.5 Conclusions and Directions for Further Research

I want to close by contrasting the nature of the challenges which I have pre-

sented for the theory-based and structure-based versions of the Logicist Template.

While taking recourse to different notions of representation, both these versions

suggest that knowledge of arithmetical principles may be based on knowledge

of logical principles and the knowledge that the arithmetical principles are repre-

sented within the logical principles. The most general version of the Logicist Tem-

plate can then be presented in terms of the following valid argument:

Base Premise: The logical principles are known.Representability Premise: It is known that the arithmetical principles arerepresentable in the logical principles.Preservation Premise: For all principles P and P ∗, if principles P ∗ areknown, and it is known that P is representable in P ∗, then principles Pare known.Conclusion: The arithmetical principles are known.

Expressed in these terms, it seems fair to say that most of the recent literature

on logicism has focused on the Base Premise, i.e., the epistemic status of Hume’s

Principle. It should be evident, but nonetheless bears emphasizing, that nothing

which I have said in this chapter touches on these recent arguments for and against

the logicality (or apriority, or analyticity) of Hume’s Principle.

Rather, expressed in terms of this premise-conclusion argument, my focus in

this chapter has been on the Representability Premise and the Preservation Premise.

The challenges to the theory-based version of the Logicist Template discussed in

§ 1.3 were challenges to the Preservation Premise, since the plethora problem

40

and the consistency problem suggested that in general it is not true that knowl-

edge can be passed along theory-based interpretations in this way. However, the

Representability Premise is patently true on the theory-based version, since as

mentioned in § 1.3, Frege’s Theorem simply says that the Peano axioms are inter-

pretable in Hume’s Principle. But, by the same token, since this is all that Frege’s

Theorem tells us, it is by no means obvious that the Representability Premise is

true in the structure-based case. In particular, I discussed two challenges to this

in § 1.4, namely, the isomorphism problem and the signature problem. While I

described a way in which to dissolve the isomorphism problem, I have suggested

that the signature problem admits of no such dissolution, and that it plainly

shows that to acquire the knowledge required by the Representability Premise, we

advert to knowledge of the Peano axioms, which was the very knowledge which

the above argument sought to secure. However, it is worth noting that while the

structure-based Representability Premise faces this deep problem, the structure-

based Preservation Premise seems demonstrably true. Indeed, as mentioned in

§ 1.4, truth is demonstrably preserved downward under the structure-based no-

tion of representation. These conclusions are summarized in Figure 1.1, where I

indicate that a given problem is a problem for such-and-such a premise of such-and-

such a version of the Logicist Template by writing that problem in the appropriate

entry of the table.

Hence, expressed in terms of the above argument, the primary conclusion of

this chapter is that each version of the Logicist Template considered here contains

at least one problematic premise. For one who accepts this conclusion but who is

still sympathetic to this brand of logicism, I think that the moral of this chapter is

that more work needs to be done on possible notions of representation which could

41

Theory-Based Version Structure-Based VersionRepresentability Premise Isomorphism Problem Isomorphism Problem

Signature Problem Signature ProblemPreservation Premise Plethora Problem Plethora Problem

Consistency Problem Consistency Problem

Figure 1.1. Summary of Problems for Versions of the Logicist Template

support a viable version of the Logicist Template. In particular, it is not clear

that there isn’t some middle ground to be found between the theory-based and

structure-based versions of the Logicist Template. So the challenge here would

be to find some notion of representation which went far enough beyond theories

to overcome the plethora and consistency problems, but which did not do so by

pinning everything down to a particular structure in a particular signature. For

all that I have said in this chapter, it is not at all clear to me that there is not

some notion of representation like this out there capable of supporting a viable

version of the Logicist Template.

42

1.6 Notes

1 The example of the forks and knives is of course Russell’s perennially aptillustration of Hume’s Principle. Formally, Hume’s Principle is a sentence inan expansion of second-order logic by a function symbol # from propertiesto objects (cf. Burgess [15] Chapter 1). That is, the idea is that if F is aproperty, then #F is an object. The notion of a “one-one correspondence”can be formally captured with the idea of a bijection. A map f : F → G isa bijection if it is injective and surjective. The map f : F → G is injective iff(x) = f(x′) implies x = x′, while the map f : F → G is surjective if for everyy ∈ G there is x ∈ F such that f(x) = y. Hence, formally, Hume’s Principleis the following sentence:

∀ F,G #F = #G↔ ∃ bijection f : F → G (1.7)

So, as the right-hand side of Hume’s Principle makes clear, the ambient logic ofHume’s Principle is second-order logic. There are of course several alternativesemantics for second-order logic, as are described in Shapiro [135] Chapter 4or Enderton [32] Chapter 4. In particular, these semantics differ from oneanother in terms of whether the second-order quantifiers range over the entirepowerset of the domain. In this chapter, nothing which I shall say will hingeon these differences. For the purposes of the technical results mentioned inthis chapter, the only essential feature of second-order logic which I shall useis the full comprehension schema, which says that to each formula there cor-responds a property such that the property is predicated of all and only thoseobjects of which the formula holds. The full comprehension schema is neededhere because one of the theorems which I discuss here, e.g. Frege’s Theorem,does not hold if a more restricted version of the comprehension schema isused (cf. Theorem 99 of Chapter 3). Hence, when discussing Hume’s Prin-ciple in this chapter, it shall be assumed that the full comprehension schemais included along with it. This of course is not to say that there are no deepand important philosophical issues surrounding the semantics for second-orderlogic and the status of the full comprehension schema. For the former, seeShapiro [135] Chapter 5 and for the latter, see Dummett [29] Chapter 18 orFeferman [37] pp. 254-258, 289-291. The point of being ambivalent about thesemantics of second-order logic is simply that the philosophical theses which Idiscuss in this chapter do not seem sensitive to the differences between thesealternative semantics. The point of assuming the full comprehension schemahere is that I am interested in understanding the philosophical consequencesof results like Frege’s Theorem, and there are simply less of these if the com-prehension schema is restricted.

2 For a summary of this discussion, see for example, see § 2 of Wright’s essay

43

“Is Hume’s Principle Analytic?” ([161] pp. 7-15, [60] pp. 308-320) or § 8 ofMacBride’s survey ([102] pp. 142-150).

3 Of course, one might consider a qualified version of the Logicist Templatewhich only specified how to proceed from knowledge of Hume’s Principle toknowledge of arithmetic in the case where the knowledge of Hume’s Principlewas appropriately apriori or otherwise logical in character. It will be clearupon reading, but is worth mentioning, that the objections which I suggestto versions of the Logicist Template in §§ 1.3-1.4 apply a fortiori to versionswhich are qualified in this manner.

4 For our purposes, we can take the Peano axioms to be given by the followingaxioms, called the axioms of Robinson’s Q

(Q1) s(x) 6= 0

(Q2) s(x) = s(y)→ x = y

(Q3) x 6= 0→ ∃ w x = s(w)

(Q4) x+ 0 = x

(Q5) x+ s(y) = s(x+ y)

(Q6) x · 0 = 0

(Q7) x · s(y) = x · y + x

(Q8) x ≤ y ↔ ∃ z x+ z = y.

and by the Mathematical Induction Principle:

∀ F [F (0) & ∀ n F (n)→ F (s(n))]→ [∀ n F (n)] (1.8)

The Mathematical Induction Principle is obviously a second-order principle,since it begins with a universal quantifier over properties. As discussed inendnote 1, I will be assuming the full comprehension schema and I will not bemaking any assumptions about the semantics for second-order logic. Hence,what I am describing in his paper as “the Peano axioms” is second-orderPeano arithmetic, as described and studied in e.g. Simpson [138]. This is tobe distinguished from first-order Peano arithmetic, in which the MathematicalInduction Principle is replaced by an infinite schema of formulas, and whichis studied in e.g. Hajek and Pudlak [59]. The reason for focusing on second-order Peano arithmetic as opposed to first-order Peano arithmetic here ispurely for the sake of simplicity: all of the points which I shall be makinghere about second-order Peano arithmetic could also be made with respect tofirst-order Peano arithmetic. This however is not to say that there are not

44

important philosophical considerations surrounding the choice between first-and second-order Peano arithmetic. For a discussion of some of these issues,see Shapiro [135] § 5.3.1.

5 There are a few other contemporary accounts of our knowledge of mathe-matical induction (and the other Peano axioms), although much less has beenwritten about these views than has been written on logicism. For a briefdescription of these views, see the second paragraph of Chapter 2, and es-pecially endnotes 48-49. Of course, the entirety of Chapter 2 is devoted toan examination and defense of an older pre-Fregean view that our knowledgeof mathematical induction is akin to our knowledge of enumerative induction(cf. especially endnote 51).

6 For instance, Demopoulos and Clark stress this aspect, saying:

Frege’s earliest contribution to the articulation of logicism consisted inshowing that the validity of reasoning by induction can be accountedfor on the basis of our general knowledge of principles of reasoningdiscoverable in every domain of inquiry. This directly engages theKantian tradition in the philosophy of the exact sciences, accordingto which principles of general reasoning peculiar to our understandingmust be supplemented by a faculty of intuition if we are to account forarithmetical knowledge. We are inclined today to view the answer toKant as requiring the demonstration that mathematical reasoning– inthis case, reasoning about the natural numbers– is recoverable as partof logical reasoning ([25] p. 138).

7 For instance, in the introduction to the Grundlagen, Frege says: “One will beable to see from this essay that even inferences which are apparently particularto mathematics, like the inference from n to n+1, are based on general logicallaws, so that they do not require particular laws of aggregative thought”([44] p. iv). There are many other statements to this effect in Frege, both inother places in the Grundlagen ([44] § 80 p. 93, § 108 p. 118) and in the essay“Formal Theories of Arithmetic” ([45] p. 104). In his 1983 book, Wright says:

Anyone who accepts the Peano axioms as truths ‘not of our making’must recognise the question of what account should be given of ourability to apprehend their truth. If Frege’s attempt to ground thatapprehension in pure logic were to succeed, we should have an answer–or at least a reduction of the problem to the more general one of theaccounting for our apprehension of logical truths ([158] p. 131, cf.p. xiv).

8 For more precise definitions of structure and theory, see for instance Marker [107]Chapter 1 or Enderton [32] Chapters 1-2. It should also be noted that there is

45

much which is peculiar to the examples given here. For instance, the Zermelo-Fraenkel axioms are recursive sets of axioms, whereas in general a theory neednot be recursive. Likewise, both the real and complex numbers are typicallytaken as structures in the field signature, that is, as structures with distin-guished addition and multiplication functions. However, structures can ingeneral have any signature whatsoever, and for instance need not contain anyfunction symbols.

9 More formally, suppose that M and N are two structures in the same signa-ture. Then Then M and N are said to be isomorphic if there is a one-onemap f from M onto N such that M models ϕ(a1, . . . , an) if and only if Nmodels ϕ(f(a1), . . . , f(an)) for every formula ϕ(x1, . . . , xn) in their shared sig-nature and every tuple of elements a1, . . . , an from M , e.g.:

M |= ϕ(a1, . . . , an)⇐⇒ N |= ϕ(f(a1), . . . , f(an)) (1.9)

10 For a proof of the introduction of coordinates in a very general setting, seeArtin [3] Chapter 2, and for a more restrictive setting, see Hartshorne [62]Theorem 21.1 p. 137. It should be emphasized that these authors do notstate these results in terms of interpretability. In particular, they do notexplicitly state that their way of “recovering” the field in the geometry orvice-versa is definable. However, reading through the proofs, one can easilycheck that everything that they are doing is definable. This is a fact whichis sometimes cited in discussions about interpretability in mathematical logictexts. See, for example, Hodges [70] pp. 222-223 Example 1.

11 Here, as with the interpretability of structures, in order to make this definitionprecise, we would need to present in more detail our notion of proof andtranslation. This definition is time-consuming, and typically takes up one totwo full pages in a typical mathematical logic textbook. See, for example, thepresentations in Lindstrom [99] pp. 96-97 and Hajek and Pudlak [59] pp. 148-149. Hence, for the sake of brevity, we have chosen to exposit this notion herewith the example of the Peano axioms and the Zermelo-Fraenkel axioms.

12 It is a classical result of model theory that the theory of a dense linear orderwithout endpoints is complete, so that if we identify theories which are prov-ably equivalent, then the theory of a dense linear order without endpoints isthe same as the complete theory of the rational numbers as a linear order.See for example, Marker [107] Theorem 2.4.1 p. 48.

13 For more examples of faithful interpretability, see Lindstrom [99] Chapter 6,§ 2, pp. 106 ff.

14 For stronger notions of interpretability, see Visser [148] § 3.3 and Hodges [70]§ 5.3.(c) pp. 222-225.

15 For a proof of Frege’s Theorem, see Wright [158] pp. 154-169 or Boolos [12], or

46

Theorem 27 of Chapter 3. For a discussion of where Frege’s Theorem appearsin the texts of Frege, see Boolos and Heck [14].

16 Heck articulates this objection in a few paragraphs of papers which haveother goals, for instance, a study of the epistemology of counting and anexposition of the elements of Frege’s Theorem ([67], [66]). On Heck’s versionof the objection, the concern is with using a theory-based interpretabilityresult (such as Frege’s Theorem) to secure an inference from the analyticity ofHume’s Principle to the analyticity of the Peano axioms. For, as Heck notes,axiomatizations of geometry are likewise interpretable in Hume’s Principle,and thus one can ask the following rhetorical questions about the analyticityof geometry:

Does it then follow that, on Frege’s view, Euclidean geometry must beanalytic? That would be unfortunate, for Frege explicitly agreed withKant that the laws of Euclidean geometry are synthetic apriori ([66]p. 59).

Frege held that analysis, as well as arithmetic, was analytic. He didnot, however, regard all of mathematics as analytic, since he agreedwith Kant that Euclidean geometry is synthetic a priori. But thetruths of Euclidean geometry can be proven in analysis (given suitabledefinitions). Were Frege’s views inconsistent then? ([67] p. 188).

One can also find versions of this objection in Hoering and Hofweber, each ofwhom is concerned to emphasize the distance between intertheoretic reduc-tion and theory-based interpretability. For instance, Hoering notes that ontheory-based notions of reduction, arithmetic would “reduce all recursivelyenumerable theories” ([71] p. 33, cf. p. 36). Likewise, Hofweber argues:

[. . . ] to see that relative interpretation alone does not provide a theoryreduction (in the present sense of the word), consider the following:take a first order formulation of some physical theory. By the resultmentioned above it is relatively interpretable in some arithmetic the-ory. But, of course, the former theory can not be reduced, in theintuitive sense of the word, to the latter. The former is about physicalobjects, the latter about numbers. The former is not just a specialcase of the latter. So, reduction requires more than just an associationof the relevant formal languages in the right way ([72] p. 132)

17 The easiest way to see this is to note three things. First, by the work of Tarski,the complete theories of the real and complex numbers are complete and re-cursively axiomatizable (cf. Marker [107] Corollary 3.2.3 p. 85 and Corol-lary 3.3.16 p. 97). Second, by formalizing Henkin’s proof of the completeness

47

theorem, one can show that if the Peano axioms prove the consistency of arecursively axiomatizable theory, then they interpret that theory (cf. Lind-strom [99] Theorem 4 p. 99 and Hajek and Pudlak [59] Theorem 2.39 p. 169).Third, it is easy to show by a model construction within Peano arithmeticthat the Peano axioms prove the consistency of the recursively axiomatiz-able fragments of the complete theories of the real and complex numbers (cf.Simpson [138] Theorem II.9.4 p. 97 and Theorem II.9.7 p. 98).

18 For the standard proof using the intermediate value theorem, see Lang [95] pp.272-273. For a complex-analytic proof, see Greene and Krantz [58] Theorem3.4.5 p. 87. For the proof using algebraic topology, see Hatcher [63] Theorem1.8 p. 31.

19 This is a particular instance of a more general response to the types of coun-terexamples to the theory-based versions of the Logicist Template which Ishall consider in this section. All of these counterexamples point out thatin general, it is false that knowledge of the interpreting theory and knowl-edge of the interpretion yields knowledge of the interpreted theory. Thus, itis always open to the advocate of the theory-based version of the LogicistTemplate to respond to such a counterexample by saying that they are notappealing to such a general claim, but rather to some more circumspect claimcentered around a more nuanced notion of interpretation germane to theo-ries. Of course, it is ultimately incumbent on such an advocate to also showthat the Peano axioms (or some other arithmetical theory) is interpretable inHume’s Principle (or some other logical theory) in this more nuanced sense.

20 This follows from the fact that no complete consistent extension of the Peanoaxioms is computable (cf. Tarski et. al. [145] Theorem 9 p. 60). However, ifthe Peano axioms were interpretable in the complete theory of the complexnumbers, then a model of the Peano axioms would be interpretable in thecomplex numbers. But the complete theory of any model in a finite signaturewhich is interpretable in the complex numbers is computable since the com-plete theory of the complex numbers is computable, and likewise for the realnumbers (cf. Marker [107] Corollary 3.2.3 p. 85 and Corollary 3.3.16 p. 97).

21 Boolos proves the consistency of Hume’s Principle relative to the Peano axiomsby showing that the Peano axioms are interpretable in Hume’s Principle ([10],[13] p. 190).

22 However, it should be emphasized that the plethora problem will still bea problem if one allows examples of more contrived theories. For instance,given any initial theory, one can always form a composite theory by gluingthe initial theory to the Peano axioms. For instance, this composite theorywould say: there are only two types of things, numbers and non-numbers,and while the numbers obey the Peano axioms, the non-numbers obey theaxioms of the initial theory. Moreover, if the initial theory is interpretable in

48

the Peano axioms, then the composite theory will be mutually interpretablewith the Peano axioms and hence with Hume’s Principle (by Frege’s Theoremand the aforementioned result of Boolos). However, it seems to me thatit is uncharitable to use such contrived examples against the logicist. For,presumably the idea behind the Logicist Template is that the theories orprinciples in question are intended to be theories or principles which we havesome stake in and whose truth we are interested in assaying, and it does notseem that these composite theories are theories in this sense.

23 For, Dedekind showed that by beginning with a model which satisfies all thePeano axioms besides the Mathematical Induction Principle, one can obtain amodel of all the Peano axioms by restricting the domain to all those numberswhich satisfy the Mathematical Induction Principle. In the terminology ofWas sind und was sollen die Zahlen?, Dedekind begins with a infinite systemand obtains a simply infinite system by the judicious use of chains. (,,72. Satz.In jedem unendlichen System S ist ein einfach unendliches System N als Teilenthalten“ ([23] vol. 3 pp. 359-360). For the other direction of the result, amodel of the anti-Peano axioms needs to be uniformly defined within eachmodel of the Peano axioms. This can be done as follows: working within thePeano axioms, build a model whose domain is an isomorphic copy of the nat-ural numbers plus some other element, call it an infinite number. Then definezero and successor on the natural numbers as normal and define the successorof the infinite element to be itself (and adjust the definition of addition andmultiplication by this infinite element accordingly). Then this model satisfiesthe axioms of Robinson’s Q (as described in endnote 4). Further, since onecan easily prove by the Mathematical Induction Principle that no numberis its own successor, this model fails to model the Mathematical InductionPrinciple.

24 It is important to note that the case which I have made against the theory-based version of the Logicist Template centered around mutual interpretabilityis stronger than the case which I have made against the version of the Logi-cist Template centered around mutual faithful interpretability. Even thoughboth arguments are focused around the consistency problem, the counterex-ample in the latter case (i.e. the example of the dense linear order with andwithout endpoints) is more contrived than the counterexample in the formercase (i.e. the example of the Peano axioms and the anti-Peano axioms). Itseems to me that the case against the version centered around mutual faith-ful interpretability would be made stronger to the extent that the examplesillustrative of the consistency problem were less contrived and closer to the ex-amples which the logicist might actually be concerned with, such as the Peanoaxioms and Hume’s Principle. This, of course, is related to the point aboutcontrived theories made in endnote 22. However, the problem here is that veryfew non-trivial natural examples of mutually faithfully interpretable theories

49

are presently known. For instance, it seems to be presently unknown whetherthe Peano axioms and the anti-Peano axioms are mutually faithfully inter-pretable, even though one can say something about the unfaithfulness of par-ticular interpretations. Likewise, it seems to be presently unknown whetherHume’s Principle is mutually faithfully interpretable in the Peano axioms (al-though again one can say something about the unfaithfulness of particularinterpretations). I suspect that Hume’s Principle is not mutually faithfullyinterpretable in the Peano axioms. For, it is relatively easy to come up withsentences which are independent of Hume’s Principle (e.g. the sentence thatevery object is the cardinality of some property). However, presently thereare very few known examples of sentences which are independent of the Peanoaxioms, and the faithful interpretability of Hume’s Principle in the Peano ax-ioms would allow one to automatically transfer any independence result aboutHume’s Principle into an independence result about the Peano axioms. For,faithful interpretability would require that non-provability facts about Hume’sPrinciple be mirrored by non-provability facts about the Peano axioms

25 The particular weak set of arithmetical principles which Nelson was concernedwith were the axioms of Robinson’s Q (cf. endnote 4). Nelson’s “compatibilityproblem” was then the problem of determining whether Q+ϕ+ψ is mutuallyinterpretable with Q whenever Q + ϕ and Q + ψ are mutually interpretablewith Q ([117] p. 63). Supposing that Robinson’s Q is consistent, it cannot bemutually interpretable with an inconsistent theory like Q + ϕ + ¬ϕ. Hence,under this supposition, a positive resolution to the compatibility problemwould have had the implication that at least one of Q+ ϕ and Q+¬ϕ is notmutually interpretable with Q. However, Kalsbeek later found examples ofsentences ϕ such that Q + ϕ and Q + ¬ϕ are mutually interpretable with Q(cf. Iwan [78] p. 151, Buss [17] p. 194).

26 For a proof of the Guaspari-Lindstrom theorem and bibliographic references,see Lindstrom’s book [99] Theorem 6 pp. 103, 115. For an example of thistheorem being cited by a set-theorist, see Steel [38] p. 427.

27 For instance, consider the opening paragraph of Tait’s famous paper on finitism:

The crux to understanding Hilbert’s conception of finitist mathematicsis this question: In what sense can we prove general propositions, suchas ∀ x, y x+ y = y + x about the natural numbers, without assumingthe infinitude of numbers or some other infinite totality? For, if thereis to be nontrivial finitist mathematics, one must be able to prove suchpropositions. Indeed, Hilbert was concerned with consistency proofsfor formal systems, which are proofs of just this sort ([143] p. 524)

It is of course not lost on set-theorists such as Steel that there is a connectionbetween Π0

1-sentences and Hilbert’s program. Indeed, where Steel invokes the

50

Guaspari-Lindstrom theorem, he describes it as part of an “instrumentalistdodge” ([38] p. 423).

28 So this response to the plethora problem has focused on the example whichI discussed in regard to the plethora problem in § 1.3, namely, the exam-ple of the Fundamental Theorem of Algebra. Hence, outside of disputingmy discussion of how the structure-based version handles this example, oneway in which to insist that the plethora problem is nonetheless a problemfor the structure-based version would be to suggest that the structure-basedversion flounders in its treatment of other examples. I cannot presently seeany relevant difference between the example of the Fundamental Theorem ofAlgebra and other examples, such as the examples of Euclidean geometry andcodifications of physical theory discussed by Heck, Hofweber, and Hoeringand mentioned in endnote 16. However, I would be remiss to suggest thatI have done anything here besides discuss how the structure-based versioncan respond to a few examples which seem to me to be representative of thegeneral problem. In particular, I do not purport to have given a proof thatthe plethora problem cannot reemerge for the structure-based version.

29 Here is the proof of the result that if theory T ∗ is true of structure M∗ andif theory T about structure M is dual interpretable in theory T ∗ about struc-ture M∗, then theory T is true of structure M . By definition, such dualinterpretability simply means that T is interpretable in theory T ∗ and struc-ture M is interpretable in structure M∗, and the definitions used in bothinterpretations are the same. Since T is interpretable in T ∗, every modelof T ∗ uniformly interprets a model of T , where the “uniformly” means thatthe same definitions are used each time. Hence, the model M∗ of T ∗ uniformlyinterprets a model N of T . But by the hypothesis of dual interpretability, theway that M∗ defines N is exactly the same way that M∗ defines an isomorphiccopy of M . Hence, N is identical to an isomorphic copy of M , and since N isa model of T , we have that M is a model of T .

30 However, it should be clear that if we do not insist on pinning everythingdown to a particular structure, then the consistency problem will reemerge.For instance, using the material from § 1.3, it is easy to find examples wheretheory T + ϕ about structure M is dually interpretable in theory T + ¬ϕabout structure M∗, and vice-versa. The reason that these examples are nottroublesome in this structure-based setting is that here our knowledge of ϕis elliptical for our knowledge that ϕ is true on some fixed structure, so thatthere is no contradiction between my knowing that ϕ is true on M and falseon M∗.

31 For an example of an author who seems to endorse a qualified version of thisthesis, see endnote 35.

32 Here I am making what I regard to be a natural assumption about our knowl-

51

edge of existential statements. In particular, I am assuming that in this settingour knowledge that ∃ x Fx is induced by there being some a such that it isknown that Fa. In particular, the knowledge in question here is knowledgethat “M is interpretable in M∗.” By definition, this is knowledge that “thereis a structure N such that M is isomorphic to N and N is definable in M∗.”I am assuming that in this setting we would acquire this knowledge by therebeing some N such that it is known that “M is isomorphic to N and that Nis definable in M∗.”

33 To be sure, there is a weaker type of invariance which is present in the case ofdefinability, in that it is demonstrably true that if one structure is definable inanother, then any isomorphic copy of the first is definable in some isomorphiccopy the second (and likewise any isomorphic copy of the second defines someisomorphic copy of the first).

34 This is Dedekind’s letter to Weber from January 24, 1888, where Dedekindexpresses some doubt about defining natural numbers in the manner suggestedby Frege at the end of § 68 of the Grundlagen, saying:

Something similar holds of the definition of cardinal number as a class;much will be said of this class (for instance, that it is a system ofinfinitely many elements, namely a system of all those systems withwhich it is bijective (ahnlich)) which one would most certainly assertmost reluctantly of the number itself; indeed, does anyone actuallyconsider, or would it not be better forgotten, that the number four isa system of infinitely many elements?” ([23] vol. 3 p. 490).

In Chapter 2 of his 1910 On the Concept of Substance and the Concept ofFunction, Cassirer presents a Dedekind-inspired critique of Frege, saying, forinstance: “[. . . ] for what is here logically deduced does not coincide with theactual sense which we attach to judgements of number in everyday cognition”([18] p. 62). Cassirer’s critique of Frege’s later indirectly influenced Benacer-raf’s dissertation on logicism (cf. [7] p. 162 fn), and one can see some of thisinfluence both in Benacerraf’s essay on Frege, where for instance Benacerraftells us that what the logicist needs is “[. . . ] an argument that the sentencesof arithmetic, in their preanalytic senses, mean the same (or approximatelythe same) as their homonyms in the logicist system” ([8], [24] p. 46).

35 For instance, Resnik says that “no mathematical theory can do more thandetermine its objects up to ismorphism” ([128] p. 529). Resnik elaborates onthis thought more extensively in his book, putting it in terms of our capacityto describe structures:

But no mathematical description of a pattern– not even one by meansof a categorical set of axioms– will differentiate its occurrences within

52

other patterns from each other or from its occurrences in isolation;unless the description also states that the pattern occurs within acertain containing pattern. For mathematics only describes structuresup to isomorphism, except when it describes them as embedded inother structures ([129] p. 220).

Even though Resnik does not cast this claim about our capacity to describestructures in epistemic terms, it seems obvious that he is talking about ourcapacity to accurately describe structures, and so it seems not unreasonable tointerpret the first part of this quotation as an endorsement of the invariancethesis. It is, however, important to note the last qualification in the quotationfrom Rensik. For, in this last qualification it seems that Resnik permits oneand only one exception to the invariance thesis, namely, the case of the defin-ability of one structure within another, which is precisely the case at issue inthe isomorphism problem. However, Resnik does not go onto explain why thecase of definability should be an exception to what I have called the invariancethesis. The explanation which I will go onto offer is that while the invariancethesis may be true with regard to intrinsic properties of structures, it is nottrue with respect to relational properties of structures.

In contrast to Resnik, it does not seem that structuralists such as Shapiroand Parsons are committed to anything like the invariance thesis. It is ofcourse difficult to provide succinct textual evidence for a negative existential,but something in Shapiro’s writings which is suggestive of this is his remarkthat the identity relation between structures is a matter of convenience: “Likethe identification of places from different structures [. . . ], the identity relation[between structures] we need is more a matter of decision or invention, basedon convenience, than a matter of discovery” ([136] p. 92). In the case ofParsons, while there are some places where he indicates a desire to focus onproperties which are invariant under isomorphism ([120] p. 75), there is by nomeans any sort of endorsement of the invariance thesis. Something indicativeof this is the different reading which Parsons gives to the admittedly ambigu-ous phrase of “specify up to isomorphism.” In the invariance thesis, and in theabove quotation from Resnik, the idea is that our knowledge of structures isonly of those properties which are invariant under isomorphism. However, Par-sons instead focuses attention on our capacity to describe a class of structuressuch that any two structures in this class are isomorphic: “[. . . ] the naturalnumbers are at least determinate up to isomorphism: If two structures answerequally well to our conception of the sequence of natural numbers, they areisomorphic. I will call this latter thesis the Uniqueness Thesis” ([120] p. 272).

It seems that what Parsons calls the uniqueness thesis is independent ofwhat I have called the invariance thesis. For instance, to anticipate some ofwhat I shall say about the distinction between intrinsic and relational prop-erties of structures, it seems that the uniqueness thesis could succeed and the

53

invariance thesis could fail if one had a theory such that (i) one knew thatall of its models were isomorphic and such that (ii) one knew that only someof its models were computable (or constructible, or Borel). Likewise, withregard to some particular theory (say the Peano axioms or Zermelo-Fraenkelaxioms for set theory), one might be convinced of the invariance thesis andyet be equally convinced by the Lowenheim-Skolem theorems that not all ofthe models of this theory are isomorphic to one another. Or, to take a morepedestrian example, one might simply note that there are both finite and in-finite models of the theory of groups, and yet still think that any property ofa particular group of which one knows conforms to the invariance thesis.

36 For instance, Parsons says: “By this [the structuralist view] I mean the viewthat reference to mathematical objects is always in the context of some back-ground structure, and that the objects involved have no more by way of a‘nature’ than is given by the basic relations of the structure” ([120] p. 40).Shapiro puts this point in terms of a dependence thesis: “Each mathematicalobject is a place in a particular structure. There is thus a certain priorityin the status of mathematical objects. The structure is prior to the math-ematical objects it contains, just as any organization is prior to the officesthat constitute it” ([136] p. 78). Shapiro thinks that this dependence thesisimplies that judgments of identity between mathematical objects in differ-ent structures is illegitimate, saying: “But it makes no sense to pursue theidentity between a place in the natural-number structure and some other ob-ject, expecting there to be a fact of the matter” ([136] p. 79). In his originalarticle on structuralism, Resnik concedes the following: “I have discussed sev-eral equivalence relations between patterns– congruence, mutual occurrence,equivalence– but have failed to propose any identity conditions for patterns.I will not; and that brings me to what I find the most difficult point of mytheory– the restriction of identity to within patterns” ([128] p. 536). In hislater book, Resnik is adamant on this point, saying: “[. . . ] restricting identityto positions in the same pattern goes hand in hand with their failure to haveany identifying features independently of a pattern” ([129] p. 211).

37 To put it very roughly, one set of natural numbers X is Turing computablefrom another set of natural numbers Y if there is a fixed program which, givenany input n and allowed access to arbitrarily large initial segments of Y , candetermine if n is in X. (For more details, see the definition of X ≤T Yin Soare [139] § III.1). Likewise, to give a very rough sketch, one set X isconstructible relative to another set Y if X is in Y or X is a definable subsetof Y , or X is a definable subset of a definable subset of Y etc., where thisprocess is iterated along all the ordinals. (For more details, see the definitionof L(Y ) in Jech [80] p. 193). Finally, a subset X of a topological space Y issaid to be Borel if X is one of the open subsets of Y , or X is the complementof one of these subsets, or X is a countable union of one of these two kinds of

54

sets, etc., where this process is iterated along all the countable ordinals. Ofcourse, while one can define Borel-ness relative to any topological space Y ,most of the theorems of descriptive set theory hold only if Y is a Polish space,i.e., it is separable in that has a countable dense set, like the rationals inthe reals, and it is completely metrizable in that there is a metric giving thetopology such that every Cauchy sequence converges, as in the real numbers.(For more details, see Kechris [84] §§I.3, II.11)

38 For an introduction to computable model theory, inner model theory, andthose parts of descriptive set theory concerned with Polish and Borel struc-tures, see for example Harizanov ([61]), Mitchell ([111]), and Gao ([51] Chap-ter 2) and Montalban-Nies ([112]).

39 Of course, in some particular cases, it may happen that isomorphisms preservethese properties. For instance, in the case of constructibility, a simple caseof Godel’s Condensation Lemma says that if X is transitive and (X,∈) isisomorphic to (Lα,∈) for a limit ordinal α, then X = Lα, and so X too isconstructible (cf. [26] Theorem 5.2 p. 80).

40 The signature of first-order Presburger arithmetic is the signature of the struc-ture (ω,+). It is clear that the ordering is definable in this structure, sincex ≤ z if and only if ∃ y x + y = z. Likewise, the constants zero and one aredefinable in this structure since zero is less than all the other natural numbers,and one is the least non-zero natural number. This signature is named afterPresburger, who in 1930 gave a complete axiomatization of this structure.This structure, and Presburger’s axiomatization, admit quantifier eliminationif one adds both unary function symbols Pn(x) for divisibility by n and sym-bols for ≤ and 0 and 1. That is, in this enriched signature, every definableset is definable by a quantifier-free formula, and this fact is registered in Pres-burger’s axiomatization. For a proof of this, along with the completeness ofPresburger’s axiomatization, see Marker [107] pp. 81 ff.

41 Let me explain in more detail what I mean when I say that the Presburgersignature has the resources to express the primality of individual natural num-bers. For each natural number n, consider the formula mentioned in the aboveendnote for divisibility by n, namely

Pn(x) ≡ ∃ y y + · · ·+ y︸︷︷︸n times

= x (1.10)

Using these formulas, for each natural number n, we can find sentences ϕn inthe Presburger signature such that (ω,+) |= ϕn if and only if n is prime. Inparticular, we can choose the sentence

ϕn ≡∨m<n

¬Pm(1 + · · ·+ 1︸︷︷︸n times

) (1.11)

55

(Here, of course, we use the fact, mentioned in the previous endnote, that thenumber one is definable in the Presburger signature). It seems to me quiteintuitive to say that ϕn expresses the primality of n: for, quite plainly, itssyntax says that for any natural number m < n, it is not the case that mdivides n.

But, it is important to note that for every subset X of natural numbers,there is a sequence of sentences ψn such that (ω,+) |= ψn if and only if nis in X. For instance, just let ψn say that 0 = 0 if n is in X and let ψnsay that 0 = 1 if n is not in X. However, this does not guarantee that thePresburger signature has the resources to express the X-ness of individualnatural numbers. For, there is a syntactic uniformity in the case of the ϕn’swhich is lacking in the case of the ψn’s.

42 Let me briefly explain why this is. Suppose that X is an infinite subset ofnatural numbers which is definable in the structure (ω,+). By the quantifier-elimination result mentioned two endnotes ago, this set is defined by a quantifier-free formula in the structure (ω, 0, 1,+,≤, Pn). Then the result follows by anenumeration of cases. For instance, if n > 1 then formula Pn(x) holds ofnon-primes since 2n is non-prime, and likewise the formula ¬Pn(x) holds ofmany non-primes it holds of p2 for any prime p > n.

43 It is not obvious that one can make any stronger claim here. For instance,it is by no means obvious what it even means to say that the such-and-such a result must be proven by means of the Peano axioms. Hence, eventhough sometimes I shall put the point by saying that knowledge of the Peanoaxioms is required for knowledge of arithmetical signature, this should beunderstood as elliptical for the more circumspect claim that our knowledgeof arithmetical signature is as a matter of fact based on our knowledge of thePeano axioms. So one way in which to disagree with this claim would be toillustrate some alternative manner in which we could secure knowledge of thearithmetical principles which grounds our knowledge of signature. Further, itshould be mentioned that one does not use all of the Peano axioms to provethe arithmetical results in question here. This, of course, is obvious, simplybecause both first- and second-order Peano arithmetic (cf. endnote 3) arenot finitely axiomatizable (cf. Hajek-Pudlak [59] Corollary III.2.24 p. 164,Simpson [138] Corollary VII.7.8 p. 306), and hence one does not use all ofthe Peano axioms to prove any particular result or any finite set of particularresults. However, in the text I will pass over this, as I don’t think that thisaffects the overall point which I am making about the signature problem. Ifone so desires, wherever in the text I say things like “knowledge of arithmeticalsignature requires knowledge of the Peano axioms,” one can replace this with“knowledge of arithmetical signature requires knowledge of some non-trivialfinite segment of the Peano axioms.”

56

44 I have focused the discussion around the Presburger signature, but there areother signatures which one could have likewise used. For instance, it is likewiseeasy to find many natural arithmetical concepts which are not expressible inthe signature which just contains successor s(x) = x + 1, since any definablesubset of (ω, s) is finite or its complement is finite (cf. [107] Exercise 3.4.3p. 104). Similarly, one might consider the second-order structure (ω, P (ω), s)where one is only allowed to quantify over subsets of natural numbers butnot over subsets of pairs of natural numbers or subsets of triples of naturalnumbers etc. Buchi showed that the subsets of natural numbers which aredefinable in this structure correspond to subsets which are recognizable bycertain finite automata (cf. [90] Theorems 3.10.3-3.10.4 pp. 201-202).

45 There are various extant projects in the philosophy of mathematics whichseek to render theories and structures free from particular signatures, and Iwant briefly to indicate some of my reservations about the extent to whichthese projects could be successfully imported into the setting of the Logi-cist Template. In developing his brand of structuralism, Resnik expressedsome reservations about having to describe structures as structures within aparticular signature, saying:

Most mathematicians and logicians would regard number theory de-veloped in a language in which the successor symbol is primitive asessentially the same as a development taking the less than symbolas primitive. Since I am viewing number theory as the science of acertain pattern or patterns, this would suggest that (N,S) and (N,<)[the natural numbers with successor, and the natural numbers with lessthan] should count as the same or essentially the same pattern. [. . . ]Moreover, they are not isolated examples of non-isomorphic structureswhich mathematicians view as essentially the same: we have BooleanAlgebras in the form of rings, but also in the form of lattices, alterna-tive definitions of groups and topologies, and so on ([128] p. 535, cf.[129] pp. 207-208).

Resnik later suggests a way to handle this problem by defining a certain equiv-alence relation on structures. In particular, Resnik says that two structuresare equivalent if they have the same domain and the constants, relations, andfunctions of the one are definable in the other, and vice-versa ([128] p. 536,[129] pp. 208-209). (Note that the notion of pattern occurrence which Resnikuses is explicitly cast in terms of definability ([128] p. 533, [129] p. 205)). Oneconcern which I have is that this is too fine-grained an equivalence relation forthe purposes of the structure-based version of the Logicist Template. For in-stance, no two of the following three structures are pairwise equivalent: (ω, s),(ω,+), and (ω,+,×). For, the even numbers are definable in (ω,+), while

57

any definable subset of (ω, s) is finite or cofinite (cf. endnote 41). Likewise,multiplication is not definable in (ω,+) since if we could define multiplication,then we could define the set of primes (cf. endnote 39). Hence, if one modifiedthe structure-based version of the Logicist Template so that it was phrasedin terms of equivalence classes of structures, then one could still ask why thenatural numbers are a structure in the equivalence class of (ω,+,×) and notin the equivalence class of (ω,+).

Another extant project which seems relevant to the signature problem isFeferman and Lavine’s notion of a schematic theory ([36] § 1.4 pp. 6 ff, [96]§ 5.7 pp. 117 ff). The idea here is that one begins with an initial theory Twhich contains schemata, in the way that first-order Peano arithmetic containsthe mathematical induction schema (cf. endnote 3). The schematic theory T ∗

associated to T is then a mapping from L-signatures extending the signatureof T to an L-theory extending T which contains all the new instances of T ’soriginal schemata. For instance, in the case of first-order Peano arithmetic,the passage from T to T ∗ would reflect a disposition to accept an instanceof the mathematical induction schema regardless of this instance’s signature.Feferman and Lavine’s method is admittedly an elegant way to avoid par-ticular signatures in the case of theories. However, it is not clear what thepossible analogue of this in the case of structures would be. Of course thegeneral idea would be to begin with a particular structure M and then definethe schematic structure M∗ to be a mapping from L-signatures extending thesignature of M to an L-structure which expands M . In the case of Fefermanand Lavine’s notion of a schematic theory, it is obvious how to define theextension of T to the new signature: one just adopts all the new instances ofT ’s original schemata. However, it does not seem like there is an analogue ofthis move in the case of structures. For instance, when I add a new unaryrelation symbol to M ’s signature, I have to choose a particular subset of Mto answer to this new symbol, and it is not obvious that anything about Mis going to be able to guide me in this choice in the way that T guides theexpansion to the new signature.

58

CHAPTER 2

EMPIRICISM, PROBABILITY, AND KNOWLEDGE OF ARITHMETIC:

A PRELIMINARY DISCUSSION

2.1 Introduction: Inceptive and Amplificatory Empiricism

The topic of this chapter is the tenability of a certain type of empiricism about

our knowledge of the Peano axioms. The Peano axioms constitute the standard

contemporary axiomatization of arithmetic, and they consist of two parts, a set

of eight axioms called Robinson’s Q, which ensure the correctness of the addition

and multiplication tables, and the principle of mathematical induction, which says

that if zero has a given property and n + 1 has it whenever n has it, then all

natural numbers have this property.46 The type of empiricism about the Peano

axioms which I want to consider here does not claim that perception can pro-

vide us with knowledge of these axioms in the same way that perception can

provide us with knowledge of the properties of middle-sized objects. Rather, the

idea behind the type of empiricism which I want to consider is that arithmetical

knowledge is akin to the knowledge by which we infer from the past to the fu-

ture, or from the observed to the unobserved. It is not uncommon today to hold

that such inductive inferences can be rationally sustained by appeal to informed

judgments of probability. The goal of this chapter is to articulate and evaluate

an empiricism which contends that the Peano axioms can be fully justified by

recourse to judgements of probability.

59

This empiricism merits our attention primarily because there are few contem-

porary accounts of our knowledge of the Peano axioms, and those accounts which

we do have seem to face deep problems. Logicism, for example, suggests that

knowledge of the Peano axioms may be based on knowledge of ostensibly logical

principles– such as Hume’s Principle– and the knowledge that the Peano axioms

are representable within these logical principles. The success of logicism thus

hinges upon identifying a concept of representation which can sustain this infer-

ence, and as I have argued elsewhere, it seems that we presently possess no such

concept.47 Alternatively, some structuralists have suggested that knowledge of

the Peano axioms may be based on our knowledge of the class of finite structures.

However, this account then owes us an explanation of why the analogues of the

Peano axioms hold on the class of finite structures: why, for example, there is no

finite structure which is larger than all the other finite structures.48 Finally, it

has been recently suggested that the natural number structure is itself perceiv-

able in a way which would justify the Peano axioms, or would at least justify the

satisfiability of these axioms.49 However, it seems difficult to see how such an infi-

nite structure is perceivable in anything like the same sense in which middle-sized

objects are perceivable, and thus this account owes us some explanation of what

the perception-like relation is which we bear to the natural number structure, and

why this perception-like relation should be a source of justification, despite the

manifest differences between it and our ordinary modes of perception.50

The second reason that a probability-based empiricism about the Peano ax-

ioms merits our attention is that it has been suggested in different ways by both

historical and contemporary sources. For instance, prior to Frege, a not uncom-

mon view seems to have been that mathematical induction was an empirical truth

60

akin to enumerative induction. This is why Kastner thought that mathematical

induction was not fit to be an axiom,51 and this is part of the background to Reid’s

begrudging concession that “necessary truths may sometimes have probable ev-

idence.”52 However, some contemporary authors writing on the epistemology of

arithmetic and arithmetical cognition have also suggested views related to this.

For instance, Rips and Asmuth– two cognitive scientists who work on mathe-

matical cognition– have recently considered the suggestion that “the theoretical

distinction between math[ematical] induction and empirical induction” is not as

clear as has been claimed, and that “even if the theoretical difference were se-

cure, it wouldn’t follow that the psychological counterparts of these operations

are distinct.”53 Finally, in the course of their work on the epistemic propriety of

randomized algorithms, Gaifman and Easwaran have both suggested the possibil-

ity of extending the notion of probability that they employ to broader issues in

the epistemology of arithmetic.54

So, as with Gaifman and Easwaran, the empiricism about arithmetical knowl-

edge that I want to consider is centered around the notion of a probability as-

signment and the associated confirmation relation. A probability assignment is

a mapping P from sentences in a fixed formal language to real numbers which

satisfies the following three axioms (cf. [74] pp. 20 ff, [30] pp. 35 ff):

(P1) P (ϕ) ≥ 0

(P2) P (ϕ) = 1 if |= ϕ

(P3) P (ϕ ∨ ψ) = P (ϕ) + P (ψ) if |= ¬(ϕ & ψ)

In what follows, all the probability assignments under consideration shall be as-

sumed to have a domain which includes all the sentences in the language of the

Peano axioms. Further, it shall be assumed that the consequence relation |= in

61

axioms P2-P3 is the logical consequence relation from first-order logic, so that |= ϕ

holds if and only if ϕ is true on all models. The notion of confirmation is then

defined as an increase of the probability of a hypothesis conditional on evidence

relative to the background knowledge. That is, hypothesis h is said to be confirmed

by evidence e relative to background knowledge k if P (h|e & k) > P (h|k), assum-

ing that the conditional probabilities P (h|e & k), P (h|k) are defined, where these

conditional probabilities are given by the equation P (h′|e′) = P (h′ & e′)P (e′)

. Further,

the hypothesis h is said to be confirmed tout court by evidence e if P (h|e) > P (h),

where again it is assumed that this conditional probability is defined. Hence it

is easy to see by standard manipulations of P1-P3 that to establish that a hy-

pothesis is confirmed by evidence relative to background knowledge, it suffices to

show that (i) the hypothesis and the background knowledge jointly logically im-

ply the evidence and that (ii) the conjunction of the evidence and the background

knowledge is assigned non-zero probability which is strictly less than the proba-

bility assigned to the background knowledge. Likewise, to show that a hypothesis

is confirmed tout court by evidence, it suffices to show that the hypothesis logi-

cally implies the evidence and that the evidence is assigned a non-zero probability

strictly less than one.55

Since there are two parts to the Peano axioms– namely Robinson’s Q and math-

ematical induction– so there are two complementary forms of empiricism which I

want to consider here, which I call inceptive empiricism and amplificatory empiri-

cism. Amplificatory empiricism contends that one is justified in inferring from the

antecedent of an instance of mathematical induction to its consequent, relative

to the background knowledge consisting of the conjunction of the eight axioms of

Robinson’s Q, because the consequent is confirmed by the antecedent relative to

62

this background knowledge. Since in conjunction with the eight axioms of Robin-

son’s Q, the consequent of such an instance (the claim that all numbers have a

given property) logically implies its antecedent (the claim that zero has this prop-

erty and that n+ 1 does whenever n does), it then follows that the consequent is

confirmed by the antecedent relative to the background knowledge consisting of

the conjunction of the eight axioms of Robinson’s Q if the conjunction of these

eight axioms and the antecedent is assigned a non-zero probability strictly less

than the probability assigned to the conjunction of these eight axioms. Hence,

were one to accept amplificatory empiricism, then there would be a straightfor-

ward connection between justification and probability, according to which one

would be justified in inferring from the antecedent of an instance of mathematical

induction to its consequent, against the background knowledge of the eight axioms

of Robinson’s Q, because of the probabilities assigned to these sentences.56, 57

Whereas amplificatory empiricism is a claim about how one may rationally

proceed from Robinson’s Q to mathematical induction, inceptive empiricism is a

claim about how one may rationally arrive at Robinson’s Q in the first place. In

particular, inceptive empiricism is the contention that one is justified in inferring

from several instances of the axioms of Robinson’s Q to these axioms themselves

because the axioms are confirmed tout court by the conjunction of these several

instances. For instance, Robinson’s Q includes the axiom ∀ x, y [x(y+1) = xy+x],

and inceptive empiricism claims that confirmation justifies one in inferring to this

axiom from several of its instances, such as 6(7 + 1) = 6 · 7 + 6. Let us call

this type of confirmation, wherein a universal claim is confirmed by several of

its instances, instance confirmation. Further, in the case where the claims in

question are arithmetical in character (resp. physical in character) let us call this

63

type of confirmation arithmetical instance confirmation (resp. physical instance

confirmation). So inceptive empiricism contends that the axioms of Robinson’s Q

can be justified by means of arithmetical instance confirmation.58

It is important to emphasize that inceptive empiricism and amplificatory em-

piricism are independent of one another.59 For instance, inceptive empiricism

relies on arithmetical instance confirmation in a way in which amplificatory em-

piricism does not, and hence were instance confirmation to be found to be somehow

inimical to justification, this would tell only against inceptive empiricism. Like-

wise, it does not seem irrational to endorse inceptive empiricism in addition to

some logicist account of the justification of mathematical induction,60 and hence

commitment to inceptive empiricism does not seem to demand commitment to

amplificatory empiricism. However, despite this independence, these two forms of

empiricism are complementary, in that they combine to give us a probabilistic ac-

count of the justification of the Peano axioms. In particular, inceptive empiricism

gives us a probabilistic route by which to proceed from a warrant for individual

quantifier-free truths about the natural numbers to a warrant for the axioms of

Robinson’s Q, and likewise amplificatory empiricism gives us a probabilistic route

by which to proceed from a warrant for the axioms of Robinson’s Q to a warrant

for instances of mathematical induction.

The goal of this chapter is to defend these two forms of empiricism against three

types of challenges, and in doing so to defend the tenability of the probabilistic

account of the justification of the Peano axioms which is jointly provided by these

two forms of empiricism. The first type of challenge is common to both forms

of empiricism, and stems from the fact that both of these forms of empiricism

presuppose that confirmation is a source of justification. The problem with this is

64

that there are reasons peculiar to the setting of arithmetic which suggest that we

do not have access to probability assignments and their associated confirmation

relations in this setting. One such reason has to do with different versions of

countable additivity, each of which provides a rule for calculating the probability

of non-propositional logical connectives, and another such reason has to do with

the non-computability of the probability assignments themselves. In the case of

countable additivity, my response in § 2.2.1 is to note that the particular version

of countable additivity which gives rise to this challenge is not a consequence of

the conception of rationality commonly associated with Dutch Book Arguments.

In the case of computability, I respond in § 2.2.2 by arguing that the tension

between the non-computability of probability assignments and the tractability of

rational belief dissipates if one views rational beliefs as being reflected by a family

of probability assignments, as opposed to a single probability assignment.

A second series of challenges are specific to the arithmetical instance confir-

mation upon which inceptive empiricism relies. In § 2.3.1, I discuss the first of

these challenges, which is due to Baker ([5]), who suggests that arithmetical in-

stance confirmation is alternatively unreliable or insufficiently diverse because it

relies upon small samplings. On the score of unreliability, my response is that

physical instance confirmation displays a similar reliance and yet fails to display

a similar unreliability, while on the score of insufficient diversity, my response is

simply that on three extant analyses of evidential diversity, arithmetical instance

confirmation is not insufficiently diverse. In § 2.3.2, I discuss a second challenge to

arithmetical instance confirmation, which suggests that it is objectionable because

it is unstable, where evidence for a universal hypothesis about a domain of objects

is said to be unstable if there are particular objects from the domain such that this

65

evidence can be bettered by the additional evidence that these objects satisfy the

hypothesis. My response to this challenge is to suggest that while stability may

be a virtue with regard to geometric reasoning, it is not a virtue in arithmetical

reasoning.

A final type of challenge is specific to amplificatory empiricism, and consists in

the challenge of explaining why the inference from the antecedent of an instance

of mathematical induction to its consequent is better than certain alternative

inferences. Recall that the consequent of an instance of mathematical induction

says that all numbers have a given property, and that the antecedent says that

zero has a property and that n + 1 has this property whenever n does. For

the sake of disambiguation, let us call this antecedent and this consequent the

genuine antecedent and the genuine consequent. Just as the genuine consequent

may be confirmed by the genuine antecedent, so it may be confirmed by the

following claim, which I call pseudo-antecedent: zero has the property and 2(n+1)

has the property whenever 2n does. Further, just as the pseudo-antecedent may

confirm the genuine consequent, so it may confirm the following claim, which I

call the pseudo-consequent: all even natural numbers have the property. In § 2.4, I

employ the degree of confirmation to explain why the genuine antecedent provides

better evidence for the genuine consequent than does the pseudo-antecedent, and

likewise, to explain why the pseudo-antecedent provides better evidence for the

pseudo-consequent than for the genuine consequent. In this section I also make

similar suggestions regarding alternative inferences centered around non-standard

integers.

Hence, this chapter constitutes a prolegomenon to a thoroughgoing empiricism

about the epistemology of arithmetic, in that it articulates and responds to what I

66

regard as the most pressing objections to inceptive and amplificatory empiricism–

objections which, if unanswerable, would render such empiricism unworthy of fur-

ther investigation. In § 2.5, I discuss the two primary tasks to which future work

on such empiricism must attend, namely, an identification of sources of arithmeti-

cal probability and a delineation of the type of arithmetical reasoning figuring in

the Peano axioms from the type of arithmetical reasoning figuring in the addition

and subtraction of probabilities. But the task of this present chapter is simply

to secure the tenability of this alternative conception of arithmetical knowledge.

For, despite the historical provenance of this probabilistic perspective, it seems

safe to say that it is entirely alien to the predominant ideas in the epistemology

of arithmetic, such as the logicist idea that knowledge of the Peano axioms is

epistemically akin to modus ponens, or the idea, mentioned above, that one has

a type of perceptual access to the natural number structure itself. The guiding

idea of this alternative probabilistic perspective is that mathematical induction

and the other Peano axioms are epistemically akin to enumerative induction, and

the aim of the present chapter is thus merely to give voice and answer to some of

the more pressing objections to the verisimilitude of this probabilistic conception

of the nature of arithmetical knowledge.

2.2 Challenges to Access to Probability Assignments

Both inceptive and amplificatory empiricism presuppose that confirmation is a

source of justification, and the challenges to be considered in this section suggest

in different ways that we do not have access to this source of justification, due to

the fact that probability in the setting of arithmetic is quite different in character

from probability in the setting of the natural sciences. In particular, both of the

67

challenges considered here adduce reasons for thinking that grasping probability

assignments in the setting of arithmetic is no less difficult than grasping arith-

metical truth itself. The first of these challenges arises from countable additivity,

each version of which provides a way to calculate the probabilities associated to

certain non-propositional logical connectives. My response to this challenge is to

argue that those versions of countable additivity which generate this challenge

do not follow from the conception of probability evinced in Dutch Book Argu-

ments (§ 2.2.1). The second of these challenges arises from the fact probability

assignments are in general non-computable. My response here is to suggest that

non-computability is an issue only if we take the type of probability to which we

have access to be reflected by a single probability assignment, as opposed to a

class of probability assignments (§ 2.2.2).

2.2.1 Countable Additivity: Aligning the True and Probable

There are several different versions of countable additivity, but their common

impetus lies in the thought that the probability axioms P1-P3 only articulate rules

of probability for the propositional connectives. For instance, it is straightforward

to derive from P1-P3 the following rules which relate probabilities to disjunctions,

conjunctions, and negations:

(P4) P (ϕ ∨ ψ) + P (ϕ & ψ) = P (ϕ) + P (ψ)

(P5) P (¬ϕ) = 1− P (ϕ)

The basic motivation behind countable additivity is to exhibit analogous rules for

non-propositional connectives. In particular, suppose that the formal language or

signature under consideration is the signature L of the Peano axioms, so that it

contains a function symbol S for successor and a constant symbol 0 for zero. It

68

follows from this that the signature L contains terms sn(0) corresponding to the n-

th successor of zero, e.g. s2(0) is s(s(0)), the second successor of zero, and s3(0)

is s(s(s(0))), the third successor of zero. One can then articulate the following ver-

sion of countable additivity, which for the sake of disambiguation can be referred

to as ω-additivity, where ϕ(x) is an L-formula with free variable x:

(Pω) P (∀ x ϕ(x)) = limN P (∧Nn=1 ϕ(sn(0)))

Hence, the idea of ω-additivity is that the probability of a universal arithmetical

hypothesis may be approximated arbitrarily closely by the probabilities assigned

to the sentences expressing that further and further arithmetical terms satisfy this

hypothesis.

To obtain a different version of countable additivity, one can consider an ex-

tension to a setting where one can form new sentences by taking conjunctions and

disjunctions over countable sets of sentences. These operations of conjunction

and disjunction are respectively written as∧n ϕn and

∨n ϕn, and the resulting

class of sentences are called Lω1ω-sentences. Relative to a natural semantics and

deductive system for these sentences, there is a completeness theorem for Lω1ω-

sentences,61 and hence the notion of a probability assignment on these sentences

can be defined. In particular, an Lω1ω-probability assignment is an assignment of

real numbers to Lω1ω-sentences which satisfies P1-P3 (relative to the consequence

relation on Lω1ω-sentences for which the completeness theorem holds). One can

then consider the following version of countable additivity, which for the sake of

disambiguation can be referred to as ω1-additivity, where ϕ1, ϕ2, . . . is a countable

sequence of Lω1ω-sentences:

(Pω1) P (∧n ϕn) = limN P (

∧Nn=1 ϕn)

Outside of the difference between the universal quantifier and the infinite conjunc-

69

tion, the primary difference between ω-additivity and ω1-additivity is an analogue

of the use-mention distinction: in ω-additivity, the natural number n is employed

to make a statement about the n-th successor of zero, whereas in the case of ω1-

additivity, it is only employed as an index for the sentence ϕn, which may or may

not be a statement about numbers at all.

While this difference between ω-additivity and ω1-additivity may seem innocu-

ous, it is not difficult to see that ω-additivity and only ω-additivity requires that

“having a high probability” align with arithmetical truth. For, suppose that the

conjunction of the eight axioms of Robinson’s Q is assigned a high probability,

say greater than 1 − ε, where ε is some small non-zero error threshold. Under

these circumstances, it follows from the fact that Robinson’s Q proves the cor-

rectness of the addition and multiplication tables that if a probability assignment

satisfies ω-additivity, then an arithmetical sentence is true of the standard model

if and only if it is assigned probability greater than 1 − ε. Here the standard

model is the structure (ω, 0, s,+,×), where ω = {0, 1, 2, . . .} is the set of natural

numbers. Hence, if a probability assignment satisfies ω-additivity, then registering

a high probability by reference to this assignment is coextensive with truth for

arithmetical sentences.62 However, the same is not the case with respect to ω1-

additivity. In particular, it is not difficult to see that for any sentence of first-order

predicate logic which is not a consequence of the axioms of Robinson’s Q, there

is an ω1-additive probability assignment which assigns this sentence probability

zero and which still gives the conjunction of the axioms of Robinson’s Q a high

probability.63 This simple fact shows that unlike ω-additivity, it is not the case

that the satisfaction of ω1-additivity forces the alignment of the arithmetically

true and the arithmetically probable.

70

Since the idea common to inceptive and amplificatory empiricism is that arith-

metical claims can be justified by recourse to judgements about confirmation and

probability, it is important for the tenability of these forms of empiricism that

they not be committed to ω-additivity. For, by the result mentioned in the pre-

vious paragraph, such commitment would force the arithmetically true to align

with the arithmetically probable, and such alignment would cast doubt on our ac-

cess to judgements about confirmation and probability in the case of arithmetic.

To see this, consider an analogous scenario centered not around probability but

around perception. Should someone posit perception as a source of justification

about arithmetic, but then inform us that this sort of perception happened to

be infallible, it seems that the proper response would be to question whether this

type of perception is something which we actually possess, given that it is so man-

ifestly different from our normal modes of perception. Likewise, it seems that the

alignment of the true and the probable in the case of arithmetic should lead us

to question whether we have access to arithmetical probability. Since such access

is vital to the ultimate tenability of inceptive empiricism and amplificatory em-

piricism, it is necessary to say why these forms of empiricism are not committed

to ω-additivity.

My response to this challenge is to argue that the reasons which commit in-

ceptive empiricism and amplificatory empiricism to the probability axioms P1-P3

do not extend to ω-additivity, even though they do extend to ω1-additivity. For,

it is common today to justify commitment to P1-P3 by taking recourse to Dutch

Book arguments, and just as it is demonstrable that ω1-additivity is justifiable

by recourse to such arguments, so it is likewise demonstrable that ω-additivity is

not so justifiable. Let me first describe the relevant theorems and non-theorems

71

before turning to the relation of the theorems to the justification of probability

axioms. The theorems and non-theorems in question here concern complete con-

sistent theories T in the signature L of the Peano axioms, and in what follows it

will be convenient to regard such complete extensions as zero-one valued functions

on the set of L-sentences, so that T (ϕ) = 1 if T |= ϕ and T (ϕ) = 0 otherwise.

Having this convention in place, the standard version of the Dutch Book Theorem

reads as follows:

Dutch Book Theorem, Standard Version: Suppose that P is a functionfrom L-sentences to real numbers. Then P is a probability assignment iffor every finite sequence of real numbers s1, . . . , sN and every finite sequenceof L-sentences ϕ1, . . . , ϕN , there is a complete consistent L-theory T suchthat

∑Nn=1 sn(T (ϕn)− P (ϕn)) ≥ 0.

The situation described in the antecedent of the theorem may be vivified as follows.

Suppose that a bookie offers stakes sn of units of currency on sentence ϕn and

that a bettor provides the bookie with snP (ϕn) units. Suppose further that there

is an agreement in place that if ϕn turns out false, then the bettor wins nothing

(for a net total of −snP (ϕn) units), and that if ϕn turns out true, then the bettor

wins sn (for net total of sn − snP (ϕn) units). Finally, say that the bettor is

invulnerable to a Dutch book if for any finite sequence of bets there is always some

situation– representable in terms of a complete, consistent theory– in which the

net total due to the bettor across all bets is not strictly negative. Hence, cast in

these terms, the Dutch Book theorem says that invulnerability to a Dutch book

is a sufficient condition for an assignment to be a probability assignment.64

The technical point that I view as relevant here is that while the analogous

theorem holds for ω1-additivity, it does not hold for ω-additivity. In particular, it

is well-known that by appropriately augmenting the proof of the standard version

of the Dutch Book Theorem,65 one can establish the following:

72

Dutch Book Theorem, ω1-additive Version: Suppose that P is a functionfrom Lω1ω-sentences to real numbers. Then P is an ω1-additive Lω1ω-probabilityassignment if for every infinite sequence of real numbers sn and every in-finite sequence of Lω1ω-sentences ϕn such that the sequence snP (ϕn) isabsolutely convergent, there is a complete consistent Lω1ω-theory T suchthat

∑n sn(T (ϕn)− P (ϕn)) ≥ 0.

In developing the analogous version for ω1-additivity, it turns out that it is im-

portant to include the stipulation about absolute convergence.66 Here, absolute

convergence means that∑∞

n=1 |snP (ϕn)| < ∞, i.e. that the sequence of partial

sums∑N

n=1 |snP (ϕn)| approaches a finite limit in the real numbers. In terms of

the betting scenario described above, this corresponds to the requirement that

the units of currency potentially exchanged between the bookie and the bettor be

finite.

However, when we turn from ω1-additivity to ω-additivity, what we find is that

the analogous version of the Dutch Book Theorem is false:

Counterexample to Dutch Book Theorem, ω-additive Version: There is a

function P from L-sentences to real numbers such that (i) P is not an ω-

additive probability assignment, and such that (ii) P has the following prop-

erty: for every infinite sequence of real numbers sn and every infinite se-

quence of L-sentences ϕn such that the sequence snP (ϕn) is absolutely con-

vergent, there is a complete consistent L-theory T such that∑

n sn(T (ϕn)−

P (ϕn)) ≥ 0.

It is quite easy to produce such a counterexample. In particular, choose a com-

plete consistent L-theory T0 such that T0 implies Robinson’s Q and such that T0

proves ¬ψ, where ψ is true on the standard model and where ψ ≡ ∀ x ψ0(x)

begins with a universal quantifier followed by a quantifier-free formula ψ0(x) or

by a formula ψ0(x) whose quantifiers are bound to variables appearing earlier in

73

the sentence. For instance, the claim that x is always strictly less than 2x for

non-zero values of x can be expressed in this way, as well as the consistency state-

ment for the Peano axioms. Given this sentence ψ and this theory T0, then define

a function P from L-sentences to real numbers by setting P (ϕ) = T0(ϕ). Since

Robinson’s Q ensures the correctness of the addition and multiplication tables,67

it follows that

P (∀ x ψ0(x)) = T0(ψ) = 0 6= 1 = limNT0(

N∧n=1

ψ0(sn(0)))

= limNP (

N∧n=1

ψ0(sn(0)))

Hence, this is how one obtains the failure of ω-additivity, which corresponds to

the roman numeral (i) in the counterexample. It is much easier to see how one

obtains roman numeral (ii) in the counterexample, since one can always choose the

complete consistent theory T to be identical to the complete consistent theory T0,

which will ensure that the sum in question is equal to zero. So this is one way to

see that there is no ω-additive version of the Dutch Book Theorem.

The philosophical significance of Dutch Book Theorem resides in the fact that

invulnerability to a Dutch book is indicative of a certain type of rationality when

the assignment in question is reflective of degrees of belief, so that the theorems

show that conformity to the probability axioms P1-P3 is a necessary condition of

a certain type of rationality. The type of rationality implicated here is of course

minimally thought to require a disposition to arrange degrees of belief in such a

way that were one to bet units of currency on these degrees of belief, then there

would be at least one situation in which a loss would not be suffered. There are

thus at least two presuppositions to the contention that this type of rationality

74

constitutes an epistemic virtue. The first presupposition is that some virtues are

revealed purely in terms of counterfactual behavior, since it is obviously not en-

visioned here that one actually engages in such betting scenarios. But while such

“dormant virtues” may be a rarity in the practical sphere, they are commonplace

in the theoretical sphere. For instance, there is a virtue related to consistency

which consists in a disposition to retract previously endorsed axioms were they

to exhibit a demonstrable inconsistency, and it seems reasonable to say that this

virtue is present in our reasoning even if it turns out that the axioms in question

(say the set-theoretic axioms) are in fact consistent. The second presupposition

is that there is a suitable abundance of potential situations across which gains or

losses may be incurred, since were the number or variety of these situations to be

highly curtailed, then the demands of invulnerability would become quite severe.

However, since we are identifying potential situations with complete consistent

theories in a given formal language or signature, the fact that there are contin-

uum many of these would seem sufficient to allay concerns about the severity of

the demands of invulnerability.

Hence, my response to the challenge of ω-additivity is to suggest that in-

ceptive and amplificatory empiricism be conceived as justifying their appeal to

confirmation and probability by means of Dutch Book Theorems, so that the fact

that there is no ω-additive Dutch Book Theorem may be taken as evidence that

these forms of empiricism are not committed to ω-additivity. While this response

clearly meets the challenge of ω-additivity, it has at least two drawbacks. The

first is that if these forms of empiricism are tied to the philosophical interpreta-

tion of the Dutch Book Theorems described above, then all the concerns voiced in

the literature about this interpretation automatically become concerns for these

75

forms of empiricism.68 The second drawback is that if inceptive and amplificatory

empiricism are going to operate only with those rules of probability which are

licensed by Dutch Book Theorems, then these forms of empiricism cannot justify

the contention that various kinds of confirmation actually occur by recourse to

probabilistic rules. For instance, inceptive empiricism turns on the supposition

that several instances of a universal arithmetical hypothesis are assigned a non-

zero probability strictly less than one, and this supposition by no means follows

from the probability axioms P1-P3 alone. Hence, if these forms of empiricism

are only allowed to operate with these probabilistic rules, then for their ultimate

success they must provide other reasons for giving such assignments. This is one

of the further challenges to these forms of empiricism which I discuss in § 2.5.

Before turning to the challenge from computability, it is helpful to briefly com-

pare this response to ω-additivity to Isaacson’ well-known response to the ω-rule

([77] § III). The ω-rule is a proof-theoretic rule which licenses the inference to the

claim that ∀ x ϕ(x) from the totality of all claims of the form ϕ(sn(0)), where n

ranges over natural numbers. For the very same reasons that the arithmetically

true and the highly probable become aligned under ω-additive probability assign-

ments which assign high probability to Robinson’s Q, so the arithmetically true

is aligned with what is derivable from Robinson’s Q in deductive systems that

are augmented by the ω-rule. Isaacson was concerned with this because he had

previously argued that the Peano axioms in conjunction with the standard rules

of inference were effectively complete and completely determined our concept of

number ([76]). Thus Isaacson was concerned to show that the ω-rule was not part

of our concept of number, since otherwise the collapse of truth and proof engen-

dered by the ω-rule would make this concept vastly outstrip the concept given to

76

us by the Peano axioms and standard rules of inference.

One of Isaacson’s basic strategies is to point out that standard defenses of

the ω-rule appeal to truth about the natural numbers, and such an appeal to

truth about the natural numbers is not part of our concept of number, but goes

above and beyond this concept, and is essentially a second-order or higher-order

concept ([77] p. 108). There are obviously many differences between Isaacson’s

strategy for handling the ω-rule and my strategy for handling ω-additivity, but

the one difference which bears especial mentioning is that my discussion of ω-

additivity did not at any place appeal to points specific to our concept of number.

Rather, my discussion focused entirely on what did and did not follow from a

standard justification of the probability axioms. The analogue of my strategy in

Isaacson’s setting would be to argue the ω-rule did not follow from some standard

justification of the other accepted rules of inference, such as modus ponens.

2.2.2 The Non-Computability of Probability Assignments

I want now to turn to a challenge to access from considerations of computabil-

ity. The basic idea with this version of the challenge is that probability assignments

are in general non-computable, and that such non-computability should render

suspect the presupposition that these probabilities are something which we can

readily discern. Prior to setting out this version of the challenge more carefully,

something must first be said about what it means to say that a probability as-

signment is computable or non-computable. For, the predicates of computable

and non-computable apply only to subsets of natural numbers, and by proxy, to

countable objects which are represented as subsets of natural numbers.69 How-

ever, the real numbers, with which probability assignments are concerned, are

77

uncountable. Despite this, there are at least two natural ways of representing

probability assignments as countable objects. First, one can restrict one’s atten-

tion to those assignments which map sentences into a countable subfield of the

real numbers, such as the real algebraic numbers, the smallest subfield of the

real numbers which is elementarily equivalent to the real numbers.70 Second, one

can restrict one’s attention to those assignments which maps sentences not into

real numbers per se, but rather into certain representations of these numbers as

quickly-converging Cauchy sequences of rationals. In this way, these assignments

can be represented as countable sequences of such Cauchy sequences, so that the

predicates of computable and non-computable are readily applicable.71 Each of

these means of representation is admittedly not without its disadvantages: the

first means of representation excludes many real numbers (such as e and π), while

the second means prescribes what might be regarded as an overly uniform manner

of representation.

But under either of these two modes of representations, one can show that if

a probability assignment assigns non-zero probability to the conjunction of the

eight axioms of Robinson’s Q, then the probability assignment is not computable.

This argument proceeds by showing that such an assignment computes a complete

consistent extension of Robinson’s Q, which is known to be non-computable by

work of Tarski (cf. Tarski et. al. [145] Theorem 9 p. 60). In particular, suppose

that the sentences in the formal language or signature L of Robinson’s Q are

enumerated as ϕ1, . . . , ϕn, . . . in such a way that ϕ1 is the conjunction of the

eight axioms of Robinson’s Q. Supposing that P is a probability assignment

such that P (ϕ1) > 0, it must be shown that P computes a complete consistent

extension TP of Robinson’s Q. Let TP (ϕ1) = 1, and suppose that for all i < n it

78

has already been decided whether to set TP (ϕi) = 0 or TP (ϕi) = 1 in such a way

that

0 < P (∧

TP (ϕi)=1

ϕi &∧

TP (ϕi)=0

¬ϕi) (2.1)

Then it follows from P1-P3 that

0 < P ([∧

TP (ϕi)=1

ϕi] & ϕn & [∧

TP (ϕi)=0

¬ϕi]) + P ([∧

TP (ϕi)=1

ϕi] & [∧

TP (ϕi)=0

¬ϕi] & ¬ϕn)

(2.2)

so that at least one of the two quantities featured in this sum is strictly pos-

itive. If one computes from P that the first quantity is strictly positive, then

set T (ϕn) = 1 and T (¬ϕn) = 0, and if one computes from P that the second

quantity is strictly positive, then do the converse.72 This construction results in

complete theory TP which extends Robinson’s Q and which is computable from

the probability assignment P . Further, this theory is consistent, since if not, then

there is some finite fragment of the theory which proves a contradiction. Since the

axioms P1-P3 imply that contradictions are assigned probability zero, and since

they likewise imply that equivalent sentences are assigned the same probability,

and since anything which proves a contradiction is equivalent to a contradiction,

it follows from P1-P3 that the conjunction of some finite fragment of TP would be

assigned probability zero. This contradicts our construction, in which all the finite

fragments of TP were assigned non-zero probability (cf. equation (2.1)). Hence,

this is one way to see that TP is a complete consistent extension of Robinson’s Q

which is computable from P . From this it follows by the theorem of Tarski that

the probability assignment P is non-computable, at least assuming it assigns the

79

conjunction of the eight axioms of Robinson’s Q a non-zero probability.

This is important because both inceptive and amplificatory empiricism re-

quire that the conjunction of the eight axioms of Robinson’s Q be assigned a

non-zero probability. For instance, inceptive empiricism claims that the hypoth-

esis h consisting of the conjunction of the eight axioms of Robinson’s Q may be

confirmed by evidence e saying that various instances of these universal axioms

hold. However, if this hypothesis h is assigned probability zero, then one has

that P (h|e) − P (h) = 0 − 0 = 0, so that no confirmation occurs. Likewise, am-

plificatory empiricism claims that the hypothesis h consisting of a consequent of

an instance of mathematical induction may be confirmed, relative to background

knowledge k consisting of Robinson’sQ, by evidence e consisting of the antecedent.

However, if this background knowledge is assigned probability zero, then the quan-

tity P (h|e & k)−P (h|k) is not defined, so that confirmation cannot occur. Thus,

the probability assignments to which inceptive and amplificatory empiricism make

avail will inevitably assign a non-zero probability to the conjunction of the eight

axioms of Robinson’s Q.

Thus, these probability assignments will be non-computable, and this raises

the concern that the sources of justification to which these forms of empiricism

take recourse are simply not available to us. One might object to this concern

by suggesting that such an insistence upon computability is tantamount to an

appeal to ignorance, since all known natural examples of sets of natural numbers

are either computable or are non-computable because they compute the halting

set, e.g. they contain information about the behavior of all partial computable

functions.73 Likewise, one might question the role which computability could play

in undergirding these forms of empiricism in the first place, since it seems difficult

80

to see how one could recognize a computable function as such without some prior

grasp of the arithmetical axioms which these forms of empiricism seek to justify.74

In light of these concerns, it seems that the proper way to defend this insistence

on computability would be to suggest that the computability of those sources

of justification to which the appellations of computability and non-computability

apply is at best a prima facie requirement, a requirement whose satisfaction the

agent or subject in question need not necessarily be in a position to verify.

My suggestion is not to dispute this requirement of computability, but to

question whether the conformity of inceptive and amplificatory empiricism to this

requirement actually entails the computability of specific probability assignments.

For, inceptive and amplificatory empiricism appeal to probability assignments

and their associated confirmation relations as a source of justification, but it is

not clear that such an appeal needs to be conceived of as an appeal to a specific

probability assignment. To reiterate a suggestion which has been proffered in

other contexts, the relevant concepts of probability and confirmation might be

better represented by a large and perhaps even uncountable class of probability

assignments, as opposed to a single probability assignment.75 If this were the

case, then since the predicates of computable and non-computable do not apply

to such classes, then the non-computability of the individual members of this class

would not violate the aforementioned requirement of computability, which only

insists on computability of those sources of justification to which the predicates

of computable and non-computable apply.

Before turning to a concern which one might have with this response to the

challenge of the non-computability of probability assignments, it is helpful to draw

an analogy between the situation of probability assignments and the situation of

81

complete consistent extensions of theories. The result of Tarski mentioned above

says that there is no complete computable consistent extension of Robinson’s Q.

However, it seems that there is no record of anyone ever suggesting that Tarski’s

result represents an epistemic barrier to Robinson’s Q and the type of knowledge

about the natural numbers that this axiomatization provides. Presumably, this

is so because no one ever thought that appeal to these axioms involved appeal to

a specific complete consistent extension. Rather, presumably the idea is that one

is appealing to these axioms themselves, along with whatever can be legitimately

deduced from them by means of standard rules of inference. Likewise, the idea

behind my response to the challenge of non-computability is to suggest that when

inceptive and amplificatory empiricism appeal to probability assignments, they

appeal not to particular probability assignments, but rather they appeal to a

potentially large class of such assignments whose members satisfy the axioms P1-

P3, as well as to whatever else can be legitimately inferred from these and other

characteristic properties of the class by means of accepted rules of deductive and

inductive inference.

One might object to this response by suggesting that while the predicates of

computable and non-computable do not apply to uncountable classes of proba-

bility assignments, there is nevertheless a related measure of complexity which

applies to such classes, that while uncountable classes might be excepted from the

aforementioned requirement of computability, they need not be excepted from an

analogous requirement on the non-complexity of classes. In mathematical logic,

one standard measure of the complexity of classes of subsets of natural numbers is

given by the so-called arithmetical hierarchy, which measures how many alterna-

tions of quantifiers over natural numbers are required to define the class.76 For in-

82

stance, the class of complete consistent extensions of Robinson’s Q can be defined

with only one universal quantifier over natural numbers, since consistency merely

requires that every proof not be a proof of contradiction, and since complete-

ness merely requires that every sentence in the language be included or excluded

from the theory. Hence, in analogue to the aforementioned requirement of com-

putability, one might require that those sources of justification which are naturally

identified with such classes have minimal complexity under this quantifier-based

measure of complexity.

This requirement of non-complexity of classes is of course only precise to the

extent that some minimum complexity level is antecedently specified, but it seems

that under any reasonable specification of such a minimum, the class of proba-

bility assignments will surely satisfy this requirement. For, under the second

representation of probability assignments as countable objects described at the

outset of this subsection, the class of representations of probability assignments

will be definable by one universal quantifier over natural numbers, just like the

class of complete consistent extensions of Robinson’s Q. Further, under the first

representation of probability assignments as countable objects described above,

the class of such representations will be definable by a universal quantifier fol-

lowed by an existential quantifier. Hence, it seems that the class of probability

assignments will surely satisfy any reasonable demands of non-complexity when

such complexity is measured in terms of the number of alternations of quantifiers

required to define the class.

There is a certain analogy between this reply to the concern about class com-

plexity and a point that Kevin Kelly has made in a number of places ([86–89]).

Part of the background to this point is that Kelly is in general critical of the

83

idea that scientific inference can be captured via probability assignments (cf. [88]

p. 96). His alternative suggestion is that one should view scientific hypotheses

as a conjunction of (i) a description of a class of possible observation sequences

and (ii) a predication that the actual sequence of observations will be among this

class. One of the virtues of this approach is that it allows for a means by which

to characterize the simplicity of hypotheses (and other virtues of hypotheses, such

as verifiability and refutability), namely, in terms of the quantifier-based measure

describe above (cf. [89] § 3, [87] § 3). Likewise, another of the virtues of this

approach is that it accounts for how scientific hypotheses can be simple under this

measure and yet how the actual sequence of observations may be very complex

under the analogous measure of computability and non-computability (cf. [89]

§ 8, [87] § 4). While the entire idea of inceptive and amplificatory empiricism is

based on a conception of probability assignments whose value Kelly would dispute

in the arena of scientific inference, the point made in the last few paragraphs can

be viewed as a kind of meta-level analogue of Kelly’s idea. For, the idea has been

that these forms of empiricism involve a class of assignments which is simple under

the measure of complexity of classes, but whose individual members are highly

complex under the measure of computability and non-computability.

2.3 Challenges to Arithmetical Instance Confirmation

In the previous section, challenges common to both inceptive and amplificatory

empiricism were considered. These challenges suggested that for reasons related

to countable additivity and computability, the probability assignments and asso-

ciated confirmation relations to which these forms of empiricism make avail are

simply not available to us. In this section, challenges specific to arithmetical

84

instance confirmation are considered. Recall that arithmetical instance confirma-

tion is that type of confirmation in which a universal arithmetical hypothesis is

confirmed by evidence to the effect that several numbers satisfy the property in

question, and further recall that physical instance confirmation is defined analo-

gously in terms of physical hypotheses and physical objects. Of the two forms of

empiricism considered here, it is only inceptive empiricism that relies on arith-

metical instance confirmation, and so the challenges to be presently considered

are peculiar to inceptive empiricism.

The first of these challenges is due to Baker ([5]), who suggests that arith-

metical instance confirmation displays either unacceptable levels of unreliability

or insufficient levels of diversity of evidence. I argue in § 2.3.1 that arithmetical

instance confirmation is no more unreliable than physical instance confirmation,

and that arithmetical instance confirmation is not insufficiently diverse on sev-

eral extant analyses of evidential diversity. The second of these challenges is that

arithmetical instance confirmation is objectionable because it is unstable. As a

first approximation, an inference from evidence to a universal hypothesis is unsta-

ble if it may be materially bettered by the further evidence that various particular

objects satisfy this universal hypothesis. In response to the challenge from stabil-

ity, I argue in § 2.3.2 that while stability may be a virtue with regard to certain

forms of mathematical reasoning, such as geometrical reasoning, it is not a virtue

of arithmetical reasoning.

2.3.1 Baker and the Exigencies of Arithmetical Sampling

Baker’s thesis is that arithmetical instance confirmation is biased in a way in

which physical instance confirmation is not, and that this is due to the samplings

85

in arithmetical instance confirmation being small.77 However, there are at least

two natural senses in which such samplings may be said to be small, which I

call setwise-small and pointwise-small, and there are two relevant dimensions of

bias to be found in Baker’s essay, namely unreliability and insufficient diversity.78

Hence, one can distinguish between several different versions of Baker’s thesis,

depending on whether the relevant notion of smallness is setwise- or pointwise-

smallness, and depending on whether the relevant notion of bias is unreliability or

insufficient diversity. Hence, subsequent to defining these two notions of smallness,

I turn to versions of Baker’s thesis centered around setwise-smallness, and then

to versions of Baker’s thesis centered around pointwise-smallness.

These two notions of setwise-smallness and pointwise-smallness apply to finite

sets of natural numbers, and they both are defined in terms of a third notion of

smallness which applies to individual natural numbers. For instance, 100 is a small

natural number, but 1001000 is not a small natural number, and if one natural

number is small and another number is less than it, then that second natural

number is small as well.79 A finite set of natural numbers can then be said to be

pointwise-small if each of its elements is a small natural number, while a finite

set of natural numbers can be said to be setwise-small if its cardinality is a small

natural number. Hence, with the exception of the set of all small natural numbers,

any pointwise-small set is itself setwise-small.80 However, the converse is not in

general true. For instance, given the 3-element sets X = {2001000, 3001000, 4001000}

and Y = {2, 3, 4}, it follows that Y is both setwise-small and pointwise-small,

whereas X is setwise-small but not pointwise-small.

One version of Baker’s thesis would thus contend that arithmetical instance

confirmation is biased in a way in which physical instance confirmation is not,

86

and that this is due to the samplings in arithmetical instance confirmation being

setwise-small. It is presumably indisputable that arithmetical instance confirma-

tion is in fact based on setwise-small samplings of natural numbers. For, even

with the aid of computers, one can only look at so many natural numbers, and in

comparison with the set of all natural numbers, the cardinality of such samplings

will inevitably appear diminutive. However, presumably physical instance con-

firmation relies on setwise-small samplings in exactly the same manner: indeed,

the same sorts of constraints that prevent us from doing innumerable calcula-

tions also prevent us from taking innumerable measurements. Hence, regardless

of whether bias is understood in terms of unreliability or insufficient diversity,

a difference between the levels of bias in arithmetical instance confirmation and

physical instance confirmation cannot be attributable to a difference in the man-

ner in which they rely upon setwise-small samplings, simply because they so rely

on setwise-small samplings in exactly the same way. Hence, versions of Baker’s

thesis centered around setwise-smallness seem plainly untenable.

Thus, it seems that Baker’s thesis might be more profitably understood in

terms of pointwise-smallness. Here it is helpful to explicitly note by way of score-

keeping that versions of Baker’s thesis centered around setwise-smallness have im-

plications for versions of Baker’s thesis centered around pointwise-smallness, for

the simple reason that Baker’s thesis is a thesis about the implications of small

samplings, and as noted above, virtually all pointwise-small samples are setwise-

small samples.81 However, the converse does not hold– i.e. most setwise-small

samples are not pointwise-small samples– and hence versions of Baker’s thesis

centered around pointwise-smallness do not automatically have implications for

versions of Baker’s thesis centered around setwise-smallness. Hence, one cannot

87

infer directly from the untenability of the latter to the untenability of the for-

mer, and so it is thus necessary to consider separately versions of Baker’s thesis

centered around pointwise-smallness.

So let us first consider the version of Baker’s thesis centered around pointwise-

smallness and unreliability. This version of the thesis holds that arithmetical

instance confirmation is unreliable in a way that physical instance confirmation is

not, and that this is due to the samplings in arithmetical instance confirmation be-

ing pointwise-small. Here, the unreliability of instance confirmation is understood

in a standard manner, so that a relative increase in unreliability is concomitant

with a relative increase in the number of false universal hypotheses which are con-

firmed by several true instances. Now, it seems hard to dispute that samplings

in arithmetical instance confirmation are drawn exclusively from pointwise-small

samples: for, given constraints of time and space, even the best computers can

only calculate with numbers of so large a size, and mathematicians likewise face

similar sampling constraints and limitations.

Hence, it seems that the only contentious point in this version of Baker’s thesis

is the claim that such a reliance upon pointwise-small samples renders arithmetical

instance confirmation unreliable in a way in which physical instance confirmation

is not. However, there is an obvious analogue of this reliance upon pointwise-small

sampling in the case of physical instance confirmation, an analogue suggested by

Baker himself.82 In particular, say that a sampling of physical data is timewise-

small if each data point in the sampling was measured (or otherwise observed)

at a point in time that is relatively close to the present. Just as it seems in-

disputable that samplings of natural numbers are pointwise-small, so it seems

indisputable that samplings of physical data are timewise-small. However, it is

88

generally conceded that physical instance confirmation is sufficiently reliable.83

Hence, if the inference from the dependence of physical instance confirmation on

timewise-small samplings to the unreliability of physical instance confirmation is

rejected, but at the same time the inference from the dependence of arithmetical

instance confirmation on pointwise-small samplings to the unreliability of arith-

metical instance confirmation is accepted, then one should be able to point out

some relevant difference between the two cases.

Baker suggests that the relevant difference between the two consists in the fact

that “there are no [. . . ] systematic differences between the past and the future

[. . . ].”84 It may indeed be the case that many of the properties that interest

scientists are in fact temporally invariant in this sense, so that what is true of

timewise-small samplings will likewise be true in general. However, it is also the

case that many of the properties that interest mathematicians are such that what

is true of pointwise-small samplings is likewise true in general. For instance, this is

the case with respect to the properties which feature in the instance confirmation

of the axioms of Robinson’s Q. So if there is to be a disanalogy between the

setting of arithmetic and the setting of the physical sciences here, it has to be

with regard to something deeper than the fact that many of the properties which

interest scientists (resp. the mathematician) are projectable from timewise-small

samplings (resp. pointwise-small samplings).

Hence, one might try to suggest that the relevant difference between arith-

metical and physical instance confirmation consists in the fact that all physical

properties are projectable from timewise-small samplings, whereas not all arith-

metical properties are projectable from pointwise-small samplings. But if the key

term of physical properties is understood as a naturalistic term that simply picks

89

out spatio-temporal properties describing portions of the external world, then it is

simply false that all physical properties are so projectable. Indeed, were this the

case, then knowledge of the future would be much easier to come by than it actu-

ally is. Likewise, if the key term of physical properties is understood historically,

as picking out those properties that have interested certain intellectual commu-

nities, then again it is false that all these properties are temporally-projectable:

for, were this the case, then science would be endowed with a kind of infallibility

which is definitively vitiated by the historical record.

It might then be suggested that the important difference between arithmetical

instance confirmation and physical instance confirmation is the success of the

extant practice: most of the physical properties picked out by the community of

scientists have in fact turned out to be temporally projectable, whereas there is no

similar track record of success of mathematicians projecting from the pointwise-

small. So part of the idea here would be that natural scientists somehow learned

to discern the properties of physical objects that do not depend on their temporal

location, whereas mathematicians have yet to learn to discern the properties of

numbers that do not depend on their location in the ordering of greater-than and

less-than on the natural numbers. I am willing to grant all this for the sake of

argument: however, what I want to emphasize is that this does not establish an

entailment from the reliance upon pointwise-small samplings to the unreliability

of arithmetical instance confirmation, any more than pointing to the failures of

pre-scientific communities to project from the temporally-small would establish

that there is an entailment from the reliance upon temporally-small samplings to

the unreliability of physical instance confirmation.

Hence, it seems that no relevant difference has been identified which would jus-

90

tify us in simultaneously accepting the inference from the reliance upon pointwise-

small samplings to the unreliability of arithmetical instance confirmation while

rejecting the inference from the reliance upon temporally-small samplings to the

unreliability of physical instance confirmation. However, it is important to em-

phasize that one could legitimately reject the need to identify such a relevant

difference. For instance, one might legitimately reject such a need by producing a

valid premise-conclusion argument (with plausible premises) for the inference from

the reliance upon pointwise-small samplings to the unreliability of arithmetical in-

stance confirmation. However, neither Baker nor anyone else that I know of has

produced such an argument. Absent such an argument for this inference, it does

not seem to be unreasonable philosophical methodology to withhold assent from

this inference until some relevant difference between it and a clearly dubious albeit

similar inference has been identified.

However, perhaps the situation is different with regard to the version of Baker’s

thesis centered around pointwise-small samplings and insufficient diversity. This

version claims that arithmetical instance confirmation is insufficiently diverse in

a way in which physical instance confirmation is not, and that this is due to the

samplings in arithmetical instance confirmation being pointwise-small. In what

follows, I will grant for the sake of argument that, all other things being equal,

sufficiently diverse evidence better confirms a hypothesis than insufficiently diverse

evidence, and I will focus instead on ascertaining whether, according to various

analyses of diversity, it is in fact the case that arithmetical instance confirmation

is insufficiently diverse due to its reliance upon pointwise-small samplings. My

conclusion is that there is no extant analysis of evidential diversity on which

arithmetical instance confirmation is insufficiently diverse due to its reliance upon

91

pointwise-small samplings.

On one analysis, evidential diversity is tied to probabilistic independence, in

that evidence e1 and e2 for hypothesis h is said to be diverse to the extent that

the two quantities P (e1 & e2) and P (e1) ·P (e2) are close to one another (either in

terms of their quotient being close to one, or their difference being close to zero).85

Hence, on this analysis, the version of Baker’s thesis presently under considera-

tion would predict that the reliance of arithmetical instance confirmation upon

pointwise-small samplings would result in evidence that displayed an insufficient

level of probabilistic independence. However, there are some examples that do not

accord with this prediction. For instance, consider a case which Baker himself ex-

amines, namely the Goldbach conjecture. This is a universal statement, and so can

be written as h ≡ ∀ x H(x). Hence, in arithmetical instance confirmation which

relies upon pointwise-small samples, the evidence for this hypothesis h would be

of the form e1, . . . , eN , where N is a small natural number, where en ≡ H(sn(0)),

and where e.g. s2(0) = s(s(0)) denotes the successor of the successor of zero.

Further, suppose (as is natural in this setting) that the conjunction of the eight

axioms of Robinson’s Q is assigned a high probability, and suppose further that

(as expected) the Goldbach conjecture is true. Then since Robinson’s Q proves

the correctness of the addition and multiplication tables,86 and since en can be

written with only bounded quantifiers, it follows that the en as well as any finite

conjunction of the en will likewise be assigned a high probability. But then the

quantities P (en & em) and P (en) · P (em) will be close to one (and hence their

quotient will be close to one and their difference will be close to zero), so that

on this analysis the evidence will be diverse. Hence, if evidential diversity is ana-

lyzed in terms of an approximation to probabilistic independence, it is simply false

92

that arithmetical instance confirmation is insufficiently diverse due to its reliance

upon pointwise-small samplings: for, this example is an example where the sam-

plings are pointwise-small and yet the evidence is close to being probabilistically

independent.87

Another analysis connects the diversity of evidence to low likelihood with re-

spect to a pool of competing plausible hypotheses, in the following sense. Suppose

that there is evidence which confirms each of a number of competing, plausible

hypotheses. Further suppose that the hypotheses are competitors in that they

are mutually exclusive (no two can both be true) and mutually exhaustive (one

must be true), and suppose that they are plausible in that they all have some

antecedently-specified non-trivial level of prior probability. Then on this analysis,

the evidence e is said to be diverse to the extent that its likelihood P (e|h) is low

with respect to a large number of these hypotheses h.88 The guiding intuition here

seems to be that diverse evidence is evidence whose complexity shields it from be-

ing rendered likely by the majority of these competing plausible hypotheses: it

is diverse not because of some feature intrinsic to the evidence itself, but rather

because it is unexpected from the vantage point of most of these hypotheses.89

However, on this analysis, it does not seem that reliance upon pointwise-small

samplings in arithmetical instance confirmation entails insufficient levels of diver-

sity. For, in arithmetical instance confirmation, the pool of plausible competing

hypotheses would presumably be restricted to a universal arithmetical hypothe-

sis h ≡ ∀ x H(x) and its negation ¬h, and the evidence in this case would be of

the form eX =∧n∈X H(sn(0)), where X ranges over setwise-small sets of natural

numbers. Further, on this analysis, evidence eX will be diverse to the extent that

the likelihood P (eX |¬h) is low (since the likelihood P (eX |h) is always equal to

93

one). Hence, to contend that reliance upon point-wise small samplings results in

insufficient diversity would be to contend here that P (eX |¬h) is not sufficiently

low when X is pointwise-small.

However, it seems difficult to see how the pointwise-smallness of X is supposed

to impact the quantity P (eX |¬h) one way or another. For instance, suppose that

one evaluates the quantity P (e′|h′) in terms of the level of assent which one would

give to e′ were one to assent entirely to h′,90 and suppose that the sets under consid-

eration are three-element sets of natural numbers, and consider the pointwise-small

set Y = {2, 3, 4} and the non-pointwise-small set X = {2001000, 3001000, 4001000}.

Were I assent to ¬h without having an explicit counterexample in mind, it does

not seem obvious that I would be more or less reticent to assent to eX than eY .91

This then suggests that the two quantities P (eX |¬h) and P (eY |¬h) need not be far

apart from another. Hence, it does not seem evident that reliance upon pointwise-

small samplings entails insufficient levels of diversity of evidence, at least on this

analysis of diversity of evidence.

A third and final analysis of the diversity of evidence suggests the diversity of

evidence requires that “as many hypotheses as possible are tested [by the evidence]

in as many different ways as possible.”92 The background to this analysis is the

idea that what is confirmed are not individual hypotheses taken one by one, but

rather a theory consisting of several auxiliary and several primary hypotheses,

which differ from one another in that the auxiliary hypotheses are themselves

deployed in the confirmation of the primary hypotheses. Given this dependence,

the thought is that it is prudent to develop theories where any given primary

hypothesis is confirmed by several different auxiliary hypotheses. Such theories

will thus only be confirmed by evidence that tests all of these different auxiliary

94

hypotheses, and that hence tests the primary hypotheses in multiple different

ways.

This analysis of the diversity of evidence was thus obviously designed with

physical theories in mind, but there is nothing in principle that precludes its

application to the setting of arithmetic. For instance, one might view Robinson’sQ

and the Goldbach conjecture as the primary hypotheses and various instances of

mathematical induction as the auxiliary hypotheses. However, on this analysis

of the diversity of evidence, it seems that the diversity of a body of evidence for

this arithmetical theory is ultimately orthogonal to whether or not the evidence

is pointwise-small. For, it does not see that there is anything connecting the

pointwise-smallness of the instances which are evaluated and whether or not these

instances collectively confirm as many hypotheses in the theory in as many ways

as possible. To be sure, the setwise-smallness of a sampling might be a roadblock

to diversity in this sense, since setwise-small samplings simply have fewer items of

evidence to work with, but there is no obvious roadblock to diversity in the case

of pointwise-small samples. Hence, on this analysis too, it seems that there is no

obvious entailment between pointwise-smallness of the sampling in arithmetical

instance confirmation and insufficient levels of the diversity of evidence.

2.3.2 Stable and Unstable Reasoning in Geometry and Arithmetic

Whereas Baker’s concern was that arithmetical instance confirmation lacked

the virtues of reliability or diversity of evidence, the concern to be considered

now is that arithmetical instance confirmation lacks the virtue of stability. Since

the concept of stability is less conventional than that of reliability, it is necessary

to first set out some preliminaries. Stability pertains to mathematical reasoning

95

about domains of mathematical objects, the canonical examples being arithmeti-

cal reasoning about the natural numbers and geometrical reasoning about the

Euclidean plane. However, stability does not presuppose that we are in posses-

sion of a means by which to uniquely describe this domain: stable reasoning is

about a domain in the sense that it is defined relative to a domain, and not in

the sense that it characterizes it. In this, it is similar to the verisimilitude of per-

ception: this is a relation between perception and the extra-perceptual, but this

relation can obtain without the extra-perceptual being perceptually specifiable.

The notion of stability applies to reasoning about these domains, and such rea-

soning may be represented in terms of inferences, which are triples of (i) premises

from which one infers, (ii) a conclusion to which one infers, and (iii) a specifica-

tion of the mode in which the premises are said to bear on the conclusion, e.g.

whether they confirm the conclusion, whether they constitute a proof of the con-

clusion, etc. In the literature on probability whose vocabulary has been employed

in this chapter, the premises are called the evidence and the conclusion is called the

hypothesis, and in what follows, I will continue to use this terminology. Another

feature of the terminology which I employ which merits explicit mentioning is that

one and the same pair (e, h) of evidence and hypothesis might be associated with

two different inferences. For, the inferences may differ in that one asserts that

the evidence confirms the hypothesis, while the other asserts that the evidence

constitutes a proof of the hypothesis. Hence, in this terminology, an inference is

completely specified only when one specifies its evidence, its hypothesis, and its

mode (e.g. confirmation, proof).

Finally, the concept of stability presupposes that with respect to each mode

of inference (confirmation, proof) and each antecedently fixed hypothesis, there is

96

an associated strict partial order on inferences from evidence to this hypothesis,

namely, the strict partial order of inference from evidence e1 to hypothesis h being

of strictly superior quality to inference from evidence e2 to hypothesis h. If the

quality of evidence e relative to hypothesis h is denoted by qh(e), then this strict

partial order can be written as qh(e1) >h qh(e2), and it will be assumed that it

satisfies the following laws:

Transitivity: qh(e1) >h qh(e2) and qh(e2) >h qh(e3) implies qh(e1) >h q(e3).

Non-reflexivity: qh(e1) ≯h qh(e1).

For instance, with respect to confirmation, this strict partial order might be that

of degree of confirmation, so that one would have qh(e1) >h qh(e2) if and only

if P (h|e1 & k) − P (h|k) > P (h|e2 & k) − P (h|k), where k is the background

knowledge. That is, this order reflects the magnitude to which the probability of

the hypothesis given the evidence and background knowledge exceeds the prior

probability of the hypothesis given the background knowledge.93 Likewise, with

respect to proof, this strict partial order of quality might reflect the degree to

which the proof (i) appeals only to generally accepted axioms, (ii) employs only

standard rules of inference, (iii) and does not appeal to the hypothesis itself.

Hence, if one has doubts about the extent to which each of these modes of inference

(e.g. confirmation, proof) comes equipped with a strict partial order of quality

which facilitates comparisons of evidence for a given hypothesis, then one will

have legitimate concerns about the cogency of the notion of stability.

These preliminaries in place, it is now possible to define stability. Suppose

that D is a mathematical domain (e.g. the natural numbers, the Euclidean plane).

Then relative to this domain, the instability of an inference can be defined as

follows:

97

The inference from evidence e to universal hypothesis h ≡ ∀ x (Dx → Hx)is unstable if there are a finite number of objects d1, . . . , dn in the domain Dsuch that the conjunctive evidence e &

∧ni=1H(di) for hypothesis h is strictly

superior to the evidence e for hypothesis h.

In this definition, it is presupposed that the mode of the inference from the con-

junctive evidence e &∧ni=1H(di) to the hypothesis h is the same as the mode of

the inference from the prior evidence e to the hypothesis h, e.g. if the latter asserts

a relation of confirmation (resp. proof) then the former also asserts a relation of

confirmation (resp. proof). Finally, an inference from evidence to a universal

hypothesis is said to be stable if it is not unstable. A body of mathematical rea-

soning about an antecedently fixed domain is said to be unstable if some inference

from evidence to universal hypotheses present in this reasoning is unstable, and

it is said to be stable otherwise. Hence, in stable reasoning, an inference from

evidence to a universal hypothesis cannot be strictly bettered by the inclusion

of additional evidence to the effect that some finite number of particular objects

from the domain satisfy this hypothesis.

Some canonical forms of reasoning in mathematics are stable. In particular,

certain canonical forms of geometrical reasoning are stable, and this is due to the

fact that certain canonical geometrical structures display a high level of indiscerni-

bility. Suppose that M is a structure and that D ⊆ Mn is a set (not necessarily

definable) and A ⊆M is a set of parameters (not necessarily definable). Then D

is said to be A-indiscernible if any two elements a, b of D are such that they satisfy

the same first-order A-definable properties.94 It is not difficult to see that stability

is implied by indiscernibility, or more specifically, by known indiscernability. For,

suppose that D is A-indiscernible, and suppose that this is part of the background

knowledge. Suppose further that evidence e is taken to constitute a proof of the

universal hypothesis h ≡ ∀ x (Dx → Hx), where H is an A-definable property.

98

Then the inference from e to h is stable. For, suppose that d1, . . . , dn are tuples

from D. It must then be asked whether conjunctive evidence e &∧ni=1H(di) for

hypothesis h is strictly superior to the prior evidence e for hypothesis h. However,

there is clear sense in which this conjunctive evidence is not strictly superior to

the prior evidence. For, what the conjunctive evidence adds to the prior evidence

is additional evidence of the form H(di), and it is known from indiscernibility that

these are each equivalent to the hypothesis h ≡ ∀ x (Dx→ Hx). Since adding ev-

idence which is obviously equivalent to the hypothesis does not improve or better

the inference from the evidence to the hypothesis, this inference is stable. This is

why stability is implied by known indiscernability.

One example of a canonical geometric structure which displays high levels

of indiscernability is the standard presentation of the Euclidean plane as a two-

sorted structure P t L with a set of points P and a set of lines L, where the

only non-logical relation is the incidence relation of a point lying on a line. This

is a presentation of the Euclidean plane to which many of Euclid’s axioms are

readily applicable. It turns out that in this structure, the set of points P is

an ∅-indiscernible set: that is, any two points satisfy the same parameter-free

first-order formulas. This is due to the fact that the only non-logical relation in

the structure is the incidence relation P ∈ ` of a point P lying on a line `. This

relation is preserved under any permutation π : P → P , in that P ∈ ` if and only

if π(P ) ∈ π(`) ≡ {π(Q) : Q ∈ `}. Here a permutation π : P → P is simply a map

such that no two distinct points are sent to the same point under this map and

such that every point is a point to which some other point is sent. Further, such

a permutation π : P → P is said to be line-preserving if π(`) ≡ {π(Q) : Q ∈ `}

is itself a line for any line `. The reason that P is ∅-indiscernible is that for

99

any two points P,Q there is a line-preserving permutation π : P → P such

that π(P ) = Q. For, since such permutations preserve the incidence relation

and send lines to lines, it induces an automorphism of the structure P t L, and

any two automorphic elements of any structure are ∅-indiscernible. Hence, the

relevant feature of geometrical reasoning which is generative of indiscernibility in

this presentation of the Euclidean plane is the fact that its primitive non-logical

relation (e.g. incidence) is invariant under permutations.95, 96

However, while this certain canonical type of geometrical reasoning is sta-

ble, it is easy to see that arithmetical instance confirmation is not stable. For

the sake of simplicity, let us suppose that the arithmetical instance confirma-

tion in question is such that the universal hypothesis h ≡ ∀ x (Dx → Hx)

is confirmed by evidence of the form en ≡∧ni=1H(si(0)) for some particular

value of n, say m, against the background knowledge k. If there is a natural

number M > m such that 0 < P (eM & k) < P (em & k) < 1, then the evi-

dence em for h is unstable. For, under these circumstances, it follows from P1-

P3 that P (h|eM & k) − P (h|k) > P (h|em & k) − P (h|k), that is, the degree

to which eM confirms the universal hypothesis h is greater than the degree to

which em confirms this universal hypothesis.97 Further, it seems that there are

several plausible scenarios in which there would be a M > m such that the prob-

ability of eM is less than the prior probability of em. For instance, this would

happen if probabilities are associated with degrees of confidence, and this would

likewise happen if some of the evidence displayed independence. Hence, to the

extent that these scenarios are plausible, it follows that arithmetical instance con-

firmation need not be stable.

The instability of arithmetical instance confirmation is a problem because,

100

other things being equal, it seems that stable mathematical reasoning has certain

advantages over unstable mathematical reasoning. For, if one is operating in a

context where all the reasoning in which one engages is stable, then one is given

a kind of license to prescind from the examination of particulars in establish-

ing universal hypotheses. For, stability guarantees that evidence for a universal

hypothesis cannot be bettered by examination of particular cases. Since the ex-

amination of particular cases often requires a non-trivial expenditure of resources,

foreknowledge of stability frees one from such expenditures, and allows one to

focus resources elsewhere. By contrast, consider what happens in the case of un-

stable reasoning about an infinite domain. On the one hand, instability says that

there is a finite set of objects such that the evidence of their satisfying a universal

hypothesis materially improves the prior evidence for the universal hypothesis. On

the other hand, the infinitude of the domain does not ensure that one will succeed

in finding such a finite set (or ensure that one will succeed in finding such a set

and recognizing it as such). In stable reasoning about infinite domains, such a

dilemma simply cannot occur, and freedom from such dilemmas is further indica-

tion of the chief good delivered by stable reasoning, namely, the freedom from the

examination of particular cases in establishing universal hypotheses. Hence, be-

cause it aids in delivering this chief good, stable reasoning enjoys a ceteris paribus

advantage over unstable reasoning.

However, there are at least three potential objections to this argument that

stable reasoning is ceteris paribus better than unstable reasoning. The first ob-

jection is that this argument only establishes that stable reasoning has certain

pragmatic non-epistemic advantages, since considerations of the expenditure of

resources are at best pragmatic considerations, not ultimately bearing upon any

101

genuine epistemic notion, like that of justification. I am willing to grant for the

sake of argument that considerations of the expenditure of resources are ulti-

mately pragmatic in nature.98 However, I would suggest that such pragmatic

considerations are ultimately constitutive of mathematical reasoning as such. In

mathematics one reasons about infinite domains with the aid of theories whose de-

ductive relations are often non-computable, and for one to meet with any success

in establishing universal hypotheses about these domains, one must ultimately de-

velop strategies for limiting expenditures of resources. The argument given in the

above paragraph shows that stable reasoning can avoid one particularly resource-

consuming activity, namely, the consideration of particular cases in establishing

universal hypotheses.

The second objection arises from the observation that while stable reasoning

may avoid the consideration of particular cases in establishing universal hypothe-

ses, obviously neither it nor any other extant available form of reasoning can avoid

the consideration of particular cases in establishing existential hypotheses. Thus

the objection would be that the aforementioned suggested advantage of stable over

unstable reasoning is illusory since one cannot know ahead of time whether to go

about seeking to establish a universal hypothesis or its negation. It is of course

true that one cannot know ahead of time whether to go about seeking to establish

a universal hypothesis or its negation. However, in the normal course of events,

one will alternate between (i) expending resources in attempting to establish a

universal hypothesis and (ii) expending resources in attempting to establish its

negation. My suggestion is merely that stable reasoning is ceteris paribus prefer-

able to unstable reasoning simply because it enjoys an advantage with respect

to (i), albeit not with respect to (ii). That is, it is preferable because it exacts a

102

savings in the expenditure of resources in regard to one important component of

mathematical activity.

Finally, one might voice not an objection per se but rather a lingering concern

that the argument for the ceteris paribus preferability of stable reasoning is ill-

motivated, in that it does not have an obvious historical precedent. However, it

seems that there are extant arguments in the philosophy of mathematics for con-

clusions to the effect that unstable reasoning is objectionable. For instance, in his

commentary on Euclid, Proclus tells us that “[. . . ] a universal premise is better for

demonstration than a particular [. . . ]” and“[. . . ] demonstrations from universals

are more truly demonstrative [. . . ]” ([125] p. 14, italics added). Likewise, in his

book on mathematical knowledge, Kitcher asks: “How, for example, do I have the

right to conclude, on inspecting a scalene triangle, that the sum of the lengths of

two sides of a triangle is greater than the length of the third side but not that

all triangles are scalene?” ([91] p. 51, italics added). These concerns of Proclus

and Kitcher seem to articulate what I regard as a not uncommon view in the phi-

losophy of mathematics, namely that evidence of several particulars satisfying a

universal mathematical hypothesis cannot materially better our evidence for that

universal hypothesis itself. By definition, such improvements cannot take place

in stable reasoning, and hence the argument given above for the ceteris paribus

preferability of stable reasoning in mathematics is simply an attempt to articulate

an admittedly new argument for this not uncommon view.

Hence, thus far three things have been argued for in this section: (i) that one

important type of mathematical reasoning– namely, a canonical sort of geometrical

reasoning– is stable, and (ii) that arithmetical instance confirmation is not neces-

sarily stable, and that (iii) that stable mathematical reasoning is ceteris paribus

103

preferable to unstable mathematical reasoning. Hence, this suggests the following

challenge: arithmetical instance confirmation is objectionable not because its fails

to provide a measure of justification, but rather because such reasoning fails to

display an important virtue of mathematical reasoning, namely, stability.

My response to this challenge is to suggest that while stability may be a

virtue of geometric reasoning, it is not a virtue of arithmetical reasoning. For,

the chief good that stability imparts on our reasoning is that it frees us from the

burden of having to examine particular cases in establishing a universal hypothesis.

However, if all extant reasoning about a given domain of objects was such that it

involved the examination of particular cases in establishing a universal hypothesis,

then this would suggest that stability is not of value in regard to our reasoning

about this domain. For, if the value X is an instrumental value which derives

its worth from the extent to which it promotes value Y , and if value Y is simply

known not to be attainable in certain contexts, then in these contexts X likewise

loses its value. In these terms, my response to the challenge from stability is

to contend that in arithmetical contexts, freedom from the burden of examining

particular cases in establishing universal hypotheses is simply not extant, and

since this is the chief end to which stability is directed as a means, the stability

of reasoning is simply not of value in arithmetical contexts.

Hence, this response requires an argument to the effect that extant arithmeti-

cal reasoning always involves the examination of particular cases in establishing a

universal hypothesis. The warrant for this is that the primary extant manner in

which mathematical claims are established in arithmetical contexts is by mathe-

matical induction: that is, one argues that zero has the property, and that n+ 1

has this property whenever n does, and from this one concludes that all natural

104

numbers have this property. To the best of my knowledge, outside of the ax-

ioms of Robinson’s Q with which inceptive empiricism is concerned, all universal

arithmetical claims are established by mathematical induction. Indeed, this is

the reason why amplificatory empiricism is important, in that purports to justify

this mode of inference. However, the point being made now is independent of

both amplificatory and inceptive empiricism: this is the point that in establishing

universal arithmetical claims, such as feature in the consequent of an instance of

mathematical induction, there is always as a matter of fact an examination of par-

ticular cases, such as in the antecedent of an instance of mathematical induction.

It is helpful here to explicitly contrast the geometrical and arithmetical setting

with regard to the necessity of examining particular cases. To establish a universal

hypothesis in the arithmetical setting by mathematical induction, it is necessary

to first garner evidence that zero (or some other “base case”) satisfies the universal

hypothesis. Hence, in the arithmetical setting, examining a particular case is a

necessary precondition to implementing this canonical means of establishing a

universal hypothesis. However, suppose that one is in a geometrical setting with

a high level of recognized indiscernability, and suppose that one is attempting

to establish a universal hypothesis to which indiscernability applies, in that if

one object in the domain has the property, then all do. In developing proofs of

this hypothesis, one can always eliminate claims to the effect that a particular

object satisfies the universal hypothesis, since such a claim contributes nothing to

the proof, since it is equivalent to the universal hypothesis itself. Hence, in this

canonical type of geometric setting, the examination of particular cases is always

in principle eliminable.

Thus, my response to the challenge from stability is simply to concede that sta-

105

bility is a virtue in some types of geometrical reasoning, but to deny that the same

is true in arithmetical reasoning. For, it seems that the good that stability secures,

namely the legitimate disengagement from the consideration of particular cases, is

simply not to be found in any known means of establishing universal arithmetical

hypotheses. That said, one way in which to disagree with this response would be

to adduce some manner of securing universal arithmetical hypotheses which did

not rely on the examination of particular cases. It is by no means obvious that

we presently possess a complete enumeration of types of argumentation germane

to arithmetical universal hypotheses. But, absent such a countervailing example

of arithmetical argumentation, it seems that arithmetical instance confirmation is

of a piece with the other known methods of establishing universal claims about

universal arithmetical hypotheses. For, unlike in the geometric setting, in the

arithmetical setting all of of these methods involve the examination of particular

cases.

2.4 Challenges from Alternative Inferences

In the previous section, challenges centered around arithmetical instance con-

firmation were considered, and these challenges were thus specific to inceptive em-

piricism, since of the two types of empiricism considered here– namely incep-

tive and amplificatory empiricism– it is only inceptive empiricism which relies

upon arithmetical instance confirmation. By contrast, this section is devoted to

a difficulty which threatens the sustainability of amplificatory empiricism. Recall

that amplificatory empiricism contends that one is justified in inferring from the

antecedent of an instance of mathematical induction to its consequent, relative

to the background knowledge consisting of the conjunction of the eight axioms

106

of Robinson’s Q, because the consequent is confirmed by the antecedent relative

to this background knowledge. The challenge considered in this section contends

that confirmation is not a good guide to justification in this arithmetical setting

since there are a series of inferences which are relatively similar to the inference

from the antecedent of an instance of mathematical induction to its consequent,

but in which the evidence in question intuitively constitutes poor evidence for the

hypothesis in question. The response which I suggest in this section is that con-

firmation can be made to accord with these intuitive judgments about the quality

of evidence if one takes into account not only confirmation per se but also the

degree of confirmation.

To describe the relevantly similar inferences that I have in mind, it is help-

ful to first recall some terminology introduced in § 2.1 and to take note of some

elementary logical implications. The consequent of an instance of mathemati-

cal induction simply says that all numbers have a given fixed property, and that

the antecedent simply says that zero has a property and that n + 1 has this

property whenever n does. Let us respectively call this antecedent and this conse-

quent the genuine antecedent and the genuine consequent. Let us then define the

pseudo-antecedent to be the following claim: zero has the property and 2(n + 1)

has the property whenever 2n does. Likewise, in parallel with this, let us fi-

nally define the pseudo-consequent to be the following claim: all even natural

numbers have the property, where an even number is simply a number which

is equal to 2n for some natural number n. Further, let us slightly expand the

background knowledge from Robinson’s Q to a slightly larger finite theory– which

we can term supplemented Robinson’s Q– which consists of Robinson’s Q plus

the axiom ∀ n 2(n + 1) = 2n + 2 = ((2n) + 1) + 1. Then against the background

107

knowledge of supplemented Robinson’s Q, one has the following elementary logical

implications: (i) the genuine consequent logically implies the genuine antecedent,

the pseudo-antecedent, and the pseudo-consequent, (ii) the pseudo-consequent

logically implies the pseudo-antecedent, and (iii) the genuine antecedent logically

implies the pseudo-antecedent.99

I want now to consider two pairs of inferences, and to contrast what amplifica-

tory empiricism says about these cases with what our intuitive judgments about

the quality of evidence say about these cases. First, consider the contrast between

the following two inferences:

(a) the inference from the genuine antecedent to the genuine consequent,(b) the inference from the pseudo-antecedent to the genuine consequent.

From an intuitive perspective, this latter inference (b) is inferior to the former in-

ference (a). For, intuitively, the evidence which features in the pseudo-antecedent

only concerns half of the natural numbers, namely, the even natural numbers, and

in general what is true of one infinite coinfinite subset of the natural numbers need

not be true of all the natural numbers. For instance, only one of the even numbers

is prime, whereas there are infinitely many prime numbers. However, it is clear

that the very same considerations which show that the genuine antecedent con-

firms the genuine consequent will also show that the pseudo-antecedent confirms

the genuine consequent: namely, against the background knowledge of supple-

mented Robinson’s Q, it follows that the consequent logically implies the an-

tecedent in both (a) and (b), as we noted in previous paragraph (at roman nu-

meral (i)). Hence, the very same considerations which underlie amplificatory em-

piricism would commit us to saying that one is justified in inferring from the

pseudo-antecedent to the genuine consequent. This, of course, is intuitively prob-

lematic, since this would justify us in concluding the obviously false statement that

108

all natural numbers are even numbers on the basis of the obviously true statement

that zero is an even number and that if 2n is an even number then 2(n+ 1) is an

even number.100

Likewise, consider the contrast between the following two inferences, the first

of which has already been encountered:

(b) the inference from the pseudo-antecedent to the genuine consequent,(c) the inference from the pseudo-antecedent to the pseudo-consequent.

The contrast between this pair of inferences is relevant because, supposing that one

had acquired the pseudo-antecedent as evidence, there arises the natural question

as to whether one is better justified in inferring towards the genuine consequent

or towards the pseudo-consequent. Or, supposing that one person infers from the

pseudo-antecedent to the genuine consequent, while another person infers from the

pseduo-antecedent to the pseudo-consequent, there arises the natural question of

which of these two people has the better evidence for their conclusion. Intuitively,

it seems that the inference towards the pseudo-consequent is better justified than

the inference towards the genuine consequent, and largely for the same reasons as

described in the previous paragraph. For, if the evidence at hand only concerns

even numbers, then it seems that one should refrain from endorsing a universal

hypothesis about all natural numbers and rather endorse a more circumspect

universal hypothesis about all even numbers. However, as noted above in roman

numerals (i)-(ii), in both the inference (b) and the inference (c), one has that

the consequent logically implies the antecedent, and hence it seems that the same

considerations which support amplificatory empiricism would lend support to the

contention that one is justified in making both of these inferences, despite the

fact that inference (c) seems superior to inference (b). This, of course, seems

intuitively quite problematic: for, given the evidence that zero is an even number

109

and that 2(n + 1) is an even number whenever 2n is an even number, it seems

far more reasonable to conclude the obvious truth that all even numbers are even

numbers than it seems to conclude the obvious falsehood that all natural numbers

are even numbers.

In order to bring the considerations of confirmation which underlie amplifica-

tory empiricism in accord with these intuitive judgements of the superiority and

inferiority of evidence, my suggestion is to advert to the notion of the degree of

confirmation. The degree of confirmation can be taken to be given by the quan-

tity P (h|e & k) − P (h|k), which measures the extent to which the probability

of the hypothesis h conditional on the evidence e and background knowledge k

exceeds the probability of the hypothesis h conditional merely on the background

knowledge k. In our examples, the background knowledge k is the conjunction

of the nine axioms of supplemented Robinson’s Q, the hypothesis in question is

either the genuine consequent hg or the pseudo-consequent hp, and evidence in

question is either the genuine antecedent eg or the pseudo-antecedent ep. Fur-

ther, in the antepenultimate paragraph, it was noted that against background

knowledge k, one has that (i) hg logically implies eg, ep and hp, (ii) hp logically

implies ep, and (iii) eg logically implies ep. Finally, in this terminology, the first

contrast considered was between the inference (a) from eg to hg and the infer-

ence (b) from ep to hg, while the second contrast considered was between the

inference (b) from ep to hg and the inference (c) from ep to hp. This information is

summarized in Figure 2.1, where the logical implications are written with arrows

labeled by the logical consequence relation |= and the inferences undertaken by

agents are written with arrows labeled by (a)-(c). Note that the directions of the

arrows here are exactly as one would expect, e.g. while hg logically implies eg, we

110

hg

|=

��

|=,,

|=

��

hp

|=

��eg

(a)

LL

|=22 ep

(b)

[[

(c)

LL

Figure 2.1. Alternative Confirming Inferences: Two Pairs of ContrastingInferences

are considering the suggestion that one is justified in inferring from eg to hg.

It is helpful to present these examples in such an abstract manner because

the information depicted in Figure 2.1 suffices to explain why the degrees of con-

firmation in our two pairs of contrasting inferences accords with the intuitive

judgements of the superiority and inferiority of evidence described above. For,

suppose that a given quadruple of sentences hg, eg, hp, ep stands in the same log-

ical relationships as depicted in Figure 2.1, in the sense that the arrows labeled

with the logical consequence relation |= are the same. It then follows from stan-

dard manipulations of P1-P3 that the degree of confirmation in inference (a) is

greater than or equal to the degree of confirmation in (b), and that likewise the

degree of confirmation in inference (c) is greater than or equal to the degree of

confirmation in inference (b).101 This, of course, accords entirely with the intu-

itive judgments described above, which held that inference (a) was superior to

inference (b), and likewise that inference (c) was superior to inference (b). Hence,

111

by taking into account not just confirmation but the degree of confirmation, it is

possible to distinguish between superior and inferior inferences in a manner which

agrees with our intuitive judgements.102

The challenge considered in this section is that in order for confirmation to be

a good guide to justification, it needs to accord with our intuitive judgments of

the superiority and inferiority of evidence. My response to this challenge is simply

to note that the comparisons of degree of confirmation agree entirely with these

intuitive judgments. However, it is not difficult to see that nothing in this response

hinged heavily upon considerations peculiar to the even numbers, in terms of which

the pseudo-antecedent and the pseudo-consequent were defined. Rather, all that

was important was the relations of logical implication depicted in Figure 2.1.

Hence, this response is quite general, and applies to other alternative inferences

which stand in similar relations of logical implication to the genuine antecedent

and the genuine consequent.

To illustrate this generality, consider an example which differs from the exam-

ple of even numbers in that it concerns a class of numbers which are not distributed

uniformly throughout the natural numbers, but rather are clustered towards the

beginning of the natural numbers. In particular, suppose that one grants that

there is a class of natural numbers called the standard natural numbers, such that

probabilities can be assigned to all sentences in the signature of the Peano axioms

augmented by a unary predicate symbol for this class, and such that this class

has the following properties, all of which are incorporated into the background

knowledge: (1) all standard natural numbers are natural numbers, but there are

some natural numbers which are not standard natural numbers, (2) if n is a stan-

dard natural number and m is less than n, then m is a standard natural number,

112

(3) zero is a standard natural number and n + 1 is a standard natural number

whenever n is a standard natural number. Obviously, properties (1) and (3) imply

the falsity of various instances of the mathematical induction schema which are

expressible in the signature augmented by a unary predicate symbol for the stan-

dard natural numbers, but elementary compactness considerations indicate that

these three properties (1)-(3) do not require violations of mathematical induction

applied to properties which are expressible in the signature unaugmented by the

unary predicate symbol for this new class. That is, the supposition of such a class

is entirely consonant with mathematical induction being true for all arithmetical

predicates expressible purely in terms of addition and multiplication.103

Hence, just as the notion of the pseudo-antecedent ep and the pseudo-consequent hp

were defined above, so one can define the notions of the standard-antecedent and

the standard-consequent as follows: the standard antecedent es says that zero has

the given property and that n+1 has the property whenever the standard natural

number n has the property, while the standard consequent hs says that all stan-

dard natural numbers have the property in question. It is easy to see that relative

to the background knowledge consisting of Robinson’s Q and the claims (1)-(3)

from the previous paragraph, one has that (i) hg logically implies eg, es and hs,

(ii) hs logically implies es, and (iii) eg logically implies es. Hence, when the sub-

script p is replaced by the subscript s, one sees immediately that the quadruple

of sentences hg, eg, hs, es instantiates the logical implication relations from Fig-

ure 2.1. Hence, by what was said above, it follows that the degree of confirmation

in the inference from the genuine antecedent to the genuine consequent is greater

than or equal to the degree of confirmation from the standard antecedent to the

genuine consequent, and likewise that the degree of confirmation in the inference

113

from the standard antecedent to the standard consequent is greater than or equal

to the degree of confirmation from the standard antecedent to the genuine con-

sequent. This, it seems, accords with our intuitive judgments about the quality

of evidence: a universal hypothesis about natural numbers is better confirmed

by evidence about all the natural numbers than by evidence about the standard

natural numbers, and likewise evidence pertaining exclusively to standard natural

numbers better confirms a universal hypothesis about standard natural numbers

than a universal hypothesis about natural numbers.

This last point seems particularly important to mention, because one might

be sympathetic to the intuition that, as far as we know, there are natural num-

bers which are not standard natural numbers, whereas all our evidence concerns

standard natural numbers. For, on the one hand, the assumptions on the class

of standard natural numbers articulated in (1)-(3) of the penultimate paragraph

require that the natural numbers 0, 1 ,2, 3, 4, 100, 2000, 30000, 400000 are all

standard natural numbers, and such considerations suggest the plausibility of the

thought that all our evidence pertains to standard natural numbers. But, on the

other hand, if one takes probabilities to be indicative of degrees of confidence, one

might be sympathetic to assigning a non-zero probability to the claim that that

some natural numbers are non-standard natural numbers.

However, what I want to emphasize is that if the evidence in question is all rel-

ativized to the standard natural numbers, then this does not significantly impact

the viability of the thought underlying amplificatory empiricism. For, whereas

amplificatory empiricism is a claim about the epistemic implications of the con-

firmation of the genuine consequent by the genuine antecedent, there is a obvious

analogue which would concern the epistemic implications of the confirmation of

114

the standard consequent by the standard antecedent. Further, as noted in the last

paragraph, given evidence consisting of the standard antecedent, both our intu-

itive judgements and considerations of the degree of confirmation would suggest

that it is better to infer from such evidence to the standard consequent than to

the genuine consequent. Hence, if one thinks that it is plausible that there are

natural numbers which are not standard natural numbers and if one thinks that

it is plausible that everything in our evidentiary store concerns standard natural

numbers, then one can employ an obvious analogue of amplificatory empiricism

as a means by which to confirm universal statements about the natural numbers

in our evidentiary store.

Much of the philosophical literature on the Peano axioms is preoccupied by

considerations about non-standard integers, and indeed rightly so, as much of

this literature is concerned with the extent to which first-order axiomatizations

such as the Peano axioms are capable of uniquely describing or characterizing the

subject-matter of the natural numbers.104 However, part of what I have tried

to underscore in the preceding paragraphs is that the distinction between stan-

dard and non-standard natural numbers is orthogonal to the tenability of the

probability-based conception of arithmetical knowledge which I consider here. On

some level, this is not a surprising result– for instance, I take it that it is not

obvious that every important distinction in the metaphysics of middle-sized ob-

jects will automatically have implications for the epistemology of such objects.

However, given the predominance of considerations of non-standard integers in

the philosophical literature on the Peano axioms, it seems prudent to have ex-

plicitly made a case for this orthogonality here in this section. My suggestion is

that alternative inferences centered around non-standard natural numbers are no

115

more difficult to deal with than alternative inferences centered around even nat-

ural numbers, since elementary considerations about the degree of confirmation

accord with our intuitive judgments about the superiority and inferiority of the

evidence in these alternative inferences.

2.5 Conclusions and Directions for Future Research

This essay has been a preliminary defense of an empiricism according to which

our knowledge of arithmetic is of a piece with the knowledge by which we infer from

the past to the future or from the observed to the unobserved. It is a preliminary

defense in that I have not adduced positive arguments for the types of empiricism

which I consider here– namely inceptive and amplificatory empiricism– but rather

have defended the tenability of these forms of empiricism against various challenges

and objections. One overarching feature of this defense has been to argue that

if one of these challenges tells decisively against empiricism, then by parity of

reasoning it tells decisively against confirmation as a source of justification in the

setting of the physical sciences (e.g. the discussion of reliability in § 2.3.1) or

against deduction as a source of justification in the setting of mathematics (e.g.

the discussion of complete computable extensions in § 2.2.2). Similarly, I have tried

to explain why issues apparently peculiar to the setting of arithmetic, such as the

discussion of alternative inferences centered around non-standard numbers in § 2.4,

can in fact be treated by recourse to purely probabilistic considerations pertaining

to degree of confirmation. Another distinctive feature of the defense presented here

has been the precise delineation of the aretaic notion of stability (e.g. §§ 2.3.2) and

the examination of whether this is preferable (or even attainable) in the type of

reasoning which occurs under the aegis of inceptive and amplificatory empiricism.

116

The primary task for future work on these forms of empiricism lies in develop-

ing and articulating an account of the sources of arithmetical probabilities. For,

these forms of empiricism reduced knowledge of the Peano axioms to knowledge

of confirmation, and such confirmation is contingent upon being able to ascertain

when the evidence in conjunction with the background knowledge has non-zero

probability strictly less than the probability of the background knowledge (or in

the case of confirmation tout court, it is contingent upon being able to ascertain

when then evidence is assigned a non-zero probability strictly less than one). I

view this as a difficult challenge because this essay has in effect ruled out various

potential sources of justification. For instance, the discussion in §§ 2.2.1-2.2.2

rules out appealing to ω-additivity or to the computability of the ambient proba-

bility assignment. Further, it is not obvious that ideas which are used to ascertain

probabilities in the setting of the physical sciences– such as de Finetti’s notion of

exchangeability (cf. [21] § 8, [119])– can be likewise used in the setting of arith-

metic. Hence, it seems that what is required to complete this task is some new

idea about the source of arithmetical probabilities, or at least the means by which

these probabilities can be ascertained.

A secondary task for future work on inceptive and amplificatory empiricism

lies in carefully delineating the type of arithmetical reasoning which features in

the Peano axioms from the type of arithmetical reasoning which features in the

addition and subtraction of probabilities. For, were one unable to do this, then

it would seem hopeless to try to ground our knowledge the Peano axioms on our

knowledge of the probability axioms P1-P3, since the latter may very well im-

plicate or presuppose the former. Of course, Tarski’s work on the decidability

of the theory of the real closed field is clearly relevant here. This work tells us

117

that the natural numbers, with which inceptive and amplificatory empiricism are

concerned, cannot be defined in the real numbers, with which probability assign-

ments are concerned. Further, this work provides us with a complete decidable

axiomatization of the real numbers, which suggests the idea that our knowledge of

real numbers figuring in probability assignments can be taken to be based purely

on this axiomatization, and not on an underlying conception of natural number

(cf. Marker [107] § 3.3 pp. 93 ff). However, this is at best a start to a response,

since it does not address the chief difficulty, namely, of spelling out in some more

precise manner what it means for one proposition or axiomatization to presuppose

or implicate another. This is connected to the widely recognized resistance of the

concept of circularity to conceptual analysis. For instance, it is widely recognized

that it is highly challenging to provide any conceptual analysis of circularity on

which some but not all valid arguments are regarded as circular (cf. [79] p. 26,

[132]). Hence, subsequent to some progress on the conceptual analysis of circular-

ity, the task for inceptive and amplificatory empiricism would be to see if Tarski’s

work can support the claim that knowledge of the arithmetic of the real numbers

does not presuppose knowledge of the arithmetic of the natural numbers.

These are the primary tasks which inceptive and amplificatory empiricism

must fulfill if they are to be ultimately endorsed. However, the task of this essay

has been more humble in character: in particular, this essay has merely sought to

defend inceptive and amplificatory empiricism against some particularly pressing

objections. In particular, I have argued that countable additivity and the non-

computability of probability assignments are not genuine barriers to our access

to probability in the setting of arithmetic (§§ 2.2.1-2.2.2). Likewise, I have ar-

gued that the arithmetical instance confirmation on which inceptive empiricism

118

relies is not vitiated by considerations of unreliability, insufficient diversity, or

instability (§§ 2.3.1-2.3.2). Finally, I have argued that elementary considerations

pertaining to the degree of confirmation can explain various intuitions about the

quality of alternative confirming inferences centered around non-standard num-

bers (§ 2.4). These objections are by no means the only objections which one

could mount against inceptive and amplificatory empiricism, but in my view they

are the objections which are the most pressing, precisely because they concern

various apparent difficulties which emerge when one begins to take seriously the

idea that just as enumerative induction can be justified by recourse to informed

judgments of probability, so too can mathematical induction and the other Peano

axioms.

119

2.6 Notes

46 The theory Robinson’s Q consists of the following eight axioms:

(Q1) s(x) 6= 0

(Q2) s(x) = s(y)→ x = y

(Q3) x 6= 0→ ∃ w x = s(w)

(Q4) x+ 0 = x

(Q5) x+ s(y) = s(x+ y)

(Q6) x · 0 = 0

(Q7) x · s(y) = x · y + x

(Q8) x ≤ y ↔ ∃ z x+ z = y.

The mathematical induction schema is the following schema, where ϕ(x)ranges over formulas in the language (and which may contain additional freevariables):

(Iϕ) [ϕ(0) & ∀ n ϕ(n)→ ϕ(n+ 1)]→ [∀ n ϕ(n)]The system of first-order Peano arithmetic consists of the axioms of Robin-son’s Q and the mathematical induction schema Iϕ, and this is the systemstudied in e.g. Hajek and Pudlak [59]. This is to be distinguished from thesystem of second-order Peano arithmetic, as studied in e.g. Simpson [138],wherein the mathematical induction schema Iϕ is replaced by both the math-ematical induction axiom

(MI) ∀ F [F (0) & ∀ n F (n)→ F (n+ 1)]→ [∀ n F (n)]and the comprehension schema, where ϕ(x) is a formula (which may con-tain additional second-order quantifiers and which may contain additionalfree variables):

(Cϕ) ∃ F ∀ n (ϕ(n)↔ Fn)It is easy to see that second-order Peano arithmetic is equivalent to the systemconsisting of Robinson’sQ, the comprehension schema Cϕ, and the mathemat-ical induction schema Iϕ, where ϕ is allowed to contain second-order quanti-fiers. Hence, since the mathematical induction principle can be represented bythe mathematical induction schema Iϕ in both first- and second-order Peanoarithmetic, in this chapter I shall focus on the mathematical induction schema,and examine types of justification which can be imparted on instances of thisschema by considerations of probability. However, one drawback of this ap-proach is that nothing will be said here about the status of the comprehensionschema Cϕ, and it is not obvious that its existential claims can be justified

120

by recourse to judgements of probability in the same way that the axiomsof Robinson’s Q and the mathematical induction schema can be so justified.Hence, one who was sympathetic to the conclusions of this chapter but whowas nonetheless committed to second-order Peano arithmetic as opposed tofirst-order Peano arithmetic would have to provide a supplemental justifica-tion of the comprehension schema Cϕ. This of course is a non-trivial task,since it is known that the comprehension schema is not proof-theoretically in-nocuous: for instance, second-order Peano arithmetic proves the consistencyof first-order Peano arithmetic. Finally, it bears mentioning that the issue ofthe epistemic warrant for the comprehension schema Cϕ is in principle sepa-rable from issues surrounding the purported logicality of second-order logic.For, there are semantics for second-order logic in which all the ordinary the-orems of first-order predicate logic hold– i.e. the so-called Henkin-semantics(cf. Shapiro [135] Chapter 4 or Enderton [32] Chapter 4). But even for onewho accepts the Henkin semantics, and hence for whom second-order quan-tifiers are no less logical in character than first-order quantifiers, there stillremains the question of the epistemic warrant of the comprehension schema.

47 I examine and discuss this issue at length in Chapter 1.

48 For instance, in the chapter “Epistemology and Reference” of his book onstructuralism, Shaprio suggests that “pattern recognition [. . . ] can lead toknowledge of small infinite structures, such as the natural-number structureand perhaps the continuum” ([136] p. 112). Summarizing his account, Shapirosays: “To briefly reiterate, then, we first contemplate the finite structures asobjects in their own right. Then we form a system that consists of the collec-tion of these finite structures with an appropriate order. Finally, we discussthe structure of this system” ([136] p. 118). Hence, it seems that Shapiro issuggesting that our knowledge of natural number is based on our knowledgeof the class or system of all finite structures. However, this class will obey var-ious axioms which are similar to the Peano axioms. For instance, the class offinite structures satisfies the following inductive principle, which for the sakeof disambiguation can be called structure induction: all finite structures havea given property if the zero element structure has a property and if whenevera given structure has the property, then so do all structures which containexactly one more element than this given structure. Presumably the ideabehind Shapiro’s account is that our knowledge of mathematical inductionis based on our knowledge of structure induction. But then it can be askedhow one knows structure induction. One might respond to this objection bysuggesting that all that is required for the epistemology of arithmetic is anaccount of the psychological mechanism which facilities or otherwise underliescognition of natural numbers. But while an account of this mechanism wouldno doubt be invaluable, MacBride’s remark in his discussion of Shapiro alsoseems apposite here: “It is also necessary to undertake the distinctively nor-

121

mative project of coming to an understanding of our justification for holdingthe mathematical beliefs we do, the justification which in favourable casesdistinguishes mathematical knowledge from mere true belief” ([103] p. 159).

49 The background to this is Godel’s brief remark that “[. . . ] the law of completeinduction [. . . ] I perceive to be true on the basis of my understanding (that is,perception) of the concept of integer” ([55] p. 320). Recently, an elaborationand defense of Godel’s remark has been offered by Leitgeb [98] § 3. Leitgeb’smain idea is that the natural number structure can be mentally representedas a “fixed point” of certain operation on graphs: “[. . . ] our agent has somesort of mental representation available which represents the natural numberstructure as being a fixed point under this mental remove-the-initial-nodeoperation” ([98] p. 278). Leitgeb argues that one can thus “see” that thePeano axioms are true of this structure: “Intuitions might give us direct orindirect evidence for the satisfiability of concepts and thus support existenceaxioms [. . . ]: e.g., while it would be hard to ‘see’ that [second-order Peanoarithmetic] is satisfiable just by consulting the conceptual structure of thatconcept, this is no longer so once we gain intuitive access to the naturalnumber structure as sketched above” ([98] p. 279).

50 At one point, Leitgeb himself asks: “So the really interesting question atthis point is [¶] How is this Anschauung der Begriffe achieved?” ([98] p. 281).However, what I have tried to urge in these brief remarks is that an additionalquestion also needs to be asked, namely: “Why should this Anschauung derBegriffe be regarded as a source of justification, given that it is so manifestlydifferent from our normal modes of perception?” It is no doubt obvious, but isworth explicitly noting, that a similar question could be asked about the typesof arithmetical probability which I consider here. Indeed, part of what I seekto do in this chapter is to explain why, even though arithmetical probability isdifferent in certain ways from ordinary notions of probability, it neverthelesscan be viewed as a source of justification.

51 Kastner wrote several essays on the philosophy of mathematics for Eberhard’santi-Kant journal Philosophisches Magazin (cf. [9] p. 219), and the essay inwhich Kastner expresses his skepticism about the connection between mathe-matical induction and axiomhood is entitled “On Geometrical Axioms” ([82]).Here is a selection from the essay, with the key remark on the axiomatic statusof mathematical induction appearing at the outset of section 20:

15) If induction means to observe something in individual casesand to form from this a general proposition, then one has to be ablerun through all the cases, and show that what one claims occurs ineach case. I did this in my Geometry (p. 4) in that I showed that twotriangles are similar when the sides are similar, and on p. 21 I showed

122

that the angle on the perimeter of the triangle is half as large as thatin the midpoint.

16) This induction in fact says nothing new, but rather only com-bines in a general proposition what a collection of particular proposi-tions said, and so is also only reliable in so far as each of the particularpropositions is true.

Mercury, Venus, the Earth, Mars, Jupiter, and Saturn, and theirmoons each get their light from the sun. The induction: all planetshave their light from the sun, is only secure when one knows that theindividual proposition is true of each planet which one is acquaintedwith or which can be acquainted with. If the planet which Herscheldiscovered has its own light, as Hell believes, then the induction is notpermitted.

17) Another type of induction is as follows: various cases are de-rived from one another, and one shows that what occurs in one casemust also occur in the case which immediately succeeds it; and so itis enough to show of the first case, the second case, the third case,in short, of some of the first cases, what one claims of all of them,perhaps continuing without end.

18) I learned this method first from Hausen. It is used in Propo-sition 23 of his Elementary Arithmetic in series of figurative num-bers; Hausen’s method is, so far as I know, based on Jacob Bernoulli’smethod (Iacobi Bernoullii Ars Conjectandi Bas. 1713. Pas II cap. 3pag. 87). This method is very useful to show of laws which one hasperceived through experience in several cases that they are general.So I have used this method frequently in my Introduction to Analy-sis, and it is also used in applied mathematics, for example by HebelIntroduction to Political Science.

19) Since one here proceeds from the n-th case to the (n + 1)-stcase, a good friend of mine in Leipzig, M. Orchliz, called it in jest:the (n+ 1) method.

But one can give it a more noble name. In a genealogy, whensomeone is gentrified, so all his descendants are, in accordance withlaw, gentrified. So it is therefore the ancestral method, and draws onsuch men as Nepern, l’Hospitals, Tschirnhausen, and many others whoknow something of n and n+ 1.

20) But none of these types of induction is the way through whichone comes to mathematical axioms. This way is that of abstraction.Two sticks laid across one another are for the understanding a picturein which it recognizes that a pair of straight lines can only intersecteach other once. This perception is based on the capacity of the un-derstanding to abstract, to think something with the two sticks, which

123

it would likewise think by timbers, as by strings strung straight acrosseach other, or by drawing lines himself.

This capacity of the understanding, called common reason by Leib-niz on p. 416, makes it that one implicitly recognizes axioms, as Leibnizsays, whether or not they are abstractly expressed when they are sorecognized. The axiom, as Leibniz says, is the embodiment of theexemplar ([82] pp. 426-428).

Of course, here Kastner refers to Leibniz’s well-known remark in his NewEssays that axioms are “known implicitly, so to speak, though not at firstin an abstract and isolated way. The instances derive their truth from theembodied axiom, and the axiom is not grounded in the instances” ([97] p. 449).This remark is in effect Leibniz’s way of avoiding the following dilemma posedby Locke in the Essay: “[. . . ] which is known first and clearest by most people,the particular instance, or the general rule; and which is it that gives life andbirth to the other” ([101] IV.xii.3).

Kastner’s discussion was not unknown in the 19th Century. For instance,Fries suggests that the difference between enumerative induction and math-ematical induction is that only the latter involves a “secure overview” of allthe relevant cases:

Induction is a kind of inference which is entirely appropriate to math-ematics, so long as it proceeds from a mathematical division whichgrants us a secure overview of all cases falling under the rule. Butthis overview will not always be directly verifiable, but often in manyways indirectly verifiable. For example, the latter is the case withBernoulli’s induction, which one uses in analysis to such a great effect.I mean here the inference, which by an overview of a series of perhapsinfinitely many cases, proves a law for one case and then shows thatwhen it is valid for any of these cases, so it must be valid for the caseimmediately following, whereby the proof then infers immediately toall cases ([46] pp. 46-47).

Likewise, Trendelenburg records his disagreement with Kastner, stating with-out further explanation that “The apriori sciences don’t recognize any genuineinduction; when they employ induction, they intertwine with it a deduction,a synthetic procedure” ([146] vol. 2 p. 283).

52 Reid’s remarks from Essays on the Intellectual Powers of Man occur in adiscussion of Wallis:

The field of demonstration, as has been observed, is necessarytruth; the field of probable reasoning is contingent truth, not whatnecessarily must be at all times, but what is, or was, or shall be.

124

No contingent truth is capable of strict demonstration; but neces-sary truths may sometimes have probable evidence.

Dr. Wallis discovered many important mathematical truths, bythat kind of induction which draws a general conclusion from particu-lar premises. This is not strict demonstration, but, in some cases, givesas full conviction as demonstration itself; and a man may be certain,that a truth is demonstrable before it ever has been demonstrated. Inother cases, a mathematical proposition may have such probable evi-dence from induction or analogy, as encourages the Mathematician toinvestigate its demonstration. But still the reasoning proper to math-ematical and other necessary truths, is demonstration; and that whichis proper to contingent truths, is probable reasoning ([126] VII.ii.1).

The results of Wallis to which Reid refers are found in Wallis’ The Arith-metic of Infinitesimals, where Wallis finds polynomial expressions for thesums Sm(n) = 1m + 2m + · · · + (n − 1)m by the method which Wallis calls

“induction.” For instance, Wallis showed that S2(n)n(n−1)2

= 13

+ 16(n−1)

by writ-ing out both sides of the equation for small values of n and saying that “theinvestigation may be done by the method of induction” ([152] p. 26 Propo-sition 19). It is interesting to note that today these results are most readilyproved by mathematical induction (cf. Graham et. al [57] § 6.5), althoughother methods of proof are also now known (cf. Ireland and Rosen [75] § 15.1).Wallis’ methods were controversial in his own day, and were criticized by bothHobbes and Fermat (cf. [68] pp. 45-46, [39] pp. 27-28). For a representativesample of Wallis’ responses to Hobbes and Fermat, see Wallis’ Due Correctionfor Mr. Hobbes ([150] p. 42) and Chapter 78 of Wallis’ Treatise of Algebra([151] pp. 298 ff).

53 The quotation is from Rips and Asmuth [130] p. 205. Rips and Asmuth con-clude that empirical induction “plays a mediating role in reminding us of aproperty of the natural number system but provides no independent justifica-tion” ([130] p. 254). My only criticism of this conclusion would be that Ripsand Asmuth do not operate with a sufficiently broad conception of empiricalinduction, and so do not consider the possible relevance of considerations ofprobability. It should also be mentioned that, in other work, Rips and As-muth develop a positive proposal concerning the psychology of the numberconcept, saying: “[. . . ] our top-down approach suggests that these principles[the Peano axioms] (or logically equivalent ones) are acquired as such– that is,as generalizations– rather than being induced from facts about physical ob-jects” ([131] p. 638). If this is understood purely as a psychological hypothesis,then I have nothing to say for or against it. However, if the suggestion is thatan innate “intuition” of number can provide us with arithmetical knowledge,then it seems that many of the same criticisms discussed in endnote 50 would

125

apply equally well here.

54 Randomized algorithms are algorithms which appeal at some point in theirimplementation to a probabilistic process like the tossing of a coin, so thatthe probability of the correctness of the algorithm can be calculated explic-itly relative to assumptions about e.g. the fairness of the coin (cf. [114]).The recent debate about the propriety of these algorithms stems from Fallis([33, 34]), and in the course of their contribution to this debate, both Gaif-man and Easwaran have suggested that one could consider the more generalproject of assigning probabilities to arithmetical sentences based on less ob-jective conceptions of probability. For instance, Easwaran says that “Mostmathematicians are quite convinced that [Goldbach’s conjecture] is true, be-cause no counterexamples have been found among the first several millionintegers” and adds: “While there may be good reason to consider these sortsof arguments in a more general study of probabilistic proofs, I will not focuson them here” ([31] p. 347). Likewise, while Gaifman focuses on the probabil-ities which emerge from randomized algorithms, he suggests that in principlethere is nothing preventing one from considering less objective probabilities inthe case of mathematics: “The methodology of eliciting probabilities by con-sidering bets applies in the case of mathematics as it applies in general. Notthat I find the methodology unproblematic, but its problems have little to dowith the distinction between the types of statements [. . . ] Here, again, thereshould be no distinction between the empirical and the purely deductive. Inprinciple, there can be experts who specialize in certain types of combinatorialproblems, just as there are experts that provide probabilities for finding oil”([48] pp. 107-108).

55 See Howson and Urbach [74] pp. 119 ff, and Earman [30] pp. 63 ff. However,for the sake of completeness, I include here a proof of both of these elemen-tary facts. Suppose that h & k |= e and 0 < P (e & k) < P (k). Then thehypotheses 0 < P (e & k) and 0 < P (k) imply that the conditional probabil-

ities P (h|e & k) = P (h & e & k)P (e & k)

and P (h|k) = P (h & k)P (k)

are defined. Likewise,

note that the hypothesis h & k |= e implies P (h & e & k) = P (h & k), sothat

P (h|e & k) > P (h|k)⇐⇒ P (h & e & k)

P (e & k)>P (h & k)

P (k)

⇐⇒ P (h & k)

P (e & k) >

P (h & k)

P (k)

⇐⇒ 1

P (e & k)>

1

P (k)

⇐⇒ P (k) > P (e & k)

126

The case of confirmation tout court follows directly from these considerationsby taking the background knowledge k to consist of a tautology.

56 For the sake of succinctness, in the main body of the text I stated amplificatoryempiricism as follows:

Amplificatory empiricism contends that one is justified in inferring fromthe antecedent of an instance of mathematical induction to its conse-quent, relative to the background knowledge consisting of the conjunc-tion of the eight axioms of Robinson’s Q, because the consequent isconfirmed by the antecedent relative to this background knowledge.

Precisely due to its succinctness, there are potentially several ambiguities insuch a formulation, which I seek to quickly dispel in this endnote. In partic-ular, formulated as such, amplificatory empiricism instantiates the followingschema:

One is justified in inferring from p to q relative to background knowl-edge k because p, q, k stand in relation R.

I take it as obvious that for such a schema to be interesting, it needs to beunderstood as an abbreviation for the following schema:

One is justified in inferring from p & k to q if one is justified in be-lieving p & k and if one is justified in believing that p, q, k stand inrelation R. Further, there are a preponderance of examples of p, q suchthat p, q, k stand in relation R. Moreover, for every normal agent op-erating in normal circumstances there are several examples of p, q suchthat the agent is justified in believing that p, q, k stand in relation R.Finally, normal agents operating in normal circumstances are justifiedin believing k.

Since such a formulation is so overly verbose, I will avoid it in the main bodyof the text.

57 I have stated amplificatory empiricism as a thesis about justification as op-posed to knowledge. This is because knowledge is typically assumed to betrue, whereas part of the idea here is to articulate epistemic relationships ofan agent to propositions which the agent regards as being not true but merelyprobable. For instance, my idea is that the agent who successfully employsamplificatory empiricism to justify a belief in the consequent of an instanceof mathematical induction could simultaneously deem the probability of thisconsequent to be somewhere between 75% and 85%.

58 Remarks similar to those made in endnotes 56-57 about amplificatory empiri-cism can of course be made here in regard to inceptive empiricism.

59 One sense in which two propositions could be independent of one another isfor the proposition that expresses that they each materially imply one anotherto be false. However, this is clearly not the appropriate sense of independencefor this setting, since in this sense two true propositions could not be inde-

127

pendent of one another. There are at least two other associated notions ofindependence which are more germane to the type of propositions which areentertained and discussed in the philosophical literature. On the one hand,one could say that two propositions are independent of one another if thereis some objection or problem which pertained to the one but which did notobviously pertain to the other. On the other hand, one could say that twopropositions are independent of one another if one but not the other could berationally endorsed by one and the same person at one and the same time.

60 For instance, in Chapter 3 Theorem 22, it is shown that the axioms of Robin-son’s Q, together with the full comprehension schema, interpret the Peanoaxioms. Hence, by appealing to some version of the Logicist Template de-scribed in Chapter 1, one could infer to the Peano axioms from the axioms ofRobinson’s Q, which themselves might be justified by inceptive empiricism.

61 Standard references for Lω1ω include Keisler [85] and Nadel [116]. In particu-lar, for the Lω1ω-completeness theorem, see Keisler [85] Theorem 3 p. 16 andNadel [116] Theorem 3.2.1 p. 280. However, in spite of the Lω1ω-completenesstheorem, the most straightforward version of compactness for Lω1ω-sentencesis false, since there is an Lω1ω-theory which does not have a model but suchthat every finite subtheory has a model. There is of course a more attenuatedversion of compactness for Lω1ω-sentences known as Barwise compactness (cf.Keisler [85] Theorem 11 p. 45, Nadel [116] Theorem 5.6.1 p. 295).

62 For the sake of completeness, I include here a proof of this fact, which for thesake of convenience I restate as follows:

Proposition 1.Suppose that L is the signature of Peano arithmetic and that ϕQ is theconjunction of the eight axioms of Robinson’s Q (cf. endnote 46). Supposethat 0 < ε < 1

2. Suppose that P is an ω-additive probability assignment such

that P (ϕQ) > 1− ε. Then (ω, 0, s,+,×) |= ϕ if and only if P (ϕ) > 1− ε.

First note that it suffices to prove the left-to-right direction. For supposethat we knew the left-to-right direction, i.e. we knew that (ω, 0, s,+,×) |= ϕimplies P (ϕ) > 1 − ε. To prove the right-to-left direction, suppose for thesake of contradiction that we are given sentence ϕ such that P (ϕ) > 1 −ε and (ω, 0, s,+,×) |= ¬ϕ. Then by the left-to-right direction, it followsthat P (¬ϕ) > 1 − ε and from P1-P3 it follows that 1 − P (ϕ) > 1 − ε andhence P (ϕ) < ε < 1

2< 1− ε < P (ϕ), which is a contradiction.

Hence it suffices to prove the left-to-right direction. Note that it suf-fices to show that (ω, 0, s,+,×) |= ϕ implies P (ϕ & ϕQ) = P (ϕQ). For, itwould then follow from P1-P3 and our hypothesis that P (ϕ) ≥ P (ϕ & ϕQ) =P (ϕQ) > 1 − ε. Hence, we now show by induction on the complexity of ϕ

128

that (ω, 0, s,+,×) |= ϕ implies P (ϕ & ϕQ) = P (ϕQ). (i) First we appeal toa well-known fact about Robinson’s Q, namely that it proves all true Σ0

1-sentences: if ϕ is Σ0

1 then (ω, 0, s,+,×) |= ϕ implies that ϕ is provablefrom Robinson’s Q. This fact is sometimes called the Σ0

1-completeness ofRobinson’s Q (cf. Hajek and Pudlak [59] Theorem I.1.8 pp. 30-31). Now, bythe Σ0

1-completeness of Robinson’s Q and by P1-P3, it follows that if ϕ is Σ01

and (ω, 0, s,+,×) |= ϕ then P (ϕ & ϕQ) = P (ϕQ). (ii) Second, if ϕ(x) is ∆00

and (ω, 0, s,+,×) |= ∀ x ϕ(x), then by ω-additivity and (i) it follows that

P ([∀ x ϕ(x]) & ϕQ) = P (∀ x [ϕ(x) & ϕQ])

= limnP (

n∧i=0

[ϕ(si(0)) & ϕQ])

= limnP ([

n∧i=0

ϕ(si(0))] & ϕQ)

= P (ϕQ) (2.3)

(iii) Suppose that the statement is true for all Σ0n and Π0

n-formulas for n ≥ 1.Suppose that ϕ(x) is Σ0

n or Π0n. Suppose that (ω, 0, s,+,×) |= ∃ x ϕ(x).

Then (ω, 0, s,+,×) |= ϕ(sm(0)) for some m ∈ ω. Then by the inductionhypothesis and ω-additivity it follows that

P ([∃ x ϕ(x)] & ϕQ) = P (∃ x [ϕ(x) & ϕQ])

= limnP (

n∨i=0

[ϕ(si(0)) & ϕQ])

≥ P (ϕ(sm(0)) & ϕQ)

= P (ϕQ) (2.4)

But it then follows from P1-P3 that P ([∃ x ϕ(x)] & ϕQ) ≤ P (ϕQ), sothat in fact we have P ([∃ x ϕ(x)] & ϕQ) = P (ϕQ). Alternatively, supposethat (ω, 0, s,+,×) |= ∀ x ϕ(x). Then by the induction hypothesis and ω-additivity it follows that

P ([∀ x ϕ(x)] & ϕQ) = P (∀ x [ϕ(x) & ϕQ])

= limnP (

n∧i=0

[ϕ(si(0)) & ϕQ])

= limnP ([

n∧i=0

ϕ(si(0))] & ϕQ)

= P (ϕQ) (2.5)

129

Hence, the result is now proven.

63 Again for the sake of completeness, I include here a proof of this fact, whichfor the sake of convenience I restate as follows:

Proposition 2.Suppose that L is the signature of Peano arithmetic and that ϕQ is theconjunction of the eight axioms of Robinson’s Q (cf. endnote 46). Supposethat ϕ is an L-sentence such that there is a model M such that M |=ϕQ & ¬ϕ. Then there is an ω1-additive probability assignment on Lω1ω-sentences such that P (ϕQ) = 1 and P (ϕ) = 0.

The proof of this fact is relatively simple. For, given an Lω1ω-sentence ψ,let P (ψ) = 1 if M |= ψ and let P (ψ) = 0 if M |= ¬ψ. Then P is an ω1-additiveprobability assignment: for, the semantics of Lω1ω-sentences are defined sothat if M |=

∧ni=1 ϕi for all n > 0, then M |=

∧n ϕn. Further, by construction,

it follows that P (ϕQ) = 1 and P (ϕ) = 0.Note that by choosing any ϕ such that (ω, 0, s,+,×) |= ϕ and such that ϕ

is not provable from Robinson’s Q, one can obtain the following result, whichexplicitly contrasts to Proposition 1 proved in the preceding endnote:

Proposition 3.Suppose that L is the signature of Peano arithmetic and that ϕQ is theconjunction of the eight axioms of Robinson’s Q (cf. endnote 1). Thenthere is an ω1-additive probability assignment P on Lω1ω-sentences suchthat for any 0 < ε < 1

2it is not the case that (ω, 0, s,+,×) |= ψ if and only

if P (ψ) > 1− ε for all Lω1ω-sentences ψ.

64 Sometimes this ordinary language expression of the Dutch Book Theorem isgiven as the official statement of the theorem. For instance, Howson andUrbach describe the theorem as follows: “if the [ϕi] do not satisfy the prob-ability axioms, then there is a betting strategy and a set of stakes [si] suchthat whoever follows this betting strategy will lose a finite sum whatever thetruth-values of the hypotheses turn out to be” ([74] p. 79). Likewise, Earmansays: “[. . . ] Dutch book, a finite series of bets such that no matter whathappens, your net is negative (a violation of what is called coherence for de-grees of belief). The Dutch-book theorem shows that if any one of the axioms[(P1)-(P3)] is violated, then Dutch book can be made” ([74] p. 39). Thesesorts of statements have the advantage of making manifestly transparent theprimary philosophical application of the theorem, namely, as facilitating aninference from invulnerability to a Dutch book to the satisfaction of the prob-ability axioms. However, it has the disadvantage of rendering opaque the

130

precise manner in which the modal notions implicit in the concept of invul-nerability are characterized in the theorem (“whatever the truth-values of thehypotheses turn out to be,” “no matter what happens”). To the best of myknowledge, the Dutch Book Theorem can only be proven if the implicit modalnotion of necessity is characterized in terms of the non-existence of a completeconsistent theory (or an ersatz thereof, such as a model in the language of thetheory).

65 For instance, this argument may be found in Williamson [156] pp. 411-412.However, one persistent feature of the literature on countable additivity is thatwhile the formal calculations are done with what I have called ω1-additivity,it is typically presumed that this type of countable additivity provides onewith a way in which to assign probabilities to universal statements, a rolewhich in my terminology is exclusively the provenance of ω-additivity. Hence,for this reason and for the sake of completeness, I include here the proof ofthe ω1-additive version of the Dutch Book Theorem. This theorem followsreadily from the following fact:

Proposition 4.Suppose that P is a probability assignment on Lω1ω-sentences. Then Pis an ω1-additive probability assignment if and only if for every sequenceof Lω1ω-sentences ϕ1, . . . , ϕn, . . . such that |= ¬(ϕi & ϕj) for i 6= j and |=∨n ϕn, it is the case that

∑n P (ϕn) = 1.

Here is the proof of this fact. First suppose that P is an ω1-additive proba-bility assignment. Then |=

∨n ϕn implies that P (

∨n ϕn) = 1. Then by ω1-

additivity and P1-P3, it follows that 1 = P (∨n ϕn) = limn P (

∨ni=1 ϕi) =

limn

∑ni=1 P (ϕi) =

∑n P (ϕn). Now we assume that P has this feature and

we attempt to show that it also satisfies ω1-additivity. Let ψ0 = ¬∨n ϕn

and let ψn = ϕn &∧n−1i=1 ¬ϕi. Then |=

∨n ψn and |= ¬(ψi & ψj) for i 6= j.

Then P (¬∨n ϕn) + P (

∨n ϕn) = 1 =

∑n P (ψn), and of course

∑n P (ψn) =

P (¬∨n ϕn) +

∑n P (ϕn &

∧n−1i=1 ¬ϕi) so that one finally has P (

∨n ϕn) =∑

n P (ϕn &∧n−1i=1 ¬ϕi) = limn P (

∨ni=1 ϕi).

Now, for ease of reference, recall that the version of the Dutch BookTheorem currently under consideration reads as follows:

Theorem 5.Dutch Book Theorem, ω1-additive Version: Suppose that P is a functionfrom Lω1ω-sentences to real numbers. Then P is an ω1-additive Lω1ω-probabilityassignment if for every infinite sequence of real numbers sn and every in-finite sequence of Lω1ω-sentences ϕn such that the sequence snP (ϕn) isabsolutely convergent, there is a complete consistent Lω1ω-theory T suchthat

∑n sn(T (ϕn)− P (ϕn)) ≥ 0.

131

It is now easy to see how this theorem follows from the fact which we justproved, and here we follow Williamson [156] pp. 411-412. For, by the standardversion of the Dutch Book Theorem, it follows that P is a probability assign-ment. So suppose that P is not ω1-additive. Then by the above fact, there isa sequence of Lω1ω-sentences ϕ1, . . . , ϕn, . . . such that |= ¬(ϕi & ϕj) for i 6= jand |=

∨n ϕn and

∑n P (ϕn) < 1. Choose sn = −1, so that

∑n |snP (ϕn)| =∑

n P (ϕn) < 1. Let T be a complete consistent Lω1ω-theory. Then∑

n sn(T (ϕn)−P (ϕn)) = −1 +

∑n P (ϕn) < 0, which contradicts our assumption on P .

66 It is important because one is also often interested in proving the converse tothe Dutch Book Theorems, and in doing so, it will be necessary to include sucha convergence criterion. Technically, the convergence criterion is not necessaryfor the Dutch Book Theorem itself: one can remove it and the theorem willstill hold.

67 More formally, here we are appealing to the Σ01-completeness of Robinson’s Q,

which was defined in endnote 62.

68 Given the prominence of Dutch Book arguments in the literature on probabil-ity, there are naturally many such concerns and objections. For instance, oneobjection is that the inference from rationality to invulnerability presupposesthat the value attributed to currency is additive, so that the entire inferencefrom invulnerability to the satisfaction of probability axioms (such as the ad-ditive axiom P3) displays a subtle kind of circularity: one obtains additivitywith respect to probability only because one surreptitiously presupposes ad-ditivity in the setting of value (cf. Armendt [2]). Another objection is basedon the observation that the type of invulnerability in question in the theoremis an invulnerability to net loss with respect to a finite number of bets. Thisobjection then alternatively suggests that rationality only requires invulnera-bility to loss with respect to individual bets taken one-by-one (cf. Maher [104]§ 4.6 pp. 94 ff).

69 By computable, I am here adverting to the notion of Turing computation andrelative Turing computation. To put it very roughly, one set of natural num-bers X is Turing computable from another set of natural numbers Y if thereis a fixed program which, given any input n and allowed access to arbitrarilylarge initial segments of Y , can determine if n is in X. (For more details, seethe definition of X ≤T Y in Soare [139] § III.1). Hence, if neither X nor Y is asubset of the natural numbers, then the formal notions of Turing computabil-ity and relative Turing computability are simply not applicable. However, if Xand Y are countable, then by definition there are injective maps f : X → ωand g : Y → ω and hence the notions of Turing computability and relativeTuring computability apply to f(X) and g(Y ). If the maps f : X → ωand g : Y → ω are somehow suitably natural or canonical, then it would becommonplace in working mathematics to extend the notions of Turing compu-

132

tation and relative Turing computation to X and Y themselves. For instance,if X and Y are sets of rational numbers, and rational numbers are defined ascertain sorts of equivalence classes of natural numbers, then the functions fand g might simply pick out representatives from these equivalence classes.It is this sort of example that I have in mind when I say that the predicatesof Turing computation and relative Turing computation can apply by proxyto countable objects. But it is by no means trivial to say something moreprecise about the conditions under which these predicates can be extendedto countable objects, precisely because it is difficult to say something moreprecise about the sense in which the maps f : X → ω and g : Y → ω may besaid to be natural or canonical.

70 There are several ways to characterize the real algebraic numbers. On the onehand, they can be defined as as the field of all those real numbers which areroots of polynomial equations with rational-valued coefficients. For instance,using this definition, one can produce concrete examples and non-examples ofreal algebraic numbers: for instance,

√2 is real algebraic, while e and π are

not real algebraic. On the other hand, it follows from work of Tarski that thereal algebraic numbers are the smallest field which is elementarily equivalentto the real numbers, in that every first-order sentence which is true of thereal algebraic numbers is likewise true of the real numbers, and vice-versa (cfMarker [107] § 3.3). This definition has the advantage that it tells one that thereal algebraic numbers satisfy various laws which are partially characteristicof the real numbers: for instance, it tells one that the intermediate value the-orem is true for definable continuous functions. In discussing computabilityand non-computability of probability assignments with values in the real al-gebraic numbers (K, 0, 1,+,×,≤), it shall be supposed that the real algebraicnumbers are identified with an isomorphic copy (M, 0, 1,⊕,⊗,�) where Mis computable subset of the natural numbers and where ⊕ and ⊗ and � arecomputable functions. That such a computable copy exists can be seen eitherby directly effectivizing the normal proof of the existence of real closures (cf.Simpson [138] Theorem II.9.7 p. 98) or by using the Effective CompletenessTheorem (cf. Harizanov [61] Theorem 4.1 p. 18). The application of the Ef-fective Completeness Theorem presupposes Tarski’s proof that the completetheory of the real numbers (and hence the real algebraic numbers) is decidable(cf. Marker [107] Corollary 3.3.16 p. 96). Tarski’s proof has been previouslyemployed in the probabilistic setting: for instance, Fitelson uses it to provethat there is a computable procedure for determining whether there are prob-ability assignments on a finite number of propositional letters which satisfyvarious probabilistic constraints (cf. [42] p. 114).

71 A sequence of rationals qn is called a Cauchy sequence if the members of thesequence eventually get arbitrarily close to one another, in the following sense:for every K > 0, there is N > 0 such that |qn − qm| < 2−K for all n,m ≥ N .

133

A sequence of rationals qn is called a quickly-converging Cauchy sequence if therate at which they get close to each other is fixed in advance, in the followingsense: for every K > 0 it is the case that |qK − qK+n| < 2−K for all n ≥ 0(cf. Simpson [138] Definition II.4.4 p. 74). Two Cauchy sequences qn and q′nare said to be equivalent if the absolute value of their difference eventuallyapproaches zero, in the following sense: for every K > 0 there is N > 0 suchthat |qn − q′n| < 2−K for all n ≥ N . Likewise, two quickly-converging Cauchysequences qn and q′n are said to be equivalent if the absolute value of theirdifference approaches zero at a fixed rate, in the following sense: |qn − q′n| ≤2−n+1 for all n ≥ 0 (cf. Simpson [138] Definition II.4.4 p. 74). Just as onecan prove that the set of all equivalence classes of Cauchy sequences is a fieldthat is isomorphic to the real numbers, so one can prove that the set of allequivalence classes of quickly-convering Cauchy sequences is a field which isisomorphic to the real numbers (cf. Simpson [138] Theorem II.4.5 p. 76).

72 The precise details of this step of the computation actually vary dependingon which of the two aforementioned means of representation of real numberswe choose to employ. If we use the representation as real algebraic numbersdescribed in endnote 70, then this is trivial, since the relation 0 ≺ P (ϕ) isdirectly computable from P and the computable structure (M, 0, 1,⊕,⊗,�)mentioned in that endnote. If we use the representation of quickly-convergingCauchy sequences described in endnote 71, then we appeal to the fact thatthe relation P (ϕ) > 0 is computably enumerable in P or Σ0,P

1 , in the followingsense: there is a P -computable predicate R such that P (ϕ) > 0 if and onlyif ∃ n R(ϕ, n) (cf. Simpson [138] p. 76, Soare [139] § I.4 pp. 18 ff). Hence, ifwe antecedently know the disjunction P (ϕ) > 0∨P (ψ) > 0, then we can beginto step through the natural numbers, P -computably testing whether R(ϕ, n)or R(ψ, n) along the way, knowing that there will eventually be some n suchthat we compute R(ϕ, n) or we compute R(ψ, n).

73 For one expression of the idea that outside of the halting set there is a lack ofnatural examples of non-computable computably enumerable sets, see Simp-son [137] p. 287.

74 To the best of my knowledge, there is no extant literature on the relationshipbetween the epistemology of arithmetical principles (like the Peano axioms)and the epistemology of computation. There is an obvious sense in whichboth are sources of arithmetical justification, and there is a certain sense inwhich each source implicates the other. For instance, computations are typ-ically are shown to be correct by virtue of induction and other of the Peanoaxioms. Hence, the question “How do you know that algorithm e computesantecedently specified function f?” always seems to be answered by recourseto induction and the Peano axioms, in that one formally proves by recourseto these axioms that ∀ n ϕe(n) = f(n). Likewise, the axioms of Robinson’s Q

134

define addition and multiplication in terms of their recursive defining equa-tions, and if these axioms were written out in an entirely relational language(i.e. with no function symbols), then they would include axioms saying thataddition and multiplication are total functions. Hence, the Peano axiomsas described in the functional language of endnote 46 do not provide an an-swer to the question: “How do you know that addition and multiplicationare total functions?” Outside of the inceptive empiricism defined in § 2.1 anddiscussed further in § 2.3, it seems that one possible answer to this questionwould be that one has some primitive “algorithmic knowledge” of these facts.Of course, any satisfactory version of this answer would have to say some-thing more definitive about the nature of algorithmic knowledge and how itultimately differs from knowledge of arithmetical axioms.

75 For instance, just to point to one recent source, Joyce has suggested thatthe judgements of probability that feature in justification might be betterrepresented by a family of probability assignments than a single probabilityassignment (cf. [81] p. 171). This suggestion comes as a response to theobjection that it is unrealistic to suppose that degrees of belief or assent canbe represented in terms of a single probability assignment.

76 The arithmetic hierarchy and its extension– the projective hierarchy– per-vades contemporary mathematical logic and occurs in both computability-theoretic and set-theoretic settings. For instance, see Rogers [133] Chap-ters 14-16, Odifreddi [118] Chapters IV.1-2, Jech [80] Chapters 11, 25, 32-33,and Moschovakis [113], especially Chapter 3.

77 Baker explicitly casts the discussion in these comparative terms. For instance,he takes himself to be answering the following question: “(C) Is the use ofinduction in mathematics more or less rationally justified than its use in theempirical case?” ([5] p. 64). Further, he explicitly assumes for the sake ofargument that “we do have good rational grounds for trusting inductive in-ference in the empirical case” ([5] p. 64). Hence, Baker’s idea is that there issomething peculiar to arithmetical instance confirmation that renders it lesstrustworthy than physical instance confirmation, and hence in what followsI shall focus on examining this facet of Baker’s essay. However, it should bementioned that there is much in Baker’s essay which is independent of thispoint. For instance, Baker’s essay includes an examination of what variousmathematicians have said about the status of arithmetical instance confirma-tion. Baker calls this the “descriptive question” (cf. [5] § 3 pp. 61-63), whichhe distinguishes from the “normative question” of “Is enumerative inductionin mathematics rationally justified” (cf. [5] § 5 pp. 65-68). Baker’s answerto the normative question is “no,” and his primary reason for this is whatI have called Baker’s thesis, namely, that arithmetical instance confirmationis biased in a way in which physical instance confirmation is not because the

135

samplings in arithmetical instance confirmation are small. This is the thesisthat I reconstruct from the argumentation adduced on pp. 67-68 of Baker’sessay, although it should be mentioned that Baker never explicitly states thethesis in this particular manner. However, for textual evidence for this recon-struction, see the following endnote (endnote 78).

78 The relevant passage in Baker’s essay is the following:

The problem, in the case of GC [the Goldbach Conjecture] and inall other cases of induction in mathematics, is that the sample weare looking at is biased. [. . . ] [¶] Definition: a positive integer, n,is minute just in case n is within the range of numbers we can (givenour actual physical and mental capabilities) write down using ordinarydecimal notation, including (non-iterated exponentiation). [¶] Verifiedinstances of GC to date are not just small, they are minute. [. . . ] [¶]Hence the sample of positive instances of GC is biased, and unavoid-ably so. Imagine, for example, that mathematicians had only looked ateven numbers divisible by 4 when checking GC, or only (even) squarenumbers. Presumably such evidence would carry less weight since therange of instances is comparatively unvaried ([5] pp. 67-68).

It seems that there are two dimensions of bias which can be seen in theabove paragraph. On the one hand, the example of verifying the GoldbachConjecture on the even numbers divisible by four suggests that the relevantdimension of bias is that of unreliability, since if this procedure were in generalfollowed, then one would end up confirming a large number of falsehoods onthe basis of an examination of a large number of truths: for instance, ifone’s samples were drawn exclusively from the even numbers, then one couldconfirm the obviously false statement that all numbers are even numbers. Onthe other hand, Baker does speak of “comparatively unvaried” samplings atthe close of the above quotation. However, this is the only place where hementions this notion, and he does not explicitly say that what he intends by“bias” is such lack of variation. Indeed, one of the difficulties of interpretingthis key section of Baker’s essay is that he does not explicitly say what heintends by “bias.” It seems that the best that can be said on the basis of thetexts at hand is that unreliability and insufficient diversity are two dimensionsof bias which can be found in Baker’s text.

79 One might also have the intuition that “being a small natural number” is avague term, and that if n is a small natural number, then n+1 is a small nat-ural number. In conjunction with the natural assumption that zero is a smallnatural number, such an additional supposition would result in the predicateof “being a small natural number” constituting a violation of mathematicalinduction. Nothing which I will say depends on this additional supposition,

136

and in fact one minor point that I discuss further in the subsequent endnotedepends on the negation of this additional supposition. Amongst those whohave discussed this matter, the consensus view appears to be that such aviolation of mathematical induction is merely apparent and that any satisfac-tory theory of vagueness must ultimately explain why this violation is merelyapparent. See Williamson [157] p. 42 fn 15, and the references therein. How-ever, as Williamson there notes, at least one author has instead endorsed thecontrary view that mathematical induction must be suitably modified andrestricted.

80 For instance, if the small natural numbers are exclusively the numbers 0, 1, . . . , N ,then the set of all small natural numbers is the set {0, 1, . . . , N}, which hascardinality N + 1, which is not a small natural number. However, it is worthpointing out that one might potentially object to the supposition that there isa “greatest” small natural numberN . In particular, suppose that (as discussedin the previous endnote) one suspects that “being a small natural number”is vague and violates mathematical induction. Then one might be sympa-thetic to the thought that typically the way that one proves the existence ofgreatest elements of finite sets is by noting that their complements have leastelements, and that the least number principle serves many of the same rolesas mathematical induction, and indeed across a suitable base theory will beprovably equivalent to mathematical induction.

81 As discussed two paragraphs previously, all the pointwise-small sets are setwise-small, with the exception of the set of all small natural numbers itself. Let uscall pointwise-small sets which are distinct from the set of all small naturalnumbers proper pointwise-small sets. Then it follows that all proper pointwise-small sets are setwise-small. Consider then the following two claims:

(i) If the samplings in arithmetical instance confirmation are proper pointwise-small sets, then this sampling is biased.

(ii) If the samplings in arithmetical instance confirmation are proper setwise-small sets, then this sampling is biased.

From the fact that all proper pointwise-small sets are setwise-small, it followstrivially that (ii) implies (i), but one cannot infer in the same manner that (i)implies (ii), since there are many setwise-small sets which are not pointwise-small. This is an admittedly more precise albeit excessively more belaboredversion of the point which I was seeking to make in the main body of thetext, which I expressed there in the manner in which I did merely for the sakeof not having to introduce the notion of proper pointwise-small sets into themain body of the text.

82 See the following endnote for the relevant quotation.

137

83 For instance, see Baker’s remark about “good rational grounds” quoted inendnote 77.

84 This quotation comes from the following key paragraph:

A defender of induction in mathematics might respond that mattersare no worse than in the empirical case. There are many distinctivefeatures that are common to all observed emeralds, ravens, electrons,and so on; for example, they have all been observed before the present,and they are all within the past light cone of the Earth. So why notargue, on analogous grounds, that empirical induction is biased? Thedisanalogy, as already mentioned, is that the position of a numberin the ordering of integers often does make a difference to its mathe-matical properties. There are no corresponding systematic differencesbetween past and future or between inside and outside the Earthslight cone. Indeed, insofar as there are any general theoretical princi-ples they tend to concern the spatial and temporal invariance– otherthings being equal– of fundamental physical properties. Of coursethere is still room for a purely sceptical worry concerning inductionin the empirical case, but it seems to lack the specific motivation forworry which afflicts induction in mathematics ([5] p. 68).

85 This analysis was articulated by Howson and Urbach, who say: “This ideaof the similarity between items of evidence is expressed naturally in proba-bilistic terms by saying that e1 and e2 are similar provided P (e2|e1) is higherthan P (e2); and one might add that the more the first probability exceeds thesecond, the greater the similarity” ([74] p. 160). In evaluating this version, itis thus helpful to distinguish between three quantities which may potentiallygauge the degree of similarity, namely:

s1(e1, e2) ≡ P (e1 & e2)P (e1)·P (e2)

= P (e2|e1)P (e1)

s2(e1, e2) ≡ P (e1 & e2)P (e1)

− P (e2) = P (e2|e1)− P (e2)

s3(e1, e2) ≡ P (e1 & e2)− P (e1)P (e2)

On the basis of the above quotation from Howson and Urbach, who talk ofone probability “exceeding“ a second, it might seem that s2 is the intendedmeasure of similarity. However, as is natural, Howson and Urbach also seek toarticulate a notion of similarity that is symmetric in that “if e2 is (dis)similarto e1, then e1 is (dis)similar to e2” ([74] p. 160). However, it is easy tosee that s2 is not symmetric in this sense, while s1 and s3 are symmetric inthis sense. In his discussion of Howson and Urbach on this matter, Wayneemploys s1 and says “For Howson and Urbach, degree of diversity simply is

138

degree of probabilistic independence” ([153] p. 113). However, I do not seeany obvious reason to prefer s1 to s3, and hence in the main body of the text,I express Howson and Urbach’s analysis disjunctively, as gauging degree ofsimilarity either in terms of the quotient from s1 or the difference from s3.

86 More formally, one needs to invoke the Σ01-completeness of Robinson’s Q here,

which was defined and employed in endnote 62.

87 There is admittedly something quite unsatisfactory about this example, namelythat since the evidence is by construction given a high prior probability, thisevidence will only be able to confirm the universal hypothesis to a low degree,at least if degree of confirmation of a hypothesis h by evidence e is measuredby the quantity P (h|e)−P (h). Hence, one might naturally try to seek out anexample where (i) the samplings are pointwise-small, where (ii) the evidenceis close to being probabilistically independent, and where (iii) the evidenceconfirms the hypothesis to a high degree. For, Baker naturally could suggestthat his thesis was only intended to cover arithmetical instance confirmationin which the evidence confirms the hypothesis to a high degree, in which casethe example described in the main body of the text would not vitiate thethesis.

88 This analysis is due to Horwich, who says: “[. . . ] I want to suggest thatevidence is significantly diverse to the extent that its likelihood is low, relativeto many of the most plausible competing hypotheses” ([73] p. 118). Horwichis concerned to prove that it follows from his analysis that diverse evidenceconfirms to a higher degree than less diverse evidence, and in the course of hisproof of this fact, he requires that the hypotheses are “mutually exclusive andit is known that one of them is true” ([73] p. 118). Further, it is evident fromthis proof that one can neglect hypotheses which do not have a substantialprior probability, and hence Horwich notes that he needs only insist upon lowlikelihood with respect to those hypotheses which do have a substantial priorprobability (cf. Horwich’s discussion of the example of curve-fitting on [73]p. 120).

89 I am speculating somewhat on whether Horwich would associate his notionof diversity with some notion of complexity, and so this should merely beregarded as my attempt to motivate the connection between diversity of evi-dence and low-likelihood. He is explicit that his notion of diversity of evidenceis dependent on some antecedent specification of the pool of competing hy-potheses, saying: “[. . . ] I deny that a data set can be evaluated with respectto significant diversity unless this is done in relation to a particular class ofalternative hypotheses and prior assessment of the plausibility of those hy-potheses” ([73] pp. 121-122).

90 For instance, Howson and Urbach employ such counterfactual language, say-ing: “[. . . ] we can gloss your conditional degree in a given b to be what you

139

believe the fair betting-rate on a would be relative to the same informationstock augmented by the additional information consisting of the statementthat b is true” ([74] p. 82).

91 For instance, imagine that one has recently discovered a proof of ¬h, but thatthis proof does not generate or provide an explicit counterexample. Then itseems that one would have no reason to suspect that the counterexamples weresmall rather than large, and hence no reason to treat eX and eY differently.

92 This analysis is due to Glymour, and the full quotation is the following:

The only means available for guarding against such errors is to have avariety of evidence, so that as many hypotheses as possible are testedin as many different ways as possible. What makes one way of test-ing relevantly different from another is that the hypotheses used inone computation are different from the hypotheses used in the othercomputation ([54] p. 140, cf. [52] pp. 419-420, [53] p. 234).

93 There is a large literature on alternative measures of degree of confirmation,and in his discussion of these, Fitelson notes that absent a compelling case thatone of these measures corresponds better to our pre-theoretic notions aboutthe quality of evidence, any argument that did not appeal to features partic-ular to some but not others of these measures would obviously be preferableto one which did so appeal (cf. Fitelson [41] S 364). Hence, in this chapter, Ishall adopt of the procedure of employing the quantity P (h|e & k) − P (h|k)as the degree of confirmation in the main body of the text, but then discussin the endnotes the extent to which the arguments given in the main body ofthe text also hold for alternative measures of the degree of confirmation. Forthese purposes, it will be helpful to here enumerate some of these alternativemeasures of confirmation and note some of their elementary properties. Inparticular, following Fitelson [41] p. S 363, consider the following four mea-sures of the degree of confirmation of a hypothesis h by evidence e relative tobackground knowledge k:

d(h|e; k) = P (h|e & k)− P (h|k)

r(h|e; k) = ln[P (h|e & k)P (h|k) ]

`(h|e; k) = ln[ P (e|h & k)P (e|¬h & k)

]

s(h|e; k) = P (k) · P (e & k) · d(h|e; k)

In some of what follows, various elementary properties of these measures ofthe degree of confirmation will be appealed to, which for the sake of complete-ness are stated and proven here. The motivation for the subscripts in what

140

follows is that our primary application is to Figure 2.1, where the g standsfor “genuine” and the p stands for “pseudo.” Hence, when reading the proofof parts C2-C3 of the following proposition, it is helpful to keep in mind themnemonic “genuine implies pseudo.”

Proposition 6.Suppose that c(h|e; k) is one of d(h|e; k), r(h|e; k), `(h|e; k) or s(h|e; k). Then(C1) If h & k |= eg, ep and c(h|eg; k), c(h|ep; k) > 0, then

(a) c(h|eg; k) > c(h|ep; k) if and only if P (ep & k) > P (eg & k), and(b) c(h|eg; k) ≥ c(h|ep; k) if and only if P (ep & k) ≥ P (eg & k).(c) c(h|eg; k) = c(h|ep; k) if and only if P (ep & k) = P (eg & k).

(C2) If h & k |= eg and eg & k |= ep, c(h|eg; k), c(h|ep; k) > 0, then(a) c(h|eg; k) ≥ c(h|ep; k).(b) If c(h|eg; k) = c(h|ep; k) then P (ep & k) = P (eg & k).

(C3) If hg & k |= hp and hp & k |= e and c(hg|e; k), c(hp|e; k) > 0, then(a) c(hp|e; k) ≥ c(hg|e; k).

(b) If c = r then c(hp|e; k) = c(hg|e; k) = ln[ P (k)P (e & k)

]

(c) If c ∈ {d, s, `} and c(hp|e; k) = c(hg|e; k) then P (hp & k) =P (hg & k)

First we establish (C1), proving only (a), since the proofs for (b)-(c) areidentical. Further, in the proof, we use ei to range over eg, ep and we use hito range over hg, hp. Suppose that h & k |= eg, ep. In the case of d, we haved(h|ei; k) = P (h|ei & k)− P (h|k) = P (h & k)( 1

P (ei & k)− 1

P (k)), so that

d(h|eg) > d(h|ep)⇔1

P (eg & k)− 1

P (k)>

1

P (ep & k)− 1

P (k)

⇔ P (ep & k) > P (eg & k)

In the case of r, it follows that P (h|ei & k)P (h|k) = P (k)

P (ei & k), so that

r(h|eg) > r(h|ep)⇔P (k)

P (eg & k)>

P (k)

P (ep & k)⇔ P (ep & k) > P (eg & k)

In the case of `, it follows that P (ei|h & k)P (ei|¬h & k)

= P (¬h & k)P (ei & ¬h & k)

, so that

`(h|eg) > `(h|ep)⇔P (¬h & k)

P (eg & ¬h & k)>

P (¬h & k)

P (ep & ¬h & k)

⇔ P (ep & ¬h & k) > P (eg & ¬h & k)

141

But note that

P (ep & ¬h & k) > P (eg & ¬h & k)

⇔ P (ep & ¬h & k) + P (h & k) > P (eg & ¬h & k) + P (h & k)

⇔ P (ep & ¬h & k) + P (ep & h & k) > P (eg & ¬h & k) + P (eg & h & k)

⇔ P (ep & k) > P (eg & k)

In the case of s, we have

s(h|ei; k) = P (k)·P (ei & k)·[P (h|ei & k)−P (h|k)] = P (h & k)·[P (k)−P (ei & k)]

Hence we have

s(h|eg) > s(h|ep)⇔ P (k)−P (eg & k) > P (k)−P (ep & k)⇔ P (ep & k) > P (eg & k)

For (C2)(a), note that it follows directly from (C1)(b). For, suppose thath & k |= eg and eg & k |= ep. Then eg & k |= ep & k implies that P (ep & k) ≥P (eg & k). Then by the left-to-right direction of (C1)(b), it follows thatc(h|eg; k) ≥ c(h|ep; k). Likewise, (C2)(b) follows directly from (C1)(c).

For (C3), the proofs of (b)-(c) follow more or less directly from (a), andso we will present the proof for (a) and make a few brief remarks indicatinghow this proof gives the proofs for (b)-(c). So suppose that hg & k |= hp andhp & k |= e. In the case of d, we have d(h|ei; k) = P (hi|e & k) − P (hi|k) =P (hi & k)( 1

P (e & k)− 1

P (k)), so that

d(hp|e) ≥ d(hg|e)⇔ P (hp & k)(1

P (e & k)− 1

P (k)) ≥ P (hg & k)(

1

P (e & k)− 1

P (k))

⇔ P (hp & k) ≥ P (hg & k)

Since hg & k |= hp, it follows that P (hp & k) ≥ P (hg & k), and so we

are done. In the case of r, it follows that P (hi|e & k)P (hi|k) = P (k)

P (e & k), so that

r(hp|e; k) = ln[ P (k)P (e & k)

] ≥ ln[ P (k)P (e & k)

] = r(hg|e; k). In the case of `, it fol-

142

lows that P (e|hi & k)P (e|¬hi & k)

= P (¬hi & k)P (e & ¬hi & k)

, so that

`(hp|e; k) ≥ `(hg|e; k) (2.6)

⇔ P (¬hp & k)

P (e & ¬hp & k)≥ P (¬hg & k)

P (e & ¬hg & k)

⇔ P (¬hp & k)

P (e & ¬hp & k)≥ P (¬hg & hp & k) + P (¬hg & ¬hp & k)

P (e & ¬hg & hp & k) + P (e & ¬hg & ¬hp & k)

⇔ P (¬hp & k)

P (e & ¬hp & k)≥ P (¬hg & hp & k) + P (¬hp & k)

P (e & ¬hg & hp & k) + P (e & ¬hp & k)

⇔ P (¬hp & k)

P (e & ¬hp & k)≥ P (¬hg & hp & k) + P (¬hp & k)

P (¬hg & hp & k) + P (e & ¬hp & k)

⇔ P (¬hp & k) · P (¬hg & hp & k) + P (¬hp & k) · P (e & ¬hp & k)

≥ P (e & ¬hp & k) · P (¬hg & hp & k) + P (e & ¬hp & k) · P (¬hp & k)

⇔ P (¬hp & k) · P (¬hg & hp & k) ≥ P (e & ¬hp & k) · P (¬hg & hp & k) (2.7)

But we have P (¬hp & k) ≥ P (e & ¬hp & k) since e & ¬hp & k |= ¬hp & k,and so we are done.

For (C3)(c) in the case where c = `, note that the equivalence betweenequation (2.6) and equation (2.7) remains if the inequality is replaced byan equality throughout. Assuming that P (¬hg & hp & k) > 0, one canthen infer from `(hp|e; k) = `(hg|e; k) that P (¬hp & k) = P (e & ¬hp & k),

so that `(hp|e; k) = ln[ P (e|hp & k)

P (e|¬hp & k)] = ln[ P (¬hp & k)

P (e & ¬hp & k)] = ln(1) = 0, which

contradicts the hypothesis of C3 that `(hp|e; k) > 0. Hence, one must ratherhave that P (¬hg & hp & k) = 0, so that P (hp & k) = P (¬hg & hp & k) +P (hg & hp & k) = P (hg & hp & k) = P (hg & k).

In the case of s, we have s(hi|e; k) = P (k) · P (e & k) · [P (hi|e & k) −P (hi|k)] = P (hi & k) · [P (k)− P (e & k)], so that

s(hp|e) ≥ s(hg|e)⇔ P (hp & k) · [P (k)− P (e & k)] ≥ P (hg & k) · [P (k)− P (e & k)]

⇔ P (hp & k) ≥ P (hg & k)

Since hg & k |= hp, it follows that P (hp & k) ≥ P (hg & k), and so we aredone.

94 This terminology is not entirely consonant with standard model-theoretic ter-minology. Instead of saying that the set D is A-indiscernible, in model theoryit would rather be said that all the elements of D have the same complete typeover A (cf. Marker [107] § 4.1 pp. 115 ff). Further, in model theory, sayingthat a set D is A-indiscernible would express a more general property, namelythat any two finite sequences of elements of the same length from D havethe same complete type over A (cf. Marker [107] Definition 5.2.1 p. 178). I

143

have co-opted the appellation of indiscernibility because it allows me to avoidintroducing the terminology of complete types and because it resonates withwell-understood philosophical notions like the indiscernibility of identicals.

95 However, one might legitimately ask how such invariance is implicated byreasoning which is specifically geometrical in character. One response to thisquestion is related to a point that Ken Manders has made in a series ofrecent essays. Mander’s idea is that diagrammatic reasoning is facilitated bythe fact that inferences that are licensed by the diagram concern co-exactfeatures– features such as incidence which are invariant under perturbationsof the diagram ([105] pp. 69 ff, [106] § 4.2.2 pp. 91 ff). In particular, invarianceunder perturbations blocks the potential objection that what one has inferredfrom the diagram is merely an artifact of the particular manner in which onehas drawn the diagram, since invariance precisely means that this featurewill persist in the face of a wide variety of alterations to the diagram. Forinstance, Manders quotes Felix Klein as expressing the concern that “thereis real danger that a pupil of Euclid may, because of a falsely drawn figure,come to a false conclusion” (quoted on [105] p. 88). I take it that Mandersis responding to such a concern when he speaks of the ‘threat of disarray’ inthe following quotation:

Co-exact attributions either arise by suitable entries in the discursivetext (the setting-out of a claim, the application of a prior result or apostulate, such as that licensing entry of a circle in the proof of I.1);or are licensed directly by the diagram; for example, an intersectionpoint of the two circles in Euclid I.1. This poses no immediate threat ofdisarray, because co-exact attributes (again, by definition) are ‘locallyinvariant’ under variation of the diagram: they are shared by a rangeof perturbed diagrams” ([106] p. 94).

So in regard to the question of how invariance relates to geometry, the sug-gestion here would be that geometric reasoning, insofar as it is both rigorousand diagrammatic, must be such that its primitive non-logical relations (e.g.incidence) display a high level of invariance.

96 It seems prudent to mention that there are alternative presentations of theEuclidean plane, some of which display high levels of indiscernability and someof which display no indiscernability. For an example of the latter, consider thereal numbers R with the usual addition and multiplication functions and theusual ordering, and consider the Euclidean plane to be the definable set R×Rin this structure. This is the presentation of the Euclidean plane with whichone operates in standard analytic geometry, e.g. in that type of analyticgeometry in which one develops and deploys such formulas as the quadraticformula. It is easy to see that any subset D ⊆ Rn with more than one

144

element is not indiscernible, for the simple reason that with addition andmultiplication, one can define each of the rational numbers and hence use theordering to distinguish between different elements of the set. For instance,if D ⊆ R and a, b ∈ D with a < b, then there is a rational q ∈ Q suchthat a < q < b, and hence b satisfies the formula ψ(x) ≡ x > q while a doesnot. So this is why subsets of Rn are not indiscernible relative to the basicstructure given by addition, multiplication, and the usual ordering on the realnumbers R.

For an alternative presentation of the Euclidean plane which does displayhigh levels of indiscernability, consider the complex numbers C with additionand multiplication. This is a presentation of the Euclidean plane in whichpoints may be added, multiplied, and otherwise treated as numerical quanti-ties (albeit with an ordering). If A ⊆ C is a set of parameters and B ⊆ Cnis an A-definable set, then the set D of A-independent points of B is an A-indiscernible set, where c = (c1, . . . , cn) ∈ B is A-independent if ci is not ina A ∪ {c1, . . . , ci−1, ci, . . . , cn}-definable finite set. In the case where A = ∅and n = 1 and B = C, this just says that any two transcendental numbersare indiscernible with respect to all parameter-free first-order formulas, wherea transcendental number is simply one which is not the root of any algebraicequation with rational coefficients. For instance,

√2 and i =

√−1 are not

transcendental, while π and e are transcendental.In this presentation of the Euclidean plane, indiscernability follows from

the fact that the complex numbers are uncountably categorical: any otherstructure that satisfies the same first-order sentences and which has the samecardinality as it is isomorphic to it. Indeed, any uncountably categoricalstructure M which is itself uncountable is similar to C in this respect: thereis some set C ⊆ Mk definable from parameters A such that if B ⊆ Cn isan A-definable set, then the set D of A-independent points of B is an A-indiscernible set (cf. Marker [107] Lemma 6.1.16 p. 209.) This fact emergesin contemporary proofs of Morley’s Theorem, which says that if a first-ordertheory has only one model of some uncountable cardinality (such as the sizecontinuum), then it only has one model of any uncountable cardinality (cf.Marker [107] Theorem 6.1.18 pp. 212-213). Hence, what generates indiscern-ability in this type of reasoning is the presence of first-order descriptions ofstructures of size continuum, descriptions that are unique in that any twostructures of size continuum which meet this description are isomorphic.

Here one can again ask why we should think that the type of reasoninginvolved in uncountable categoricity is geometrical in character. Boris Zilberhas suggested that what is geometrical in uncountable categoricity is that theindependence relation described above gives rise to a notion of dimension,which like in linear algebra or algebraic geometry may defined in terms ofmaximal independence. For a formal definition of dimension in this sense,

145

see e.g. Zilber [163] Appendix B.1.1 and in particular Definition B.1.12 onp. 223, or Marker [107] § 8.1 and in particular the definition subsequent toLemma 8.1.3 on p. 290. In speaking of Morley’s resolution of Los’s conjecture,Zilber says: “As a matter of fact the main logical problem after answeringthe question of J. Los was what properties of M make it κ-categorical foruncountableκ? [¶] The answer is now reasonably clear: The key factor is thatwe can measure definable sets by a rank-function (dimension) and the wholeconstruction is highly homogenous” ([163] Appendix B.2.1 p. 236). Hence,in regard to the question of how reasoning about uncountably categoricalstructures is geometrical in character, the suggestion is that in any non-trivialuncountably categorical structure, there is a notion of dimension that behaveslike the notion of dimension from canonical geometrical settings like linearalgebra and algebraic geometry.

97 Assuming that h is confirmed by en, this is a direct application of Propo-sition 6 (C1)(a) from endnote 93, so that the analogous result will hold ifthe other measures of degree of confirmation discussed in that endnote areemployed.

98 This of course is not to say that stability is a pragmatic notion. Obviously,stability is defined explicitly in epistemic terms. Hence, what I am suggestingis that the argument from the previous paragraph of the main body of thetext shows that stable reasoning– an epistemic notion– enjoys a certain ceterisparibus pragmatic advantage.

99 Of course, the extra axiom of supplemented Robinson’s Q is only needed toestablish (iii), which will be important in what follows. Clearly, to the extentthat one considers analogues of amplificatory empiricism presented in termsof supplemented Robinson’s Q as opposed to Robinson’s Q itself, one willlikewise need to consider analogues of inceptive empiricism that incorporatethe extra axiom of supplemented Robinson’s Q.

100 As stated, this example is not entirely apposite. For the evidence eF inthis example is the pseduo-antecedent associated to the property F (n) whichsays that n is an even number (i.e. a number equal to 2n for some n), andthe hypothesis hF is the pseudo-consequent associated to F , and the back-ground knowledge k is the conjunction of axioms of supplemented Robin-son’s Q. Since in this case one has hF & k |= eF and k |= eF , it followsthat P (hF |eF & k) = P (hF |k), so that that hypothesis hF is in fact not con-firmed by the evidence eF relative to the background knowledge k. However,this example can be easily augmented in a way that preserves the underlyingthought that the inference from the evidence to the hypothesis is unjustified.In particular, choose any sentence ψ that is not provable or disprovable fromsupplemented Robinson’s Q, and which one has no evidence for or against.Then consider the property G(n) which says that H(n) and ψ. Then there

146

are probability assignments P on which the hypothesis hG is confirmed by theevidence eG relative to background knowledge k. For instance, choose struc-tures Mi for 1 ≤ i ≤ N such that n of these structures model Robinson’s Qand ψ and the other N − n of these structures model Robinson’s Q and ¬ψfor some 1 ≤ n < N . Then define P (ϕ) = N−1 · |{i ∈ [1, N ] : Mi |= ϕ}|, sothat P is the “counting measure” on M1, . . . ,MN . It is easy to see that Psatisfies P1-P3 and hence is a probability assignment. Further, it is easyto see that P (eG & k) = n

Nand P (k) = 1. Hence, since hG & k |= eG

and 0 < P (eG & k) < P (k), one has that the pseudo-consequent hG isconfirmed by the pseudo-antecedent eG relative to background knowledge k.Hence, amplificatory empiricism– or rather a contention similar to it but cen-tered around supplemented Robinson’s Q– would claim that one is justifiedin inferring from eG to hG, which intuitively seems problematic.

101 Assuming that the hypotheses in inferences (a)-(c) are confirmed by the ev-idence in these inferences relative to the background knowledge, these twofacts follow immediately from Proposition 6 (C2)-(C3) from endnote 93, sothat the analogous results will hold if the other measures of degree of confir-mation discussed in that endnote are employed.

102 It is also possible to explain why the inference (a) is equally justifiable asthe inference (b) when the degree of confirmation in inference (a) is equalto the degree of confirmation in inference (b). For, suppose that degree ofconfirmation in inference (a) is equal to the degree of confirmation in infer-ence (b). Assuming that the hypotheses in inferences (a)-(b) are confirmedby the evidence in these inferences relative to the background knowledge,it then follows from Proposition 6 (C2) (b) that P (ep & k) = P (eg & k)and then it follows from k & eg |= ep and standard manipulations of P1-P3that P (k) ≤ P (ep ↔ eg). This suggests that if one were confident in k, thenone should be confident that ep and eg have the same truth value. Given thisparity between ep and eg, this then suggests that were one to be justified in k,then one would be equally justified in inferring from ep to hg as from eg to h.Hence, this is why, when the degree of confirmation in inference (a) is equalto the degree of confirmation in inference (b), the inference (a) is equallyjustifiable as the inference (b).

However, it is not obvious that the inference (b) is equally justifiable asthe inference (c) when the degree of confirmation in inference (b) is equal tothe degree of confirmation in inference (c). For, the plausibility of this de-pends heavily on the measure of the degree of confirmation that one employs.Suppose that the degree of confirmation in inference (b) is equal to the degreeof confirmation in inference (c), and suppose that the hypotheses in infer-ences (b)-(c) are confirmed by the evidence in these inferences relative to thebackground knowledge. If the degree of confirmation is measured by the func-tion d(h|e; k) = P (h|e & k)− P (h|k) or s(h|e; k) = P (k) · P (e & k) · d(h|e; k)

147

or `(h|e; k) = ln[ P (e|h & k)P (e|¬h & k)

] from endnote 93, then it follows from Proposi-

tion 6 (C3) (c) that P (hp & k) = P (hg & k), so that from hg & k |= hp andstandard manipulations of P1-P3 we can conclude that P (k) ≤ P (hp ↔ hg).Thus, as above, if one was confident in k, then one would likewise be confidentthat hp and hg have the same truth value, and given this parity, it seems thatone would be equally justified in inferring to hg from ep as to hp from ep.

However, if the degree of confirmation is measured by the function r(h|e; k) =

ln[P (h|e & k)P (h|k) ] from endnote 93, then as noted in Proposition 6 (C3) (b), one

has r(hg|ep; k) = r(hp|ep; k) = ln[ P (k)P (ep & k)

]. Hence, using this measure of the

degree of confirmation, we can say nothing about the difference between in-ference (b) and inference (c), since in each case the degree of confirmationis purely a function of ep and k. Hence, this is why it is not obvious thatthe inference (b) is equally justifiable as the inference (c) when the degree ofconfirmation in inference (b) is equal to the degree of confirmation in infer-ence (c): for, while an argument can be made for this if degree of confirmationis measured by the function d, s or ` from endnote 93, it is not obvious thatan argument can be made for this if the degree of confirmation is measuredby the function r from endnote 93.

103 For instance, consider a model N of the Peano axioms. By compactness,there is a model M which satisfies all the same first-order sentences as N (andhence which is likewise a model of the Peano axioms), but in which there is anelement c such that M |= c > sn(0) for all n ∈ ω where e.g. s2(0) = s(s(0)).Then define the standard natural numbers to be the set S = {d ∈ M :∃ n ∈ ω M |= d = sn(0)}. Then (M,S) will have the three properties (1)-(3)mentioned in the main body of the text, while M itself will satisfy the Peanoaxioms and hence every instance of mathematical induction.

104 For instance, see Parsons’ uniqueness thesis discussed in endnote 35 of Chap-ter 1 or the papers of Dean and Halbach-Horsten mentioned in endnote 39 ofChapter 1.

148

CHAPTER 3

COMPARING PEANO ARITHMETIC, BASIC LAW V, AND HUME’S

PRINCIPLE

3.1 Introduction, Definitions, and Overview of Main Results

3.1.1 Introduction

Second-order Peano arithmetic and its subsystems have been studied for many

decades by mathematical logicians (cf. [138]), and the resulting theory contin-

ues to be the subject of current research and a source of open problems. More

recently, philosophers of mathematics have begun to study systems closely re-

lated to second-order Peano arithmetic (cf. [15]). One of these systems, namely,

Hume’s Principle, constitutes an axiomatization of cardinality which is similar to

the notion of cardinality defined in Zermelo-Frankel set theory. The contemporary

philosophical interest in this principle stems from Crispin Wright’s suggestion that

it can serve as the centerpiece of a revitalized version of Frege’s logicism (cf. [158],

[161], [102], and Chapter 1). Frege himself focused his logicism around a principle

called Basic Law V, which in effect codified an alternative conception of set. While

Russell’s paradox shows that Basic Law V is inconsistent with the unrestricted

comprehension schema (cf. Proposition 10), this principle has garnered renewed

attention due to Ferreira and Wehmeier’s recent proof that it is consistent with

the hyperarithmetic comprehension schema ([40], cf. [154, 155] and Remark 59).

149

The goal of this chapter is to apply methods from the subsystems of second-

order Peano arithmetic to the subsystems of Basic Law V and Hume’s Principle.

In particular, we use methods from hyperarithmetic theory to build models of sub-

systems of Basic Law V (§ 3.3), and we use recursively saturated models and ideas

from the model theory of fields to build models of subsystems of Hume’s Principle

and Basic Law V (§ 3.4). Our primary application of these new constructions is

to compare the interpretability strength of the subsystems of second-order Peano

arithmetic to the subsystems of Basic Law V and Hume’s Principle. For, one of

the few known ways to show that one theory is of strictly greater interpretabil-

ity strength than another theory is to show that the first proves the consistency

of the second (cf. Proposition 13). Hence, by formalizing our constructions, we

can compare the interpretability strength of subsystems of Hume’s Principle and

Basic Law V to subsystems of Peano arithmetic. Our main results about inter-

pretability are summarized in § 3.1.5 and on Figure 3.2. Prior to summarizing

these results, we first present formal definitions of the theories and subsystems of

Hume’s Principle and Basic Law V (§§ 3.1.2-3.1.4) and then describe what is and

is not known about the provability relation among these subsystems (§ 3.1.4 and

Figure 3.1).

3.1.2 Definition of Signatures and Theories of PA2, BL2 and HP2

The signature of PA2 is a many-sorted signature, with sorts for numbers as well

as a sort for sets of numbers. The theory PA2 is a natural set of axioms for the

following many-sorted structure in this signature:

(ω, 0, s,+,×,≤, P (ω)) (3.1)

150

This structure satisfies the eight-axioms of Robinson’s Q

(Q1) s(x) 6= 0

(Q2) s(x) = s(y)→ x = y

(Q3) x 6= 0→ ∃ w x = s(w)

(Q4) x+ 0 = x

(Q5) x+ s(y) = s(x+ y)

(Q6) x · 0 = 0

(Q7) x · s(y) = x · y + x

(Q8) x ≤ y ↔ ∃ z x+ z = y.

and the mathematical induction axiom

∀ F [F (0) & F (n)→ F (s(n))]→ [∀ n F (n)] (3.2)

and each instance of the comprehension schema (where F not free in ϕ)

∃ F ∀ n [F (n)↔ ϕ(n)] (3.3)

Here, the formula ϕ is allowed to contain free object variables (in addition to n)

and free set variables (with the exception of F ). Hence, what an instance of this

comprehension schema says is that if ϕ(n) is a formula with parameters, then

there is a set F corresponding to it. This all in place, we are now in a position to

define:

151

Definition 7. The theory PA2 or CA2 or second-order Peano arithmetic consists

of Q1-Q8, the mathematical induction axiom (3.2), and each instance of the com-

prehension schema (3.3) (cf. [138] p. 4).

The name CA2 is also given to PA2 because it reminds us of comprehension.

The signature of HP2 and BL2 is likewise a many-sorted signature, with sorts

for objects as well as sorts for n-ary relations on objects, and with an additional

function symbol from the unary relation sort to the object sort. The unary rela-

tions are written as A,B,C, F,G,H,X, Y, Z and will be called sets , and the n-ary

relation symbols for n > 1 are written as f, g, h, P,Q,R, S and will be called rela-

tions. Occasionally when we want to say something about both sets and relations,

we will talk about all n-ary relations for n ≥ 1. The additional function symbol

is denoted by # in the case of HP2 and by ∂ in the case of BL2. So the signatures

of HP2 and BL2 are exactly the same: it is merely for the sake of convenience and

clarity that we use # in the context of HP2 and ∂ in the context of BL2. Hence,

structures in this signature have the form

(M,S1, S2, . . . ,#) (3.4)

where M is a set, Sn ⊆ P (Mn) and # : S1 → M . Note that the function # only

goes from S1 to M , so that the relations from Sn for n > 1 are not in the domain

of this function.

It is worth pausing for a moment to dwell on a technical point. Formally, the

signature of PA2 also contains a binary relation symbol E which holds between

an object and a set and which, in the standard model from (3.1), is interpreted

by the ∈ relation from the ambient set-theory. In structures where this holds, let

us say that the symbol E is interpreted absolutely. It is easy to see that every

152

structure in the signature of PA2 is isomorphic to a structure that interprets this

symbol absolutely, and it is for this reason that this symbol is typically suppressed

when describing structures. Likewise, formally the signature of HP2 and BL2 con-

tains (n + 1)-ary relation symbols En, which hold between n-tuples of objects

and n-ary relations. Further, there is an obvious generalization of the notion of

absoluteness for structures in this signature, such that the structure from (3.4)

interprets En absolutely, and such that every structure in this signature is iso-

morphic to a structure which interprets En absolutely. Hence, as in the case of

second-order Peano arithmetic, in what follows, these symbols will be suppressed

when describing structures, and it will be assumed that every structure in this

signature has the form of (3.4).

Hume’s Principle and Basic Law V can now be defined. Hume’s Principle is

the following axiom in the signature of structure (3.4):

#X = #Y ⇐⇒ ∃ bijection f : X → Y (3.5)

Here, the notion of bijectivity is defined in terms of functionality, injectivity, and

surjectivity in the obvious manner. The axiom Basic Law V is the following

sentence in this signature:

∂X = ∂Y ⇐⇒ X = Y (3.6)

Here, two sets are said to be equal if they are coextensive; formally, the equality of

coextensive sets can be taken to be an axiom of all the theories considered in this

chapter. The important thing to note here is that (M,S1, S2, . . . , ∂) is a model of

Basic Law V if and only if the function ∂ : S1 →M is an injection. That is, Basic

153

Law V mandates that a very simple relation holds between S1 and M . There is no

analogue of this in the case of Hume’s Principle, since the right-hand side of (3.5)

contains a higher-order quantifier.

Nevertheless, there are many natural models of Hume’s Principle, and examin-

ing these models is the easiest way to define the theories HP2 and BL2. In particular,

if α is an ordinal which is not a cardinal, and if # is interpreted as cardinality,

then the following structure is a model of Hume’s Principle:

(α, P (α), P (α2), . . . ,#) (3.7)

Restricting attention to ordinals α that are not cardinals serves the purpose of

ensuring that #(α) < α, so that dom(#α) is P (α) and so that rng(#α) is a subset

of α. For all n-ary relation variables R and all n ≥ 1, this structure also satisfies

each instance of the following comprehension schema (where R does not occur free

in ϕ(z))

∃ R ∀ n [n ∈ R↔ ϕ(n)] (3.8)

This comprehension schema is simply the generalization of the comprehension

schema from PA2, namely (3.3), to the n-ary relations for all n ≥ 1. Here, as with

(3.3), the formula ϕ is allowed to include free object variables (in addition to n)

and free relation variables of any arity m ≥ 1 (with the exception of R). Hence,

we can now define the following theories:

Definition 8. The theory HP2 is the theory that is given by Hume’s Principle

(3.5) and the comprehension schema (3.8).

Definition 9. The theory BL2 is the theory which is given by Basic Law V (3.6)

and the comprehension schema (3.8).

154

The primary focus of this chapter is on subsystems of HP2 and BL2 that are

generated by restrictions on the complexity of the formulas appearing in the com-

prehension schema (3.8). This is due to the fact that we seek to compare the

interpretability strength of these subsystems to those of second-order Peano arith-

metic. However, unlike in the case of PA2 and HP2, attention must be restricted

to these subsystems in the case of BL2. For, it is not difficult to see that Russell’s

paradox shows that BL2 is inconsistent:

Proposition 10. BL2 is inconsistent.

Proof. By applying the comprehension schema (3.8) to the formula

ϕ(x) ≡ ∃ Y ∂ Y = x & x /∈ Y (3.9)

it follows that BL2 proves that there is set X that satisfies

∀ x (x ∈ X)⇐⇒ (∃ Y ∂ Y = x & x /∈ Y ) (3.10)

There are then two cases: either ∂(X) ∈ X or ∂(X) /∈ X. Case one: suppose

that ∂(X) ∈ X. Then by the left-to-right direction of equation (3.10), it follows

that there is Y such that ∂(Y ) = ∂(X) and ∂(X) /∈ Y . But ∂(Y ) = ∂(X)

and Basic Law V imply that Y = X, so that ∂(X) /∈ X, which contradicts our

case assumption. Case two: suppose that ∂(X) /∈ X. Then by the right-to-left

direction of equation (3.10), it follows that for any Y we have that ∂(Y ) = ∂(X)

implies ∂(X) /∈ Y . But then ∂(X) = ∂(X) implies ∂(X) /∈ X, which contradicts

our case assumption.

Hence BL2 is inconsistent and does not have any models, unlike the theories PA2

and HP2, which respectively have the canonical models (3.1) and (3.7).

155

3.1.3 Definition of Subsystems of PA2, BL2 and HP2

So if one wants to study Basic Law V, one needs to pass to subsystems of

Basic Law V that do not allow instances of the comprehension schema (3.8) applied

to formulas like the one in (3.9). To this end, let us introduce the following natural

hierarchy of formulas in the signature of BL2 and HP2. A formula ϕ, perhaps with

free object variables z and free relation variables R of different arities m ≥ 1, is

called arithmetical or Π10 or Σ1

0 if it does not contain any bound m-ary relation

variables for any m ≥ 1. Further, if m ≥ 1 and R is an m-ary relation variable

and ϕ(R) is a Σ1n-formula, then ∃ R ϕ(R) is a Σ1

n-formula and ∀ R ϕ(R) is a

Π1n+1-formula. Likewise, if m ≥ 1 and R is an m-ary relation variable and ϕ(R)

is Π1n-formula, then ∃ R ϕ(R) is a Σ1

n+1-formula and ∀ R ϕ(R) is a Π1n-formula.

That is, in this hierarchy of formulas, one is allowed to accumulate arbitrarily

many existential relation quantifiers of different arities m ≥ 1 in front of a Σ1n-

formula and still remain Σ1n, and likewise one is allowed to accumulate arbitrarily

many universal relation quantifiers of different arities m ≥ 1 in front of a Π1n-

formula and still remain Π1n. It is only the change from a universal relation

quantifier of some arity m ≥ 1 to an existential relation quantifier of some arity

m ≥ 1 (or vice-versa) which increases the complexity of the sentence in this

hierarchy. For instance, if X is set variable and R and S are binary relation

variables, then the following formulas are respectively Σ11,Π

11,Σ

12,Π

12:

∃ X ∀ x R(x,#X) (3.11)

∀ R ∀ X ∃ y [∀ x R(x, y)→ y = ∂X] (3.12)

∃ X ∀ R [∃ x R(x, x)→ R(#X,#X)] (3.13)

∀ R ∃ X ∃ S ∀ y [(∀ x x ∈ X ↔ ¬Sxy)→ R(∂X, y)] (3.14)

156

Finally, it is worth explicitly noting that not all formulas are included in our

hierarchy of formulas. For instance, we have said nothing about the complexity of

formulas which include alternations of object quantifiers and set quantifiers, such

as the following formula:

∀ X ∃ y ∀ Z [R(#X,#Z)→ R(y,#Z)] (3.15)

However, this is not a serious omission, since so long as one includes enough of the

comprehension schema (3.8) to guarantee the existence of the singleton set {n}

for each element n, the above formula is equivalent to the following Π13-formula

∀ X ∃ Y ∀ Z [∃ y ∈ Y & ∀ z ∈ Y z = y] & [R(#X,#Z)→ R(y,#Z)] (3.16)

That is, we can correct for this omission by treating object quantifiers as set quan-

tifiers over singleton sets when they occur in alternation of object quantifiers and

set quantifiers.

Using this hierarchy of formulas, one can define the subsystems of BL2 and

HP2 by restricting the complexity of formulas which appear in the comprehension

schema (3.8). For the following definition, let us recall that CA2 is another name

for PA2 (cf. Definition 7). The idea behind the following definition is then that AC

reminds us of the axiom of choice and is the result of inverting the letters in CA,

which reminds us of comprehension. So with the exception of the choice schema,

each of the schemas which figure in the below definition asserts the existence of a

certain class of definable sets and relations:

Definition 11. Suppose that XY2 is one of CA2, BL2, or HP2. Then we can define

the following four subsystems of XY2:

157

(i) The subsystem AXY0 is XY2 but with the comprehension scheme (3.8) restricted

to arithmetical formulas.

(ii) The subsystem ∆11 − XY0 is XY2 but with the comprehension scheme (3.8) re-

placed by the following schema, which is called the ∆11-comprehension schema or

hyperarithmetic comprehension schema, wherein ϕ is a Σ11-formula and ψ is a

Π11-formula:

[∀ n ϕ(n)↔ ψ(n)]→ [∃ R ∀ n n ∈ R↔ ϕ(n)] (3.17)

(iii) The subsystem Σ11 − YX0 is AXY0 and the following schema, which is called the

Σ11-choice schema, wherein ϕ is a Σ11-formula:

[∀ n ∃ P ϕ(n, P )]→ [∃ R ∀ n ∀ P (∀ m (m ∈ P ↔ nm ∈ R))→ ϕ(n, P )] (3.18)

(iv) The subsystem Π1n − XY0 is XY2 but with the comprehension schema (3.8)

restricted to Π1n-formulas.

Further, in all these schemata, ϕ and ψ are allowed to contain free object vari-

ables (in addition to n) and free relation variables of any arity m ≥ 1 (with the

exception of R).

The intuition behind the choice schema (3.18) can be made clearer as follows.

Suppose that a structure (M,S1, S2, . . . ,#) is a model of Σ11 − PH0 and that the

antecedent of a given instance of the Σ11-choice schema (3.18) holds. Then Σ11 − PH0

asserts the existence of a relation R, which for the sake of simplicity we can

assume to be a binary relation. For each object n in M , the following set is

then guaranteed to exist in S1 by the arithmetic comprehension schema (which is

included in Σ11 − PH0):

Rn = {m : Rnm} (3.19)

158

So it follows that (M,S1, S2, . . . ,#) |= ϕ(n,Rn) for every n in M . Hence, in the

situation where for every n there is a choice of P such that ϕ(n, P ), the Σ11-choice

schema asserts that there is a uniform way to make these choices, in that there is

an R such that its columns Rn satisfy ϕ(n,Rn) for each n.

Note, however, that the map (R, n) 7→ #(Rn) is not a function symbol in the

signature of HP2 or BL2. For instance, given a binary relation R, the comprehension

schema (3.8) restricted to arithmetical formulas does not in general guarantee the

existence of the binary relation

{(n,m) : #(Rn) = m} = {(n,m) : ∃ X (∀ x x ∈ X ↔ Rnx) & #X = m}

= {(n,m) : ∀ X (∀ x x ∈ X ↔ Rnx) → #X = m}

(3.20)

For, as these definitions make evident, one will in general need the hyperarithmetic

comprehension schema (3.17) in order to show that this relation exists (cf. Propo-

sitions 55-56). This example underscores an important fact: intuitively simple

relations expressible via the maps # or ∂ may be quite complex when explicitly

written out in terms of the primitives of the signature. Since our interest in this

chapter is on restrictions of the comprehension schema, this fact will be particu-

larly important to keep in mind throughout this chapter. (In § 3.5, we raise the

question of what happens when one does include function symbols (R, n) 7→ #(Rn)

in the signature, so that relations like the one defined in equation (3.20) would

count as arithmetical.)

159

Π11 − CA0

��Σ11 − LB0

Σ11 − AC0

��

Π11 − HP0

"*LLLLLLLLL

LLLLLLLLL

? --Σ11 − PH0|mm

t| rrrrrrrrr

rrrrrrrrr

∆11 − BL0

?

JJ

��

∆11 − CA0

��

∆11 − HP0

��ABL0 ACA0 AHP0

Figure 3.1. Provability Relation in Subsystems of BL2, PA2, and HP2

3.1.4 Summary of Results about the Provability Relation

Our primary concern in this chapter is with the interpretability relation be-

tween subsystems of PA2, HP2, and BL2, and we summarize our results in the next

section (§ 3.1.5). However, since provability implies interpretability, and since the

provability relation is intrinsically interesting, in this section we record what is

known about this relation among the subsystems of PA2, HP2, and BL2. This is

summarized in Figure 3.1, where the double arrows indicate that the provability

implication is irreversible, and where the negated arrows indicate that the prov-

ability implication fails, and where the arrows with question marks beside them

indicate that the provability implication is unknown.

Each of the positive provability relations in in Figure 3.1 follows immediately

from the definitions, except for the fact that Π11 − CA0 proves Σ11 − AC0 and the

fact that Σ11-choice implies ∆1

1-comprehension. For the former, see Simpson [138]

Theorem V.8.3 pp. 205-206. For the latter, the proof from Simpson [138] The-

160

orem VII.6.6 (i) p. 295 carries over to the setting of HP2 and BL2, as we verify

now:

Proposition 12. Σ11 − AC0 → ∆11 − CA0, and Σ11 − PH0 → ∆11 − HP0, and Σ11 − LB0 →

∆11 − BL0

Proof. Let M = (M,S, . . .) be a model of Σ11 − AC0 (resp. Σ11 − PH0, Σ11 − LB0).

By standard conventions, M is non-empty. However, nothing in these standard

conventions requires that M be non-empty as opposed to say S. But, in the case

of Σ11 − AC0 we have that 0 ∈M , and in the case of Σ11 − PH0 we have that #∅ ∈M ,

and likewise in the case of Σ11 − LB0 we have that ∂∅ ∈M . Hence, for the remainder

of the proof, fix parameter a ∈ M . Suppose that M |= ∀ z ϕ(z) ↔ ψ(z), where

ϕ is Σ11 and ψ is Π1

1. Then M |= ∀ z ϕ(z) ∨ ¬ψ(z). Then by the arithmetical

comprehension schema,M |= ∀ z ∃ Z (ϕ(z) ∧ a ∈ Z)∨ (¬ψ(z)∧ a /∈ Z). By the

Σ11-Choice Schema, there is R such that

M |= ∀ z ∀ Z (∀x x ∈ Z ↔ Rzx)→ [(ϕ(z) ∧ a ∈ Z) ∨ (¬ψ(z) ∧ a /∈ Z)] (3.21)

By the arithmetical comprehension schema, there isW such that z ∈ W if and only

if Rza. Then we claim that z ∈ W if and only if ϕ(z). For, suppose that z ∈ W , so

that Rza. Then Z = {x : Rzx} exists by the arithmetical comprehension schema,

and we have a ∈ Z. Then by (3.21), it follows that ϕ(z). Conversely, suppose

that z /∈ W , so that ¬Rza. Then Z = {x : Rzx} exists by the arithmetical

comprehension schema, and we have a /∈ Z. Then by (3.21), it follows that ¬ψ(z)

and hence ¬ϕ(z). Hence, in fact we have established that z ∈ W if and only if

ϕ(z). So M models ∆11 − CA0 (resp. ∆11 − HP0, ∆11 − BL0).

The known non-provability relations in Figure 3.1 are not difficult to verify.

161

In the case of the subsystems of HP2, we can read these results off of the results

for the subsystems of PA2, as the proof of Proposition 53 indicates. In the case

of the subsystems of BL2, the only known result we have is that ABL0 does not

prove ∆11 − BL0, and this is shown in Proposition 51. In § 3.5, we list the remaining

unknown questions about the provability relation, namely, the question of whether

∆11 − BL0 implies Σ11 − LB0 and whether Π11 − HP0 implies Σ11 − PH0.

3.1.5 Summary of Results about the Interpretability Relation

Most of the formal work done on the the subsystems of PA2, HP2, BL2 has con-

cerned the interpretability strength of these theories. A theory T0 is interpretable

in a theory T1 (T0 ≤I T1) if every model M1 of T1 uniformly defines without

parameters some model M0 of T0, where “uniform” has the sense that e.g. a

binary relation symbol R in the signature of T0 is defined by one and the same

formula ϕ(x, y) in each model M1 of T1. (For a more syntactic definition, see

Lindstrom [99] p. 96 or Hajek and Pudlak [59] pp. 148-149). Since this relation is

reflexive and transitive, one can define the associated notions

T0 ≡I T1 ⇐⇒ T0 ≤I T1 & T1 ≤I T0 (3.22)

T0 <I T1 ⇐⇒ T0 ≤I T1 & T1 �I T0 (3.23)

The relation ≤I is then a partial order on the set of equivalence classes of theories

under the equivalence relation ≡I. Since this partial order is in fact a linear order

in many natural cases, it can be intuitively conceived as a measure of the strength

of the theory. This order is also connected to the formal notion of consistency

strength by the following proposition:

162

Proposition 13. Suppose T1 is a finitely axiomatizable theory such that ACA0 ⊆

T1 ⊆ PA2, and suppose that T0 is a computable theory in a computable signature.

Then

T1 ` Con(T0) =⇒ T1 �I T0 (3.24)

[T0 ≤I T1 & T1 ` Con(T0)] =⇒ T0 <I T1 (3.25)

Proof. (Sketch) For (3.24), note that if T1 ` Con(T0), then T1 proves that there is

a model M0 of T0 (cf. Simpson [138] Theorem IV.3.3 p. 140). But if T1 ≤I T0 and

T1 is finitely axiomatizable, then this interpretation is due to a finite number of the

axioms of T0. Further, since T0 is computable, this can be accurately represented in

T1, so that inside T1 the model M0 of T0 defines a model M1 of T1, which likewise

exists since the theory inside which we are working (namely T1 itself) includes

arithmetical comprehension. But then T1 would prove Con(T1), which contradicts

Godel’s Second Incompleteness Theorem. (For a formal proof, see Lindstrom [99]

Chapter 7 Corollary 1 p. 97). Note that (3.25) follows immediately from (3.24)

and definition (3.23).

In what follows, we will apply this proposition to T1 = ACA0 itself or T1 =

Π11 − CA0, both of which are known to be finitely axiomatizable (cf. Simpson [138]

Lemma VIII.1.5 pp. 311-312 and Lemma VI.1.1 pp. 217-218).

The major previous results on the interpretability strength of the subsystems

of PA2, HP2, BL2 can be described as follows. In the 19th Century, Frege in essence

showed that PA2 ≤I HP2 (cf. Frege [44], [11], Boolos and Heck [14]), and recently

Heck ([67] p. 192) and Linnebo ([100] p. 161) noted that Frege’s proofs in fact show

that Π11 − CA0 ≤I Π11 − HP0 (cf. § 3.2.2, Corollary 27). Further, Boolos ([10]) showed

163

Π11 − CA0

��s{ ooooooooooo

oooooooooooooFrege/Boolos// Π11 − HP0

Σ11 − LB0 + InfWalsh --

��

Σ11 − AC0?oo

��Σ11 − LB0

��

∆11 − CA0

��∆11 − BL0

��

ACA0

��ABL0 Q//

Heck/Ganea/V isseroo ? --

Σ11 − PH0Burgess

kk"*

WalshLLLLLLLLLL

LLLLLLLLLL

Figure 3.2. Interpretability Relation in Subsystems of BL2, PA2, and HP2

that the converse holds (cf. Corollary 29), so that one has Π11 − CA0 ≡I Π11 − HP0

(cf. Corollary 30). Heck ([65]) then showed that ABL0 interprets Robinson’s Q, and

Ganea and Visser ([50], [149]) independently showed that the converse holds, so

that ABL0 ≡I Q. Likewise, Burgess ([15]) showed that AHP0 interprets Robinson’s Q.

Finally, Ferreira and Wehmeier ([40]) showed that ∆11 − BL0 is consistent and a

slight modification of their proof shows that Σ11 − LB0 is consistent, and inspection

of this proof shows that Σ11 − LB0 <I Π11 − CA0. These previous results and our

new results are summarized in Figure 3.2, where the double arrows indicate that

the provability relation is irreversible, and where the single arrows indicate that

the provability relation may or may not be irreversible. That is, in the diagram

T1 ⇒ T0 means T0 <I T1 and T1 → T0 means T0 ≤I T1.

Our new results establish upper and lower bounds on consistent subsystems of

164

BL2 and HP2 by (i) finding new constructions of models of these theories, (ii) noting

that the constructions can be formalized in theories such as ACA0 and Π11 − CA0,

and (iii) applying Proposition 13. Our first main new result, Theorem 60, is a

construction of a model M of Σ11 − LB0 using ideas from higher recursion theory

(cf. Sacks [134] Part A). This structure M models a finite extension of Σ11 − LB0

called Σ11 − LB0+Inf which interprets Σ11 − AC0. Moreover, since this construction is

formalizable in Π11 − CA0, we have that Proposition 13 implies that Σ11 − LB0+Inf <I

Π11 − CA0.

Our second set of results concerns new constructions of models of ∆11 − LB0 and

Σ11 − PH0 and ∆11 − HP0 + ¬Σ11 − PH0. These results are all based on a generaliza-

tion of a theorem of Barwise-Schlipf and Ferreira-Wehmeier which allows us to

built models of these theories on top of various recursively saturated structures

(cf. Theorem 70). In particular, we show that if k is a countable recursively

saturated o-minimal expansion of a real-closed field, then then there is a function

# : D(k) → k, where D(kn) denotes the definable subsets of kn, such that the

structure

(k,D(k), D(k2), . . . ,#) (3.26)

is a model of Σ11 − PH0. Moreover, we show that this construction can be formalized

in ACA0, so that by Proposition 13, we have Σ11 − PH0 <I ACA0 (cf. Corollary 99).

Further, we show that if k is a countable saturated algebraically closed field, then

there is a there is a function # : D(k) → k, where D(kn) denotes the definable

subsets of kn, such that the structure

(k,D(k), D(k2), . . . ,#) (3.27)

165

is a model of ∆11 − HP0+¬Σ11 − PH0. Further, we can use this construction to answer

an open question of Linnebo (cf. Remark 81 and Proposition 83). However, we do

not presently know whether this construction can be formalized in ACA0, although

we have reduced it to the question of whether Ax’s Theorem can be formalized in

ACA0 (cf. Remark 78 and Question 111). Finally, we show that if k is a countable

recursively saturated separably closed field of finite imperfection degree, then there

is a function ∂ : D(k)→ k, where D(kn) denotes the definable subsets of kn, such

that the structure

(k,D(k), D(k2), . . . , ∂) (3.28)

is a model of ∆11 − LB0 (cf. Theorem 108). However, we do not presently know

whether this construction can be formalized in ACA0, although we have reduced

this question to the question of whether the proof of the elimination of imagi-

naries for separably closed fields can be formalized in ACA0 (cf. Remark 109 and

Question 112).

3.2 Standard Models of HP2 and Associated Results

Prior to turning to the primary results of this chapter in §§ 3.3-3.4, the re-

lationship between PA2 and HP2 is briefly explored in this section. On the one

hand, in § 3.2.2, a brief self-contained proof of Frege and Boolos’s result that

PA2 and HP2 are mutually interpretable is presented (cf. Corollary 30). Then, in

§ 3.2.1, some of the ways in which the standard models of HP2 are similar to and

different from the standard models of PA2 are examined. The standard model of

PA2 is the structure from equation (3.1), namely, (ω, 0, s,+,×,≤, P (ω)), while the

standard models of HP2 are the structures from equation (3.7), namely, structures

of the form (α, P (α), P (α2), . . . ,#α), where α is an ordinal which is not a cardi-

166

nal and where #α : P (α) → α denotes cardinality. In § 3.2.1, it is shown that

these standard models of HP2 depend only on the cardinality of α for α ≥ ω + ω

(Proposition 16 (i)), and further that they can have many automorphisms, unlike

the standard model of PA2 (cf. Proposition 17 (iv)). Finally, it is shown that

there is an analogue of the relative categoricity of PA2 in the setting of HP2 (cf.

Proposition 20 and Remark 21).

3.2.1 Models of HP2 from Infinite Cardinals

Proposition 14. Suppose α, β are ordinals that are not cardinals, and consider

the structures (α, P (α), P (α2), . . . ,#α) and (β, P (β), P (β2), . . . ,#β), where #α :

P (α)→ α and #β : P (β)→ β denote cardinality.

(i) The structures (α, P (α), P (α2), . . . ,#α) and (β, P (β), P (β2), . . . ,#β) model

HP2.

(ii) If α = ω + k + 1 where k ≥ 0, then |α− rng(#α)| = k

(iii) If α ≥ ω + ω, then |α− rng(#α)| = |α|.

(iv) The structures (α, P (α), P (α2), . . . ,#α) and (β, P (β), P (β2), . . . ,#β) are

isomorphic if and only if α = β or α, β ≥ ω + ω and |α| = |β|.

Proof. For (i), note that restricting attention to ordinals α which are not cardinals

serves the purpose of ensuring that #(α) < α, so that dom(#α) is P (α) and so

that rng(#α) is a subset of α. Further, note that (α, P (α), P (α2), . . . ,#α) satisfies

Hume’s Principle by the definition of cardinality. Further, note that by the Power

Set Axiom and the Separation Axiom, the structure (α, P (α), P (α2),#α) satisfies

the full comprehension schema. Hence, in fact (α, P (α), P (α2),#α) is a model of

HP2.

167

For (ii), note that α− rng(#α) = {ω + 1, . . . , ω + k}, which has cardinality k.

For (iii), note that since α ≥ ω + ω, we have that α − ω is infinite, and

hence |α| = |α− ω|. Case One: α is a limit ordinal. Then the mapping from

α − ω to α − rng(#α) given by β 7→ β + 1 is an injection. Case Two: α is a

successor ordinal. Then α = γ + n where n > 0 and γ is a limit ordinal. Then

|α| = |α− ω| = |γ − ω|. Then the mapping from γ − ω to α − rng(#α) given by

β 7→ β + 1 is an injection. Hence in both cases we have |α− rng(#α)| = |α|.

For (iv), suppose that the two structures are isomorphic. Then this isomor-

phism induces a bijection from α onto β, and hence α and β have the same

cardinality. Further, suppose for the sake of contradiction that α 6= β and it is

not the case that α, β ≥ ω + ω. If α < β < ω + ω, then by part (ii) we have that

|α− rng(#α)| < |β − rng(#β)| < ω, and so the two structures are not elementarily

equivalent and hence not isomorphic, which is a contradiction. If α < ω + ω ≤ β,

then by parts (ii) and (iii) we have that |α− rng(#α)| < ω ≤ |β − rng(#β)|, and

so the two structures are not elementarily equivalent and hence not isomorphic,

which is a contradiction. Hence, in fact, we must have that α = β or α, β ≥ ω+ω

and |α| = |β|.

Conversely, suppose that α, β ≥ ω + ω have the same cardinality, so that

rng(#α) = rng(#β) by definition, and hence that |α− rng(#α)| = |α| = |β| =

|β − rng(#β)| by part (iii). Hence choose a bijection f : α→ β such that f(α) = α

on rng(#α). Extend f to a bijection f : P (α)→ P (β) by setting f(X) = {f(x) :

x ∈ X}. Since f(α) = α on rng(#α) and since f is a bijection, we have that

f(#α(X)) = f(|X|) = |X| = |{f(x) : x ∈ X}| =∣∣f(X)

∣∣ = #β(f(X)) (3.29)

Hence, f is an isomorphism.

168

Definition 15. If κ is a cardinal, then define the ordinal

Hκ =

ω + κ+ 1 if κ < ω,

ω + ω if κ = ω

κ+ 1 if κ > ω.

(3.30)

and define the structure

Hκ = (Hκ, P (Hκ), P (H2κ), . . . ,#κ) (3.31)

where #κ : P (Hκ)→ Hκ denotes cardinality.

Proposition 16.

(i) For every ordinal α that is not a cardinal, there is exactly one cardinal κ such

that the structureHκ is isomorphic to the structure (α, P (α), P (α2), . . . ,#α),

where #α : P (α)→ α denotes cardinality.

(ii) If κ is a cardinal then |Hκ − rng(#κ)| = κ.

(iii) If κ, λ are cardinals, then Hκ and Hλ are isomorphic if and only if κ = λ.

Proof. For (ii), there are three cases. First, suppose that κ = k < ω. Then Hκ −

rng(#κ) = {ω+1, . . . , ω+k}. Second, suppose that κ = ω. Then Hκ−rng(#κ) =

{ω + n : 0 < n < ω}. Third, suppose that κ > ω. Then by Proposition 14 (iii),

|Hκ − rng(#κ)| = |κ+ 1− rng(#)| = |κ+ 1| = κ.

For (iii), note that the right-to-left direction is trivial. For the left-to-right

direction, suppose for the sake of contradiction that Hκ and Hλ are isomorphic

and that κ 6= λ. Then without loss of generality, κ < λ. First suppose that

169

κ < λ < ω. Then part (ii) implies thatHκ andHλ are not elementarily equivalent,

since Hκ models that there are exactly κ elements not in the range of #, whereas

Hκ models that there are exactly λ elements not in the range of #. Second suppose

that κ < ω ≤ λ. Then likewise the structures Hκ and Hλ are not elementarily

equivalent, since Hκ models that there are exactly κ many elements not in the

range of #, whereas Hλ models that there are at least κ + 1 many elements not

in the range of #. Third, suppose that κ = ω < λ. In fact, this cannot happen,

since the isomorphism from Hκ and Hλ would induce a bijection between the

first-order parts of these structures, which, respectively, have cardinality ω and

λ > ω. Fourth, suppose that ω < κ < λ. Again this cannot happen, since the

isomorphism from Hκ and Hλ would induce a bijection between the first-order

parts of these structures, which respectively, have cardinality κ and λ > κ.

For (i), note that uniqueness follows from part (iii). For existence, there are

two cases. If α < ω + ω, then α = ω + k + 1 where k ≥ 0. Then of course the

structure (α, P (α), P (α2), . . . ,#α) is identical with the structureHk. If α ≥ ω+ω,

then by Proposition 14 (iv), we have that (α, P (α), P (α2), . . . ,#α) is isomorphic

to H|α|.

Proposition 17. Suppose that κ is a cardinal.

(i) If β, γ ∈ (Hκ − rng(#κ)) then there is f ∈ Aut(Hκ) such that f(β) = γ.

(ii) If X ⊆ Hκ is ∅-definable in Hκ then X ⊆ rng(#κ) or (Hκ − rng(#κ)) ⊆ X.

(iii) If β ∈ rng(#κ) and f ∈ Aut(Hκ) then f(β) = β.

(iv) Aut(Hκ) and Aut(κ) are isomorphic, where we view κ as a structure in the

empty signature.

170

Proof. (i) Let f : Hκ → Hκ by setting f(γ) = β, f(β) = γ, and let f be the

identity otherwise, so that f is a bijection of Hκ. Extend f to a mapping f :

Hκ → Hκ by setting f(X) = {f(x) : x ∈ X}. Then f is clearly a bijection since f

is a bijection. To show that it is an automorphism of the structure Hκ, it suffices

to show that f(#κX) = #κf(X). But, since f is the identity on rng(#κ), we

have that f(#κX) = f(#κX) = #κX, and since f is a bijection, we have that

f � X : X → f(X) is a bijection, and so #κX = #κf(X). Hence, in fact f is an

automorphism of Hκ which sends β to γ.

(ii) Suppose that X ⊆ Hk is ∅-definable in Hκ, but it is not the case that

X ⊆ rng(#κ) or (Hκ − rng(#κ)) ⊆ X. Then there is β ∈ X ∩ (Hκ − rng(#κ))

and γ ∈ (Hκ − rng(#κ)) ∩ (Hκ − X). By part (i), there is f ∈ Aut(Hκ) such

that f(β) = γ. But since X is ∅-definable, we have that β ∈ X if and only if

γ = f(β) ∈ X, which is a contradiction.

(iii) Suppose that β ∈ rng(#κ) and f ∈ Aut(Hκ) and f(β) 6= β. Since

rng(#κ) is ∅-definable and β ∈ rng(#κ), we have that f(β) ∈ rng(#κ). Case One:

f(β) < β. Note that the relation < on rng(#κ) is ∅-definable, since on rng(#κ)

we have

λ ≤ λ′ ⇐⇒ Hκ |= ∃ X ∃ Y #κ(X) = λ & #κ(Y ) = λ′ & ∃ injective f : X → Y

(3.32)

Then our case assumption f(β) < β implies f(f(β)) < f(β) < β and so we obtain

an infinite decreasing sequence of ordinals, which is a contradiction. Case Two:

β < f(β). Since f ∈ Aut(Hκ) we have that f−1 ∈ Aut(Hκ), and since β < f(β)

we have f−1(β) < β, since again the relation < on rng(#κ) is ∅-definable. Hence,

by iterating f−1(f−1(β)) < f−1(β) < β as before, we again obtain an infinite

decreasing sequence of ordinals, which is a contradiction.

171

(iv) If X is a set viewed as a structure in the empty signature, then Aut(X) is

just the set of permutations of X, and hence if X and Y have the same cardinality,

then Aut(X) and Aut(Y ) are isomorphic as groups. Hence by Proposition 16 (ii),

we have that Aut(κ) and Aut(Hκ−rng(#)) are isomorphic as groups. So it suffices

to find a group isomorphism F : Aut(Hκ − rng(#))→ Aut(Hκ).

To this end, given a bijection f : Hκ → Hκ, extend f to a mapping f : Hκ →

Hκ by setting f(X) = {f(x) : x ∈ X}, so that f : Hκ → Hκ is a bijection. Then

we claim that

f ∈ Aut(Hκ)⇐⇒ f � (rng(#κ)) = idrng(#κ) (3.33)

The left-to-right direction follows directly from part (iii). For the right-to-left

direction, it suffices to show that f(#κX) = #κf(X). Since f is the identity on

rng(#κ), we have that f(#κX) = f(#κX) = #κX, and since f is a bijection,

we have that f � X : X → f(X) is a bijection, and so #κX = #κf(X). Hence,

equation (3.33) does hold, and so we can define F : Aut(Hκ−rng(#κ))→ Aut(Hκ)

by setting F (g) = f , where f is g on Hκ − rng(#κ) and where f is the identity

on rng(#κ). Since F (g1 ◦ g2) = F (g1) ◦F (g2), we have that F witnesses the group

isomorphism between Aut(Hκ) and Aut(Hκ − rng(#κ)).

Remark 18. The proof of the theorem above shows one how to construct many

natural examples of sentences that are independent of HP2. For instance, in equa-

tion (3.32), it was shown how to define the ordering in Hκ. Using this, one can

form a sentence ϕ such that Hκ |= ϕ if and only if κ is an infinite successor car-

dinal, so that Hω2 |= HP2 + ϕ and Hωω |= HP2 + ¬ϕ. This contrasts starkly with

the case of PA2, where there are comparatively few known examples of natural

independent sentences.

172

Remark 19. The structuresHκ for κ < ω from Definition 15 are on one level very

different: for, they are not elementarily equivalent since Hκ models that there are

exactly κ-many elements that are not in the range of the #-function. However,

on another level, these structures are very similar to each other: for, when κ < ω,

it is easy to see that Hκ is isomorphic to the structure (ω, P (ω), P (ω2), . . . ,#∗κ),

where #∗κ(X) = 0 if X is infinite and where #∗κ(X) = κ + 1 + |X| if X is finite.

Further, when one restricts to the ranges of the #∗κ-functions, the induced struc-

tures (rng(#∗κ), P (ω) ∩ P (rng(#∗κ)), S2 ∩ P (rng(#∗κ)2), . . . ,#∗κ) are all isomorphic

to the structure (ω, P (ω), P (ω2), . . . ,#∗) where #∗(X) = 0 if X is infinite and

where #∗(X) = 1 + |X| if X is finite. As the next theorem indicates, this is

a very general phenomenon among models of HP2: namely, so long as different

#-functions on one and the same underlying set can in some sense see each other,

they yield isomorphic structures when one restricts attention to their ranges.

Proposition 20. Suppose that (M,S1, S2, . . . ,#1,#2) is a structure where Sn ⊆

P (Mn) and where #i : S1 →M . Suppose further that the structures (M,S1, S2, . . . ,#i)

are models of HP2 for i ∈ {1, 2}, and further that the structure (M,S1, S2, . . . ,#1,#2)

satisfies every instance of the comprehension schema (3.8), in the signature that

includes both of the function symbols #1,#2. Finally, for i ∈ {1, 2}, define the

following induced structure:

Ni = (rng(#i), S1 ∩ P (rng(#i)), S2 ∩ P (rng(#i)2), . . . ,#i) (3.34)

Then N1 and N2 are isomorphic models of HP2.

Proof. First we define a bijection Γ : rng#1 → rng#2. If #1X ∈ rng#1 where

X ∈ S1, then we define Γ(#1X) = #2X. Note that Γ : rng#1 → rng#2 is well-

173

defined: if #1X = #1Y then we need to show that #2X = #2Y . This follows,

since

#1X = #1Y =⇒ [∃ bijection f : X → Y ] =⇒ #2X = #2Y (3.35)

Next, note that Γ : rng#1 → rng#2 is injective:

Γ(#1X) = Γ(#1Y ) =⇒ #2X = #2Y =⇒ [∃ bijection f : X → Y ] =⇒ #1X = #1Y

(3.36)

Finally, note that Γ : rng#1 → rng#2 is surjective: if #2X ∈ rng#2 then by

definition Γ(#1X) = #2X. Hence, in fact Γ : rng#1 → rng#2 is a bijection.

Further, note that the graph of Γ is in S2 since one has the equality

graph(Γ) = {(x, y) ∈M2 : ∃ Z #1(Z) = x & #2(Z) = y} (3.37)

and since it was assumed that the structure (M,S1, S2, . . . ,#1,#2) satisfies every

instance of the comprehension schema (3.8) in the signature that includes both

of the function symbols #1,#2. Now, extend to Γ : N1 → N2 by setting Γ(X) =

{Γ(x) : x ∈ X}, which exists in S1 since the graph of Γ is in S2. Then Γ : N1 → N2

is an isomorphism, because

Γ(#1X) = Γ(#1X) = #2X = #2{Γ(x) : x ∈ X} = #2Γ(X), (3.38)

where the first and second equalities follow respectively from the definitions of Γ

and Γ, and where the third equality follows from the fact that Γ : X → {Γ(x) :

x ∈ X} is a bijection whose graph is in S2, and where the last equality follows

from the definition of Γ.

174

Remark 21. The previous proposition can be thought of as an analogue of the rel-

ative categoricity results for models of PA2. In the 19th Century, Dedekind showed

that any two models (M,+,×, P (M), P (M2), . . .) and (N,⊕,⊗, P (N), P (N2), . . .)

of PA2 are isomorphic ([22] § 132, cf. Shapiro [135] Theorem 4.8 p. 82). How-

ever, it is not difficult to see that Dedekind’s result can be relativized, in the

following way: if (M,+,×,⊕,⊗, S1, S2, . . .) is a structure where Sn ⊆ P (Mn)

such that (M,+,×, S1, S2, . . .) and (M,⊕,⊗, S1, S2, . . .) are models of PA2 and

such that (M,+,×,⊕,⊗, S1, S2, . . .) satisfies every instance of the comprehen-

sion schema (3.8) in the signature of +,×,⊕,⊗, then (M,+,×, S1, S2, . . .) and

(M,⊕,⊗, S1, S2, . . .) are isomorphic (cf. Parsons [120] § 49 pp. 279 ff). The pre-

vious proposition is simply the analogue of this phenomenon in the setting of

HP2.

3.2.2 The Mutual Interpretability of PA2 and HP2

The goal of this section is to present a brief and self-contained proof of the

result that PA2 is mutually interpretable with HP2 (Corollary 30). One half of this

result, namely, the interpretability of HP2 in PA2 is due to Boolos (Corollary 29).

The other half of the result, namely, the interpretability of PA2 in HP2 is now

called Frege’s Theorem, namely (Corollary 27). The proof of Frege’s Theorem

can be broken down into two steps: first, the proof that PA2 is interpretable

in the theory consisting of (Q1)-(Q2) and the comprehension schema (3.3) (cf.

Theorem 22), and second the argument that this theory is interpretable in HP2 (cf.

Theorem 26). Elements of the first step can be found in Dedekind (cf. [22] § 72),

and elements of this second step can be traced back to Frege (cf. Boolos and Heck

[14]). However, the modern presentation stems from Wright [158] pp. 154-169 (cf.

175

also Boolos [12]). The warrant for including a proof of this result here is two-fold:

(i) the proof presented here is much briefer than other published presentations, and

(ii) the proof presented here is slightly different from other published presentations

in that it is centered around the notion of Dedekind-finiteness, defined in terms of

the lack of injective non-surjective functions, as opposed to Frege’s ancestral notion

(cf. the relation X ⊀ X in Proposition 24 and Theorem 26). The observations

recorded in this section about the Π1n-comprehension schema are due to Heck ([67]

p. 192) and Linnebo ([100] p. 161). The trick of defining the graph of addition

and multiplication in terms of its initial segments in the proof of Theorem 22 is

adapted from Burgess and Hazen [16] pp. 6-10, although their concern there was

not with Frege’s Theorem.

Theorem 22. PA2 is interpretable in the theory consisting of (Q1)-(Q2) and

the comprehension schema (3.3). More generally, Π1n − CA0 is interpretable in the

theory consisting of (Q1)-(Q2) and the comprehension schema (3.3) restricted to

Π1n-formulas for n > 0.

Proof. Suppose that we are working with structureM = (M,S1, S2, . . . , 0, s) that

satisfies (Q1)-(Q2) and the comprehension schema (3.3) restricted to Π1n-formulas

for n > 0. In what follows, we will refer respectively to the element 0 and the

function s as “zero” and “successor.” It must be shown how to uniformly define

a model of Π1n − CA0 within this structure. We say that X in S1 is inductive if it

contains zero and is closed under successor. Let N be the intersection of all the

inductive sets X in S1, which exists in S1 by Π11-comprehension. Note that zero is

in N by construction, and note that N is closed under successor: for, if a is in N

then a is contained in every inductive set X, and by definition of inductive sets,

it follows that the successor of a is contained in every inductive set X, which is

176

to say that the successor of a is in N .

Hence, we can define the structure N = (N,S1 ∩ P (N), S2 ∩ P (N2), . . . , 0, s)

uniformly within M. This structure then satisfies (Q1)-(Q2) since M satisfies

(Q1)-(Q2). Further, N satisfies the Mathematical Induction Axiom (3.2), since

if F ∈ S1 ∩ P (N) contains zero and is closed under successor, then F ∈ S1

contains zero and is closed under successor, and so by definition of N , it follows

that N ⊆ F ⊆ N . For (Q3), let X be the subset of N for which the conclusion

holds, i.e., X = {a ∈ N : a 6= 0 → ∃ w ∈ N x = sw}. Clearly zero is in X,

and suppose that a ∈ X ⊆ N : then of course sa = sw for some w ∈ N , namely

w = a, and hence sa ∈ X. Hence, by the Mathematical Induction Axiom (3.2),

it follows that X = N . Finally, before turning to the remainder of the axioms

of Robinson’s Q, note that since M satisfies Π1n-comprehension, we have that N

satisfies Π1n-comprehension as well, since the second-order parts of N are just the

second-order parts of M restricted to subsets of N .

To verify axioms Q4-Q5 of Robinson’s Q, we must first define addition. Let

x + y = z if and only if there is a graph of a partial function G ⊆ N3 such that

(x, y, z) ∈ G ⊆ N3 and

(x, 0, x) ∈ G & [(x, sy, z) ∈ G→ ∃ w sw = z & (x, y, w) ∈ G] (3.39)

That is, we define the graph of addition as the union of its initial segments. Note

that this graph of addition exists by the Π11-Comprehension Schema. Further,

note that addition is well-defined on its domain. Suppose that G0 and G1 are

partial functions which satisfy equation (3.39) and fix an arbitrary x and let

Y = {y ∈ N : ∀ z0, z1 (x, y, z0) ∈ G0 & (x, y, z1) ∈ G1 → z0 = z1}. Clearly, 0 ∈ Y

and if y ∈ Y and (x, sy, z0) ∈ G0 and (x, sy, z1) ∈ G1 then there is w0, w1 such

177

that sw0 = z0 and sw1 = z1 and (x, y, w0) ∈ G0 and (x, y, w1) ∈ G1. Then since

y ∈ Y we have w0 = w1 and hence z0 = sw0 = sw1 = z1. Hence, in fact, addition

is a well-defined function on its domain. To show that it is a total function, fix

an arbitrary x and let Y = {y ∈ N : ∃ z x+ y = z}. Clearly, 0 ∈ Y , since we can

choose G = {(x, 0, x)}. Suppose that y ∈ Y , say, with (x, y, z) ∈ G. To see that

sy ∈ Y , set G′ = G ∪ {(x, sy, sz)}. Then clearly G′ also satisfies equation (3.39).

Hence, in fact, addition is a total function. Finally, the verification of Q4 and Q5

follows directly from our construction in equation (3.39). To verify Q6-Q7, just

define multiplication analogously.

Remark 23. Hence, it remains to show that the theory consisting of (Q1)-(Q2)

and the comprehension schema (3.3) is interpretable in HP2. In preparation for

this result (Theorem 26), we first record some elementary considerations in the

following proposition.

Proposition 24. Suppose that (M,S1, S2, . . . ,#) models AHP0. For X, Y in S1,

define X ≺ Y if and only if there is injective non-surjective function f : X → Y

such that graph(f) is in S2. Then for a, b ∈ M and X,U,A,B in S1, it follows

that

(i) If a /∈ X and X ∪ {a} ≺ X ∪ {a} then X ≺ X.

(ii) If a /∈ X and U ≺ X ∪ {a} then U ≺ X or #U = #X.

(iii) If a ∈ A, b ∈ B and #A = #B then #(A− {a}) = #(B − {b})

(iv) If X 6= ∅ then ∅ ≺ X

(v) X ⊀ ∅

178

Proof. For (i), suppose that f : X ∪ {a} → X ∪ {a} is an injection that is not a

surjection. If f(X) ⊆ X then f(a) = a and so f(X) ( X, and hence X ≺ X. If

f(X) * X then say f(y) = a where y ∈ X and f(a) = z ∈ X, and hence define

g : X → X by g(y) = z and g = f otherwise. Then g is injective and misses

the same point that f does. Further, the graph of g exists by the arithmetical

comprehension schema.

For (ii), suppose that f : U → X∪{a} is an injection which is not a surjection.

If f(U) ⊆ X then #U = #X when f : U → X is a bijection and U ≺ X otherwise.

If f(U) * X then say f(y) = a and f misses b ∈ X, in which case we define an

injective function g : U → X by g(y) = b and g = f otherwise. The graph

of g exists by the arithmetical comprehension schema. If g is a bijection, then

#U = #X and U ≺ X otherwise.

For (iii), suppose that f : A→ B is a bijection. If f(a) = b then f � (A−{a})

is the desired bijection. If f(a) = d for d 6= b and f(c) = b for c 6= a, then define a

bijection g : (A− {a})→ (B − {b}) by g(c) = d and g = f otherwise. The graph

of this function g then exists by the arithmetical comprehension schema.

For (iv), note that the “empty” binary relation witnesses that there is an

injective non-surjective function from ∅ to X.

For (v), note that if X ≺ ∅, then there would be an injective non-surjective

function f : X → ∅, which would imply that there was an element in ∅ \ rng(f),

which would imply that there was some element in ∅.

Remark 25. It is well-known that the chief difficulty in the proof of the following

theorem is establishing the totality of the successor function (cf. remarks to this

effect in Wright [158] p. 161). Prior to looking at the proof, it is helpful to

179

think about what happens on the standard models (α, P (α), P (α2), . . . ,#) from

§ 3.2.1, where α is an ordinal which is not a cardinal and where # : P (α) → α

is cardinality. It is easy to see that ω is uniformly definable in each of these

structures. Further, it is easy to see that for each n ∈ ω, it follows that

{#W : W ≺ {0, . . . , n}} = {0, . . . , n} (3.40)

where as in the previous proposition, X ≺ Y if and only if there is injective

non-surjective function f : X → Y . From this we see that

{0, . . . , n} ⊀ {0, . . . , n} & #{0, . . . , n} = #{#W : W ≺ {0, . . . , n}} (3.41)

as well as

s(#{0, . . . , n}) = s(n+ 1) = n+ 2 = #({0, . . . , n} ∪ {n+ 1})

= #({#W : W ≺ {0, . . . , n}} ∪ {#({0, . . . , n})}) (3.42)

The entire idea of the below proof is to show that we can replicate these consid-

erations in arbitrary models of HP2. So in such an arbitrary model, we will define

an analogue N of ω, and for analogues X of {0, . . . , n}, we will find that

s(#X) = #({#W : W ≺ X} ∪ {#X}) (3.43)

This, in any case, is the heuristic explanation of the proof of the totality of the

successor function in the following theorem.

Theorem 26. The theory consisting of (Q1)-(Q2) and the comprehension schema (3.3)

is interpretable in HP2. More generally, the theory consisting of (Q1)-(Q2) and the

180

comprehension schema (3.3) restricted to Π1n-formulas is interpretable in Π1n − HP0

for n > 0.

Proof. Suppose that we are working with structure M = (M,S1, S2, . . . ,#) that

satisfies Π1n − HP0. It must be shown how to uniformly define a model of (Q1)-(Q2)

and the comprehension schema (3.3) restricted to Π1n-formulas. Define 0 = #∅

and define s(x, y) if and only if there is X, Y in S1 such that #X = x,#Y = y,

and there is b ∈ Y such that #X = #(Y − {b}). That is, s(x, y) says that x, y

are respectively cardinalities of sets X, Y and the cardinality of X is equal to the

cardinality of Y minus one point. Note that the relation s exists in S2 by the Π11-

comprehension schema. In what follows, we will respectively refer to the element

0 and the relation s as “zero” and “successor,” keeping in mind that formally s

is a binary relation. Then say that X in S1 is inductive if it contains zero and is

closed under successors, that is, if x ∈ X and s(x, y) then y ∈ X. Then define

N to be the intersection of all the inductive sets, so that N is in S1 by the Π11-

comprehension schema. Now we show that (i) s is a well-defined function on its

domain and that (ii) s is a total function on N that (iii) maps elements of N to

elements of N and that (iv) satisfies axioms Q1-Q2 on N .

For (i), to see that s is well-defined, suppose that s(x, y) and s(x, z). Then

x = #X, y = #Y , z = #Z and there exists b ∈ Y, c ∈ Z such that #X =

#(Y − {b}) = #(Z − {c}). Then there is bijection f : (Y − {b}) → (Z − {c})

whose graph is in S2. Define f : Y → Z by setting f � (Y −{b}) = f and f(b) = c.

Then the graph of f is in S2 by the arithmetical comprehension schema. Further,

since f : Y → Z is a bijection, it follows that y = #Y = #Z = z. Hence, s is a

well-defined function on its domain.

For (ii), recall from Proposition 24 that for X, Y in S1, we say X ≺ Y if and

181

only if there is an injective non-surjective function f : X → Y such that graph(f)

is in S2. Then by iterated applications of Π11-comprehension, the following exist

in S2 and S1 respectively

R = {(#W,#X) : W ≺ X} (3.44)

Z = {#X : X ⊀ X & ∃ Y (∀ w w ∈ Y ↔ (w,#X) ∈ R) & #X = #Y }(3.45)

Note that

Z = {#X : X ⊀ X & #X = #({#W : W ≺ X})} (3.46)

(It may be heuristically helpful to compare this with equation (3.41)). Suppose

that #X is in Z. Then X ⊀ X and #X = #({#W : W ≺ X}). Then

s(#X,#({#W : W ≺ X} ∪ {#X})) (3.47)

(Likewise, it may be helpful to compare this with equation (3.43)). Hence, we

have the inclusion Z ⊆ {x : ∃ y s(x, y)}, and so it suffices to show that Z is

inductive.

Clearly, 0 ∈ Z. Suppose that #X is in Z, so that X ⊀ X and #X =

#({#W : W ≺ X}). Then s(#X,#({#W : W ≺ X} ∪ {#X})). Since successor

is well-defined on its domain by part (i), it suffices to show that #({#W : W ≺

X} ∪ {#X}) is in Z. We have {#W : W ≺ X} ⊀ {#W : W ≺ X}. Since

#X /∈ {#W : W ≺ X}, it follows from Proposition 24 (i) that {#W : W ≺ X}∪

{#X} ⊀ {#W : W ≺ X} ∪ {#X}. Hence, #({#W : W ≺ X} ∪ {#X}) satisfies

the first conjunct of Z in equation (3.46). To see that #({#W : W ≺ X}∪{#X})

182

satisfies the second conjunct of Z in equation (3.46), it suffices to show that

{#W : W ≺ X} ∪ {#X} = {#U : U ≺ {#W : W ≺ X} ∪ {#X}} (3.48)

For the left-to-right direction, suppose first that W ≺ X. Since X is bijective

with {#W : W ≺ X}, we have that W ≺ {#W : W ≺ X} ∪ {#X}. Continuing

with the left-to-right direction, suppose that #U = #X. Since X is bijective with

{#W : W ≺ X}, we have that #U = #({#W : W ≺ X}) and hence U ≺ {#W :

W ≺ X} ∪ {#X}. For the right-to-left direction, suppose that U ≺ {#W : W ≺

X} ∪ {#X}. Since #X /∈ {#W : W ≺ X}, we have by Proposition 24 (ii) that

#U = #({#W : W ≺ X}) = #X or U ≺ {#W : W ≺ X}. Hence, in fact

equation (3.48) holds. It follows that #({#W : W ≺ X}∪{#X}) is in Z. Hence,

Z is an inductive set, and as mentioned at the close of the above paragraph, it

thus follows that successor is a total function on N .

(iii) Now we show that successor maps elements of N to elements of N . Sup-

pose that a is in N . Then by definition, a is contained in every inductive set, and

by parts (i)-(ii), it follows that there is unique b such that s(a, b), from which it

follows that b is contained in every inductive set, so that b is contained in N as

well. Hence, successor maps elements of N to elements of N .

(iv) Finally, we note that the successor function s satisfies axioms (Q1)-(Q2).

To see that it satisfies (Q1), note that if s#X = 0 = #∅, then ∅ would be

bijective with a non-empty set, which is a contradiction. To see that it satisfies

(Q2), suppose that s#X = s#Y . Then s#X = #A where #X = #(A − {a})

for some a ∈ A and s#Y = #B where #Y = #(B − {b}) for some b ∈ B. Then

Proposition 24 (iii) implies that #X = #(A− {a}) = #(B − {b}) = #Y .

Putting this all together, we can uniformly define the structure N = (N,S1 ∩

183

P (N), S2 ∩P (N2), . . . , 0, s) which satisfies (Q1)-(Q2). Finally, note that sinceM

satisfies Π1n-comprehension, we have that N satisfies Π1

n-comprehension as well,

since the second-order parts of N are just the second-order parts ofM restricted

to subsets of N .

Corollary 27. PA2 is interpretable in HP2. More generally, Π1n − CA0 is inter-

pretable in Π1n − HP0 for n > 0.

Proof. This follows immediately from Theorem 26 and Theorem 22.

Remark 28. The following theorem was first noted by Boolos ([10]). We include

here for the sake of having a relatively self-contained presentation of the main

results in this area, and because we will use Boolos’ construction to transfer facts

about the provability relation from subsystems of PA2 to subsystems of HP2 (cf.

the proofs of Proposition 53 and Proposition 55).

Theorem 29. HP2 is interpretable in PA2. More generally, Π1n − HP0 is interpretable

in Π1n − CA0 for n > 0, and Σ11 − PH0 is interpretable in Σ11 − AC0 and AHP0 is inter-

pretable in ACA0.

Proof. We begin with the proof of the interpretability of AHP0 in ACA0. We will

note how this proof yields all the other results as well. Let us work in a model

M = (M,S1, S2, . . . ,⊕,⊗) of ACA0, where Sn ⊆ P (Mn). We must show how to

uniformly define a model of AHP0. Consider the model N = (M,S1, S2, . . . ,#)

where #(X) = n + 1 if |X| = n, and where #(X) = 0 if X is infinite. Then

N is clearly definable in M since the graph of X is arithmetically definable.

Further, since this graph is arithmetically definable, it follows that N satisfies

the arithmetical comprehension schema. Further, by Simpson [138] Lemma II.3.6

p. 70, ACA0 proves that any two infinite sets are bijective, so that N is a model

184

of AHP0. Hence, in fact we have that AHP0 is interpretable in ACA0. Further,

it is obvious from this construction that N will satisfy whatever comprehension

schemas M satisfies.

Corollary 30. PA2 is mutually interpretable with HP2. More generally, Π1n − CA0

is mutually interpretable with Π1n − HP0 for n > 0.

Proof. This follows immediately from Corollary 27 and Theorem 29.

Remark 31. The notion of faithful interpretability is a modification of inter-

pretability in the following respect: whereas the interpretability of one theory in

another only requires that translations of theorems of the interpreted theory are

theorems of the interpreting theory, faithful interpretability additionally requires

that translations of non-theorems of the interpreted theory are non-theorems of

the interpreting theory (cf. Lindstrom [99] § 6.2 pp. 106 ff). It is not difficult to

see, using the ideas from the proof of Corollary 27 and Theorem 29, that PA2 is

faithfully interpretable in HP2. However, the converse is not obvious, that is, it is

not obvious whether or not HP2 is faithfully interpretable in PA2 (cf. Question 113).

In § 3.2.1, and in particular at Remark 18, we noted that there are numerous nat-

ural sentences that are independent of HP2, whereas there are comparatively few

natural sentences which are known to be independent of PA2. This leads one to

suspect that HP2 is not faithfully interpretable in PA2 or that any such faithful in-

terpretation is comparatively unnatural, since such a faithful interpretation would

allow us to turn all the independent sentences of HP2 into independent sentences

of PA2.

185

3.3 Standard Models of Subsystems of BL2 and Associated Results

The primary goal of this section is to study models of subsystems of BL2 that

are standard in the sense that they have the form (ω, S1, S2, . . . , ∂), where the sets

Sn ⊆ P (ωn) all come from some antecedently fixed computational class (e.g. the

recursive sets, the arithmetical sets, the hyperarithmetical sets, etc.). The main

result of this section is Theorem 60 which gives a construction of a standard model

of the hyperarithmetic subsystem of BL0 in terms of the hyperarithmetic subsets

of natural numbers. Further, this construction isolates a certain sentence Inf (cf.

Definition 58) such that Σ11 − AC0 ≤I Σ11 − LB0 + Inf <I Π

11 − CA0 (cf. Corollary 61

and Figure 3.2).

In the preliminary section § 3.3.1, we record some elementary facts about arbi-

trary models of subsystems of BL2, focusing in particular on the fact that arbitrary

models of the hyperarithmetic subsystems of BL2 require the existence of injective

non-surjective functions (cf. Proposition 38). Such functions are important both

because they are used to define the sentence Inf (cf. Definition 58) and because

such functions are not required to exist by the hyperarithmetic subsystems of HP2

(cf. Remark 37). Further, in the preliminary section § 3.3.2, we review some

elementary facts about hyperarithmetic theory, which we will employ in § 3.3.3.

We also use these facts to fill in some parts of the provability relation (cf. Propo-

sitions 47-53 and Figure 3.1). Finally, in § 3.3.3, we turn to the main results of

this section, namely the aforementioned Theorem 60 and Corollary 61.

3.3.1 Generalities on Models of Subsystems of BL2

Proposition 32. Suppose that Y ⊆ M is definable with parameters by an

arithmetical formula in the structure (M,S1, S2, . . . , ∂) (resp. in the structure

186

(M,S1, S2, . . . ,#)). Then Y is definable with parameters by an arithmetical for-

mula that does not contain any instances of ∂ (resp. does not contain any instances

of #).

Proof. If Y ⊆ M is definable in (M,S1, S2, . . . , ∂) by an arithmetical formula ϕ,

and if ∂(P ) appears in ϕ, then P is not free in ϕ but rather is a parameter from

S1 and hence a = ∂(P ) is a parameter from M . So, replacing parameters from

S1 with parameters from M , it follows that the set Y is also definable by an

arithmetical formula that does not contain any instances of ∂.

Proposition 33. Suppose that M is a structure and ∂ : D(M)→M is an injec-

tion, whereD(Mn) is the definable subsets ofMn. Then (M,D(M), D(M2), . . . , ∂)

is a model of ABL0.

Proof. It is a model of Basic Law V since ∂ is an injection (cf. discussion subse-

quent to (3.6)). Further, it satisfies the arithmetical comprehension schema, since

if X ⊆ M is defined by an arithmetical formula, then by Proposition 32 it is de-

fined by an arithmetical formula which does not include any instances of ∂. Hence,

since D(M) is closed under arithmetical comprehension, it follows that X is in

D(M), so that the structure (M,D(M), D(M2), . . . , ∂) satisfies the arithmetical

comprehension schema.

Proposition 34. Suppose that (M,S1, S2, . . . , ∂) is a model of ∆11 − BL0. (a)

Then there is a injective function s : M → M such that s(x) = ∂({x}) and such

that graph(s) is in S2. (b) Further, there is a function s : Mn → M such that

s(x1, . . . , xn) = ∂({x1, . . . , xn}) and such that graph(s) is in Sn+1.

Proof. The proof of (b) is identical to the proof of (a), so we present only the proof

of (a). It suffices to show three things: first, that the graph of this function is ∆11,

187

second that this function is well-defined and total, and third that the function is

injective. Note that the following Σ11 and Π1

1-definitions of s(x) = y agree:

[∃ X (∀ z z ∈ X ↔ z = x) & ∂X = y]⇐⇒ [∀ Y (∀ z z ∈ Y ↔ z = x)→ ∂Y = y]

(3.49)

Suppose that the left-hand-side of this equation holds and that Y = {x}. Then

Y = X and hence ∂(Y ) = ∂(X) = y. Conversely, suppose that the right-hand-

side of this equation holds. By arithmetical comprehension, form the set X =

{x}. Then by the right-hand-side it is the case that ∂(X) = y. Hence, by

∆11-comprehension, there is an s such that s(x, y) if and only if both the left-

hand-side and the right-hand-side of the above equation holds with respect to x

and y. To see that the function is well-defined, suppose that the left-hand-side

holds both of x and y and of x and z. By arithmetical comprehension, form the

set Y = {x}. Then the right-hand-side implies that y = ∂(Y ) = z. Hence, the

function is well-defined. Further, it is everywhere defined because given x one

can use arithmetical comprehension to form X = {x}, and hence x and ∂(X)

will satisfy the right-hand-side. Finally, to see that the function X is injective,

suppose that s(x) = s(y). Then ∂({x}) = ∂({y}). By Basic Law V, it follows

that {x} = {y} and hence that x = y.

Remark 35. The following proposition generalizes the construction in the Russell

Paradox (cf. Proposition (10)). Note that in the following proposition, the term

rng∂ is employed to designate the range of the function ∂. However, this set need

not exist in the second-order parts of any of the models under consideration, even

though it is is defined by a Σ11-formula in these models.

Proposition 36. Suppose that (M,S1, S2, . . . , ∂) is a model of ∆11 − BL0. For every

188

A in S1 such that A ⊆ rng∂, there is B in S1 such that B ⊆ A and ∂B ∈ rng∂−A.

Proof. First we claim that for all x it is the case that

[∃ X x ∈ A & ∂X = x & x /∈ X]⇐⇒ [∀ Y x ∈ A & (∂Y = x→ x /∈ Y )] (3.50)

Suppose that the left-hand-side holds, i.e., suppose that x ∈ A & ∂X = x & x /∈

X, and further suppose that Y is such that ∂Y = x. Then ∂X = x = ∂Y and

Basic Law V implies that X = Y . Conversely, suppose that the right-hand-side

holds, i.e., suppose it is the case that ∀ Y x ∈ A & (∂Y = x → x /∈ Y ). Since

x ∈ A ⊆ rng∂, there is X such that ∂X = x, and hence x /∈ X. The claim

is proved, and, hence, by the ∆11-Comprehension Schema, there exists B such

that x ∈ B if and only if both the left-hand-side and right-hand-side of (3.50)

hold with respect to x. Note that it follows automatically from the left-hand-side

that B ⊆ A. So it remains to show that ∂B ∈ rng∂ − A. Suppose not. Then

∂B ∈ rng∂ ∩ A. Then either ∂B ∈ B or ∂B /∈ B. If ∂B ∈ B then by right-

hand-side we have ∂B /∈ B, which is a contradiction. If ∂B /∈ B, then by the

left-hand-side we have that ∀ X ∂B /∈ A ∨ ∂X 6= ∂B ∨ ∂B ∈ X. Applying this

to X = B we have that ∂B /∈ A ∨ ∂B 6= ∂B ∨ ∂B ∈ B. Since by hypothesis we

have that ∂B ∈ rng∂∩A, we must conclude that ∂B ∈ B, which again contradicts

our supposition. Hence, in fact, ∂B ∈ rng∂ − A.

Remark 37. The following corollary is important because it shows that satisfying

∆11 − BL0 requires the existence of injective non-surjective functions. As we note

in Proposition 39 and later in Corollary 80, this is not the case with ABL0 and

∆11 − HP0.

Corollary 38. Suppose that (M,S1, S2, . . . , ∂) is a model of ∆11 − BL0. Then there

189

is a injective non-surjective function s : M → M such that graph(s) is in S2 and

such that s(x) = ∂({x}).

Proof. By Proposition 34 there is an injective function s : M → M such that

rng(s) ⊆ rng∂ and such that graph(s) is in S2 and such that s(x) = ∂({x}). By

Proposition 36, there is B in S1 such that B ⊆ rng(s) and ∂B ∈ rng∂ − rng(s).

Hence, s : M →M is not surjective.

Proposition 39. There is a structure (M,S1, S2, . . .) such that

(i) For any injection ∂ : S1 → M it is the case that (M,S1, S2, . . . , ∂) mod-

els ABL0.

(ii) There is no injection ∂ : S1 → M such that (M,S1, S2, . . . , ∂) models

∆11 − BL0.

Proof. Let M be an algebraically closed field (cf. Marker [107] Example 4.3.10

p. 140) and let Sn = D(Mn), i.e. the definable subsets of Mn. Suppose that

s : M →M was an injective surjective function whose graph was in S2 = D(M2).

Then this implies that there is a definable injective non-surjective function s :

M →M , which contradicts Ax’s Theorem (cf. Theorem 72). For (i), note that by

Proposition 33, the structure (M,S1, S2, . . . , ∂) is a model of ABL0 for any injection

∂ : D(k)→ k. For (ii), note that if there was such an injection ∂ : S1 →M , then

by Corollary 38, there would be an injective non-surjective s : M →M such that

graph(s) is in S2, which is a contradiction.

3.3.2 Hyperarithmetic Theory and Related Results

Definition 40. Suppose thatX, Y ∈ 2ω. ThenX ≤T Y ifX is Turing computable

from Y or if X is ∆0,Y1 . Further, X ≤a Y if X is arithmetical in Y or if there

190

is n > 0 such that X is ∆0,Yn . Finally, X ≤h Y if X is hyperarithmetic in Y or

if X is ∆1,Y1 (For computational definitions of these reducibilities and proofs that

they correspond with the relevant definability notion, see respectively Soare [139]

p. 64, Odifreddi [118] p. 375, Sacks [134] p. 44).

Definition 41. Suppose that Y ∈ 2ω. Then define

REC(Y ) = {X ∈ 2ω : X ≤T Y } (3.51)

ARITH(Y ) = {X ∈ 2ω : X ≤a Y } (3.52)

HYP(Y ) = {X ∈ 2ω : X ≤h Y } (3.53)

Further, let REC = REC(∅) and ARITH = ARITH(∅) and HYP = HYP(∅) (cf.

Simpson [138] Remark I.7.5. p. 25, Example I.11.2 p. 39).

Remark 42. Recall that structures in the language of HP2 and BL2 have the form

(M,S1, S2, . . . ,#), where Sn ⊆ P (Mn) and # : S1 → M (cf. equation (3.4)).

If # : HYP(Y ) → ω, then (ω,HYP(Y ),#) will be used as an abbreviation for

the structure (ω, S1, S2, . . . ,#), where Sn ⊆ P (ωn) is the set of n-ary relations

whose graph is in HYP(Y ). Similarly, in what follows, we will sometimes use the

abbreviations (ω,REC(Y ),#) and (ω,ARITH(Y ),#).

Proposition 43. The relation X ≤h Y is Π11.

Proof. See Sacks [134] p. 45.

Theorem 44. (Kleene’s Theorem on Restricted Quantification) Suppose that

ϕ(X, Y ) is a Π11 predicate. Then ∃ X ≤h Y ϕ(X, Y ) is a Π1

1-predicate. Moreover,

this is provable in Π11 − CA0.

191

Proof. See Kleene [92] and Moschovakis [113] Theorem 4D.3 p. 220. That this

theorem is provable in Π11 − CA0 was noted by Simpson [138] VIII.3.20 p. 330.

Theorem 45. (Spector-Gandy Theorem) Suppose that ϕ(Y ) is a Π11-predicate.

Then there is an arithmetic predicate ψ(X, Y ) such that ϕ(Y ) ↔ ∃ X ≤h

Y ψ(X, Y ).

Proof. See Spector and Gandy ([140], [49]), Sacks [134] Theorem III.3.5 p. 61 and

Exercise III.3.13 p. 62.

Remark 46. The following proposition is non-trivial only because the second-

order quantifiers must be evaluated with respect to the second-order part S1 ⊆

P (ω) of the structure (ω, S1) and not with respect to P (ω) itself. For instance,

one cannot infer that (ω,HYP(Y )) |= ¬Π11 − CA0 simply from the fact that OY is

Π11 but not Σ1

1, since to say this is merely to say that OY is Π11-definable but not

Σ11-definable on the structure (ω, P (ω)).

Proposition 47. Suppose that Y ∈ 2ω. Then (ω,ARITH(Y )) |= ACA0+¬∆11 − CA0

and (ω,HYP(Y )) |= Σ11 − AC0 + ¬Π11 − CA0.

Proof. For the fact that (ω,ARITH(Y )) |= ACA0, see Simpson [138] Theorem VIII.1.13

p. 313. Suppose that (ω,ARITH(Y )) |= ∆11 − CA0. But note that

(n,m) ∈ Y (ω) ⇐⇒ ∃ X ∈ ARITH(Y ) X = ⊕ni=1Y(i) & (n,m) ∈ X

⇐⇒ ∀ X ∈ ARITH(Y ) X = ⊕ni=1Y(i) → (n,m) ∈ X (3.54)

and hence Y (ω) ∈ ARITH(Y ), which would contradict Tarski’s Theorem on Truth.

Hence, in fact (ω,ARITH(Y )) |= ¬Σ11 − AC0. For the fact that (ω,HYP(Y )) |=

Σ11 − AC0, see Simpson [138] Theorem VIII.4.5 p. 334 and Theorem VIII.4.8 p. 335.

192

This proof uses Kleene’s Theorem on Restricted Quantification 44, and below in

Theorem 60 we will emulate this proof in the setting of BL2. Suppose for the

sake of contradiction that (ω,HYP(Y )) |= Π11 − CA0. Since OY is Π1,Y1 , by the

Spector-Gandy Theorem (45), there is an arithmetic predicate ψ(n,X, Y ) such

that n ∈ OY ⇐⇒ ∃ X ≤h Y ψ(n,X, Y ). ThenOY is Σ11-definable on (ω,HYP(Y ))

and hence exists in HYP(Y ) by Π11 − CA0, which contradicts that OY is not in

HYP(Y ).

Corollary 48. Suppose that there is a Π11-formula θ(X, Y, Z) such that for all

Z ∈ 2ω the set GZ = {(X, Y ) ∈ 2ω × 2ω : θ(X, Y, Z)} is the graph of a function

gZ : HYP(Z) → HYP(Z). Then the graph GZ of gZ is Σ11-definable in the

structure (ω,HYP(Z)) uniformly in Z.

Proof. Note that since gZ : HYP(Z)→ HYP(Z), we have that for all X, Y, Z ∈ 2ω

θ(X, Y, Z) =⇒ X ⊕ Y ≤h Z (3.55)

By the Spector-Gandy Theorem (45), there is an arithmetical predicate ψ(X, Y, Z,W )

such that for all X, Y, Z ∈ 2ω

θ(X, Y, Z)⇐⇒ ∃ W ≤h X ⊕ Y ⊕ Z ψ(X, Y, Z,W ) (3.56)

Putting the two previous equations together, we have that for all X, Y, Z ∈ 2ω

θ(X, Y, Z)⇐⇒ ∃ W ≤h Z ψ(X, Y, Z,W ) (3.57)

Then for all X, Y, Z ∈ 2ω

gZ(X) = Y ⇐⇒ (ω,HYP(Z)) |= ∃ W ψ(X, Y, Z,W ) (3.58)

193

Hence, in fact the graph GZ of gZ is Σ11-definable in the structure (ω,HYP(Z))

uniformly in Z.

Theorem 49. (Kondo’s Uniformization Theorem) Suppose that ϕ(X, Y ) is a Π11

predicate. Then there is a Π11-predicate ϕ′(X, Y ) such that

∀ X, Y [ϕ′(X, Y )→ ϕ(X, Y )] (3.59)

∀ X [∃ Y ϕ(X, Y )]→ [∃!Y ϕ′(X, Y )] (3.60)

Moreover, this is provable in Π11 − CA0.

Proof. See Moschovakis [113] pp. 235-236. Simpson notes that Kondo’s theorem

is provable in Π11 − CA0 (cf. [138] Theorem VI.2.6 p. 225).

Remark 50. The following two propositions use some of the preceding material

to fill in some information about the probability relation (cf. Figure 3.1).

Proposition 51. There are models of ABL0 + ¬∆11 − BL0.

Proof. Choose any injection ∂ : ARITH→ ω. Then by Proposition 33 the struc-

ture (ω,ARITH, ∂) is a model of ABL0. Further, since the graphs of addition and

multiplication are in ARITH, if (ω,ARITH, ∂) |= ∆11 − BL0, then one would have

that ∅(ω) ∈ ARITH (cf. equation (3.54)), which would contradict Tarski’s theorem

on truth.

Remark 52. The construction in the following proposition is the same construc-

tion as Boolos used to prove the interpretability of HP2 in PA2 (cf. the proof of

Corollary 29).

Proposition 53. There are models of AHP0 +¬∆11 − HP0 and Σ11 − PH0 +¬Π11 − HP0

and ∆11 − HP0 + ¬Σ11 − PH0

194

Proof. Define a function # : ARITH → ω by #X = 0 if X is infinite and

#X = |X| + 1 if X is finite. By Simpson [138] Lemma II.3.6 p. 70, ACA0 proves

that any two infinite sets are bijective, and hence (ω,ARITH,#) is a model of

Hume’s Principle. Further, it satisfies the arithmetical comprehension schema,

since if X ⊆ ω is defined by an arithmetical formula, then by Proposition 32 it

is defined by an arithmetical formula that does not include any instances of #.

Hence, since ARITH is closed under arithmetical comprehension, it follows that X

is in ARITH, so that the structure (ω,ARITH,#) satisfies the arithmetical com-

prehension schema. Since ∅(ω) /∈ ARITH but ∅(ω) is ∆11-definable over ARITH us-

ing the graphs of addition and multiplication as parameters (cf. equation (3.54)),

we have that (ω,ARITH,#) is a model of AHP0 + ¬∆11 − HP0. Similarly, using the

fact that the graph of # is arithmetical, we can argue that (ω,HYP,#) is a model

of Σ11 − PH0 + ¬Π11 − HP0. Likewise, Steel constructs a sequence of reals Gn such

that (ω,⋃∞n=1 HYPG1⊕···⊕Gn) is a model of ∆11 − CA0 +¬Σ11 − AC0 ([141] Theorem 4

pp. 68 ff), and we can argue as before that (ω,⋃∞n=1 HYPG1⊕···⊕Gn ,#) is a model

of ∆11 − HP0 + ¬Σ11 − PH0.

Remark 54. The following two propositions use elementary considerations about

arithmetical sets (cf. Definition 41) to record some observations about natural

functions whose existence cannot be proven in ABL0 or AHP0. For the motivation

for these propositions, see § 2.2, and in particular around equation (3.20). The only

reason for including these propositions here is that it seemed prudent to delay their

proof until the arithmetical sets had been introduced, which we did earlier in this

section (cf. Definition 41). Note that the construction in the following proposition

is analogous to the construction used by Boolos to prove the interpretability of

HP2 in PA2 (cf. the proof of Corollary 29).

195

Proposition 55. There is a structure M and a function # : D(M)→M , where

D(Mn) is the definable subsets of Mn, such that (M,D(M), D(M2), . . . ,#) is a

model of AHP0, and further there is binary relation R in D(M2) such that the set

{(n,m) : #(Rn) = m} does not exist in D(M2), where Rn = {x : Rnx}.

Proof. Let M be the standard model of first-order arithmetic (ω,+,×) so that

D(M) are the arithmetical sets ARITH. Choose a real Z /∈ ARTIH, such as

∅(ω), and enumerate Z as z0, z1, z2, . . .. Define the function # : ARITH → ω by

#(X) = zn if X is finite and |X| = n and define #(X) = z∞ for some z∞ /∈ Z if X

is infinite. This structure satisfies arithmetical comprehension, since if X ⊆M is

defined by an arithmetical formula, then by Proposition 32 it is defined by an arith-

metical formula which does not include any instances of #. Hence, since D(M)

is closed under arithmetical comprehension, it follows that X is in D(M), so that

the structure (M,D(M), D(M2), . . . ,#) satisfies the arithmetical comprehension

schema. Further, by Simpson [138] Lemma II.3.6 p. 70, ACA0 proves that any two

infinite sets are bijective, and hence the structure (M,D(M), D(M2), . . . ,#) is a

model of Hume’s Principle. Hence, (M,D(M), D(M2), . . . ,#) is a model of AHP0.

Consider now the set R = {(n,m) : m < n}, which is clearly arithmetical and so

exists in D(M2). Then Rn = {x : Rnx} = {0, . . . , n− 1} and #(Rn) = zn. Then

the set

{(n,m) : #(Rn) = m} = {(n,m) : zn = m} (3.61)

is equal to the graph of n 7→ zn, which is not arithmetical: for, if it were arithmeti-

cal, then its range Z would be be arithmetical, which contradicts the hypothesis

on Z.

Proposition 56. There is a structure M and an injection ∂ : D(M)→M , where

D(Mn) is the definable subsets of Mn, such that (M,D(M), D(M2), . . . , ∂) is a

196

model of ABL0, and further there is binary relation R in D(M2) such that the set

{(n,m) : ∂(Rn) = m} does not exist in D(M2), where Rn = {x : Rnx}.

Proof. Let M be the standard model of first-order arithmetic (ω,+,×) so that

D(M) are the arithmetical sets ARITH. Choose a real Z /∈ ARTIH, such as

∅(ω), and enumerate Z as z0, z1, z2, . . .. Choose an injection ∂ : ARITH → ω

such that ∂({n}) = zn, which we can do since Z is coinfinite (since it is not

arithmetical). Then by Proposition 33, the structure (M,D(M), D(M2), . . . , ∂)

is a model of ABL0. Consider now the diagonal R = {(n,m) : n = m} which is

clearly arithmetical and so exists in D(M2). Then Rn = {x : Rnx} = {n} and

∂(Rn) = ∂({n}) = zn. Then the set

{(n,m) : ∂(Rn) = m} = {(n,m) : zn = m} (3.62)

is equal to the graph of n 7→ zn, which is not arithmetical: for, if it were arithmeti-

cal, then its range Z would be be arithmetical, which contradicts the hypothesis

on Z.

3.3.3 Standard Models of the Hyperarithmetic Subsystems of BL2

Remark 57. Recall from Proposition 34 that ∆11 − BL0 proves the existence of the

graph of an injective function s : M →M such that s(x) = ∂({x}). This function

is is mentioned in the following axiom.

Definition 58. The following sentence Inf is a sentence in the signature of BL2:

197

Inf ≡ ∃ s : M →M [∀ x s(x) = ∂({x})] &

∃ N [∂(∅) ∈ N & ∀ x x ∈ N → sx ∈ N ]

& ∀ N ′ [∂(∅) ∈ N ′ & ∀ x x ∈ N ′ → sx ∈ N ′]→ N ⊆ N ′

& ∃ ⊕ : N2 → N ∃ ⊗ : N2 → N ∃ � ⊆ N2

[(N, ∂(∅), s,⊕,⊗,�) |= (Q1)− (Q8)] (3.63)

Intuitively, Inf says that there is a smallest set N which contains the zero ele-

ment ∂(∅) and which is closed under the successor function s(x) = ∂({x}) and

which has addition and multiplication functions ⊕ and ⊗ and an ordering relation

� which satisfy the eight axioms of Robinson’s Q.

Remark 59. The following theorem and its corollary is the main result of § 3.3.

Recall that the Russell paradox showed that BL0 and Π11 − BL0 is inconsistent (cf.

Proposition (10)). Recently Ferreira and Wehmeier ([40]) showed that ∆11 − BL0 is

consistent, using Barwise and Schlipf’s recursively-saturated model construction.

In § 3.4.1, we present a generalization of this construction (cf. Theorem 70), which

we apply to ∆11 − BL0 and ∆11 − HP0 (cf. Proposition 83, Corollary 99, Theorem 108,

and Remark 109). However, the recursively-saturated model construction does not

provide one with natural models, simply because most natural structures are not

recursively saturated (unless of course they are saturated tout court). Hence, this

raises the question of whether there are natural models of ∆11 − BL0. The following

theorem constructs a model of ∆11 − BL0 which is mutually interpretable with the

minimal ω-model of ∆11 − CA0, namely, the model whose second-order part consists

of the hyperarithmetic sets.

198

Theorem 60. For any real Y ∈ 2ω, there is a map ∂Y : HYP(Y )→ ω with Π1,Y1 -

graph such that (i) the structure MY = (ω,HYP(Y ), ∂Y ) is a model of (a) Σ11 − LB0

and (b) the sentence Inf, and such that (ii) the structures MY = (ω,HYP(Y ), ∂Y )

and (ω, 0, S,+,×,≤,HYP(Y )) are mutually interpretable uniformly in Y , in the

following sense:

(a) The map ∂Y : HYP(Y ) → ω is definable in (ω,HYP(Y ), 0, s,+,×,≤) uni-

formly in Y .

(b) An isomorphic copy HY of the structure (ω,HYP(Y ), 0, s,+,×,≤) is defin-

able in the structure MY = (ω,HYP(Y ), ∂Y ) uniformly in Y .

Moreover, all these facts are provable in Π11 − CA0.

Proof. Define P (Y ⊕X,n) iff X ∈ HYP(Y ) and n = 〈a, e〉 is a hyperarithmetical-

in-Y index of X:

P (Y ⊕X, 〈a, e〉) ≡ X ∈ HYP(Y ) & a ∈ OY & X = {e}HYa (3.64)

Since the relation X ∈ HYP(Y ) is Π11 and membership in HY

a is ∆1,Y1 for a ∈ OY

, we have that P (Y ⊕ X,n) is a Π11-predicate. By Kondo uniformization (Theo-

rem 49), there is a Π11-uniformization P ′ of P . For Y ∈ 2ω, define ∂Y (X) = n if

and only if P ′(Y ⊕X,n). Since ∂Y (X) = n implies that n is a hyperarithmetical-

in-Y index of X, we have that ∂Y : HYP(Y )→ ω is an injection and hence MY =

(ω,HYP(Y ), ∂Y ) is a model of Basic Law V. Note that since ∂Y : HYP(Y )→ ω has

a Π1,Y1 -graph, the Corollary to the Spector-Gandy Theorem (cf. Corollary 48) im-

plies that ∂Y : HYP(Y )→ ω is definable in the structure (ω,HYP(Y ), 0, S,+,×,≤

), and this establishes (ii)(a).

199

To establish (i)(a), note that since ∂Y : HYP(Y ) → ω is an injection, it

follows that MY = (ω,HYP(Y ), ∂Y ) is a model ABL0 (as in the proof of Propo-

sition 33). To see that it also models the Σ11-choice schema (3.18), suppose that

MY |= ∀ z ∃ X ϕ(z,X, ∂Y (X)), where ϕ is an arithmetical formula. (The

proof for the case where z is replaced by a tuple z, or where there are mul-

tiple existential set quantifiers and multiple existential relation quantifiers, or

where there are parameters from the model present in ϕ is exactly similar). Then

MY |= ∀ z ∃ X ∃ e [∂Y (X) = e ∧ ϕ(z,X, e)]. Define a relation Q(Y ⊕ {z}, X) as

follows:

Q(Y ⊕ {z}, X)⇐⇒ X ∈ HYP(Y ) & ∃ e [∂Y (X) = e ∧ ϕ(z,X, e)] (3.65)

Then Q is a Π11-predicate. By Kondo uniformization, there is a Π1

1-uniformization

Q′ of Q. For Y ∈ 2ω, define qY (z) = X if and only if Q′(Y ⊕ {z}, X) and let

RY = {(z, x) : ∃ X ∈ HYP(Y ) qY (z) = X ∧ x ∈ X} (3.66)

Then by Kleene’s Theorem on Restricted Quantification 44, RY is Π1,Y1 -definable.

Moreover, since Q′ is a uniformization, we also have

RY = {(z, x) : ∀ X ∈ HYP(Y ) qY (z) = X → x ∈ X} (3.67)

Again, by Kleene’s Theorem on Restricted Quantification (44), the set RY is

Σ1,Y1 -definable. Hence RY is ∆1,Y

1 and so RY ∈ HYP(Y ). Finally, since Q′ is a

uniformization, we have that MY |= ∀ z ϕ(z, (RY )z, ∂Y ((RY )z)), so in fact MY is

a model of Σ11 − BL0 and this establishes (i)(a).

To show (i)(b) and (ii)(b), we first prove (ii)(b) and then note how our proof of

200

(ii)(b) in fact establishes (i)(b). Recall that by Proposition 34, there is an injective

function sY : ω → ω whose graph is in HYP(Y ) such that sY (n) = ∂Y ({n}) for

all n ∈ ω. Define an sY -recursive function fY : ω → ω:

fY (0) = ∂Y (∅) & fY (n+ 1) = sY (fY (n)) (3.68)

Let NY be the range of fY , so that both the graph of fY and its range NY are

in HYP(Y ). Since NY = rng(fY ) and dom(fY ) = ω, the following induction

principle holds:

∀ P [fY (0) ∈ P & ∀ n ∈ ω fY (n) ∈ P → fY (n+ 1) ∈ P ]→ NY ⊆ P (3.69)

Using this form of induction, one can show that fY : ω → NY is injective, so that

its inverse f−1Y : NY → ω is likewise in HYP(Y ). Further, one can arithmetically

define from NY , fY and f−1Y the functions ⊕Y : N2

Y → NY and ⊗Y : N2Y → NY as

follows:

fY (x)⊕ fY (y) = fY (f−1Y (x) + f−1

Y (y)) fY (x)⊗ fY (y) = fY (f−1Y (x) · f−1

Y (y))

(3.70)

and then arithmetically define a relation � on N2Y by

x �Y y ⇐⇒ ∃ z ∈ NY x⊕Y z = y (3.71)

Further one can extend the map to fY : HYP(Y )→ (P (NY )∩HYP(Y )) by setting

fY (X) = {fY (n) : n ∈ ω} (3.72)

201

and define the following structure in the signature of (ω, 0, S,+,×,≤,HYP(Y )):

HY = (NY , ∂Y (∅), sY ,⊕Y ,⊗Y ,�Y , fY (HYP(Y ))) (3.73)

Then the functions fY and fY witness that the two structures (ω, 0, S,+,×,≤

,HYP(Y )) and HY are isomorphic.

Further, note that HY is definable within MY : for, by the induction princi-

ple (3.69) one can show that NY is the unique smallest set containing ∂Y (∅) and

closed under sY , and using equation (3.70) and the induction principle (3.69) one

can show that ⊕Y and ⊗Y are the unique functions on NY satisfying the following

recursion clauses

x⊕Y ∂Y (∅) = x x⊕Y (sY (z)) = sY (x⊕Y z) (3.74)

x⊗Y ∂Y (∅) = ∂Y (∅) x⊗Y (sY (z)) = (x⊗Y z)⊕Y x (3.75)

Hence, since HY and (ω,HYP(Y ), 0, s,+,×,≤) are isomorphic and since HY is

definable in MY , we have established (ii)(b). Finally, note by construction that

the structure HY witnesses that MY is a model of the axiom Inf, so that we have

established (i)(b).

Corollary 61. Σ11 − AC0 ≤I Σ11 − LB0 + Inf <I Π

11 − CA0.

Proof. Note that Σ11 − AC0 ≤I Σ11 − LB0 + Inf because the sentence Inf (cf. Defini-

tion 58) literally provides an interpretation. To see that Σ11 − LB0+Inf <I Π11 − CA0,

note that since the previous theorem can be proven in Π11 − CA0, it follows that

Π11 − CA0 proves the consistency of Σ11 − LB0 + Inf. The construction above also

provides an interpretation of Σ11 − LB0 + Inf in Π11 − CA0, so that the result follows

from Proposition 13.

202

3.4 Barwise-Schlipf Models of Subsystems of BL2 and HP2

In this section, we turn to building models of subsystems of BL2 and HP2 on

top of various recursively saturated fields. In particular, § 3.4.1 is devoted to

the statement and proof of a generalization of a theorem of Barwise-Schlipf and

Ferreira-Wehmeir (Theorem 70). Then in §§ 3.4.2-3.4.4 three applications of this

theorem are presented. The major result here is Corollary 99, which says that

Σ11 − PH0 <I ACA0, and this fills in a key piece of Figure 3.2 about the inter-

pretability relation.

3.4.1 Generalized Barwise-Schlipf/Ferreira-Wehmeier Theorem

The main theorem of this section (Theorem 70) is a generalization of the way in

which Barwise-Schlipf ([6]) built models of ∆11 − CA0 on top of recursively saturated

models of Peano arithmetic, and the way in which Ferreira-Wehmeir ([40]) built

models of ∆11 − BL0 on top of recursively saturated structures. The new addition

is the concept of a uniformly definable function ∂ : D(M) → M (Definition 62).

Subsequent to defining this notion, the definitions of definable skolem functions

and recursively saturated structures are recalled, and then Theorem 70 is stated

and proven.

Definition 62. Suppose that M is an L-structure and let D(Mn) be the definable

subsets of Mn. Then ∂ : D(M) → M is uniformly definable if for all L-formula

θ(x, y) with all free variables displayed and with a non-empty set y of parameter

variables, there is an L-formula θ′(x, y) with the same free variables, such that

{∂(θ(·, a))} = {x : M |= θ′(x, a)} for all a ∈M .

Definition 63. Suppose that L is countable and that M is an L-structure and

that B ∈ 2ω. Then ∂ : D(M) → M is B-computably uniformly definable if it is

203

uniformly definable and the map θ 7→ θ′ is B-computable.

Definition 64. Suppose that M is an L-structure. Then M has definable skolem

functions if for every definable set P ⊆Mm+n there is a definable set P ′ ⊆Mm+n

such that

M |= ∀x, y [P ′xy → Pxy] (3.76)

M |= ∀ x [∃ y Pxy]→ [∃! y P ′xy] (3.77)

Remark 65. Note that in this definition, the parameters used to define P ′ may

exceed those used to define P . Note also the obvious similarity between definable

skolem functions and the uniformization results, such as Kondo’s Uniformization

Theorem 49, which we employed in Theorem 60. In particular, equations (3.76)-

(3.77) are nearly identical to equations (3.59)-(3.60).

Definition 66. Suppose that M is an L-structure and A ⊆ M . A set of A-

formulas p(v) in finitely many variables v is realized in M if there is an b in M

such that M |= θ(b) for every A-formula θ(v) in p(v). A set of A-formulas p(v)

is finitely realized in M if every finite subset p0(v) of p(v) is realized in M . The

structure M is saturated if for every A ⊆ M with |A| < |M | and every set of

A-formulas p(v), if p(v) is finitely realized in M then p(v) is realized in M .

Definition 67. Suppose that L and M are countable and B ∈ 2ω. Then M is

B-recursively saturated if for every finite A ⊆ M and every B-computable set of

A-formulas p(v), if p(v) is finitely realized in M then p(v) is realized in M .

Remark 68. The following proposition records the very elementary observation

that saturated structures (resp. B-recursively saturated structures) have a kind

of compactness property, in that each covering of Mn by definable sets has a finite

204

sub-covering (resp. each B-recursive covering of Mn by definable sets has a finite

sub-covering).

Proposition 69. Suppose that M is a saturated L-structure (resp. B-recursively

saturated L-structure) and that A ⊆ M with |A| < |M |. Further, suppose that

{θi(v)}i∈I is a set of A-formulas (resp. B-computable set of A-formulas). Then

[M |= ∀ a∨i∈I

θi(v)] =⇒ [∃ finite I0 ⊆ I M |= ∀ a∨i∈I0

θi(v)] (3.78)

Proof. The contrapositive of equation (3.78) says that if the set of A-formulas

p(v) = {¬θi(a) : i ∈ I} is finitely realized, then it is realized.

Theorem 70. Suppose that M is an L-structure and ∂ : D(M) → M such that

the structure N = (M,D(M), D(M2), . . . , ∂) models ABL0 (resp. AHP0). Suppose

that B ∈ 2ω. Then

(i) If ∂ : D(M) → M is uniformly definable and M is saturated, then the

structure N models ∆11 − BL0 (resp. ∆11 − HP0).

(ii) If ∂ : D(M) → M is uniformly definable and M is saturated, then the

structure N models Σ11 − LB0 (resp. Σ11 − PH0) if and only if M has definable

skolem functions.

(iii) If ∂ : D(M) → M is B-computably uniformly definable and M is B-

recursively saturated, then the structureN models ∆11 − BL0 (resp. ∆11 − HP0).

(iv) If ∂ : D(M) → M is B-computably uniformly definable and M is B-

recursively saturated, then the structure N models Σ11 − LB0 (resp. Σ11 − PH0)

if and only if M has definable skolem functions.

205

Proof. In all four parts of this proof, the proof is identical between Basic Law V

and Hume’s Principle, and so we only include the proofs for the case of Ba-

sic Law V. Further, the proof of (i) and (iii) are parallel and the proof of (ii)

and (iv) are parallel, and so we present the proofs of (i) and (iii) simultaneously

and the proofs of (ii) and (iv) simultaneously. For (i) and (iii), suppose that

∂ : D(M)→ M is uniformly definable (resp. B-computably uniformly definable)

and M is saturated (resp. B-recursively saturated). To see that N is a model

of ∆11 − BL0, suppose that there is a subset Z of Mn which is defined on N by a

Σ11-formula ϕ(z) and by a Π1

1-formula ψ(z). Let us suppose that ϕ(z) and ψ(z)

use exactly one set parameter A ∈ D(M) where

A = {w ∈M : M |= ρ(w, a)} (3.79)

and where ρ(w, v) is an ∅-formula with a ∈ M , since the proof in the case where

there are multiple parameters, with some being objects, some sets, and some

binary relations etc., is exactly identical. Further, let us suppose that ϕ(z) ≡

∃ X ϕ0(z,X, ∂(X), A) and that ψ(z) ≡ ∀ X ψ0(z,X, ∂(X), A), since the proof in

the case where there are multiple existential (resp. universal) set-quantifiers or

relation-quantifiers in ϕ(z) (resp. ψ(z)) is exactly identical. Then

z ∈ Z ⇐⇒ N |= ∃ X ϕ0(z,X, ∂(X), A)⇐⇒ N |= ∀ X ψ0(z,X, ∂(X), A) (3.80)

Then

N |= ∀ z ∃ X ϕ0(z,X, ∂(X), A) ∨ ¬ψ0(z,X, ∂(X), A) (3.81)

206

Let us abbreviate

ξ0(z,X, ∂(X), A) ≡ ϕ0(z,X, ∂(X), A) ∨ ¬ψ0(z,X, ∂(X), A) (3.82)

so that equation (3.81) becomes

N |= ∀ z ∃ X ξ0(z,X, ∂(X), A) (3.83)

Then this translates into M as

M |= ∀ z∨θ(x,y)

∃ b ξ0(z, θ(·, b), ∂(θ(·, b)), ρ(·, a)) (3.84)

where θ(x, y) ranges over ∅-formulas with non-empty set of parameter variables

y. Since the map ∂ : D(M) → M is uniformly definable (resp. B-computably

uniformly definable) via the map θ 7→ θ′, we have

M |= ∀ z∨θ(x,y)

∃ b ∃ c (θ′(c, b) & ξ0(z, θ(·, b), c, ρ(·, a)) (3.85)

Since M is saturated (resp. B-recursively saturated), an application of Proposi-

tion 69 implies that there is K > 0 and there are ∅-formulas θ1(x, y), . . . , θK(x, y)

such that

M |= ∀ zK∨i=1

∃ b ∃ c (θ′i(c, b) & ξ0(z, θi(·, b), c, ρ(·, a))) (3.86)

207

Then by definition of ξ0 (cf. equation 3.82)), we have:

M |= ∀ zK∨i=1

∃ b ∃ c (θ′i(c, b) & (ϕ0(z, θi(·, b), c, ρ(·, a)) ∨ ¬ψ0(z, θi(·, b), c, ρ(·, a))))

(3.87)

It follows from equation (3.80) that

Z = {z ∈Mn : M |=K∨i=1

∃ b ∃ c (θ′i(c, b) & (ϕ0(z, θi(·, b), c, ρ(·, a)))} (3.88)

Hence Z ∈ D(Mn) and so N satisfies ∆11 − BL0. Hence, this completes the proof

of parts (i) and (iii).

We turn to the proofs of parts (ii) and (iv). First, we handle the proof of the

right-to-left direction, which is quite similar to the proof from the above para-

graph. Suppose that ∂ : D(M)→M is uniformly definable (resp. B-computably

uniformly definable) and M is saturated (resp. B-recursively saturated) and has

definable skolem functions. To see that N is a model of Σ11 − LB0, suppose that

N |= ∀ z ∃ X ξ0(z,X, ∂(X), A) (3.89)

where ξ0 is arithmetical and where A ∈ D(M) is a set parameter with

A = {w ∈M : M |= ρ(w, a)} (3.90)

and where ρ(w, v) is an ∅-formula with a ∈ M . (As in the proof in the previous

paragraph, the case of multiple parameters or multiple set or relation quantifiers

208

is exactly similar). Then equation (3.89) translates into M as

M |= ∀ z∨θ(x,y)

∃ b ξ0(z, θ(·, b), ∂(θ(·, b)), ρ(·, a)) (3.91)

where θ(x, y) ranges over ∅-formulas with non-empty set of parameter variables

y. Since ∂ : D(M) → M is uniformly definable (resp. B-computably uniformly

definable) via the map θ 7→ θ′, we have

M |= ∀ z∨θ(x,y)

∃ b ∃ c (θ′(c, b) & ξ0(z, θ(·, b), c, ρ(·, a)) (3.92)

Since M is saturated (resp. B-recursively saturated), an application of Proposi-

tion 69 implies that there is K > 0 and there are ∅-formulas θ1(x, y), . . . , θK(x, y)

such that

M |= ∀ zK∨i=1

∃ b ∃ c (θ′i(c, b) & ξ0(z, θi(·, b), c, ρ(·, a))) (3.93)

Then by adding dummy variables if need be, we can move the disjunction to the

right as follows:

M |= ∀ z ∃ b ∃ cK∨i=1

(θ′i(c, b) & ξ0(z, θi(·, b), c, ρ(·, a))) (3.94)

and one can take the first such i as follows:

M |= ∀ z ∃ b ∃ cK∨i=1

[(θ′i(c, b) & ξ0(z, θi(·, b), c, ρ(·, a)))

&∧j<i

¬(θ′j(c, b) & ξ0(z, θj(·, b), c, ρ(·, a)))] (3.95)

209

Then since M has definable skolem functions, there is a possibly larger finite set

of parameters a′ ⊇ a and a′-definable functions f, g such that

M |= ∀ zK∨i=1

[(θ′i(g(z), f(z)) & ξ0(z, θi(·, f(z)), g(z), ρ(·, a)))

&∧j<i

¬(θ′j(g(z), f(z)) & ξ0(z, θj(·, f(z)), g(z), ρ(·, a)))] (3.96)

Then there is a partition of Mn into the a′-definable sets P1, . . . , PK which are

defined as follows:

Pi = {z ∈Mn : M |=[(θ′i(g(z), f(z)) & ξ0(z, θi(·, f(z)), g(z), ρ(·, a)))

&∧j<i

¬(θ′j(g(z), f(z)) & ξ0(z, θj(·, f(z)), g(z), ρ(·, a)))]}

(3.97)

Then define the a′-definable relation

R = {(z, w) :K∨i=1

[z ∈ Pi → θi(w, f(z))]} (3.98)

so that

z ∈ Pi =⇒ Rz = {w ∈M : (z, w) ∈ R} = {w ∈M : M |= θi(w, f(z))} = θi(·, f(z))

(3.99)

z ∈ Pi =⇒ {∂(Rz)} = {∂(θi(·, f(z))} = {c ∈M : M |= θ′i(c, f(z))} = {g(z)}

(3.100)

z ∈ Pi =⇒ ∂(Rz) = g(z) (3.101)

Putting these things together and glancing back at the definition of Pi in equa-

210

tion (3.97) we have,

z ∈ Pi =⇒M |= (θ′i(g(z), f(z)) & ξ0(z, θi(·, f(z)), g(z), ρ(·, a)))

=⇒ N |= ξ0(z,Rz, ∂(Rz), A) (3.102)

Since the sets P1, . . . , PK partition Mn we have

N |= ∀ z ξ0(z,Rz, ∂(Rz), A) (3.103)

and this implies that N models Σ11 − BL0. Hence we have established the right-to-

left direction of (ii) and (iv).

We want to establish the left-to-right direction of (ii) and (iv). Suppose that

∂ : D(M)→ M is uniformly definable (resp. B-computably uniformly definable)

and M is saturated (resp. B-recursively saturated) and that N models Σ11 − BL0.

Suppose that P ⊆ Mm+n is definable, perhaps with a finite set a of parameters

from M . Note that for every x ∈ Mm with a tuple y ∈ Mn such that Pxy, we

can arbitrarily choose one such y ∈ Mn and form the y-definable singleton {y}.

This implies that

N |= ∀ x ∃ R [∃ y Pxy]→ [(∃! y Ry) & (∀ y Ry → Pxy)] (3.104)

Since N |= Σ11 − LB0, one then has

N |= ∃ P ′ ∀ x [∃ y Pxy]→ [(∃! y P ′xy) & (∀ y P ′xy → Pxy)] (3.105)

211

Since P ′xy if and only if P ′xy, this implies that

N |= ∃ P ′ ∀ x [∃ y Pxy]→ [(∃! y P ′xy) & (∀ y P ′xy → Pxy)] (3.106)

Finally, let P ′′ = P ′ ∩ P . Then

M |= ∀x, y [P ′′xy → Pxy] (3.107)

M |= ∀ x [∃ y Pxy]→ [∃! y P ′′xy] (3.108)

Hence, M has definable skolem functions.

3.4.2 Application to Algebraically Closed Fields

Remark 71. In this section, we apply Theorem 70 to construct models of ∆11 − HP0

on top of certain algebraically closed fields (cf. Theorem 77). The primary applica-

tion of this construction is to answer a question posed by Linnebo (cf. Remark 81

and Theorem 83). Prior to doing this, we recall Ax’s Theorem and note one

elementary consequence of this theorem.

Theorem 72. (Ax’s Theorem) Suppose that k is an algebraically closed field and

f : k → k is a definable injective function. Then f is surjective.

Proof. See Ax [4] Theorem C pp. 241, 270 or Poizat [123] Lemma 4.3 pp. 70-71,

in which is proved the stronger result wherein k is replaced by a definable subset

of kn.

Proposition 73. Suppose that k is an algebraically closed field and that X, Y ⊆ k

are definable. Then the following are equivalent:

(i) There is definable bijection f : X → Y

212

(ii) Either both X and Y are finite and of the same cardinality, or both X and

Y are cofinite and k \X and k \ Y are of the same cardinality.

Proof. Suppose that (i) holds. Then by strong minimality and the fact that an

infinite set cannot be bijective with a finite set, either both X and Y are finite or

both X and Y are cofinite. If X and Y are both finite then the fact that there is a

definable bijection between them implies that X and Y have the same cardinality.

If X and Y are both cofinite but k \X and k \ Y are not of the same cardinality,

then without loss of generality k \ X = {a1, . . . , am} and k \ Y = {b1, . . . , bn}

where m < n. Then define a function f : k → k by f � X = f and f(ai) = bi for

i ≤ m. Then f : k → k is an injection that is not a surjection, since bn is not the

in the range of f . This contradicts Ax’s Theorem 72. So, in fact, k \X and k \ Y

are of the same cardinality. Then (ii) holds.

Conversely, suppose that (ii) holds. If both X and Y are finite of the same

cardinality, then simply enumerate the elements ofX and Y and use these elements

as parameters to define a bijection f : X → Y . If X and Y are both cofinite and

k \ X and k \ Y are of the same finite cardinality, then enumerate k \ X =

{y1, . . . , yn} and k \ Y = {x1, . . . , xn}. By renumbering, we can assume without

loss of generality that (k\X)∩(k\Y ) = {x1, . . . , xm} = {y1, . . . , ym} where m ≤ n

and x1 = y1, . . . , xm = ym. If m = n then this implies that (k \X) = (k \ Y ) and

X = Y , and we can choose the definable bijection f : X → Y to be the identity

map. If m < n, then note that {xm+1, . . . , xn} ⊆ X and {ym+1, . . . , yn} ⊆ Y and

X \ {xm+1, . . . , xn} ⊆ Y and Y \ {ym+1, . . . , yn} ⊆ X. Then we can choose the

definable bijection f : X → Y which is given by the identity on X \{xm+1, . . . , xn}

and by f(xi) = yi on {xm+1, . . . , xn}.

Definition 74. A structure k is strongly minimal if every definable X ⊆ k is finite

213

or cofinite.

Proposition 75. Every algebraically closed field is strongly minimal.

Proof. See Marker [108] p. 5.

Proposition 76. Algebraically closed fields do not have definable skolem func-

tions.

Proof. Let ϕ(x, y) ≡ x = y2. Then k |= ∀ x ∃ y x = y2. If k has definable skolem

functions or parametrically definable skolem functions, then there is a definable

function f : k → k such that k |= ∀ x x = (f(x))2. Then rng(f) is a definable

set which includes exactly one square root for each x ∈ k. Then rng(f) is infinite

and coinfinite, which contradicts strong minimality.

Theorem 77. Suppose that k is a saturated algebraically closed field of charac-

teristic zero. Then there is a uniformly definable function # : D(k)→ k such that

(k,D(k), D(k2), . . . ,#) is a model of ∆11 − HP0 + ¬Σ11 − PH0 + ¬Π11 − HP0. Further,

there is no function ∂ : D(k)→ k such that (k,D(k), D(k2), . . . , ∂) is a model of

∆11 − BL0.

Proof. Since k is a field of characteristic zero, the prime field of k is Q and

the integers Z are hence embedded into k via Q. Using this embedding, de-

fine # : D(k) → k by #X = |X| if X is finite and #X = −(|k \X| + 1) if

X is cofinite. Then by Proposition 73, the structure (k,D(k), D(k2), . . . ,#) is a

model of Hume’s Principle. To apply Theorem 70 (i)-(ii), we need to show that

# : D(k) → k is uniformly definable. Suppose that θ(x, y) is an ∅-formula with

non-empty set y of parameter variables. Then by strong minimality, for any a we

214

have that θ(·, a) is finite or ¬θ(·, a) is finite. Then

k |= ∀ a∨N≥0

[|θ(·, a)| ≤ N ∨ |¬θ(·, a)| ≤ N ] (3.109)

Since k is saturated, by Proposition 69, there is an integer Nθ > 0 such that

k |= ∀ aNθ∨i=0

[|θ(·, a)| ≤ i ∨ |¬θ(·, a)| ≤ i] (3.110)

Then for each such formula θ(x, y) we define the following ∅-formula θ′(x, y) as

follows:

θ′(x, y) ≡Nθ∨i=0

[|θ(·, y)| = i & x = i] ∨ [|¬θ(·, y)| = i & x = −(i+ 1)] (3.111)

Hence, by definition, we have that for any a

{#(θ(·, a))} = {c : k |= θ′(c, a)} (3.112)

The map # : D(k) → k is uniformly definable. Hence, by Theorem 70 (i)-(ii)

and Proposition 76, we have that (k,D(k), D(k2), . . . ,#) is a model of ∆11 − HP0 +

¬Σ11 − PH0. Further, since the set rng(#) = Z is definable by a Σ11-formula in

the structure (k,D(k), D(k2), . . . ,#) but is not definable in k since k is strongly

minimal, we have that (k,D(k), D(k2), . . . ,#) is a model of ¬Π11 − HP0.

Now let us note why there is no function ∂ : D(k)→ k such that the structure

(k,D(k), D(k2), . . . , ∂) is a model of ∆11 − BL0. If there was such a function, then

by Corollary 38 it would follow that there was an injective non-surjective function

s : k → k whose graph is in D(k2), which would contradict Ax’s Theorem (72).

Remark 78. If we knew that all the parts of the proof of the above theorem were

215

formalizable in ACA0, then we could infer from the proof of the above theorem

and Proposition 13 that ∆11 − HP0 <I ACA0. It is clear from the proof that this

comes down to determining whether or not Ax’s Theorem 72 is provable in ACA0.

However, note that in the next subsection, we will prove Corollary 99, which

assures us that ∆11 − HP0 <I ACA0.

Remark 79. In conjunction with Corollary 38, the following corollary shows that

there is a stark contrast between ∆11 − HP0 and ∆11 − BL0 on the score of whether

they require the existence of injective non-surjective functions.

Corollary 80. There is a model (M,S1, S2, . . . ,#) of ∆11 − HP0 such that there is

no injective non-surjective function s : M →M such that graph(s) is in S2.

Proof. This follows immediately from the construction in Theorem 77 and Ax’s

Theorem 72.

Remark 81. Linnebo presented a description of properties that models of AHP0

and ∆11 − HP0 must have if they fail to model a certain sort of successor axiom

([100] pp. 164-165), and he additionally showed that there was a model of AHP0

which did not model this successor axiom ([100] Theorem 2 p. 164). Linnebo then

remarked that it was unknown whether there was a model of ∆11 − HP0 that did not

model the successor axiom (cf. [100] Remark 6 p. 168). Subsequent to defining

this successor axiom, we now show that the model from the previous theorem

does not model this axiom. We also explain why certain properties identified by

Linnebo hold in this model.

Definition 82. The following are formulas in the language of HP2 (cf. Lin-

nebo [100] pp. 158-160):

(i) P (n,m)⇐⇒ ∃ X, Y #X = n & #Y = m & ∃ y ∈ Y X = Y \ {y}

216

(ii) F is hereditary if Fn and P (n,m) implies Fm

(iii) F is closed if P (#∅,m) implies Fm

(iv) n is a pseudo-number if n = #∅ or n is contained in all hereditary, closed F .

(v) The successor axiom (SA) says that for any pseudo-number n, there is m

such that P (n,m).

Proposition 83. Suppose that k is a saturated algebraically closed field of char-

acteristic zero. Suppose that # : D(k)→ k by #X = |X| if X is finite and #X =

−(|k \X|+ 1) if X is cofinite. Then (k,D(k), D(k2), . . . ,#) |= ∆11 − HP0 + ¬SA.

Proof. Before we begin, it is perhaps helpful to informally state the definition of

# given above and describe how it interacts with the predicate P (n,m). If X is a

finite set with n elements, then #X = n, and if X is a cofinite set with n elements

in its complement, then #X = −(n+ 1). So, for example, the set X = {√

2,−1}

has #X = 2, and the set X = {a ∈ k : k |= a2 +1 6= 0} has #X = −(2+1) = −3,

and the set X = k has #X = −1, and the set X = ∅ has #X = 0. Further,

if X is finite, then by choosing an element y /∈ X, we have P (#X,#(X ∪ {y})).

For example, if X is finite and has n elements and y /∈ X, we have that #X = n

and #(X ∪ {y}) = n + 1, so that P (n, n + 1). Conversely, if X is cofinite and

has n > 0 elements in its complement and y /∈ X, then we have that X ∪ {y}

has n − 1 elements in its complement, so that #X = −(n + 1) = −n − 1 and

#(X∪{y}) = −((n−1)+1) = −n and hence so that P (−n−1,−n). For example,

we have P (0, 1), P (1, 2), P (2, 3), . . . and . . . , P (−4,−3), P (−3,−2), P (−2,−1).

Now we begin the proof. In particular, we want to begin by describing what

the hereditary, closed sets F ∈ D(k) look like. So suppose that F ∈ D(k) is

hereditary and closed. First we claim that N \ {0} ⊆ F . For, by the definition

217

of P (n,m) and #, we have that F ’s being closed implies that P (0, 1) and hence

1 ∈ F . So suppose that n ∈ (N \ {0}) ∩ F . Then by the definition of P (n,m)

and #, we have that F ’s being hereditary implies that P (n, n + 1) and hence

n+ 1 ∈ F . By induction, we have that if F ∈ D(k) is hereditary and closed then

N \ {0} ⊆ F .

We want to claim that {n ∈ Z : n 6= 0} ⊆ F . Suppose not. That is, suppose

that there are some negative integers that are not in F . Then, since F ∈ D(k) is

infinite, strong minimality implies that F is co-finite. So there are at most finitely

many negative integers that are not in F . Suppose that we write these negative

integers in increasing order as a1 < a2 < · · · < an. (E.g. if Z\F = {−5,−10,−12}

then a1 = −12, a2 = −10 and a3 = −5). This implies that a1 − 1 ∈ F . But then

by the definition of P (n,m) and #, we have that F ’s being hereditary implies

that P (a1−1, a1) and hence Fa1, which is a contradiction. Hence, in fact we have

that {n ∈ Z : n 6= 0} ⊆ F . So, what we have shown in this paragraph is that if

F ∈ D(k) is hereditary and closed, then {n ∈ Z : n 6= 0} ⊆ F .

This, of course, implies that every element of Z is a pseduo-number. Con-

versely, it is not difficult to see that all the pseudo-numbers are elements of Z.

Suppose that a ∈ k is not an integer. Then the set F = k \ {a} is a hereditary

closed set that does not contain a. Hence, what we have shown in this paragraph

is that the pseduo-numbers in the structure (k,D(k), D(k2), . . . ,#) are precisely

the integers.

Now we are in a position to show that (k,D(k), D(k2), . . . ,#) |= ¬SA. For,

consider the set k ∈ D(k). By definition #k = −(|k \ k| + 1) = −1. Hence,

by the results of the previous paragraph, we have that #k is a pseudo-number.

So suppose that SA held on the structure (k,D(k), D(k2), . . . ,#). Then there

218

would be m such that P (#k,m). Then by definition, there would be sets X, Y ∈

D(k) such that #k = #X and m = #Y and ∃ y ∈ Y X = Y \ {y}. Since

Hume’s Principle holds on the structure (k,D(k), D(k2), . . . ,#), we have that

#k = #X implies that there is a bijection f : X → Y that is definable in the

structure k. By Proposition 73, we have that k \ k and k \ X are of the same

cardinality, so that X = k. But then the condition that y ∈ Y \ X implies that

y ∈ k \ k, which is a contradiction. So, in fact, SA does not hold on the structure

(k,D(k), D(k2), . . . ,#).

Remark 84. In the course of his proof of the existence of a model of AHP0 +

¬SA, Linnebo noted several properties which must be had by such models ([100]

pp. 164-165). Since models of ∆11 − HP0 +¬SA are automatically models of AHP0 +

¬SA, Linnebo’s results predict several properties of the model from the previous

proposition. In this remark, we briefly explain why the properties identified by

Linnebo hold on this structure. First, Linnebo notes that the example of a pseduo-

number n witnessing that SA fails on the structure (k,D(k), D(k2), . . . ,#) must

be such that n = #k. In the last paragraph of the previous proposition, we showed

that n = #k was such a counterexample. Second, Linnebo notes that the example

of a structure (k,D(k), D(k2), . . . ,#) |= ¬SA must be such that k \X 6= ∅ implies

#k 6= #X. In the context of the model constructed in the previous proposition,

this is a consequence of Ax’s Theorem (or Proposition 73). Finally, Linnebo notes

that the example of a structure (k,D(k), D(k2), . . . ,#) |= ¬SA must in effect

contain a copy of both ω and ω∗ ordered by the P -relation, that is, this structure

must contain a copy of the positive integers and the negative integers ordered by

the P -relation. In the model constructed in the previous theorem, this is reflected

in the fact that the pseduo-numbers are precisely the integers.

219

3.4.3 Application to O-Minimal Expansions of Real-Closed Fields

Remark 85. In this section, we apply Theorem 70 to construct models of Σ11 − PH0

on top of certain o-minimal expansions of real-closed fields (cf. Theorem 97). The

primary application of this construction is to note that an effectivization this

construction allows us to conclude that Σ11 − PH0 <I ACA0 (cf. Corollary 99), thus

filling in a key piece of the interpretability relation (cf. Figure 3.2). Prior to doing

this, we recall some basic notions pertaining to the model theory of o-minimal

expansions of real-closed fields, such as dimension and Euler characteristic; the

reader who is already familiar with these notions may wish to proceed directly to

Theorem 97.

Definition 86. Suppose that L is a signature extending the signature of linear

orders, and suppose that M is an L-structure such that (M,≤) is a dense linear

order. Then M is o-minimal if every definable set is a finite union of points and

intervals.

Proposition 87. Every real-closed ordered field is o-minimal.

Proof. See Marker [108] Corollary 2.5 p. 11.

Definition 88. Suppose that M is an o-minimal structure. If X is a definable

subset of Mn, then let C(X) be the set of definable continuous functions f : X →

M , and let C∞(X) be C(X) plus the two constant functions −∞,∞. Further, if

f, g ∈ C∞(X) and f < g on X, then let

(f, g)X = {(x, r) ∈ X ×R : f(x) < r < g(x)} (3.113)

Then inductively define the notion of a σ-cell, where σ ∈ 2<ω is a finite sequence

of zeros and ones. First, 0-cells are points and 1-cells are open intervals, including

220

(−∞, a), (a,−∞). Second, given a σ-cell X, the σ0-cells are graphs of functions

f ∈ C(X), and the σ1-cells are sets (f, g)X where f, g ∈ C∞(X).

Definition 89. Suppose that M is an o-minimal structure. A decomposition of

Mn is defined inductively as follows. A decomposition of M1 is a finite partition

of M with the following form:

{(−∞, a1), (a1, a2), . . . , (ak,+∞), {a1}, . . . , {ak}} (3.114)

where a1 < a2 < · · · < ak. A decomposition of Mm+1 = Mm × M is a fi-

nite partition of Mm+1 into cells {A1, . . . , An} such that the set of projections

{π(A1), . . . , π(An)} is a decomposition of Mm, where π : Mm+1 → Mm by

π(x1, . . . , xm+1) = (x1, . . . , xm). A decomposition of Mm is said to partition a

definable set X ⊆ Mm if X can be written as a finite union of pairwise disjoint

cells in the decomposition.

Theorem 90. (Cell Decomposition Theorem) Suppose that M is an o-minimal

structure. For any finite sequence of B-definable sets A1, . . . , Ak ⊆ Mm, there is

a decomposition of Mm partitioning each of the Ai. Moreover, the cells in the

decomposition are B-definable.

Proof. See van den Dries [147] Theorem 2.11 p. 52.

Definition 91. Suppose that M is an o-minimal structure and that X ⊆ Mn.

Then define

dim(X) = max{i1 + · · ·+ in : X contains a (i1, . . . , in)-cell} (3.115)

E(X) = k0 − k1 + k2 − · · · =n∑d=0

kd(−1)d (3.116)

221

where kd is the number of d-dimensional cells contained in some cell decomposition

of X.

Remark 92. Note that if X ⊆M , then dim(X) > 0 if and only if X contains an

open interval. Note that the above definition of Euler dimension can be shown to

be independent of the choice of the cell decomposition (cf. [147] Proposition 2.2

p. 70).

Proposition 93. Suppose that M is an o-minimal structure and that θ(x, y) is

a ∅-formula. Then there is a positive integer Nθ > 0 such that for all b ∈M , it is

the case that ∣∣dim(θ(·, b)∣∣ , ∣∣E(θ(·, b))

∣∣ < Nθ (3.117)

Further, for each integer k, it is the case that the sets

{b ∈M : dim(θ(·, b)) = k} & {b ∈M : E(θ(·, b)) = k} (3.118)

are ∅-definable. Moreover, the formulas that define these sets and the positive

integer Nθ can be uniformly computed from θ.

Proof. See van den Dries [147] Proposition 1.5 p. 65 and Proposition 2.10 p.

72.

Proposition 94. Suppose that M is an o-minimal expansion of a real-closed

field, and suppose that X ⊆Mn and Y ⊆Mm are definable sets. Then there is a

definable bijection f : X → Y if and only if dim(X) = dim(Y ) and E(X) = E(Y ).

Proof. See van den Dries [147] p. 132.

Remark 95. As a simple illustration of this fact, consider the example of the two

222

sets

X = (−2,−1) t {0} t (1, 2) Y = (−1, 1) (3.119)

Both have dimension 1, since they both contain intervals, and their Euler charac-

teristics are the same, namely, E(X) = 1−2 = −1 and E(Y ) = 0−1 = −1. Hence,

the above proposition predicts that there is a definable bijection f : X → Y , and

in fact this is the case: one simply sends (−2,−1) to (−1, 0) and one sends 0 to

0 and one sends (1, 2) to (0, 1).

Proposition 96. O-minimal expansions of real closed fields have definable skolem

functions.

Proof. See van den Dries [147] p. 94 for details.

Theorem 97. Suppose that k is a recursively-saturated o-minimal expansion

of a real-closed field. Then there is a computably uniformly definable function

# : D(k)→ k such that (k,D(k), D(k2), . . . ,#) is a model of Σ11 − PH0+¬Π11 − HP0.

Further, there is no function ∂ : D(k) → k such that (k,D(k), D(k2), . . . , ∂) is a

model of ∆11 − BL0.

Proof. Since k is a field of characteristic zero, the prime field of k is Q and the

integers Z are hence embedded into k via Q. Choose a recursive bijection 〈·, ·〉 :

Z2 → Z. Using this embedding and this bijection, define # : D(k)→ k by #X =

〈dimX,E(X)〉. Then by Proposition 94, the structure (k,D(k), D(k2), . . . ,#) is

a model of Hume’s Principle. To apply Theorem 70 (iii)-(iv), we need to show that

# : D(k) → k is computably uniformly definable. So suppose that θ(x, y) is an

∅-formula with non-empty set y of parameter variables. Then by Proposition 93,

from the formula θ(x, y) we can uniformly compute a positive integer Nθ > 0 such

223

that

k |= ∀ b [∣∣dim(θ(·, b)

∣∣ , ∣∣E(θ(·, b))∣∣ < Nθ] (3.120)

as well as ∅-formulas defining the sets {b : dim(θ(·, b) = n} and {b : E(θ(·, b)) = n}.

Then for each such formula θ(x, y) we define the following ∅-formula θ′(x, y) as

follows:

θ′(x, y) ≡Nθ∨i=0

Nθ∨j=0

[dim(θ(·, y)) = i & E(θ(·, y)) = j]→ x = 〈i, j〉 (3.121)

Hence, by definition, we have that for any a

{#(θ(·, a))} = {c : k |= θ′(c, a)} (3.122)

Hence, by Theorem 70 (iii)-(iv) and Remark 96, we have that the structure

(k,D(k), D(k2), . . . ,#) is a model of Σ11 − PH0. Further, since the set rng(#) = Z

is definable by a Σ11-formula in the structure (k,D(k), D(k2), . . . ,#) but is not de-

finable in k since k is o-minimal, we have that (k,D(k), D(k2), . . . ,#) is a model

of ¬Π11 − HP0.

Now let us note why there is no function ∂ : D(k)→ k such that the structure

(k,D(k), D(k2), . . . , ∂) is a model of ∆11 − BL0. If there was such a function, then

by Proposition 34, there would be a function s : k2 → k whose graph was in D(k2)

and which satisfied s(x, y) = ∂({x, y}). Consider the definable set X = {(x, y) ∈

k2 : x < y}, and note that dim(X) = 2. Then s � X : X → k is an injection. For,

suppose s(x, y) = s(x′, y′) for (x, y), (x′, y′) ∈ X. Then ∂({x, y}) = ∂({x′, y′}) and

x < y and x′ < y′. Then by Basic Law V, {x, y} = {x′, y′} and x < y and x′ < y′.

Then x = x′ and y = y′. Hence, in fact, s � X : X → k is an injection. Then

trivially s � X : X → rng(s � X) is a bijection whose graph is in D(k2). Then by

224

the left-to-right direction of Proposition 94, it would follow that

2 = dim(X) = dim(rng(s � X)) ≤ dim(k) = 1 (3.123)

which is a contradiction.

Remark 98. It is our claim that all of the results quoted and proved in this

subsection can be proven in ACA0 for o-minimal structures M with ACA0-provable

quantifier-elimination, such as real-closed fields (cf. Marker [109] Theorem 2.3

p. 10, Simpson [138] Lemma II.9.6 p. 98). The reason for this is that (i) the proofs

from van den Dries [147] all concern properties of definable sets, as opposed to

properties of the defining formula, and (ii) the proofs from van den Dries [147] all

proceed by induction on the cartesian power of the definable set. It is worthwhile

to say a little bit more about each of these points.

In regard to (i), the proofs in this section from van den Dries [147] are all

concerned with properties of a definable set X, so that the definable set X has

the property regardless of which particular formula is used to define X. For

instance, the property of X’s being a cell has this feature, since a definable set

X ⊆ M is e.g. an interval or a point regardless of whether the formula ϕ or the

formula ψ is being used to define it (where ϕ and ψ are two formulas that do in fact

define X). By the same token, the proofs in this section from van den Dries [147]

are not concerned with the syntactic complexity of given formulas, for instance,

whether or not they are Π03-formulas or Π0

4-formulas. Hence, if M has quantifier-

elimination, then for the purposes of the proofs in this section from van den Dries

[147], we can take the quantifier-free formulas as representatives for the definable

sets. For instance, in proving the Cell Decomposition Theorem in this manner, we

would in fact prove that e.g. for every finite sequence of quantifier-free formulas

225

ϕ1(x), . . . , ϕk(x) in m-free variables, there is a quantifier-free decomposition of

Mm partitioning each of the ϕi(x).

In regard to (ii), the proofs in this section from van den Dries [147] all proceed

by induction, where it is first shown that the definable subsets of M have a given

property, and then it is shown that if the definable subsets of Mn have a given

property, then the definable subsets of Mn+1 have this given property. Given our

discussion in the previous paragraph, when proving these theorems in ACA0, we

would in fact prove that the quantifier-free formulas ϕ(x) have a given property,

and that if the quantifier-free formulas ϕ(x1, . . . , xn) have a given property, then

the quantifier-free formulas ϕ(x1, . . . , xn+1) have a given property. Since ACA0 has

the mathematical induction axiom for all sets X, it suffices to note that ACA0

has enough comprehension to show that the sets X on which it is doing induc-

tion exist. Here it suffices to note that the proofs in this section from van den

Dries [147] all concern properties of the definable sets that can be expressed by

(iii) finitely many quantifiers over quantifier-free definable sets and by (iv) finitely

many quantifiers over the structure M . For instance, to reiterate the point made

in the last paragraph, in proving the Cell Decomposition Theorem in this fashion,

we must show that for every m and every finite sequence of quantifier-free formu-

las ϕ1(x), . . . , ϕk(x) in m-free variables, there is a quantifier-free decomposition of

Mm partitioning each of the ϕi(x). In terms of (iii), this involves a universal quan-

tifier over quantifier-free definable sets followed by an existential quantifier over

quantifier-free definable sets. In terms of (iv), this involves a universal quantifier

to say that e.g. the cells in the decomposition are disjoint and another universal

quantifier to say that e.g. ϕi(x) can be written as a finite union of pairwise dis-

joint cells in the decomposition. Since the number of quantifiers in (iii) and (iv)

226

is fixed in advance, ACA0 can prove that the set on which one is doing induction

exists. In this way, the proofs from van den Dries [147] can be translated word-

for-word into proofs in ACA0 for o-minimal structures M which have ACA0-provable

quantifier-elimination, such as real-closed fields.

Corollary 99. Σ11 − PH0 <I ACA0.

Proof. This follows from Proposition 13, the fact that ACA0 proves the existence

of recursively saturated elementary extensions (cf. Simpson [138] Lemma IX.4.2

pp. 379), and the fact that the proof of the previous theorem can be formalized in

ACA0 for o-minimal expansions of real-closed fields with ACA0-provable quantifier-

elimination, such as real-closed fields.

3.4.4 Application to Separably Closed Fields

Remark 100. In the two previous subsections, we applied Theorem 70 to con-

struct models of ∆11 − HP0 on top of various fields, such as certain algebraically

closed fields and o-minimal expansions of real-closed fields. We noted in both

Theorem 77 and Theorem 97 that this construction cannot result in models of

∆11 − BL0. Hence, this raises the question of whether there is some natural field

such that one can apply Theorem 70 to it to obtain models of ∆11 − BL0. In this

section, we isolate certain model-theoretic conditions on a field (such a uniform

elimination of imaginaries) which suffice to ensure that such a construction can

succeed (cf. Theorem 106). Then we note that separably closed fields of finite

imperfection degree satisfy these model-theoretic conditions (cf. Theorem 108).

Definition 101. Suppose that M is an L-structure. Then M has uniform elimi-

nation of imaginaries if for every ∅-definable equivalence relation E on Mn there

227

is an ∅-definable function f : Mn →Mm for some m > 0 such that

zEy ⇐⇒ f(z) = f(y) (3.124)

Definition 102. Suppose that M is an L-structure. Then M has a ∅-definable

pairing function if there is an ∅-definable injection ι : M2 →M .

Proposition 103. Suppose that M has uniform elimination of imaginaries and

an ∅-definable pairing function. Then for every ∅-definable equivalence relation E

on Mn there is an ∅-definable function f : Mn →M such that

zEy ⇐⇒ f(z) = f(y) (3.125)

Proof. By hypothesis, M has an ∅-definable pairing function ι : M2 → M . Then

define injections jn : Mn →M recursively as follows:

j1(x1) = x1 (3.126)

j2(x1, x2) = ι(x1, x2) (3.127)

jn+1(x1, . . . xn, xn+1) = ι(jn(x1, . . . , xn), xn+1) (3.128)

Finally, given a function f : Mn → Mm for some m > 0 which witnesses the

uniform elimination of imaginaries, simply define f ∗ = jm ◦ f .

Proposition 104. Suppose that M has an ∅-definable pairing function and that

dcl(∅) has at least two elements. Then there is a uniformly computable sequence

of injections ιn : M →M such that n 6= m implies rng(ιn) ∩ rng(ιm) = ∅.

Proof. Suppose that ι : M2 →M is the ∅-definable injection and that b, c ∈ dcl(∅)

228

are distinct. Then define injections ιn : M →M recursively as follows:

ι0(x) = ι(c, ι(c, x)) (3.129)

ι2s+1(x) = ι(b, ι2s(x)) (3.130)

ι2s+2(x) = ι(c, ι2s+1(x)) (3.131)

By construction, all the functions ιn : M → M are injections. So it remains to

show by induction on m ≤ n that rng(ιn) ∩ rng(ιm) = ∅ when m 6= n. Clearly

this holds for n = 0. So suppose it holds for n. If n is even then n = 2s and

n + 1 = 2s + 1. Suppose that m < n + 1 is such that rng(ιn+1) ∩ rng(ιm) 6= ∅.

Then there are x, y such that ιn+1(x) = ιm(y). Expanding this equation on the

left, we have ι(b, ι2s(x)) = ι2s+1(x) = ιn+1(x) = ιm(y). Then by construction,

ιm(y) = ι(b, ι2t(y)) for some 2t + 1 = m. Then ιm−1(y) = ι2t(y) = ι2s(x) = ιn(x),

which contradicts our induction hypothesis on n. On the other hand, if n is odd

then n = 2s + 1 and n + 1 = 2s + 2. Suppose that m < n + 1 is such that

rng(ιn+1) ∩ rng(ιm) 6= ∅. Then there are x, y such that ιn+1(x) = ιm(y). Then

expanding this equation on the left we have ι(c, ι2s+1(x)) = ι2s+2(x) = ιn+1(x) =

ιm(y). There are then two cases. First suppose that m = 0. Then by construction

ιm(y) = ι(c, ι(c, y)). Then ι(b, ι2s(x)) = ι2s+1(x) = ι(c, y), and so b = c, which

is a contradiction. Second, suppose that m > 0. Then by construction, ιm(y) =

ι(c, ι2t+1(y)) for some 2t + 2 = m. Then ιm−1(y) = ι2t+1(y) = ι2s+1(x) = ιn(x),

contradicting our induction hypothesis on n.

Remark 105. The intuitive idea of the proof of the following theorem is very clear.

For, suppose that M has uniform elimination of imaginaries and a ∅-definable

pairing function. Then given a formula θ(x, y) with a set of parameter variables

229

y of length ` > 0, these assumptions yield an ∅-definable function ∂θ : M ` → M

such that

M |= [∀ x θ(x, a)→ θ(x, b)]⇐⇒ ∂θ(a) = ∂θ(b) (3.132)

Intuitively, the idea is to build a model (M,D(M), D(M2), . . . , ∂) of Basic Law V

by setting

∂(θ(·, a)) = ∂θ(a) (3.133)

However, there are two potential problems. First, such a function will not be

well-defined, since a given set X ∈ D(M) will be defined by many formulas

θ1(·, a), θ2(·, b), . . .. Second, it is not obvious that such a function will be in-

jective, which is required by Basic Law V. Overcoming these problems is the only

thing that makes the below proof non-trivial. In particular, the first problem is

overcome simply by fixing beforehand an enumeration of the all potential defining

formulas θ1(x, y), . . . , θn(x, y), . . ., and then defining ∂(X) to be ∂θn(a) for the first

θn(x, a) in the enumeration that defines X for some tuple a. The second problem

is overcome by including additional hypotheses on M which ensure that we can

partition M =⊔nMn and likewise ensure that ∂θn(a) always takes values in Mn.

The previous proposition was in effect devoted to explaining why the hypothesis

of a ∅-definable pairing function with |dcl(∅)| > 1 ensure that we can construct

such a partition.

Theorem 106. Suppose that M is a Th(M)-computably saturated structure

such that (i) M has uniform elimination of imaginaries, (ii) M has an ∅-definable

pairing function, and (iii) dcl(∅) has at least two elements. Then there is a

Th(M)-computably uniformly definable function ∂ : D(M) → M such that

(M,D(M), D(M2), . . . , ∂) is a model of ∆11 − BL0.

230

Proof. To apply Theorem 70 (iii)-(iv), we need to define an injection ∂ : D(M)→

M that is Th(M)-computably uniformly definable. Choose a fixed computable

enumeration of the ∅-formulas θ(x, y) with non-empty set y of parameter variables

of length `n as θ1(x, y), . . . , θn(x, y), . . .. For each n > 0 and 0 < m ≤ n, consider

the following ∅-definable sets Un,m ⊆ M `n , where again `n is the length of the

tuple y in θn(x, y):

U1,1 = M `1 (3.134)

U2,1 = {a ∈M `2 : ∃ b ∈M `1 [∀ x θ2(x, a)↔ θ1(x, b)]} (3.135)

U2,2 = M `2 \ U2,1 (3.136)

U3,1 = {a ∈M `3 : ∃ b ∈M `1 [∀ x θ3(x, a)↔ θ1(x, b)]} (3.137)

U3,2 = {a ∈M `3 : ∃ b ∈M `2 [∀ x θ3(x, a)↔ θ2(x, b)]} \ U3,1 (3.138)

U3,3 = M `3 \ (U3,1 ∪ U3,2) (3.139)

Note that for a fixed n > 0 that the sets Un,1, . . . , Un,n partition M `n and that

the formulas defining these sets are uniformly computable from n. Then define

∅-definable equivalence relations on M `n as follows:

yEnz ⇐⇒M |= [∀ x θn(x, y)↔ θn(x, z)] (3.140)

Note by definition that any two elements y and z which are En-equivalent are in

the same member of the partition Un,1, . . . , Un,n of M `n . By Proposition 103 from

θn(x, y) we can uniformly Th(M)-compute a ∅-definable function fn : M `n → M

such that

M |= [∀ x θn(x, y)↔ θn(x, z)]⇐⇒ yEnz ⇐⇒ fn(y) = fn(z) (3.141)

231

By Proposition 104, we can uniformly compute a sequence of injections ιn : M →

M with disjoint ranges, and we can define gn = ιn◦fn. Finally, define ∂ : D(M)→

M by setting ∂(θn(·, a)) = c if and only if

n∧m=1

[a ∈ Un,m → (∃ b ∈M `m & ∀ x θn(x, a)↔ θm(x, b) & c = gm(b))] (3.142)

First let us show that ∂ : D(M) → M is a well-defined function. So suppose

that θn(·, a) and c satisfy the right-hand side of equation (3.142) and that θn′(·, a′)

and c′ also satisfy the right-hand side of equation (3.142), and suppose that θn(·, a)

and θn′(·, a′) define the same set. Then we must show that c = c′. Without loss of

generality, n′ ≤ n. If n′ = n, then since θn(·, a) and θn′(·, a′) define the same set,

we have that a and a′ are En-equivalent and hence are in the same set Un,m. Then

by the right-hand side of equation (3.142), we have that there are b, b′ ∈M `m such

that

M |= ∀ x θm(x, b)↔ θn(x, a)↔ θn(x, a′)↔ θm(x, b′) (3.143)

c = gm(b) (3.144)

c′ = gm(b′) (3.145)

But by equation (3.143), we have that b and b′

are Em-equivalent, and hence by

equation (3.141), we have that fm(b) = fm(b′) and so by equations (3.144)-(3.145)

we have that

c = gm(b) = ιm ◦ fm(b) = ιm ◦ fm(b′) = gm(b

′) = c′ (3.146)

In the case where n′ < n, we have that a ∈ Un,m and a′ ∈ Un′,m′ and so by the

232

right-hand side of equation (3.142), we have that there is b ∈M `m , b′ ∈M `m′ such

that

M |= ∀ x θm(x, b)↔ θn(x, a)↔ θn(x, a′)↔ θm′(x, b′) (3.147)

c = gm(b) (3.148)

c′ = gm′(b′) (3.149)

Then by equation (3.147) and the definition of the sets Un,m, we must have that

m = m′. Then by equation (3.147) again, we have that b and b′are Em-equivalent,

and, hence, by equation (3.141), we have that fm(b) = fm(b′), and so by equa-

tions (3.148)-(3.149) we have that

c = gm(b) = ιm ◦ fm(b) = ιm ◦ fm(b′) = gm(b

′) = c′ (3.150)

Therefore, ∂ : D(M)→M is a well-defined function.

Now let us show that ∂ : D(M)→M is an injection. Suppose that θn(·, a) and

c satisfy the right-hand side of equation (3.142) and that θn′(·, a′) and c′ satisfy

the right-hand side of equation (3.142) and suppose that c = c′. Then we must

show that θn(·, a) and θn′(·, a′) define the same set. We have that a ∈ Un,m and

a′ ∈ Un′,m′ , and by the right-hand side of equation (3.142), we have that there is

b ∈M `m , b′ ∈M `m′ such that

M |= ∀ x θn(x, a)↔ θm(x, b) (3.151)

M |= ∀ x θn′(x, a′)↔ θm′(x, b′) (3.152)

gm(b) = c = c′ = gm′(b′) (3.153)

233

Since gm = ιm◦fm and since the functions ιm have distinct ranges, equation (3.153)

implies that m = m′ and since gm = ιm ◦ fm and ιm is an injection, we have

that equation (3.153) implies that fm(b) = fm′(b′), which by equation (3.141)

implies that θm(·, b) and θm′(·, b′) define the same set. This in turn implies with

equations (3.151)-(3.152) that θn(·, a) and θn′(·, a′) define the same set, which is

what we wanted to show. Hence, in fact ∂ : D(M)→M is an injection.

So, ∂ : D(M) → M is well-defined and indeed an injection. Note that by

its very definition in equation (3.142), we have that ∂ : D(M) → M is Th(M)-

computably uniformly definable. Hence, by Theorem 70 (iii)-(iv), we have that

(M,D(M), D(M2), . . . , ∂) is a mode of ∆11 − BL0.

Definition 107. Suppose that k is field of characteristic p > 0. Then k is a

separably closed field of finite imperfection degree if (i) there is a finite set B ⊆ k

such that the set of monomials {bm11 · · · bmee : 0 ≤ mi < p & b1, . . . , be ∈ B} is a

basis for k over kp, and if (ii) every f ∈ k[x] such that f ′ 6= 0 has a root in k.

Theorem 108. Suppose that k is a recursively saturated separably closed field

of finite imperfection degree. Then there is a computably uniformly definable

function ∂ : D(k)→ k such that (k,D(k), D(k2), . . . , ∂) is a model of ∆11 − BL0.

Proof. This follows immediately from the fact that such fields satisfy the an-

tecedents of the previous theorem and have a computable theory when names

are added for the finite set B from the previous definition (cf. Messmer [110]

Proposition 4.2 p. 140, p. 143, Remark 4.4 p. 141).

Remark 109. If we knew that all the elements of the proof of the previous

theorem were formalizable in ACA0, then we could infer from the proof of the

above theorem and Proposition 13 that we have ∆11 − BL0 <I ACA0. It is clear

234

from the proof that this comes down to determining whether or not the uniform

elimination of imaginaries for separably closed fields of finite imperfection degree

is provable in ACA0.

3.5 Further Questions

Question 110. In Figure 3.1, we summarized what is known about the provability

relation. Two questions which remain open are the following: does ∆11 − BL0 imply

Σ11 − LB0 and does Π11 − HP0 imply Σ11 − PH0?

Question 111. In Remark 78, we noted that if Ax’s Theorem 72 is provable in

ACA0, then we would have another proof of ∆11 − HP0 <I ACA0 besides the proof

from Corollary 99. Hence, is Ax’s Theorem 72 provable in ACA0?

Question 112. In Remark 109, we noted that if the uniform elimination of

imaginaries for separably closed fields is provable in ACA0, then we would have

∆11 − BL0 <I ACA0. Hence, is the uniform elimination of imaginaries for separably

closed fields provable in ACA0?

Question 113. Is HP2 faithfully interpretable in PA2? See the discussion of this

question at Remark 31.

Question 114. The results in Heck [64], Ganea [50], and Visser [149] imply that

ABL0 is mutually interpretable with Robinson’s Q. Is ∆11 − BL0 mutually inter-

pretable with Robinson’s Q?

Question 115. What is the exact interpretability strength of AHP0 and ∆11 − HP0?

Are these theories interpretable in Robinson’s Q?

Question 116. In § 2.2, and in particular around equation (3.20), we pointed out

that there is no function symbol in our language for the mapping (R, n) 7→ #(Rn),

235

where R is a binary relation and Rn = {m : Rnm}. The inclusion of such

a function symbol will not affect systems which contain the ∆11-comprehension

schema, since the graph of this function is ∆11-definable (cf. equation (3.20)).

However, in Propositions 55-56, we pointed that AHP0 and ABL0 do not prove

the existence of the graph of this function (R, n) 7→ #(Rn), in the sense that

AHP0 and ABL0 do not prove that the binary relation {(n,m) : #(Rn) = m}

exists for every binary relation R. Does the addition of this function symbol

affect the interpretability strength of AHP0 and ABL0? In particular, do the Heck-

Visser-Ganea results about the mutual interpretability of ABL0 and Robinson’s Q

mentioned in § 3.1.5 still hold if we add a function symbol for (R, n) 7→ #(Rn)?

236

CHAPTER 4

DENJOY INTEGRATION: DESCRIPTIVE SET THEORY AND MODEL

THEORY

4.1 Introduction

One of the classical results of descriptive set theory is Mazurkiewicz’s result

that Diff[a, b], the set of everywhere differential real-valued functions on [a, b], is a

Π11-complete subset of the Polish space C[a, b] of continuous real-valued functions

on [a, b] (cf. Kechris [84] § 33.D Theorem 33.9 p. 248). The set {F ∈ Diff[a, b] :

F (a) = 0} is in one-one correspondence with Derv[a, b], the set of derivatives

of everywhere differential real-valued functions on [a, b]. Building on unpublished

work of Ajtai, in the 1980s it was shown by Dougherty and Kechris that Derv[a, b],

viewed as subspace of the countable product space (C[a, b])ω of the Polish space

C[a, b], is co-analytic but not analytic (and indeed not even analytic on the co-

analytic subspace of (C[a, b])ω of sequences which converge pointwise) (cf. [28]

Theorem 1-2 p. 147, [83] Theorem 3.1-3.2 p. 310). This work raises the natural

question of the descriptive set theory complexity of integration, and this is how

Dougherty and Kechris frame the question:

A second problem is related to the definability aspects of the so-called“descriptive definitions of integrals” [. . . ]. These are essentially im-plicit definitions like the original one of the primitive. For example,the Lebesgue integral F of an integrable function f can be defined as

237

the unique (up to a constant) F such that (i) F is absolutely contin-uous and (ii) F ′ = f(x) for almost all x. By replacing in (i) absolutecontinuity by more general conditions, one can obtain descriptive def-initions of integrals involving any derivative. The question is whetherthese conditions can possibly be Borel ([28] p. 166).

The “more general” condition which Dougherty and Kechris refer to is known as

“generalized absolute continuity in the restricted sense” or ACG∗[a, b] (cf. Defi-

nition 121 and Theorems 128 and 134), and the resulting integrals are known as

Denjoy integrals. These turn out to be equivalent to the Henstock-Kurzweil inte-

grals and the Perron integrals (cf. Theorem 138 and Gordon [56] esp. Chapter 11

and Swartz [142]). In this chapter, Dougherty and Kechris’ question is answered

by showing that ACG∗[a, b] is a coanalytic but not analytic subset of the Polish

space of real-valued continuous functions C[a, b] (cf. Corollary 197). Using the

same methods, it is also shown that the operation of indefinite Denjoy integration

is coanalytic but not analytic. In particular, it is shown that the relation “f is

Denjoy integrable and F is equal to its indefinite integral” is a co-analytic but not

analytic relation on the product space M [a, b]×C[a, b], where M [a, b] is the Polish

space of real-valued measurable functions on [a, b] and where C[a, b] is again the

Polish space of real-valued continuous functions on [a, b] (cf. Corollary 195 and

Figure 4.1).

Dougherty and Kechris’ question was essentially a question of how difficult it

is to define the Denjoy integral. One can also ask about the complexity of the sets

which are defined by this integral. Here the appropriate setting seems to be that

of model theory, where one asks what can be defined in a first-order way from the

Denjoy integral, and a natural language for this is the language of R[X]-modules,

where the indeterminate X is interpreted as the indefinite Denjoy integral, so

that the atomic formulas are a very elementary type of integral equation. One

238

of the basic questions to ask here is whether there is any first-order difference

between the Denjoy integrable functions, the Lebesgue integrable functions, and

the continuous functions with the Riemann integral. This question is answered

here in the negative, in that it is shown that these R[X]-modules are elementar-

ily equivalent, and taken as Q[X]-modules their complete theory is computable

(Corollary 226). Hence, the conclusion of this paper is that from a descriptive set

theory standpoint, the Denjoy integrals are much more difficult to describe than

the Lebesgue or Riemann integrals, while from an admittedly elementary model-

theoretic standpoint, these integrals are indistinguishable. For suggestions as to

less elementary model-theoretic standpoints, see the further questions in § 4.5.

4.2 Background

The primary goal of this section is to review and collate the background ma-

terial on Denjoy integration which will be employed in subsequent sections. The

basic idea of the Denjoy integral is that it generalizes the Lebesgue integral by

replacing the notion of absolute continuity with the notion of “generalized abso-

lute continuity in the restricted sense”. Hence in § 4.2.1 this notion is defined and

several of its basic properties are recorded. Then in § 4.2.2, the definition of the

Denjoy integral is stated, and the manner in which this integral generalizes the

Lebesgue integral is explicitly discussed (cf. Theorems 128 and 134). In § 4.2.2,

several equivalent characterizations of the Denjoy integral are also recalled, such

as its equivalence with the Henstock-Kurzweil integral, and several of the basic

properties of this integral are noted. Finally, in § 4.2.3, two important lemmas

about the Denjoy integral are noted: namely, the Improper Integrals Lemma and

Lebesgue’s Lemma (Lemmas 143 and 146 and 149). These two lemmas allow for

239

the definition of a sequence of subspaces of the Denjoy integrable functions which

will prove important in what follows (cf. Definition 153 and Figure 4.1). Further,

in § 4.2.3, attention is paid to the exact closure properties possessed by these

subspaces, as these properties will be particularly important for the model theory

results discussed later (cf. Remark 199).

Definition 117. Let M [a, b] be the space of measurable real-valued functions on

[a, b] under the equivalence relation of almost everywhere equality. Let C[a, b] be

the space of continuous real-valued functions on [a, b]. Let K[a, b] be the space of

closed subsets of [a, b]. Let L1[a, b] be the space of Lebesgue Integrable functions

on [a, b]. (See Figure 4.1, and see Remarks 180-181 for the Polish structure on

C[a, b], K[a, b] and M [a, b]).

Remark 118. In what follows we always identify almost everywhere equal el-

ements of M [a, b]. Of course, several of the results also hold for the pointwise

functions, but especially when we consider the topology on M [a, b] in § 4.3.3, it

will be important to identify almost everywhere equal elements of M [a, b].

4.2.1 Absolutely Continuous Functions and Generalizations

Definition 119. Suppose that K ⊆ [a, b]. Then a K-edged subpartition D of [a, b]

is a finite non-empty collection J1, . . . , Jn of non-overlapping closed sub-intervals

of [a, b] which have both their endpoints in K. A sub-partition D of [a, b] is called

a partition if [a, b] = ∪J∈DJ . The length of a closed interval J will be denoted by

its Lebesgue measure µ(J).

Remark 120. The above terminology is introduced purely for the purpose of

not having to explicitly write out sub-partitions as [a1, b1], . . . , [an, bn], which can

240

Figure 4.1. Containment Diagram for Subsets of M [a, b] and C[a, b]

241

be quite cumbersome when one is quantifying over such sub-partitions, as in the

following definitions.

Definition 121. If F : [a, b]→ R and K ⊆ [a, b].

(i) Then F is said to be absolutely continuous on K, and written F ∈ AC(K),

if for every ε > 0 there is δ > 0 such that for all K-edged sub-partitions D

of [a, b] if∑

J∈D µ(J) < δ then∑

J∈D |F (max(J))− F (min(J))| < ε.

(ii) Further, F is said to be absolutely continuous in the restricted sense on K,

and written F ∈ AC∗(K), if for every ε > 0 there is δ > 0 such that for all

K-edged sub-partitions D of [a, b] if∑


J∈D ω(F, J) < ε,

where ω(F, J) = sup{|F (x)− F (y)| : x, y ∈ J}.

(iii) Finally, F is said to be generalized absolutely continuous in the restricted

sense, and written F ∈ ACG∗(K), if there are Kn ∈ K[a, b] such that

K =⋃nKn and F ∈ AC∗(Kn).

Remark 122. In the above definitions, note that no topological restrictions are

placed on F or K, although typically we will restrict ourselves to F ∈ C([a, b])

and K ∈ K[a, b].

Remark 123. Note that F ∈ AC([a, b]) or F ∈ AC∗([a, b]) trivially implies

F ∈ C[a, b], but F ∈ ACG∗([a, b]) does not in general imply that F ∈ C([a, b]).

For example consider F = 0 on [0, 12) and F = 1 on [1

2, 1]. Let K1 = [1

2, 1] and let

Kn = [0, 12− 1

n+1] for n > 1. Then [0, 1] =

⋃nKn and F ∈ AC∗(Kn) and hence

F ∈ ACG∗([0, 1]) but by construction F /∈ C([0, 1]). However, it turns out that

functions in ACG∗([a, b]) are differentiable almost everywhere:

Proposition 124. If F ∈ ACG∗([a, b]) then F is differentiable almost everywhere

on [a, b]

242

Proof. See Gordon [56] Corollary 6.19 p. 100.

Remark 125. The following proposition enumerates several elementary proper-

ties of absolutely continuous functions which shall be appealed to at various points

in what follows.

Proposition 126. Suppose that F ∈ C([a, b]).

(i) If E = [c, d] then F ∈ AC(E) if and only if F ∈ AC∗(E).

(ii) IfE ∈ K[a, b] & F ∈ AC∗(E) & (a, b)−E =⊔n(cn, dn) then

∑n ω(F, [cn, dn]) <

∞.

(iii) If E ∈ K[a, b] and Q ⊆ E is dense in E, then F ∈ AC∗(E) if and only if

F ∈ AC∗(Q).

(iv) If k ∈ R and F ∈ AC∗(K) then F + k ∈ AC∗(K).

(v) If k 6= 0 then F ∈ AC∗(K) if and only if kF ∈ AC∗(K).

(vi) If K ∈ K[a, b] and F ∈ AC∗(K) then F ∈ AC∗(K ∪ {a} ∪ {b}).

(vii) If L ⊆ K then F ∈ AC∗(K) implies F ∈ AC∗(L).

(viii) If K ⊆ [c, d] ⊆ [a, b], F ∈ AC∗(K), G = F on [c, d] for G ∈ C([a, b]) then

G ∈ AC∗(K).

Proof. For (i), the proof splits into two directions. For the left-to-right direction of

(i), since F is continuous on the bounded interval [a, b], given J ∈ D we define the

subinterval IJ ⊆ J so that |F (max(IJ))− F (min(IJ))| = ω(F, J). For the right-

to-left direction of (i), simply note that |F (max(J))− F (min(J))| ≤ ω(F, J).

For (ii), simply apply the definition of AC∗(E) for ε = 1 to obtain a δ >

0. Then there is N > 0 such that∑∞

i=n µ([cn, dn]) < δ for all n ≥ N . Then

243

∑n ω(F, [cn, dn]) =

∑n<N ω(F, [cn, dn]) +

∑n≥N ω(F, [cn, dn]), and this sum in

turn is ≤∑

n<N ω(F, [cn, dn]) + 1.

For (iii), this elementary fact is stated without proof in Gordon [56] Theo-

rem 6.2 (d) pp. 90-91, and a proof is included here merely for the sake of complete-

ness. First, note that the left-to-right direction holds trivially, since any Q-edged

sub-partition is automatically an E-edged sub-partition. For the right-to-left di-

rection of (iii), suppose that ε > 0. Choose δ > 0 such that for all Q-edged sub-

partitions D of [a, b] we have that∑

J∈D µ(J) < δ implies that∑

J∈D ω(F, J) < ε8.

Let D be an E-edged sub-partition of [a, b] such that∑

J∈D µ(J) < δ. Since F ∈

C([a, b]), choose η > 0 such that for |F (x)− F (y)| < ε8|D| whenever |x− y| < η.

Partition D into two E-edged sub-partitions D0 and D1 such that Di has the prop-

erty that no two-intervals in it have common endpoints, i.e., list out the elements

D in order and let D0 be the even ones in the list and let D1 be the odd ones.

Clearly it suffices to show that∑

J∈Di ω(F, J) < ε2. For each J ∈ Di, let mid(J)

be its midpoint and choose non-overlapping Q-edged intervals IJ so that

(a) min(IJ) ∈ (min(J)− η,mid(J)) ∩ (min(J)− η,min(J) + η)

(b) and max(IJ) ∈ (mid(J),max(J) + η) ∩ (max(J)− η,max(J) + η)

(c) and∑

J∈Di µ(IJ) < δ.

Then {IJ : J ∈ Di} is a Q-edged sub-partition of [a, b] such that∑

J∈Di µ(IJ) < δ,

and hence by choice of δ we have that∑

J∈Di ω(F, IJ) < ε8. It suffices to show that

for all J ∈ Di we have ω(F, J) ≤ ω(F, IJ)+ ε4|D| , since this implies

∑J∈Di ω(F, J) ≤∑

J∈Di(ω(F, IJ) + ε4|D|) = (

∑J∈Di ω(F, IJ)) + (

∑J∈Di

ε4|D|) <

ε8

+ ε4< ε

2. So we let

J ∈ Di and show that ω(F, J) ≤ ω(F, IJ) + ε4|D| . There are four cases, depending

on how J and IJ relate:

(C1) min(IJ) ≤ min(J) < mid(J) < max(J) ≤ max(IJ).

244

(C2) min(J) < min(IJ) < mid(J) < max(J) ≤ max(IJ).

(C3) min(IJ) ≤ min(J) < mid(J) < max(IJ) < max(J).

(C4) min(J) < min(IJ) < mid(J) < max(IJ) < max(J).

In case C1, we have IJ ⊇ J and hence ω(F, J) ≤ ω(F, IJ). In case C2, we can

conclude that ω(F, J) ≤ ω(F, [min(J),min(IJ)])+ω(F, [min(IJ),max(J)]) ≤ ε8|D|+

ω(F, IJ). In case C3, ω(F, J) ≤ ω(F, [min(J),max(IJ)])+ω(F, [max(IJ),max(J)]) ≤

ω(F, IJ) + ε8|D| . In case C4, we have that ω(F, J) ≤ ω(F, [min(J),min(IJ)]) +

ω(F, [min(IJ),max(IJ)]) + ω(F, [max(IJ),max(J)]) ≤ ε8|D| + ω(F, IJ) + ε

8|D| ≤

ω(F, IJ) + ε4|D| . Hence, in all four cases, we are done, and so we have established

the right-to-left direction of (iii).

For (iv), note that ω(F, J) = ω(F + k, J) since one has |F (x)− F (y)|=

|(F (x) + k)− (F (y) + k)|. Likewise, for (v), note that ω(kF, J) = |k|ω(F, J)

since |kF (x)− kF (y)| = |k| |F (x)− F (y)|.

For (vi), suppose that K ∈ K[a, b] and F ∈ AC∗(K). It suffices to show that

F ∈ AC∗(K ∪ {a}). Since K is closed, if a is a limit point of K, then a is already

in K, and we are done. Hence, we can suppose that there is some η > 0 such that

(a, a + η) ∩K = ∅. For ε > 0, choose δ > 0 corresponding to F ∈ AC∗(K) from

Definition 121 (ii) such that δ < η. Suppose that D is an K ∪ {a}-edged sub-

partition of [a, b] such that∑

J µ(J) < δ < η. Since each µ(J) < η, it cannot be

the case that one of the endpoints of J is a. Hence D is a K-edged sub-partition of

[a, b], from which we conclude that∑

J∈D ω(F, J) < ε by the hypothesis on δ > 0.

Hence, in fact F ∈ AC∗(K ∪ {a}).

For (vii), this follows immediately from the definitions. For (viii), simply note

that any K-edged sub-partition D is such that if J ∈ D then J ⊆ [c, d], and since

the values of F and G are the same on this interval, it follows that F ∈ AC∗(K)

245

implies G ∈ AC∗(K).

4.2.2 Basic Properties of the Denjoy Integral

Remark 127. The following is a version of the Fundamental Theorem of Calculus

for Lebesgue Integrals (cf. Folland [43] Theorem 3.35 p. 106).

Theorem 128. Suppose that f ∈ M [a, b] and F ∈ C[a, b] and F (a) = 0. Then

the following are equivalent:

(i) f ∈ L1[a, b] and F (x) =∫ xaf

(ii) F ∈ AC([a, b]) and F ′ = f a.e.

Remark 129. By Proposition 126 (i), it follows that F ∈ AC([a, b]) if and only

if F ∈ AC∗([a, b]). Hence, the above theorem can be restated as follows:



(i) f ∈ L1[a, b] and F (x) =∫ xaf

(ii) F ∈ AC∗([a, b]) and F ′ = f a.e.

Remark 131. Stated in this way, this theorem motivates following definition

of the Denjoy integral (cf. Gordon [56] Definition 7.1 p. 108, Peng-Yee [121]

Definition 6.8 p. 30).

Definition 132. Suppose that f : [a, b] → R. Then f is Denjoy integrable or

f ∈ Den[a, b] if and only if there is F ∈ C([a, b]) ∩ ACG∗([a, b]) such that F ′ = f

a.e.

246

Remark 133. It can be shown that if F ∈ C([a, b]) ∩ ACG∗[a, b] is such that

F ′ = 0 a.e. then F is constant everywhere on [a, b] (cf. Gordon [56] p. 108

and Corollary 6.26 p. 104, Peng-Yee [121] Theorem 6.11 p. 30). From this it

follows that for each f ∈ M [a, b], there is at most one F ∈ C([a, b]) ∩ ACG∗[a, b]

such that F ′ = f a.e. and F (a) = 0. Hence, when such a function exists, it

is called the indefinite Denjoy integral of f , and we write F (x) =∫ xaf . Hence,

it follows trivially from these definitions that we have the following analogue of

Theorem 128:



(i) f ∈ Den[a, b] and F (x) =∫ xaf

(ii) F ∈ ACG∗([a, b]) and F ′ = f a.e.

Remark 135. This analogue between Theorems 130 and 134 may not be enough

to convince one that the Denjoy integral is in fact deserving of the name of the

integral. It turns out that the Denjoy integral is equivalent to the Henstock-

Kurzweil integral, which directly generalizes the notion of the Riemann integral:

Definition 136. Suppose that f : [a, b] → R and suppose that a ≤ x ≤

b. Then f on [a, x] is Henstock-Kurzweil integrable with value F (x) if for ev-

ery ε > 0 there is a sequence of strictly positive values {δt}t∈[a,x] such that∣∣∣∑Ni=1 f(ti)(bi − ai)− F (x)

∣∣∣ < ε for all partitions [a1, b1], . . . , [aN , bN ] of [a, x] with

ti − δti < ai ≤ ti ≤ bi < ti + δti .

Remark 137. Hence, one sees immediately from this definition that every Rie-

mann integrable function f is Henstock-Kurzweil integrable with all the δt set

247

equal to a fixed constant δ > 0. It is easy to see the motivation for the Henstock-

Kurzweil integral by pursuing this analogy: for, the basic idea of the Riemann

integral is that given an error threshold ε > 0, there is a fixed width δ > 0 such

that so long as one takes boxes with width less than this fixed width, then the

estimates for the area under the curve in terms of these boxes will be within the

error threshold. Likewise, the idea of the Henstock-Kurzweil integral is that one

is allowed to vary the width-estimates along the domain of the integrable func-

tion, perhaps requiring greater precision on those areas of the domain where the

integrable function oscillates more frequently between large positive and nega-

tive values. This idea is very different from the motivating idea of the Denjoy

integral, which was defined to generalize the Fundamental Theorem of Calculus.

It is thus surprising that one can demonstrate that the Denjoy integral and the

Henstock-Kurzweil integral are one and the same:

Theorem 138. Suppose that f : [a, b] → R and F : [a, b] → R. Then the

following are equivalent

(i) f is Denjoy integrable with F (x) =∫ xaf .

(ii) f is Henstock-Kurzweil integrable with value F (x).

Proof. See Gordon [56] Chapter 11, and in particular Theorems 11.3-11.4 pp. 171-

173. See also Peng-Yee [121] Theorem 6.12-6.13 pp. 31-32.

Remark 139. There is a certain infelicity in the statement of the above theo-

rem, in that part (ii) of the theorem is stated in terms of the “values” from the

Definition 136 of the Henstock-Kurzweil integral. This was done merely for the

sake of not having to introduce subscripted integral signs. That is, one could

have stated the above theorem by subscripting the indefinite Henstock-Kurzweil

248

integral with “HK”, subscripting the indefinite Denjoy integral with “Den”, and

then stating in the above theorem that these two indefinite integrals are the same.

Such a formalism tends to obscure the main point: if one proceeds on the basis

of the Definition of the Denjoy integral from Definition 132 and the definition of

the Henstock-Kurzweil integral from Definition 136, then one obtains one and the

same class of integrable functions and one and the same values for these integrals.

Given this equivalence, one can quickly enumerate several elementary properties of

the Denjoy integral, many of which are easily proven using the Henstock-Kurzweil

characterization:

Proposition 140.

(i) L1[a, b] ⊆ Den[a, b] and the values of the integrals is the same

(ii) Derv[a, b] ⊆ Den[a, b] and∫ baF ′ = F (b)− F (a)

(iii) Den[a, b] ⊆M [a, b].

(iv) If f ∈ Den[a, b] then there are Kn ∈ K[a, b] with [a, b] =⋃nKn and fχKn ∈

L1[a, b].

(v) If f ∈ Den[a, b] and F (x) =∫ xa

then F ∈ C[a, b].

(vi) If f ∈ Den[a, b] and F (x) =∫ xa

then F ′ = f a.e.

Proof. For (i) see Pfeffer [122] Proposition 4. For (ii) see Swartz [142] Theo-

rem 5 p. 6. For (iii) see Peng-Yee [121] Theorem 5.10 pp. 23-24. For (iv) see

Gordon [56] Theorem 9.18 pp. 148-149. For (v), see either Definition 132 and

the subsequent remark, or Swartz [142] Corollary 2 p. 25. For (vi) see either

Definition 132 and the subsequent remark, or Swartz [142] Theorem 2 p. 135.

249

4.2.3 Lebesgue’s Lemma and the Subspaces

Definition 141. A subset X ⊆ M [a, b] is called subinterval-closed if f ∈ X and

(c, d) ⊆ (a, b) implies fχ(c,d) ∈ X .

Remark 142. Note that Den[a, b] is subinterval-closed. See, for example, Swartz

[142] Theorem 7 p. 16.

Lemma 143. (Improper Integrals Lemma) Suppose f ∈ M [a, b]. If fχ[c,b] ∈

Den[a, b] for every c ∈ (a, b), then f ∈ Den[a, b] with∫ baf = L if and only

if limc↘a+

∫ bcf exists and is equal to L. Likewise, if fχ[a,c] ∈ Den[a, b] for every

c ∈ (a, b), then f ∈ Den[a, b] with∫ baf = L if and only if limc↗b−

∫ ca

exists and is

equal to L.

Proof. Cf. Swartz [142] Chapter 3 Theorem 4 pp. 25-26.

Definition 144. Suppose that X ⊆ Den[a, b]. Then f ∈ Den[a, b] is an improper

integral of X if there is a countable sequence (an, bn) ⊆ (a, b) such that (i) (a, b) =⋃n(an, bn) and (an, bn) ⊆ (an+1, bn+1) and (ii) fχ(an,bn) ∈ X and (iii) limc↘a+

∫ b1cf

exists, and (iv) limc↗b−∫ ca1f exists. Further, let Lim(X ) be the set of improper

integrals of X .

Proposition 145. If X is a subset of Den[a, b] which is subinterval-closed and

which is closed under scalar multiplication, then Lim(X ) is subinterval-closed sub-

set of Den[a, b] which contains X and which is closed under scalar multiplication.

Further, if X is closed under addition, then so is Lim(X ).

Proof. First we show that Lim(X ) contains X . Since X is a subset of Den[a, b],

if f ∈ X then the left-to-right direction of the Improper Integrals Lemma implies

that limc↘a+

∫ b1cf and limc↗b−

∫ ca1f exist for any countable sequence (an, bn) ⊆

250

(a, b) such that (a, b) =⋃n(an, bn) and (an, bn) ⊆ (an+1, bn+1). Hence, by choosing

any such sequence as a witness, and by using the fact that both X is sub-interval

closed, we have that Lim(X ) contains X .

Second we note that Lim(X ) is closed under scalar multiplication. So sup-

pose that f is in Lim(X ) with the associated sequence of intervals (an, bn) and

suppose that s ∈ R is the scalar multiple. Since fχ(an,bn) ∈ X and since X is

by hypothesis closed under scalar multiplication, it follows that sfχ(an,bn) ∈ X ,

and limc↘a+

∫ b1csf = s · limc↘a+

∫ b1cf exists, and limc↗b−

∫ ca1sf = s limc↗b−

∫ ca1f

exists. Hence, Lim(X ) is closed under scalar multiplication.

Third we show that it is subinterval-closed. So suppose that f is in Lim(X )

with associated sequences of intervals (an, bn). Let (u, v) ⊆ (a, b). Since X is

subinterval-closed, it follows that (fχ(u,v))χ(an,bn) = (fχ(an,bn))χ(u,v) is in X . Since

Den[a, b] is subinterval closed and X is a subset of Den[a, b], we further have that

fχ(u,v) is in Den[a, b]. Applying the left-to-right direction of the Improper Integrals

Lemma, we then have that limc↘a+

∫ b1cfχ(u,v) and limc↗b−

∫ ca1fχ(u,v) exist. Hence,

it follows that fχ(c,d) is in Lim(X ), so that Lim(X ) is subinterval-closed.

Finally, supposing that X is closed under addition, we show that Lim(X )

is closed under addition. So suppose that f, g are in Lim(X ) with associated

sequences of intervals (an, bn) and (cn, dn). It must be shown that f + g is in

Lim(X ). Choose N > 0 such that (aN , bN)∩(cN , dN) 6= ∅, and define a sequence of

intervals by (un, vn) = (aN+n, bN+n)∩ (cN+n, dN+n). Since X is subinterval-closed

and contains fχ(an,bn) and gχ(cn,dn), it likewise contains fχ(un,vn) and gχ(un,vn).

Since X is closed under addition, it contains (f +g)χ(un,vn). Since X is a subspace

of Den[a, b], the left-to-right direction of the Improper Integrals Lemma implies

that each of limc↘a+

∫ v1cf and limc↗b−

∫ cu1f and limc↘a+

∫ v1cg and limc↗b−

∫ cu1g

251

exist, which in turn implies that limc↘a+

∫ v1cf + g and limc↗b−

∫ cu1f + g. Hence,

in fact Lim(X ) is closed under addition whenever X is closed under addition.

Lemma 146. (Lebesgue’s Lemma, first version) Suppose that f ∈ M [a, b] and

K ∈ K[a, b] and (a, b)−K =⊔∞n=1(cn, dn). Further suppose that fχK ∈ L1[a, b],

fχ[cn,dn] ∈ Den[a, b] and∑∞

n=1 ω(∫ xcnf, [cn, dn]) < ∞. Then f ∈ Den[a, b] and∫ b

af =

∫Kf +

∑∞n=1

∫ dncnf .

Proof. Cf. Pfeffer [122] Lemma 8, Peng-Yee [121] Theorem 7.1 and Corollary 7.11

and Gordon [56] Theorem 9.22 pp. 151-152.

Definition 147. Suppose that X ⊆ Den[a, b]. Then f ∈ Den[a, b] is given by

the first version of Lebesgue’s Lemma from X if there is a K ∈ K[a, b] with

(a, b) − K =⊔∞n=1(cn, dn) such that fχK ∈ L1[a, b] and fχ(cn,dn) ∈ X , and∑∞

n=1 ω(∫ xcnf, [cn, dn]) < ∞. Further, let Leb(X ) be the set of elements which

are given from the first version of Lebesgue’s Lemma by X .


which is closed under scalar multiplication, then Leb(X ) is subinterval-closed sub-

set of Den[a, b] which contains X and which is closed under scalar multiplication.

Proof. First we note that Leb(X ) contains X . Since X is a subset of Den[a, b], we

can choose the closed set K = ∅ to witness that any element f ∈ X is contained in

Leb(X ). Hence Leb(X ) contains X . Second we note that Leb(X ) is closed under

scalar multiplication, simply because if f is in Leb(X ) via the closed set K, then

the scalar multiple kf is in Leb(X ) via the closed set K.

Finally we note that Leb(X ) is sub-interval closed. So suppose that f is in

Leb(X ) via the closed set K, and suppose that (c, d) ⊆ (a, b) is the sub-interval.

252

Now, note that since (a, b)−K =⋃n(cn, dn) it follows that

(a, b)− (K ∩ [c, d]) = (a, c) ∪ (d, b) ∪⋃n

(cn, dn) (4.1)

Without loss of generality, we may assume that none of the (cn, dn) are subsets

of (a, c) or subsets of (d, b), since otherwise these (cn, dn) can be omitted without

affecting the equation (4.1). There are now four cases, depending on whether

(a, c) and (d, b) intersect any of the (cn, dn).

First suppose that there are no intersections, so that

(a, b)− (K ∩ [c, d]) = (a, c) t (d, b) t⊔n

(cn, dn) (4.2)

Then fχ(c,d)χ(cn,dn) ∈ X since fχ(cn,dn) ∈ X by hypothesis and X is sub-interval

closed. Likewise fχ(c,d)χ(a,c), fχ(c,d)χ(d,b) ∈ X since these functions are equal to

zero and since X is closed under scalar multiplication. Finally, we have that

fχ(c,d)χK ∈ L1[a, b] since fχK ∈ L1[a, b] by hypothesis. Hence, putting all these

elements together, one has fχ(c,d) ∈ Leb(X ).

Second suppose that (a, c) intersects (c`, d`) but that (d, b) does not intersect

any of the (cn, dn). Then a ≤ c` < c < d` ≤ d and

(a, b)− (K ∩ [c, d]) = (a, d`) t (d, b) t⊔n6=`

(cn, dn) (4.3)

Then for n 6= ` we have fχ(c,d)χ(cn,dn) ∈ X since fχ(cn,dn) ∈ X by hypothesis and

X is sub-interval closed. Similarly, fχ(c,d)χ(a,d`) = fχ(c,d`) ∈ X since fχ(c`,d`) ∈ X

by hypothesis and X is sub-interval closed and c` < c < d` by our case hypothesis.

Likewise fχ(c,d)χ(d,b) ∈ X since this function is equal to zero and since X is

253

closed under scalar multiplication. Finally, we have that fχ(c,d)χK ∈ L1[a, b] since

fχK ∈ L1[a, b] by hypothesis. Hence, putting all these elements together, we see

that fχ(c,d) ∈ Leb(X ).

The proofs of the remaining two cases are similar to this second case.

Lemma 149. (Lebesgue’s Lemma, second version) Suppose that f ∈M [a, b] and

K ∈ K[a, b] and (a, b) −K =⊔∞n=1(cn, dn). Further suppose that fχK ∈ L1[a, b]

and fχ[cn,dn] ∈ Den[a, b] and that there is F ∈ AC∗(K) such that F (x)−F (cn) =∫ xcnf on [cn, dn]. Then f ∈ Den[a, b] and

∫ baf =

∫Kf +

∑∞n=1

∫ dncnf .

Proof. This follows from the first version by Proposition 126 (ii).

Definition 150. Suppose that X ⊆ Den[a, b]. Then f ∈ Den[a, b] is given by

the second version of Lebesgue’s Lemma from X if there is a K ∈ K[a, b] with

(a, b) − K =⊔∞n=1(cn, dn) such that fχK ∈ L1[a, b] and fχ(cn,dn) ∈ X , and F ∈

AC∗(K) where F (x) =∫ xaf . Further, let Leb∗(X ) be the set of elements which

are given from Lebesgue’s Lemma by X .


which is closed under scalar multiplication, then Leb∗(X ) is subinterval-closed

subset of Den[a, b] which contains X and which is closed under scalar multiplica-

tion.

Proof. The argument that Leb∗(X ) contains X and that Leb∗(X ) is closed under

scalar multiplication is exactly identical to the argument from Proposition 148.

For the argument that Leb∗(X ) is sub-interval closed, suppose that f is in

Leb∗(X ) via the closed set K, and suppose that (c, d) ⊆ (a, b) is the sub-interval.

Let F (x) =∫ xaf so that F ∈ AC∗(K) and let G(x) =

∫ xafχ(c,d). Note that on

254

[c, d] we have F (x) =∫ xaf =

∫ xafχ(c,d)+

∫ caf = G(x)+

∫ caf , so that G and F differ

by the constant∫ caf on [c, d]. Since F ∈ AC∗(K) implies F ∈ AC∗(K ∩ [c, d]) (cf.

Proposition 126 (vii)), and since F and G differ by a constant on [c, d], it follows

from Proposition 126 (iv) & (viii) that G ∈ AC∗(K ∩ [c, d]). The argument then

proceeds exactly as in the proof of Proposition 148.

Definition 152. Den0[a, b] = L1[a, b] and Denα[a, b] = Leb(Lim(⋃β<α Denβ[a, b]))

when α > 0.

Definition 153. Den∗0[a, b] = L1[a, b] and Den∗α[a, b] = Leb∗(Lim(⋃β<α Den∗β[a, b]))

when α > 0.

Remark 154. The concept of the sets Denα[a, b] are very standard: they are

the usual way of understanding the Denjoy totalization process (cf. Gordon [56]

pp. 117 ff). However, it seems as though the sets Den∗α[a, b] are more amenable

to descriptive set theory analysis (cf. Theorem 172, Corollary 177, Corollary 196),

and hence in what follows we typically work with Den∗α[a, b] as opposed to Denα[a, b].

Proposition 155. For all α ≥ 0, it is the case that Denα[a, b] (resp. Den∗α[a, b])

is (i) a subinterval-closed subset of Den[a, b], and (ii) contains Denβ[a, b] (resp.

Den∗β[a, b]) for β < α, and (iii) is closed under scalar multiplication. Further, for

all α ≥ 0, it is the case that (iv) Denα[a, b] contains Den∗α[a, b].

Proof. By induction on α, using Proposition 145, Proposition 148, and Proposi-

tion 151 and the fact that L1[a, b] is sub-interval closed and is closed under scalar

multiplication. For (iv), this follows from the fact that Leb∗(X ) is a subset of

Leb(X ) by Proposition 126 (ii).

255

Definition 156. Suppose that X ⊆M [a, b]. Then let 〈X 〉 ⊆M [a, b] be the vector

subspace of M [a, b] generated by X .

Proposition 157. For all α ≥ 0, it is the case that 〈Denα[a, b]〉 (resp. 〈Den∗α[a, b]〉)

is (i) a subinterval-closed vector subspace of Den[a, b], and (ii) is equal to the set

of∑n

i=1 fi for fi ∈ Denα[a, b] (resp. the set of∑n

i=1 fi for fi ∈ Den∗α[a, b]). Fur-

ther, for all α ≥ 0, it is the case that (iii) 〈Denα[a, b]〉 contains Den∗α[a, b]〉 (cf.

Figure 4.1).

Proof. The proof is parallel between 〈Denα[a, b]〉 and 〈Den∗α[a, b]〉, and so we give

the proof for 〈Den∗α[a, b]〉. For (ii), note that formally 〈Den∗α[a, b]〉 is equal to the

set of∑n

i=1 kifi for ki ∈ R and fi ∈ Den∗α[a, b]. But since Den∗α[a, b] is closed

under scalar multiplication by Proposition 155, it follows that kifi ∈ Den∗α[a, b].

For (i), it must be shown that if f ∈ 〈Den∗α[a, b]〉 then fχ(c,d) ∈ 〈Den∗α[a, b]〉. So

by (ii), suppose that f =∑n

i=1 fi for fi ∈ Den∗α[a, b]. Since Den∗α[a, b] is sub-

interval closed by Proposition 155, it follows that fiχ(c,d) ∈ Den∗α[a, b], from which

it follows that fχ(c,d) =∑n

i=1 fiχ(c,d) ∈ 〈Den∗α[a, b]〉. Finally, part (iii) is a trivial

consequence of Proposition 155 (iv).

4.3 Descriptive Set Theory

The goal of this section is to prove that ACG∗[a, b] is a coanalytic but not

analytic (cf. Corollary 197). Using the same methods, it is also shown that the

operation of indefinite Denjoy integration is coanalytic but not analytic. In par-

ticular, it is shown that the relation “f is Denjoy integrable and F is equal to its

indefinite integral” is a co-analytic but not analytic relation on the product space

M [a, b]×C[a, b], where M [a, b] is the Polish space of real-valued measurable func-

tions on [a, b] and where C[a, b] is again the Polish space of real-valued continuous

256

functions on [a, b] (cf. Corollary 195 and Figure 4.1). The proofs here proceed

by identifying three derivatives in § 4.3.1 which measure the extent to which a

measurable function f and a continuous function F deviate from satisfying the

Fundamental Theorem of Calculus for the Lebesgue Integral (cf. Theorem 128).

Likewise, associated with these derivatives is an ordinal-valued rank, and in § 4.3.1

it is shown that there are functions of arbitrarily high countable rank (cf. Corol-

lary 166). Then, in § 4.3.2, it is shown that these ranks are correlated with the

entry into the subsets Den∗α[a, b] (cf. Theorem 172). Finally, in § 4.3.3, it is shown

that the derivatives are Borel (cf. Corollary 192), which allows us to apply an

important theorem linking the vanishing of Borel derivatives and coanalyticity

(cf. Kechris [84] Theorem 34.10 and Exercise 34.13).

4.3.1 Three Derivatives and Functions of Arbitrarily High Rank

Definition 158. Let K[a, b] be the set of closed subsets of [a, b]. Suppose that

B ⊆ K[a, b] is closed under subsets, i.e., if K ∈ B and L ⊆ K then L ∈ K[a, b].

Let K ∈ Bσ if K is the countable union of elements from B. Define a map

DB : K[a, b]→ K[a, b] by

DB(K) = {x ∈ K : U ∩K /∈ B for any open U 3 x} (4.4)

Define maps DαB : K[a, b]→ K[a, b] inductively by

D0B(K) = K, Dα+1

B (K) = DB(DαB(K)), Dα

B(K) =⋂β<α

DβB(K) when α limit

(4.5)

Let |K|B be the least α such that DαB(K) = Dα+1

B (K) and let D∞B (K) = D|K|BB (K).

257

Proposition 159. Suppose that B ⊆ K[a, b] is closed under subsets. Then

(i) If L ⊆ K then DαB(L) ⊆ Dα

B(K)

(ii) DαB(K) ∩ U ⊆ Dα

B(K ∩ U) for any open U

(iii) |K|B < ω1

(iv) DB(K) = {x ∈ K : (p, q) ∩K /∈ B for any rational p, q ∈ Q with (p, q) 3 x}

(v) D∞B (K) = ∅ if and only if K ∈ Bσ

Proof. For (i) & (iii)-(v), see Kechris [84] pp. 271-272. For (ii), suppose that

U is open. Suppose that α = 0. Then D0B(K) ∩ U = K ∩ U ⊆ K ∩ U =

D0B(K ∩ U). Suppose that α = β + 1. Suppose that x ∈ Dα

B(K) ∩ U but

x /∈ DαB(K ∩ U). Then there is open V 3 x such that V ∩Dβ

B(K ∩ U) ∈ B.

Since x ∈ U ∩ V ∩ DβB(K) ⊆ V ∩ Dβ

B(K ∩ U) ⊆ V ∩DβB(K ∩ U) we have that

U ∩ V ∩DβB(K) ⊆ V ∩Dβ

B(K ∩ U) ∈ B and since B is closed under subsets

we have that U ∩ V ∩DβB(K) ∈ B. But since U ∩ V 3 x this contradicts that

x ∈ DαB(K). Suppose that α is a limit. Then Dα

B(K) ∩ U =⋂β<α(Dβ

B(K) ∩ U) ⊆⋂β<αD

βB(K ∩ U) = Dα

B(K ∩ U).

Definition 160. For each f ∈M [a, b] define

Bf = {K ∈ K[a, b] : fχK ∈ L1[a, b]} (4.6)

For each F ∈ C[a, b] define

BF = {K ∈ K[a, b] : F ∈ AC∗(K)} (4.7)

258

For each pair f ∈M [a, b] and F ∈ C[a, b] define

Bf,F = Bf ∩ BF = {K ∈ K[a, b] : fχK ∈ L1[a, b] & F ∈ AC∗(K)} (4.8)

Since Bf , BF , and Bf,F are closed under subsets (cf. Proposition 126 (vii)), define

Df (K) = DBf (K), DF (K) = DBF (K), and Df,F (K) = DBf,F (K) and define

Dαf (K), Dα

F (K), and Dαf,F (K) as in Definition 158. Likewise, define |K|f = |K|Bf ,

|K|F = |K|BF and |K|f,F = |K|Bf,F . Finally define |f | = |[a, b]|f , |F | = |[a, b]|F

and |f, F | = |[a, b]|f,F .

Remark 161. For ease of future reference, it is helpful to unpack some of the

previous definition. So suppose that f ∈M [a, b] and F ∈ C[a, b]. Then it follows

from Definition 160 and Equation 4.4 of Definition 158 that

Df (K) = {x ∈ K : fχU∩K /∈ L1[a, b] for any open U 3 x} (4.9)

DF (K) = {x ∈ K : F /∈ AC∗(U ∩K) for any open U 3 x} (4.10)

Df,F (K) = {x ∈ K : [fχU∩K /∈ L1[a, b] & F /∈ AC∗(U ∩K)] for any open U 3 x}

(4.11)

That is, Df (K) is the points of K where f is not locally Lebesgue integrable,

while DF (K) is the points of K where F is locally absolutely continuous in the

restricted sense. Comparing this to the Fundamental Theorem of Calculus for

Lebesgue Integrals (cf. Theorem 128), one sees that these derivatives record the

points at which the Fundamental Theorem locally fails for a measurable function

f and a continuous function F .

After enumerating some of the elementary properties of these derivatives in

the next proposition, the goal in this section is to prove two facts about these

259

derivatives. First, that these derivatives vanish for each Denjoy integrable func-

tion and its indefinite integral (cf. Proposition 163), and second that there are

Denjoy integrable functions whose indefinite integrals have derivatives that vanish

at arbitrarily high stages (cf. Corollary 166). Intuitively, these two results tell

us that after countably many stages, each Denjoy integrable function becomes

Lebesgue integrable on its derivatives, and that for each countable stage, there

is some Denjoy integrable function on [a, b] which has of yet to become Lebesgue

integrable on its derivatives.

Proposition 162. Suppose f ∈M [a, b] and F ∈ C[a, b]. Then

(i) Df,F (K) = Df (K) ∪DF (K)

(ii) Dαf (K) ⊆ Dα

f,F (K)

(iii) DαF (K) ⊆ Dα

f,F (K)

(iv) If D∞f,F (K) = ∅ then D∞f (K) = ∅ and D∞F (K) = ∅

(v) If D∞f,F (K) = ∅ then |K|f , |K|F ≤ |K|f,F

(vi) If D∞f,F ([a, b]) = ∅ then |f | , |F | ≤ |f, F |

(vii) If k ∈ R then DF (K) = DF+k(K) and DαF (K) = Dα

F+k(K)

(viii) If k 6= 0 then DF (K) = DkF (K) and DαF (K) = Dα

kF (K)

Proof. (i) Suppose that x ∈ Df,F (K) but x /∈ (Df (K) ∪ DF (K)). Then there is

open U 3 x and open V 3 x such that K ∩ U ∈ Bf and K ∩ V ∈ BF . Then U ∩V

is open and contains x, and K ∩ U ∩ V ⊆ K ∩ U ∈ Bf and K ∩ U ∩ V ⊆ K ∩ V ∈

BF , and hence since Bf and Bf are closed under subsets, we have thatK ∩ U ∩ V ∈

Bf ∩ BF = Bf,F , which contradicts that x ∈ Df,F (K). Conversely, suppose that

260

x ∈ Df (K) but x /∈ Df,F (K). Then there is open U 3 x such that K ∩ U ∈

Bf,F = Bf ∩ BF ⊆ Bf , which contradicts that x ∈ Df (K). (ii) Suppose that

α = 0. Then by the previous item, Dαf (K) = K ⊆ K ∪K = Dα

f,F (K). Suppose

that α = β + 1. Then Dαf (K) = Df (D

βf (K)) ⊆ Df (D

βf,F (K)) ⊆ Df (D

βf,F (K)) ∪

DF (Dβf,F (K)) = Df,F (Dβ

f,F (K)) = Dαf,F (K). Suppose that α is a limit ordinal.

Then Dαf (K) =

⋂β<αD

βf (K) ⊆

⋂β<αD

βf,F (K) = Dα

f,F (K). (iii) The proof is

identical to the previous item. (iv) This follows directly from the previous two

items. (v) If D∞f,F (K) = ∅, then by (ii) we have D|K|f,F+1

f (K) ⊆ D|K|f,Ff (K) ⊆

D|K|f,Ff,F (K) = D∞f,F (K) = ∅. Since |K|f is by definition the least α such that

Dα+1f (K) = Dα

f (K), we have that |K|f ≤ |K|f,F . Similarly, we have |K|F ≤

|K|f,F . (vi) This follows directly from the previous item and Definition 160, which

said that e.g. |f | = |[a, b]|f . For (vii) and (viii), note that these follow directly

from Proposition 126 (iv)-(v).

Proposition 163. Suppose that f ∈ Den[a, b] and F (x) =∫ xaf and K ∈ K[a, b].

Then (i) D∞f (K) = ∅, (ii) D∞F (K) = ∅, and (iii) D∞f,F (K) = ∅.

Proof. For (i), note that Proposition 159 (v) implies that D∞f (K) = ∅ if and only

if K ∈ (Bf )σ, i.e. if there are Kn ∈ K[a, b] such that K =⋃nKn and fχKn ∈

L1[a, b]. But Proposition 140 (iv) says that this happens when f ∈ Den[a, b]. (ii)

Likewise, by Proposition 159 (v), we have that D∞F (K) = ∅ if and only if K ∈

(BF )σ, i.e. if there are Lm ∈ K[a, b] such that K =⋃m Lm and F ∈ AC∗(Lm).

But this is just to say that F ∈ ACG∗[a, b], and so this follows immediately from

the Fundamental Theorem of Calculus for Denjoy Integrals (cf. Theorem 134).

(iii) Now, retaining the closed sets Kn from part (i) and the closed sets Lm from

part (ii), consider the sequence of closed sets Cn,m = Kn ∩ Lm. Then we have

that K = K ∩K = (⋃nKn) ∩ (

⋃m Lm) =

⋃n,mKn ∩ Lm =

⋃n,mCn,m. Further,

261

since fχKn ∈ L1[a, b] and F ∈ AC∗(Lm), we have that fχCn,m ∈ L1[a, b] and

F ∈ AC∗(Cn,m). This is to say that K ∈ (Bf,F )σ, so that by Proposition 159 (v)

it follows that D∞f,F (K) = ∅.

Remark 164. The construction in the successor step of the following exam-

ple is based on the example discussed in Gordon [56] pp. 117-118, although this

discussion does not treat the derivatives DαF [a, b] which we introduced above in

Definition 160.

Proposition 165. For every α < ω1 and every [a, b] and r > 0 there is f ∈

Den[a, b] with F (x) =∫ xaf and

∫ baf = 0 and f(a) = f(b) = 0 and a, b ∈ Dα

F ([a, b])

and ω(F, [a, b]) = r.

Proof. Suppose that α = 0. Let f(x) = sin(2π(b−a)−1(x−a)). Since ω(F, [a, b]) =

b−aπ

> 0 where F (x) =∫ xaf , to ensure that for any r > 0 we can obtain

ω(F, [a, b]) = r, simply multiply f by rω(F,[a,b])

if need be. Here of course we

tacitly appeal to Proposition 162 (viii), which says that multiplying by non-zero

scalars does not affect the closed sets DαF [a, b].

Suppose now that α = β + 1. Let C be the Cantor 1/3-set on [a, b] and let

(a, b)−C =⊔n>0(cn, dn) and let Cn the Cantor 1/3-set on [cn, dn] and let (cn, dn)−

Cn =⊔m>0(cnm, dnm). Choose fnm ∈ Den[cnm, dnm] with Fnm(x) =

∫ xcnm

fnm and∫ dnmcnm

fnm = 0 and fnm(cnm) = fnm(dnm) = 0 and cnm, dnm ∈ DαFnm

([cnm, dnm])

and ω(Fnm, [cnm, dnm]) = 2−n if m < 2n and ω(Fnm, [cnm, dnm]) = 2−n2−m+2n−1

otherwise. Then by fixing n we have

∑m>0

ω(Fnm, [cnm, dnm]) = (2n − 1)2−n + 2−n∑m≥2n

2−m+2n−1 = 1 (4.12)

Still fixing n, let fn = fnm on [cnm, dnm] and fn = 0 otherwise, so that fn ∈

262

Den[cn, dn] with∫ dncnfn = 0 by the first version of Lebesgue’s Lemma 146, and

set Fn(x) =∫ xcnfn. Fixing n for the remainder of the paragraph, we claim that

ω(Fn, [cn, dn]) ≤ 2 · 2−n. For, let ε > 0 and let [x, y] ⊆ [cn, dn]. Since Fn is

continuous (cf. Proposition 140 (v)), choose δ > 0 such that 0 < u − x < δ

implies∣∣∫ uxfn∣∣ < ε

2and such that 0 < y − v < δ implies

∣∣∫ yvfn∣∣ < ε

2. Choose

u, v /∈ Cn such that cn ≤ x < u < v < y ≤ dn and 0 < u − x < δ and

0 < y − v < δ. If [u, v] ⊆ [cnm, dnm] then∣∣∫ vufn∣∣ ≤ ω(Fnm, [cnm, dnm]) ≤ 2−n and

hence∣∣∫ yxfn∣∣ ≤ ε+ 2−n. Otherwise, we have that cn` ≤ u ≤ dn` < cnm ≤ v ≤ dnm,

and then estimating as before we have∣∣∫ yxfn∣∣ ≤ ε + 2 · 2−n +

∣∣∣∫ cnmdn`fn

∣∣∣, and so

it suffices to show that∫ cnmdn`

fn = 0, which follows as above from the first version

of Lebesgue’s Lemma 146. Hence we have in fact shown that, fixing n, we have

ω(Fn, [cn, dn]) ≤ 2 · 2−n.

This of course implies that∑

n>0 ω(Fn, [cn, dn]) ≤∑

n>0 2 · 2−n ≤ 2, and so

letting f = fn on [cn, dn] and f = 0 otherwise, we have that f ∈ Den[a, b] with∫ baf = 0 by the first version of Lebesgue’s Lemma 146, and set F (x) =

∫ xaf . To

see that a, b ∈ DαF ([a, b]), note that by hypothesis cnm, dnm ∈ Dβ

Fnm([cnm, dnm]) and

hence cnm, dnm ∈ DβF ([a, b]), since Dβ

Fnm([cnm, dnm]) = Dβ

F �[cnm,dnm]([cnm, dnm]) ⊆

DβF ([a, b]) respectively by Proposition 162 (vii) and Proposition 159 (i). Since a

subsequence of the cnm converge to cn and since a subsequence of the dnm converge

to dn we have that cn, dn ∈ DβF ([a, b]). Then we claim that cn, dn ∈ DF (Dβ

F ([a, b])).

For, otherwise there is open U 3 cn such that F ∈ AC∗(U ∩DβF ([a, b])) and

hence F ∈ AC∗(U ∩ DβF ([a, b])) by Proposition 126 (vii). Since F ∈ AC∗(U ∩

DβF ([a, b])), choose δ > 0 corresponding to ε = 1

2in the definition of AC∗(K)

from Definition 121 (ii). Since U is open and intersects C, and since C is per-

fect and nowhere dense, U contains infinitely many intervals (c`, d`) and hence

263

an interval (c`, d`) with d` − c` < δ. By equation 4.12, choose a subsequence

(c`1, d`1), . . . , (c`M , d`M) such that∑M

m=1 ω(F`m, [c`m, d`m]) > 12. But this is a con-

tradiction, since (c`1, d`1), . . . , (c`M , d`M) is a U ∩ DβF ([a, b])-edged sub-partition

with∑M

m=1 d`m − c`m ≤ d` − c` < δ. Hence in fact cn, dn ∈ DF (DβF ([a, b])) for

all n which of course implies that a, b ∈ DF (DβF ([a, b])) = Dα

F ([a, b]), since there

is a subsequence of the cn converging to a and likewise a subsequence of the dn

converging to b.

Finally, note that ω(F, [a, b]) > 0 since 0 < 12

= ω(F1,1, [c1,1, d1,1]) ≤ ω(F, [a, b]).

Hence, to ensure that for any r > 0 we can obtain ω(F, [a, b]) = r, simply multi-

ply f by rω(F,[a,b])

if need be. Here we are appealing to Proposition 162 (viii), which

says that multiplying by non-zero scalars does not affect the closed sets DαF [a, b].

Suppose that α < ω1 is a limit ordinal. Let αn be an enumeration of the

ordinals less than α. Let w be the midpoint of [a, b]. Choose un ↘ a+ from above

with u0 = w and vn ↗ b− from below with v0 = w. Choose h : ω → ω such

that h−1(n) is infinite for all n. Choose fn ∈ Den[un+1, un] with Fn(x) =∫ xun+1

fn

and∫ unun+1

f = 0 and f(un+1) = f(un) = 0 and un+1, un ∈ Dαh(n)

Fn([un+1, un]) and

ω(Fn, [un+1, un]) = 1n. Likewise, choose gn ∈ Den[vn, vn+1] with Gn(x) =

∫ xvnfn

and∫ vn+1

vngn = 0 and gn(vn) = gn(vn+1) = 0 and vn+1, vn ∈ D

αh(n)

Gn([vn, vn+1]) and

ω(Gn, [vn, vn+1]) = 1n.

Let f = fn on [un+1, un] and f = gn on [vn, vn+1] and f(a) = f(b) = 0.

Since ω(Fn, [un+1, un]) = ω(Gn, [vn, vn+1]) = 1n, we claim that f ∈ Den[a, b] with∫ b

af = 0 by the Improper Integrals Lemma 143. For, to apply this lemma in this

way, it must be shown that limc↘a+

∫ wcf and limc↗b−

∫ caf exist and are equal

to zero, where recall that w is the midpoint of [a, b]. Without loss of generality,

consider the case of the first limit limc↘a+

∫ wcf . Let ε > 0. Choose N such that

264

1N< ε and set δ = uN − a. Suppose that 0 < c − a < δ, so that a < c < uN .

Let n ≥ N such that a < un+1 ≤ c < un ≤ uN . Since ω(Fn, [un+1, un]) = 1n

and∫ uiui+1

f = 0 , it follows that

∣∣∣∣∫ w

c

f

∣∣∣∣ ≤ ∣∣∣∣∫ un

c

f

∣∣∣∣+n−1∑i=0

∣∣∣∣∫ ui

ui+1

f

∣∣∣∣ ≤ 1

n+ 0 ≤ 1

N< ε (4.13)

Hence, in fact f ∈ Den[a, b] with∫ baf = 0 by the Improper Integrals Lemma 143,

and so we define F (x) =∫ xaf .

To show that a ∈ DαF ([a, b]), it suffices to show that a ∈ Dαn

F ([a, b]) for all

n. So, fixing n and recalling that h−1(n) is infinite, choose sequence unk ↘ a+

from above such that unk ∈ DαnFnk

([unk+1, unk ]). Since unk ∈ DαnFnk

([unk+1, unk ])

and DαnFnk

([unk+1, unk ]) = DαnF �[unk+1,unk ]([unk+1, unk ]) ⊆ Dαn

F ([a, b]) respectively by

Proposition 162 (vii) and Proposition 159 (i), it follows that unk ∈ DαnF ([a, b]).

Since unk ↘ a+ from above, it follows that a ∈ DαnF ([a, b]). Since the αn enu-

merated the ordinals below the limit ordinal α, it follows that a ∈ DαF ([a, b]). An

analogous argument shows that b ∈ DαF ([a, b]).

Finally, note that ω(F, [a, b]) > 0 since 0 < 1 = ω(F1, [u1, u0]) ≤ ω(F, [a, b]).

Hence, to ensure that for any r > 0 we can obtain ω(F, [a, b]) = r, simply multi-

ply f by rω(F,[a,b])

if need be. Here again we are appealing to Proposition 162 (viii),

which says that multiplying by non-zero scalars does not affect the closed sets

DαF [a, b].

Corollary 166. For all α < ω1 there is f ∈ Den[a, b] with F (x) =∫ xaf and

α < |F | ≤ |f, F |.

Proof. For each α < ω1, choose f ∈ Den[a, b] with F (x) =∫ xaf from the pre-

265

vious proposition so that DαF ([a, b]) 6= ∅. By Proposition 163, we have that

D∞F ([a, b]) = ∅ and D∞f,F ([a, b]) = ∅. Since DαF ([a, b]) 6= ∅, we have α < |F |.

By Proposition 162 (vi), we have |F | ≤ |f, F |.

4.3.2 Totalization: Calibrating Rank and Entry into Subspaces

Remark 167. Recall that in Remark 154, we noted that we would be working

with the less traditional subsets Den∗α[a, b] as opposed to the more traditional

subsets Denα[a, b]. In this section, we prove that the normal totalization result

for Denα[a, b] (in particular, the equivalence of (i) and (iv) in Corollary 174) also

holds for Den∗α[a, b] (in particular, the equivalence of (i) and (ii) in Corollary 174).

Hence, the subsets Den∗α[a, b] can be regarded as constituting a reasonable variant

on Denjoy totalization. Further, as indicated previously in Remark 154, it is the

subsets Den∗α[a, b] which seem more amenable to a descriptive set theory analysis

(cf. Corollary 196). In this section, the groundwork for this analysis is laid, in

that we precisely calibrate entry into Den∗α[a, b] with the vanishing of the derivative

Df,F introduced in the previous section (cf. Theorem 172). Later in § 4.3.3, it

will be proven that this derivative is Borel (cf. Corollary 192), which will allow

us to characterize the descriptive set-theory complexity of the subsets Den∗α[a, b]

(cf. Corollary 196).

Proposition 168. Suppose f ∈ M [a, b] and K ∈ K[a, b]. If Df (K) = ∅ then

fχK ∈ L1[a, b].

Proof. If Df (K) = ∅ then for every x ∈ K there is open nbhd Ux 3 x such

that fχUx∩K ∈ L1[a, b]. By the compactness of K, there is a finite subcov-

ering Ux1 , . . . , UxN . Then |fχK | ≤∑N

i=1

∣∣∣fχUxi∩K∣∣∣ and∣∣∫ fχK∣∣ ≤ ∫

|fχK | ≤∑Ni=1

∫ ∣∣∣fχUxi∩K∣∣∣ <∞, and so fχK ∈ L1[a, b].

266

Remark 169. In the estimates in the proof of the above proposition, it is im-

portant to note that we are working with the Lebesgue integral, since in general

it is not true that the absolute value of a Denjoy integrable function is Denjoy

integrable (cf. Swartz [142] Example 12 pp. 18-19). Indeed, it can be shown that

the Denjoy integrable functions whose absolute values are Denjoy integrable are

exactly the Lebesgue integrable functions (cf. Peng-Yee [121] p. 22).

Proposition 170. Suppose F ∈ C[a, b] and K ∈ K[a, b]. If DF (K) = ∅ then

F ∈ AC∗(K).

Proof. If DF (K) = ∅ then for every x ∈ K there is (cx, dx) 3 x such that

F ∈ AC∗((cx, dx) ∩K). By the compactness of K, there is a finite subcovering

(c1, d1), . . . , (cN , dN) such that F ∈ AC∗((ci, di) ∩K). Define ai = inf((ci, di)∩K)

and bi = sup((ci, di) ∩ K) so that ai, bi ∈ (ci, di) ∩K. Let η > 0 be strictly less

than all the nonzero |ai − bj|, |bi − aj| for i 6= j. Let ε > 0 and choose δi > 0 such

that for every (ci, di) ∩K-edged sub-partition D of [a, b] if∑

J∈D µ(J) < δi then∑J∈D ω(F, J) < N−1 · ε. Choose δ > 0 such that δ < δi and δ < η. Suppose that

D is an K-edged sub-partition of [a, b] with∑

J∈D µ(J) < δ.

First we establish the claim that if some closed interval J ∈ D is not (cj, dj)∩K-

edged for any j, then there are non-overlapping closed intervals IJ , LJ such that

J = IJ ∪LJ and IJ is (ci, di) ∩K-edged and LJ is (ck, dk) ∩K-edged for some i 6=

k. So suppose that J ∈ D is not (cj, dj)∩K-edged for any j. Then for some i 6= k

we have min(J) ∈ (ci, di) ∩K and max(J) ∈ (ck, dk) ∩K such that min(J) ≤ ck

and di ≤ max(J) and ai ≤ min(J) ≤ bi and ak ≤ max(J) ≤ bk. If bi < ak then

i 6= k implies η < ak − bi ≤ max(J) − min(J) = µ(J) < δ < η. Hence ak ≤ bi.

It suffices to show that bi ∈ (ck, dk) ∩K since then we may set IJ = [min(J), bi]

and LJ = [bi,max(J)]. If ak = bi then bi = ak ∈ (ck, dk) ∩K. If ak < bi then

267

ck ≤ ak < bi ≤ di ≤ max(J) < dk and so bi ∈ (ck, dk). Since bi = sup((ci, di)∩K),

choose a sequence xn ∈ ((ci, di) ∩K) which converges upwards to bi, so that the

sequence xn is eventually in (ck, dk) and hence bi ∈ (ck, dk) ∩K. Hence our claim

is established.

Let K be a [c, d] ∩K-edged sub-partition of [a, b] which (i) contains J where

J ∈ D is an (cj, dj) ∩ K-edged for some j, and which (ii) contains IJ , LJ where

J ∈ D is not (cj, dj) ∩K-edged for any j. Then for every J ∈ K there is some j

such that J is (cj, dj) ∩K-edged. Let Kj be an (cj, dj) ∩K-edged sub-partition

of [a, b] which consists of those J ∈ K such that J is (cj, dj) ∩K-edged. Then

Kj is an (cj, dj) ∩K-edged sub-parition of [a, b] such that∑

J∈Kj µ(J) < δ <

δj so that∑

J∈Kj ω(F, J) < N−1 · ε. Then∑

J∈D ω(F, J) ≤∑

J∈K ω(F, J) =∑Nj=1

∑J∈Kj ω(F, J) <

∑Nj=1N

−1ε = ε.

Corollary 171. Suppose f ∈M [a, b], F ∈ C[a, b], and K ∈ K[a, b].

(i) If Dα+1f (K) = ∅ then fχDαf (K) ∈ L1[a, b].

(ii) If Dα+1F (K) = ∅ then F ∈ AC∗(Dα

F (K)).

(iii) If Dα+1f,F (K) = ∅ then fχDαf,F (K) ∈ L1[a, b] and F ∈ AC∗(Dα

f,F (K)).

Theorem 172. For all f ∈ Den[a, b] with F (x) =∫ xaf and all α < ω1 and all

[c, d] ⊆ [a, b] we have Dα+1f,F [c, d] = ∅ if and only if fχ[c,d] ∈ Den∗α[a, b].

Proof. Suppose that α = 0. First suppose that Dα+1f,F [c, d] = ∅. By Corol-

lary 171 (iii), we have that fχ[c,d] ∈ L1[a, b] = Den∗0[a, b]. Second, suppose that

fχ[c,d] ∈ Den∗0[a, b] = L1[a, b]. By the Fundamental Theorem of Calculus for

Lebesgue Integrals (Theorem 128), F ∈ AC([c, d]) and hence F ∈ AC∗([c, d]) by

Proposition 126 (i). Then Dα+1f,F ([c, d]) = Df,F ([c, d]) = ∅.

268

Suppose now that α > 0. First suppose that Dα+1f,F ([c, d]) = ∅. By Corol-

lary 171 (iii), we have that fχDαf,F ([c,d]) ∈ L1[a, b] and F ∈ AC∗(Dαf,F ([c, d])).

Suppose (c, d)−Dαf,F ([c, d]) =

⊔n(cn, dn). If [c′, d′] ⊆ (cn, dn), then

Dαf,F ([c′, d′]) ⊆ [c′, d′] ⊆ (cn, dn) ⊆ (c, d)−Dα

f,F ([c, d]) (4.14)

Hence Dαf,F ([c′, d′]) = ∅. Then there is β < α such that Dβ+1

f,F ([c′, d′]) = ∅ and

hence by induction hypothesis fχ[c′,d′] ∈ Den∗β[a, b]. Hence, since we are sup-

posing that f ∈ Den[a, b] it follows from the left-to-right direction of the Im-

proper Integrals Lemma 143 that fχ[cn,dn] ∈ Lim(⋃β<α Den∗β[a, b]). Since by def-

inition we have (c, d) − Dαf,F ([c, d]) =

⊔n(cn, dn) and since we have already es-

tablished that F ∈ AC∗(Dαf,F ([c, d])), it follows from Definition 150 that fχ[c,d] ∈

Leb∗(Lim(⋃β<α Den∗β[a, b])) = Den∗α[a, b].

Second, suppose that fχ[c,d] ∈ Den∗α[a, b] = Leb∗(Lim(⋃β<α Den∗β[a, b])). By

Definition 150, there is a closed set K ∈ K[a, b] with (a, b) − K =⊔n(cn, dn)

such that fχ[c,d]χK ∈ L1[a, b] and fχ[c,d]χ(cn,dn) ∈ Lim(⋃β<α Den∗β[a, b]) and G ∈

AC∗(K) where G(x) =∫ xafχ[c,d]. Note that by Proposition 126 (vi), we may

assume without loss of generality that a, b ∈ K. Note that on [c, d] we have

F (x) =∫ xaf =

∫ xafχ[c,d] +

∫ caf = G(x) +

∫ caf , so that G and F differ by the

constant∫ caf on [c, d]. Since G ∈ AC∗(K) implies G ∈ AC∗(K ∩ [c, d]) (cf.

Proposition 126 (vii)), and since G and F differ by a constant on [c, d], it follows

from Proposition 126 (iv) & (viii) that F ∈ AC∗(K ∩ [c, d]).

Further, since fχ[c,d]χ(cn,dn) ∈ Lim(⋃β<α Den∗β[a, b]), it follows from Defini-

tion 144 that (a, b) =⋃m(cnm, dnm) and fχ[c,d]χ(cn,dn)χ(cnm,dnm) ∈ Den∗βnm [a, b] for

some βnm < α. Let [c′nm, d′nm] = [c, d] ∩ [cn, dn] ∩ [cnm, dnm], so that fχ[c′nm,d

′nm] ∈

Den∗βnm [a, b] for some βnm < α. By induction hypothesis, Dβnm+1f,F ([c′nm, d

′nm]) = ∅

269

and so Dαf,F ([c′nm, d

′nm]) = ∅. Since a, b ∈ K, it follows that

[c, d] ⊆ (K ∩ [c, d]) ∪ (a, b)−K = (K ∩ [c, d]) ∪⋃nm

(cn, dn) ∩ (cnm, dnm) (4.15)

and by successively applying this, Proposition 159 (ii), and Proposition 159 (i),

we obtain

Dαf,F ([c, d]) ⊆ (K ∩ [c, d]) ∪

⋃nm

Dαf,F ([c, d]) ∩ (cn, dn) ∩ (cnm, dnm) (4.16)

⊆ (K ∩ [c, d]) ∪⋃nm

Dαf,F ([c, d] ∩ (cn, dn) ∩ (cnm, dnm)) (4.17)

⊆ (K ∩ [c, d]) ∪⋃nm

Dαf,F ([c, d] ∩ [cn, dn] ∩ [cnm, dnm]) (4.18)

= (K ∩ [c, d]) ∪⋃nm

Dαf,F ([c′nm, d

′nm]) (4.19)

= (K ∩ [c, d]) (4.20)

From this and the fact that fχ[c,d]χK ∈ L1[a, b] and F ∈ AC∗(K ∩ [c, d]) it follows

that

Dα+1f,F ([c, d]) ⊆ Df,F (K ∩ [c, d]) = Df (K ∩ [c, d]) ∪DF (K ∩ [c, d]) = ∅ (4.21)

which is what we wanted to establish.

Corollary 173. Suppose that f ∈ Den[a, b] and F (x) =∫ xaf . Then

(i) D∞f,F ([a, b]) = ∅ =⇒ f ∈ Den∗|f,F |[a, b]

(ii) f ∈ Den∗α[a, b] and F (x) =∫ xaf =⇒ |f, F | ≤ α

(iii) Den∗ω1[a, b] = Den[a, b].

270

(iv) Den∗α[a, b] ⊆ Den∗β[a, b] ( Den[a, b] for all α < β < ω1

Proof. For (iii), this follows from the left-to-right direction of the previous theorem

and Proposition 163. For (iv), this follows from the right-to-left direction of

previous theorem and Corollary 166.

Corollary 174. Let f ∈ M [a, b] and F ∈ C[a, b] with F (a) = 0. Then the

following are equivalent:


(ii) There is α < ω1 such that f ∈ Den∗α[a, b] and F (x) =∫ xaf

(iii) There is α < ω1 such that f ∈ 〈Den∗α[a, b]〉 and F (x) =∫ xaf

(iv) There is α < ω1 such that f ∈ Denα[a, b] and F (x) =∫ xaf

(v) There is α < ω1 such that f ∈ 〈Denα[a, b]〉 and F (x) =∫ xaf

(vi) There is α < ω1 such that Dα+1f,F ([a, b]) = ∅ and F ′ = f a.e.

(vii) There is α < ω1 such that Dα+1F ([a, b]) = ∅ and F ′ = f a.e.

Proof. The proof proceeds by showing (i) ⇒ (ii) ⇒ (vi) ⇒ (vii) ⇒ (i) and

(ii)⇒ (iii)⇒ (i) and (ii)⇒ (iv)⇒ (i) and (iii)⇒ (v)⇒ (i).

For (ii) ⇒ (iv) ⇒ (i) and (iii) ⇒ (v) ⇒ (i), note that this follows trivially

from the fact that Den∗α[a, b] ⊆ Denα[a, b] ⊆ Den[a, b] by Proposition 157.

For (i) ⇒ (ii), note that this follows from item (iii) of the previous corollary.

For (ii) ⇒ (vi), note that this follows from the right-to-left direction of of Theo-

rem 172 and Proposition 140 (vi). For (vi)⇒ (vii), note that by Proposition 162,

Dα+1F ([a, b]) ⊆ Dα+1

f,F ([a, b]) = ∅. For (vii)⇒ (i), by Proposition 159 (vii), there are

271

En ∈ K[a, b] such that [a, b] =⋃nEn and F ∈ AC∗(En), so that F ∈ ACG∗[a, b],

and hence f ∈ Den[a, b] and F (x) =∫ xaf by Definition 132 and the subsequent

remark.

For (ii)⇒ (iii), note that this follows trivially from the fact that 〈Den∗α[a, b]〉

contains Den∗α[a, b]. For (iii)⇒ (i), this follows from the fact that 〈Den∗α[a, b]〉 is

a subspace of Den[a, b] (cf. Proposition 157).

Remark 175. One sees from the previous corollary that there is an asymmetry

between the derivatives DαF (K) and Dα

f (K). The following proposition shows that

this asymmetry is in a certain sense necessary. The construction below is based

on Gordon [56] Exercise 9 p. 119, which was used to produce an example of a

non-Denjoy integrable function, and so what we do is essentially just verify that

that we can obtain Dα+1f ([a, b]) = ∅ with this construction.

Proposition 176. Let f ∈ M [a, b] and F ∈ C[a, b] with F (a) = 0. Then the

following are not equivalent, and in particular, while (i) implies (ii), it is not the

case that (ii) implies (i):


(ii) There is α < ω1 such that Dα+1f ([a, b]) = ∅ and F ′ = f a.e.

Proof. That (i) implies (ii) follows immediately from Proposition 163 (i) and

Proposition 140 (vi). To see that (ii) does not imply (i), let C be the Cantor

1/3-set on [a, b] with (a, b) − C =⊔n>0(cn, dn). Define F = 0 on C and on

[cn, dn] define F to be everywhere differentiable with F (cn) = F (dn) = 0 and

ω(F, [cn, dn]) = 2−k if dn − cn = 3−k. Then F is differentiable a.e. and hence we

can choose f ∈ M [a, b] such that F ′ = f a.e. Further, consider the closed sets

E−1 = {a}, E0 = {b}, En = [cn, dn], so that [a, b] =⋃nEn. Then fχEn ∈ L1[a, b]

272

for each n ≥ −1, so that D∞f ([a, b]) = ∅ by Proposition 159 (v), and hence

Dαf ([a, b]) = ∅ for some α < ω1. Hence, the example of f and F satisfies the

hypotheses of (ii).

To see that this example does not satisfy the hypotheses of (i), it suffices to

show that F /∈ ACG∗[a, b]. So suppose that F ∈ ACG∗[a, b] with [a, b] =⋃nKn

and F ∈ AC∗(Kn). Then C∩ [a, b] =⋃nC∩Kn. By the Baire Category Theorem,

there is n and open U such that C ∩ U 6= ∅ and C ∩ U ⊆ C ∩ Kn. Since

F ∈ AC∗(Kn), we have F ∈ AC∗(C ∩ Kn) by Proposition 126 (vii). Let δ > 0

correspond to ε = 12

from the Definition of F ∈ AC∗(C∩Kn) in Definition 121 (ii).

Choose a ball V of radius < δ2

such that C ∩ V 6= ∅ and C ∩ V ⊆ C ∩Kn. Then

there is N > 0 and a C ∩ Kn-edged sub-partition DN = {(cnk,j , dnk,j) : 0 <

k < 2N , 0 < j < 2k, dnk,j − cnk,j = 3−N−k} such that J ∈ DN implies J ⊆ V .

That is, just as in the Cantor set there is one middle third, two “middle” ninths,

four “middle” 1/27ths, etc. so within V there will be one “middle” 3−N -ths,

two “middle” 3−N−1-ths, four “middle” 3−N−2-ths, etc. Then∑

J∈DN ω(F, J) =∑2N

k=1

∑2k

j=1 2−N−k =∑2N

k=1 2k2−N−k =∑2N

k=1 2−N = 2N2−N = 1. This contradicts

that∑

J∈DN µ(J) < δ.



(i) f ∈ Den∗α[a, b] and F (x) =∫ xaf

(ii) Dα+1f,F ([a, b]) = ∅ and F ′ = f a.e.

Proof. (i) ⇒ (ii). By the right-to-left direction of Theorem 172 and Proposi-

tion 140 (vi). (ii) ⇒ (i). By the equivalence of (vi) and (i) in Corollary 174,

we have that f ∈ Den[a, b] and F (x) =∫ xaf . By the left-to-right direction of

Theorem 172, we have that f ∈ Den∗α[a, b].

273



(i) f ∈ 〈Den∗α[a, b]〉 and F (x) =∫ xaf

(ii) There are f1, . . . , fn ∈ M [a, b] and F1, . . . , Fn ∈ C[a, b] with Fi(a) = 0 such

that Dα+1fi,Fi

([a, b]) = ∅ and F ′i = fi a.e. and f =∑n

i=1 fi and F =∑n

i=1 Fi

Proof. By the previous corollary and Proposition 157.

4.3.3 Definability: The Derivatives are Borel

Remark 179. The goal of this subsection is to prove that the three derivatives

Df (K), DF (K) and Df,F (K) introduced in § 4.3.1 and Definition 160 are Borel,

in the sense that as maps from the Polish space K[a, b] to K[a, b] they are Borel.

Hence, after briefly recalling the Polish structure on K[a, b], C[a, b], and M [a, b],

the goal here is to show that these three derivatives are Borel (cf. Corollar-

ies 188, 191, and 192). Since these derivatives are Borel, an important theorem

linking coanalyticity and the vanishing of Borel derivatives may be applied (cf.

Kechris [84] Theorem 34.10 and Exercise 34.13). In particular, using this theorem,

it can be shown that the relation f ∈ Den[a, b] and F (x) =∫ xaf is coanalytic but

not analytic on M [a, b]×C[a, b] (cf. Corollary 195), and likewise it is shown that

F ∈ ACG∗[a, b] is coanalytic but not analytic on C[a, b] (cf. Corollary 197 and

Figure 4.1).

Remark 180. Recall that K[a, b], the space of compact (or closed) subsets of

[a, b], is a Polish space, where the topology is generated by the “miss” sets {K ∈

K[a, b] : K∩U c = ∅} and the “hit” sets {K ∈ K[a, b] : K∩U 6= ∅}, where U ⊆ [a, b]

is open (cf. Kechris [84] § 4.F pp. 24 ff). Likewise C[a, b], the space of continuous

274

real-valued functions on [a, b], is Polish space, where the topology is given by the

sup-metric ‖F − G‖u = sup{x ∈ [a, b] : |F (x)−G(x)|} (cf. Kechris [84] § 4.E

p. 24).

Remark 181. The Polish space structure on M [a, b], the space of real-valued

measurable functions on [a, b] (where functions which are equal a.e. are identified),

is less familiar. It is given by the following metric:

d(f, g) = inf{ε > 0 : µ({x ∈ [a, b] : |f(x)− g(x)| > ε}) < ε} (4.22)

This metric is defined so that fn → f in M [a, b] if and only if fn → f in measure,

that is limn µ({x ∈ [a, b] : |fn(x)− f(x)| > ε}) = 0 for all ε > 0 (cf. Folland [43]

§ 2.4 pp. 60 ff and Doob [27] § V.12 pp. 67 ff). This Polish space does not appear

in standard references on descriptive set theory, such as Kechris [84]. Hence, for

the sake of completeness, we record here a proof that M [a, b] is a Polish space,

where this proof is based on Doob [27] § 12 pp. 67-68, but where we key whatever

results we can to the standard real analysis textbook Folland [43].

Proposition 182. M [a, b] is a Polish space, and in particular, the countable

dense set can be taken to be the rational-valued simple functions formed from

open intervals with rational endpoints.

Proof. For completeness, see Folland [43] Theorem 2.30. For separability, fix ε > 0

and choose a sequence of simple functions sn which converge a.e. to f (see Fol-

land [43] Theorem 2.10b). By Ergoroff’s Theorem [43] 2.33, there is a a measur-

able set E such that µ(E) < ε and sn → f uniformly on Ec. Choose N such

that |sN − f | < ε on Ec. Then µ({x ∈ [a, b] : |sN − f | > ε}) ≤ µ(E) < ε.

Choose a sequence ϕm of rational-valued simple functions formed from open in-

275

tervals with rational endpoints such that ϕm → sN a.e. (see Folland [43] Theo-

rem 2.26 & Corollary 2.32). By Ergoroff’s Theorem [43] 2.33, there is a a mea-

surable set D such that µ(D) < ε and ϕm → sN uniformly on Dc. Choose M

with |ϕM − sN | < ε on Dc. Then µ({x ∈ [a, b] : |ϕM − sN | > ε}) ≤ µ(D) < ε and

µ({x ∈ [a, b] : |f − ϕM | > 2ε}) ≤ µ({x ∈ [a, b] : |f − sN | > ε}) + µ({x ∈ [a, b] :

|sN − ϕM | > ε}) < 2ε. Then d(ϕM , f) ≤ 2ε. Hence, the countable dense set can

be chosen to be the rational-valued simple functions formed from open intervals

with rational endpoints.

Proposition 183. The map E 7→ χE from K[a, b] into M [a, b] is Borel.

Proof. By the previous proposition, it suffices to show that the set

Xf = {D ∈ K[a, b] : d(χD, f) < ε} (4.23)

is Borel in K[a, b] for every rational-valued simple function f ∈ M [a, b] formed

from open intervals with rational endpoints. However, note that

Xf =⋃

r∈(0,ε)∩Q

Xf,r (4.24)

where we define

Xf,r = {D ∈ K[a, b] : µ({x ∈ [a, b] : |χD − f | > r}) < r} (4.25)

To see equation (4.24), note that the right-to-left containment follows trivially

from the definition of the metric in equation (4.22). To see the left-to-right con-

tainment, suppose that d(χD, f) < ε. Then there is η such that d(χD, f) ≤ η < ε

and µ({x ∈ [a, b] : |χD − f | > η}) < η. Choose a strictly decreasing se-

276

quence of rational values rn which converge to η from above and which are all

strictly less than ε: that is, rn ↘+ η and η < rn+1 < rn < ε and rn ∈ Q.

Then {x ∈ [a, b] : |χD − f | > rn} ⊆ {x ∈ [a, b] : |χD − f | > rn+1} and

{x ∈ [a, b] : |χD − f | > η} =⋃n{x ∈ [a, b] : |χD − f | > rn}. Then

limnµ({x ∈ [a, b] : |χD − f | > rn}) = µ({x ∈ [a, b] : |χD − f | > η}) < η = lim

nrn

(4.26)

Then 0 < limn[rn − µ({x ∈ [a, b] : |χD − f | > rn})], and hence there is N > 0

such that 0 < rN − µ({x ∈ [a, b] : |χD − f | > rN}) or µ({x ∈ [a, b] : |χD − f | >

rN}) < rN , and hence D ∈ Xf,rN as defined in equation (4.25).

Hence, it suffices to show that the sets Xf,r from equation (4.25) are Borel,

where f is a fixed rational-valued simple function formed from open intervals with

rational endpoints and where likewise the rational r is fixed. Since functions in

M [a, b] are identified a.e., we have that f =∑N

i=1 biχ[ai,ai+1] where a = a1 < a2 <

· · · < aN+1 = b. Define constants ci by ci = 1 if |1− bi| > r and ci = 0 otherwise,

and likewise define constants di by di = 1 if |bi| > r and di = 0 otherwise. Note

that the values ai, bi, ci, di depend only on f and r, which are fixed. Then note

277

that for arbitrary D ∈ K[a, b], we have

µ({x ∈ [a, b] : |χD(x)− f(x)| > r}) =

N∑i=1

µ({x ∈ [ai, ai+1] : |χD(x)− f(x)| > r}) =

N∑i=1

µ({x ∈ [ai, ai+1] : |χD(x)− bi| > r}) =

N∑i=1

µ({x ∈ [ai, ai+1] ∩D : |χD(x)− bi| > r})

+ µ({x ∈ [ai, ai+1] \D : |χD(x)− bi| > r}) =

N∑i=1

µ({x ∈ [ai, ai+1] ∩D : |1− bi| > r}) + µ({x ∈ [ai, ai+1] \D : |bi| > r}) =

N∑i=1

ciµ([ai, ai+1] ∩D) + diµ([ai, ai+1] \D) =

N∑i=1

(ci − di)µ([ai, ai+1] ∩D) + di(ai+1 − ai) (4.27)

Then it follows from the definition in equation (4.25) that

Xf,r = {D ∈ K[a, b] :N∑i=1

(ci−di)µ([ai, ai+1]∩D) < r− [N∑i=1

di(ai+1−ai)]} (4.28)

But this is Borel, since the maps (D,E) 7→ D ∩ E and E 7→ µ(E) are Borel (cf.

Kechris [84] Exercise 11.4 ii p. 71 and Exercise 17.29 p. 114).

Remark 184. By e.g. Folland [43] p. 63, addition and multiplication are contin-

uous functions on M [a, b]. Note also that absolute value is continuous on M [a, b]

since if fn → f in measure, then |fn| → |f | in measure because {x ∈ [a, b] :

||fn(x)| − |f(x)|| ≥ ε} ⊆ {x ∈ [a, b] : |fn(x)− f(x)| ≥ ε}.

Proposition 185. The following sets are Borel:

278

(i) {f ∈M [a, b] : f ≥ 0}

(ii) {f ∈M [a, b] : ∃ measurable E ⊆ [a, b] f = χE}

(iii) {(f, g) ∈ (M [a, b])2 : ∃ disjoint measurable D,E ⊆ [a, b] f = χD, g = χE}

(iv) {(f, g) ∈ (M [a, b])2 : ∃ measurable E ⊆ [a, b] f = χE, g = χ[a,b]\E}

(v) {(f, r) ∈M [a, b]× R : ∃ measurable E ⊆ [a, b] f = χE &∫|f | > r}

(vi) {(f, r) ∈M [a, b]× R : ∃ measurable E ⊆ [a, b] f = χE &∫|f | < r}

(vii) {(f, r) ∈M [a, b]× R : ∃ measurable E ⊆ [a, b] f = χE &∫|f | = r}

Proof. In the proof of this proposition, it is helpful to keep in mind that elements of

the Polish space M [a, b] are identified when they are a.e. equal. For (i), note that

f ≥ 0 a.e. if and only if |f | = f a.e., and recall that absolute value is continuous

by Remark 184. For (ii), note that f is a.e. a characteristic function if and only

if f · f = f a.e., and recall that multiplication is continuous by Remark 184. For

(iii), note that two characteristic functions f, g represent disjoint sets a.e. if and

only if f · g = 0 a.e. Likewise, for (iv), note that two characteristic functions f, g

represent complementary sets a.e. if and only if f · g = 0 a.e. and f + g = 1 a.e.,

where recall that addition and multiplication are continuous by Remark 184.

For (v), note that if f is a.e. the characteristic function χE, then∫|f | = µ(E).

Further µ(E) > r if and only if there is a closed set K ∈ K[a, b] such that K ⊆ E

and µ(K) > r. That is, µ(E) > r if and only if there is a closed set K ∈ K[a, b]

such that 0 ≤ χE − χK a.e. and µ(K) > r, which is analytic by the previous

proposition and Kechris [84] Exercise 17.29 p. 114. Hence, by Souslin’s Theorem

279

([84] Theorem 14.11), it suffices to show that relation µ(E) ≤ r is analytic. But

note that µ(E) ≤ r if and only if (b− a)− µ([a, b] \E) ≤ r, which happens if and

only if µ([a, b] \ E) > q for every rational q < (b − a) − r, which is analytic by

part (iv) of this proposition and what was said previously in this paragraph.

For (vi), it again suffices to note that µ(E) < r if and only if (b−a)−µ([a, b]\

E) < r if and only if (b − a) − r < µ([a, b] \ E), so that the result follows from

parts (iv)-(v) of this proposition. Finally, for (vii), note that µ(E) = r if and only

if for every rational q > 0, it follows that r− q < µ(E) < r+ q, so that the result

follows immediately from parts (v)-(vi).

Proposition 186. L1[a, b] is Borel in M [a, b].

Proof. First note that by the previous proposition, we can talk of rational-valued

simple functions as finite sums of pairwise disjoint characteristic functions mul-

tiplied by a rational value. Hence on the one hand, f ∈ L1[a, b] if and only if

there is rational M > 0 such that for any rational-valued simple function ϕ with

|ϕ| ≤ |f |, it is the case that∫|ϕ| < M , and so L1[a, b] is co-analytic in M [a, b]. On

the other hand, to show that L1[a, b] is analytic in M [a, b], it suffices to show that

f ∈ L1[a, b] if and only if there is a sequence of rational-valued simple functions

|ϕn| → |f | in M [a, b] with |ϕn| ≤ |ϕn+1| ≤ |f | a.e. such that limn

∫|ϕn| exists and

is finite.

To see this equivalence, first suppose that f ∈ L1[a, b]. Then |f | ∈ L1[a, b] and

hence choose a sequence of rational-valued simple functions 0 ≤ ϕn ≤ ϕn+1 ≤ |f |

and ϕn → f a.e. Let ψn = |f | − ϕn. Then ψn ∈ L1[a, b] and ψn → 0 a.e.

and |ψn| ≤ |f | + |ϕn| ≤ 2 |f |. By the Dominated Convergence Theorem (Fol-

land [43] Theorem 2.24 p. 54), we have that 0 =∫ ba

0 = limn

∫ ba(|f | − ϕn). Hence

limn

∫ baϕn =

∫ ba|f | and so limn

∫ baϕn exists. Further, ϕn → |f | in L1[a, b] and

280

so ϕn → |f | in M [a, b] by Folland [43] Proposition 2.29 p. 61. Second, sup-

pose that there is a sequence of rational-valued simple functions |ϕn| → |f | in

M [a, b] with |ϕn| ≤ |ϕn+1| ≤ |f | a.e. such that limn

∫|ϕn| exists. By Fol-

land [43] Theorem 2.30, there is a subsequence ϕnk such that |ϕnk | → |f | a.e.

By the Monotone Convergence Theorem (Folland [43] Theorem 2.14), we have

that∫ ba|f | = limk

∫ ba|ϕnk | <∞, so that in fact f ∈ L1[a, b].

Proposition 187. Suppose that f ∈ M [a, b] and K ∈ K[a, b]. Then (p, q) ∩

Df (K) = ∅ if and only if for all rational [r, s] ⊆ (p, q) it is the case that fχ[r,s]∩K ∈

L1[a, b].

Proof. The left to right direction follows immediately from Corollary 171. For the

right-to-left direction, suppose for the sake of contradiction that x ∈ (p, q)∩Df (E).

Choose rational (r, s) 3 x such that [r, s] ⊆ (p, q). By hypothesis fχ[r,s]∩K ∈

L1[a, b] and so fχ(r,s)∩K ∈ L1[a, b]. But this contradicts that x ∈ Df (K).

Corollary 188. The map (f,K) 7→ Df (K) is Borel from M [a, b] × K[a, b] to

K[a, b].

Proof. First recall that being a Borel map is the same as having a Borel graph

(cf. Kechris [84] Theorem 14.12). Hence, it suffices to show that the following set

is Borel

G = {(f,K,E) ∈M [a, b]× (K[a, b])2 : E = Df (K)} (4.29)

But note that since K,E are closed sets, it follows that

G = {(f,K,E) ∈M [a, b]× (K[a, b])2 :∀ p < q in Q2

[(p, q) ∩ E = ∅ ⇐⇒ (p, q) ∩Df (K) = ∅]}

(4.30)

281

But the left-hand side of this biconditional is Borel in K[a, b] by definition of

the topology on K[a, b] (cf. Remark 180), while the right-hand side of this bi-

conditional is Borel in M [a, b]×K[a, b] by Proposition 187, Proposition 186, and

Proposition 183.

Proposition 189. The relation F ∈ AC∗(E) is Borel on C[a, b]×K[a, b].

Proof. By definition, F ∈ AC∗(E) if for every ε > 0 there is a δ > 0 such that for

all E-edged sub-partitions D of [a, b] if∑


J∈D ω(F, J) < ε.

Since F is continuous, Proposition 126 (iii) says that we may replace E by a

countable dense subset {dn(E)} of E. Further, maps dn : K[a, b]→ [a, b] may be

chosen to be Borel (see Kechris [84] Theorem 12.13 p. 76). Moreover, consider the

closed subset 4 = {(c, d) ∈ R×R : c ≤ d} which is thus a Polish space, and note

that the map (F, c, d) 7→ ω(F, [c, d]) from C([a, b])×4 to R is clearly a continuous

map. Finally ω<ω denote the set of finite strings of natural numbers, where |σ|

denotes the length of the string σ. Then it follows that

{(F,E) ∈ C[a, b]×K[a, b] : F ∈ AC∗(E)} =⋂

ε∈Q∩(0,∞)

⋃δ∈Q∩(0,∞)

⋃`>0

⋃|σ|=2`

Xε,δ,`,σ

(4.31)

where for σ = 〈σ(1), . . . , σ(2`)〉 in ω<ω of length 2` we define

Xε,δ,`,σ ={(F,E) ∈ C[a, b]×K[a, b]} :∧i=1

dσ(2i−1)(E) < dσ(2i)(E)

&`−1∧i=1

dσ(2i)(E) ≤ dσ(2i+1)(E)

&∑i=1

(dσ(2i)(E)− dσ(2i−1)(E)) < δ ⇒∑i=1

ω(F, [dσ(2i−1)(E), dσ(2i)(E)]) < ε}

(4.32)

282

Since the maps E 7→ dn(E) and (F, c, d) 7→ ω(F, [c, d]) are Borel, it thus follows

that Xε,δ,`,σ is Borel and hence that the relation F ∈ AC∗(E) is Borel.

Proposition 190. Suppose that F ∈ C[a, b] and K ∈ K[a, b]. Then (p, q) ∩

DF (K) = ∅ if and only if for all rational [r, s] ⊆ (p, q) it is the case that F ∈

AC∗([r, s] ∩K).

Proof. The left-to-right direction follows immediately from Corollary 171. For

the right-to-left direction, suppose for the sake of contradiction that x ∈ (p, q) ∩

DF (K). Choose rational (r, s) 3 x such that [r, s] ⊆ (p, q). By hypothesis F ∈

AC∗([r, s]∩K) and so F ∈ AC∗((r, s) ∩K). But this contradicts that x ∈ DF (K).

Corollary 191. The map (F,K) 7→ DF (K) is Borel from C[a, b] × K[a, b] to

K[a, b].

Proof. First recall that being a Borel map is the same as having a Borel graph

(cf. Kechris [84] Theorem 14.12). Hence, it suffices to show that the following set

is Borel

G = {(F,K,E) ∈ C[a, b]× (K[a, b])2 : E = DF (K)} (4.33)

But note that since K,E are closed sets, it follows that

G = {(F,K,E) ∈ C[a, b]× (K[a, b])2 :∀ p < q in Q2

[(p, q) ∩ E = ∅ ⇔ (p, q) ∩DF (K) = ∅]}

(4.34)

But the left-hand side of this biconditional is Borel in K[a, b] by definition of the

topology on K[a, b] (cf. Remark 180), while the right-hand side of this bicondi-

tional is Borel in C[a, b]×K[a, b] by Proposition 190, Proposition 189, and the fact

283

that the map (D,L) 7→ D∩L from K[a, b] to K[a, b] is Borel (cf. Kechris [84] Ex-

ercise 11.4 ii p. 71).

Corollary 192. The map (f, F,K) 7→ Df,F (K) is Borel from M [a, b]× C[a, b]×

K[a, b] to K[a, b].

Proof. This follows directly from the two previous corollaries, as well as the fact

that Df,F (K) = Df (K) ∪DF (K) (cf. Proposition 162 (i)) and the fact that the

map (D,E) 7→ D ∪ E is continuous (see Kechris [84] Exercise 4.29 iv p. 27).

Theorem 193. The following sets are co-analytic and the ranks |K|f , |K|F , and

|K|f,F are co-analytic ranks on these sets:

(i) {(f,K) ∈M [a, b]×K[a, b] : ∃ α < ω1 Dαf (K) = ∅}

(ii) {(F,K) ∈ C[a, b]×K[a, b] : ∃ α < ω1 DαF (K) = ∅}

(iii) {(f, F,K) ∈M [a, b]× C[a, b]×K[a, b] : ∃ α < ω1 Dαf,F (K) = ∅}

Proof. By the previous corollary, the derivativesDf , DF andDf,F are Borel deriva-

tives. The result then follows immediately from Kechris [84] Theorem 34.10 & Ex-

ercise 34.13.

Proposition 194. The class of (F, f) in C[a, b]×M [a, b] such that F is differen-

tiable a.e. and F ′ = f is Borel.

Proof. Let us define

C = {(F, f) ∈ C[a, b]×M [a, b] : F ′ exists a.e. & F ′ = f} (4.35)

B = {F ∈ C[a, b] : F ′ exists a.e.} (4.36)

284

Then the proposition requires us to prove that C is Borel. Note that C is the

graph of the function Γ : B → M [a, b] given by Γ(F ) = F ′. For fixed ε > 0 and

function f ∈M [a, b], define the sets

Bf = {F ∈ B : d(Γ(F ), f) < ε} (4.37)

Af = {F ∈ C[a, b] : d(F, f) < ε} (4.38)

where d is the metric on M [a, b] from equation (4.22). We claim that to show that

C is Borel, it suffices to show that (i) B is Borel, and that (ii) Bf is Borel for each

f ∈ M [a, b] from a countable dense set from M [a, b], and that (iii) Af is likewise

Borel for these same f ∈ M [a, b]. For, suppose that (i) and (ii) have both been

established. Define Γ : C[a, b]→M [a, b] by Γ(F ) = Γ(F ) on B and Γ(F ) = F on

C([a, b]) \ B. Then it follows from (i)-(iii) that Γ is a Borel map. For, suppose

that Uf = {g ∈M [a, b] : d(g, f) < ε} where f is an element of the countable dense

set, so that it suffices to show that Γ−1

(Uf ) is Borel. But this follows immediately

from our hypotheses (i)-(iii) and the equality

Γ−1

(Uf ) = Bf t (Af \B) (4.39)

But since being Borel is the same as having a Borel graph (cf. Kechris [84] Theo-

rem 14.12), it follows that Γ has a Borel graph, and hence we can infer that C is

Borel from the following equality and the hypothesis (i):

C = graph(Γ) ∩ (B ×M [a, b]) (4.40)

Hence, in fact, it suffices to establish (i)-(iii).

285

For (i), we must establish that B from equation (4.36) is Borel. To this end,

define

D = {(F, x) ∈ C[a, b]× [a, b] : F ′(x) exists} (4.41)

Further, for F ∈ C[a, b], x ∈ [a, b] and |h| > 0, define 4(F,x)(h) = F (x+h)−F (x)h

.

Then D is analytic, since for F ∈ C([a, b]) we have (where Q+ = Q ∩ (0,∞)

(F, x) ∈ D ⇐⇒

∃ L ∈ R ∀ ε ∈ Q+ ∃ δ ∈ Q+ ∀ |h| ∈ Q ∩ (0, δ)∣∣4(F,x)(h)− L

∣∣ < ε

(4.42)

Likewise, D is co-analytic, since for F ∈ C([a, b]) we have

(F, x) ∈ D ⇐⇒ ∀ hn, h′n → 0

[4(F,x)(hn),4(F,x)(h′n) Cauchy &

∣∣4(F,x)(hn)−4(F,x)(h′n)∣∣→ 0]

(4.43)

Hence, by Souslin’s Theorem ([84] Theorem 14.11, it follows that D is Borel. Since

D is Borel, the following set is Borel by Kechris [84] Theorem 17.25:

{F ∈ C([a, b]) : µ(DF ) = b− a} (4.44)

But this set is precisely equal to B, so that B too is Borel.

The proofs of (ii) and (iii) are nearly identical, and so we include only the

proof of (ii). For this, it must be shown that Bf from equation (4.37) is Borel for

f ∈ M [a, b] from some countable dense set in M [a, b]. So by Proposition 182, we

may suppose that f ∈ M [a, b] is a rational-valued simple functions formed from

286

open intervals with rational endpoints, so that f =∑N

i=1 biχ[ai,ai+1] where a =

a1 < a2 < · · · < aN+1 = b. Further note that, as in the proof of Proposition 183

(cf. equation (4.24)), it suffices to show that the following set is Borel:

Bf,r = {F ∈ B : µ({x ∈ [a, b] : |F ′(x)− f | > r}) < r} (4.45)

Then in analogue to equation (4.41), define

Di,r = {(F, x) ∈ C[a, b]× [ai, ai+1] : F ′(x) exists & |F ′(x)− bi| > r} (4.46)

Then, just as in the proof of (i) in the above paragraph, it can be shown that Di,r

is Borel. Since it is Borel, the following set is Borel by Kechris [84] Theorem 17.25:

{F ∈ B :N∑i=1

µ(Di,rF ) < r} (4.47)

But this set is precisely equal to Bf,r, which is what we wanted to show.

Corollary 195. The set of (f, F ) in M [a, b] × C[a, b] such that f ∈ Den[a, b]

and F (x) =∫ xaf is co-analytic but not analytic, and hence, assuming analytic

determinacy, this set is Π11-complete.

Proof. That this set is co-analytic follows immediately from the previous propo-

sition and theorem, as well as Corollary 174. That the set is not analytic follows

from the fact that if the set is Borel then there is α < ω1 such that |f, F | ≤ α

for all f, F in the class (see Kechris [84] Theorem 35.23). But this contradicts

Corollary 166. The last statement about determinacy is just Kechris [84] Theo-

rem 26.4.

287

Corollary 196. For all α < ω1 the set of (f, F ) in M [a, b] × C[a, b] such that

f ∈ Den∗α[a, b] and F (x) =∫ xaf is Borel, and the set Den∗α[a, b] is analytic.

Proof. That this set is Borel follows from the previous theorem and Corollary 177.

That the set Den∗α[a, b] is analytic follows from the fact that its definition requires

us to say that there are g ∈ Den∗β[a, b] and G(x) =∫ xag.

Corollary 197. The set of F in C[a, b] such that F ∈ ACG∗([a, b]) is co-

analytic but not analytic, and hence, assuming analytic determinacy, this set

is Π11-complete.

Proof. That this set is co-analytic follows immediately from the previous theorem

and Proposition 159 (v). That the set is not analytic follows from the fact that

if the set is Borel then there is α < ω1 such that |F | ≤ α for all f, F in the class

(see Kechris [84] Theorem 35.23). But this contradicts Corollary 166. The last

statement about determinacy is just Kechris [84] Theorem 26.4.

4.4 Model Theory

There are many different languages in which one can view C[a, b], L1[a, b],

〈Den∗α[a, b]〉, 〈Denα[a, b]〉 and Den[a, b]. For instance, as abelian groups, they

are all isomorphic since all are divisible torsion-free abelian groups of cardinal-

ity 2ω, and divisible torsion-free abelian groups are uncountably categorical (cf.

Marker [107] Corollary 3.1.11 p. 72). In this section, the relationship between

C[a, b], L1[a, b], 〈Den∗α[a, b]〉, 〈Denα[a, b]〉 and Den[a, b] as Q[X]-modules (resp.

R[X]-modules) is studied, where we interpret the map f 7→ Xf as the indefinite

integral, so that Xf =∫ xaf . More specifically, it is assumed that this indefinite

integral is the Riemann integral on C[a, b], the Lebesgue integral on L1[a, b], and

288

the Denjoy integral on 〈Den∗α[a, b]〉, 〈Denα[a, b]〉 and Den[a, b]. Note too that this

integral is the indefinite integral, so that e.g. if Xf = 0, then f = 0.

Recall that the signature ofR-modules is simply the signature of abelian groups

equipped with linear maps r for each element r or R (cf. Marker [107] Exam-

ple 1.2.7 p. 17, Hodges [70] p. 37 and Appendix A1 pp. 653 ff, Prest [124] p. 2).

Hence, e.g. the signature of R[X]-modules is uncountable, whereas the signature

of Q[X]-modules is countable. Likewise, since elements r of R correspond to lin-

ear maps in an R-module M , subsets of M such as rM = {ra : a ∈ M} and

ker(r) = {a ∈M : ra = 0} are definable without parameters in M .

There are two main results of this section. The first is that the structures

C[a, b], L1[a, b], 〈Den∗α[a, b]〉, 〈Denα[a, b]〉 and Den[a, b] are stable but not su-

perstable as Q[X]-modules or R[X]-modules, and that the indefinite integral

Xf 7→∫ xaf is not definable in theses structures as Q-vector spaces or R-vector

spaces (cf. Corollary 204). The second main result is that as Q[X]-modules, or

R[X]-modules, these structures are elementarily equivalent, and as Q[X]-modules

their complete theory is decidable (cf. Corollary 227).

4.4.1 Indexes of Subgroups and Non-Definability of the Integral

Remark 198. The goal of this subsection is to establish that the index [XkM :

Xk+1M ] ofXk+1M inXkM is infinite, whereM ⊆M [a, b] is one of C[a, b], L1[a, b],

〈Den∗α[a, b]〉, 〈Denα[a, b]〉 or Den[a, b] (cf. Theorem 200). This result is used to

show that these modules are stable but not superstable, which in turns shows us

that the indefinite integral Xf =∫ xaf is not definable in M as a vector-space over

the reals or rationals (cf. Corollary 204). This information about the cardinality

of [XkM : Xk+1M ] will also be used in the next subsection to show that all of

289

these modules are elementarily equivalent.

Remark 199. Recall from Definition 141 that a subset X ⊆ M [a, b] is said to

be subinterval-closed if f ∈ X and (c, d) ⊆ (a, b) implies fχ(c,d) ∈ X . Further,

recall that it was shown in Proposition 157 that the subspaces 〈Denα[a, b]〉 and

〈Den∗α[a, b]〉 are sub-interval closed. Finally, note that L1[a, b] and Den[a, b] are

sub-interval closed. Hence, the following theorem can be applied to all these

modules.

Theorem 200. Suppose thatM is a submodule of Den[a, b] which contains C[a, b].

Suppose further that one of the following conditions holds: (i) M = C[a, b] or (ii)

M is subinterval-closed. Then [XkM : Xk+1M ] is infinite.

Proof. First we show this for M satisfying hypothesis (i). For each f ∈ M we

may choose g ∈ C[a, b] such that f = g a.e., and so M may be identified with

C[a, b]. This implies that for k ≥ 0 we have

XkM = {f ∈ Ck[a, b] : ∀ i < k f (i)(a) = 0} (4.48)

where we stipulate X0M = M and C0[a, b] = C[a, b]. For, in the case of k = 0,

this follows by our stipulation. Suppose that (4.48) holds for k. To see it holds

for k + 1, consider first the left-to-right containment. That is, suppose that f ∈

Xk+1M . Then f =∫ xag where g ∈ XkM ⊆ M = C[a, b]. Then since this is the

Riemann integral applied to continuous functions, it follows that f is differentiable

everywhere and that f ′ = g, where g is by hypothesis a continuous function. Then

f (k+1)(a) = g(k)(a) = 0 by induction hypothesis and f(a) =∫ 0

ag = 0, so that

f ∈ Ck+1[a, b] and ∀ i < k + 1 f (i)(a) = 0. For the right-to-left containment of

(4.48) in the case of k+ 1, suppose that f ∈ Ck+1[a, b] and ∀ i < k+ 1 f (i)(a) = 0.

290

Let g = f ′ which by hypothesis is in C[a, b] = M . Then by induction hypothesis, it

follows that g ∈ XkM ⊆M = C[a, b], so that∫ xag =

∫ xaf ′ = f(x)− f(a) = f(x),

so that f ∈ Xk+1M . Hence, in fact (4.48) holds for all k ≥ 0.

Now Ck[a, b] is a Banach space with norm given by

‖f‖u,k =∑

0≤i≤k

‖f (i)‖u (4.49)

where ‖ · ‖u is the sup-norm on C[a, b] (cf. Folland [43] Exercise 9 p. 155). From

this and equation (4.48) it follows that XkM is a closed subgroup of Ck[a, b] and

hence is itself a Polish group (cf. Gao [51] Proposition 2.2.1 p. 45). Now, note

that for all k ≥ 0, it is the case that XkM and Xk+1M are homeomorphic by

the map f 7→ Xf . This map is clearly bijective, and it is continuous since if

‖f − g‖u,k < min{ ε2, ε

2(b−a)}, then

‖Xf −Xg‖u,k+1 =∑

0≤i≤k+1

‖(X(f − g))(i)‖u ≤ supx∈[a,b]

∫ x

a

|f − g|+∑

0≤i≤k

‖(f − g)(i)‖u

< [ supx∈[a,b]

∫ x

a

ε

2(b− a)] +

ε

2=

ε

2(b− a)· (b− a) +

ε

2= ε (4.50)

Note that in this equation, it is important to remember that the integral is the

Riemann integral, and hence it is permissible to infer from the integrability of a

function to the integrability of its absolute value (cf. Remark 169). Further, this

291

map is open since if ‖Xf −Xg‖u,k+1 < ε then

‖f − g‖u,k ≤ ‖Xf −Xg‖u +∑

0≤i≤k

‖(f − g)(i)‖u

= ‖Xf −Xg‖u +∑

1≤i≤k+1

‖(Xf −Xg)(i)‖u = ‖Xf −Xg‖u,k+1 < ε

(4.51)

Hence, in fact XkM and Xk+1M are homeomorphic via the map f 7→ Xf .

By induction on k ≥ 0, it follows from this that Xk+1M is meager in XkM .

For k = 0, note that XM = XC[a, b] is meager in M = C[a, b] since the nowhere

differentiable functions are comeager in M and contained in the set M \ XM

(cf. Munkres [115] Theorem 49.1 p. 300). Suppose that it holds for k, that is

suppose that Xk+1M is meager in XkM . Since meagerness is preserved under

homeomorphisms, it follows that Xk+2M is meager in Xk+1M , which is just to

say that the statement holds for k + 1.

From this it easily follows that [XkM : Xk+1M ] is infinite, and indeed un-

countable. For, suppose that [XkM : Xk+1M ] were countable. Then XkM =⊔n gn + Xk+1M , where gn ∈ XkM . Since XkM is a Polish group and each

Xk+1M is nowhere dense in XkM , we have that each gn + Xk+1M is nowhere

dense in XkM (since addition by a constant is a homeomorphism in any Polish

group). Hence, the Polish space XkM is a countable union of nowhere dense sub-

sets, contradicting the Baire Category Theorem. So [XkM : Xk+1M ] is infinite

(and indeed uncountable) for M satisfying hypothesis (i).

Now we show this for M satisfying hypothesis (ii). Suppose that this fails,

and [XkM : Xk+1M ] is finite. Then XkM =⊔ni=1X

kfi +Xk+1M , where fi ∈M .

Choose continuous nowhere differentiable function g ∈ C[a, b] ⊆ M . Choose a

292

partition [a, b] = [a1, b1]t · · ·t [an, bn], and let h = Xk[g+∑n

i=1 fiχ[ai,bi]], which is

in XkM since M is subinterval-closed. So, by hypothesis, there is j ∈ [1, n] such

that h ∈ Xkfj +Xk+1M . Then

h−Xkfj = Xk[g + (n∑i=1

fiχ[ai,bi])− fj] ∈ Xk+1M (4.52)

From this it follows that

g + (n∑i=1

fiχ[ai,bi])− fj ∈ XM (4.53)

But then this function is differentiable a.e. and so differentiable a.e. on each

[ai, bi]. But on the interval [aj, bj], this function is equal to g, which contradicts

the choice of g. So [XkM : Xk+1M ] is infinite when M satisfies hypothesis (ii).

Remark 201. It is a classical result that all modules are stable (cf. Prest [124]

Theorem 3.1 (a) p. 55, or Hodges [70] Theorem A.1.13 p. 660). However, on the

basis of the following proposition, it can be inferred from the previous theorem

that the modules of integrable functions which we are concerned with are not

superstable.

Proposition 202. A module M is superstable if and only if there is no infinite

descending sequence of definable subgroups, each of infinite index in its predeces-

sor.

Proof. See Prest [124] Theorem 3.1 (b) p. 55, or Ziegler [162] Theorem 2.1 p. 156.

Corollary 203. Suppose that M is a submodule of Den[a, b] which contains

C[a, b]. Suppose further that one of the following conditions holds: (i) M = C[a, b]

293

or (ii) M is subinterval-closed. Then M is stable but not superstable. Further,

the map X : M →M is not definable in M as a vector-space over R or Q.

Proof. The part about stability and superstability follows immediately from the

above proposition (Proposition 202) and the previous theorem (Theorem 200).

Suppose that the map X : M 7→ M was definable in M as a vector-space over R

or Q. Then since M as a vector-space over either of these fields is superstable (cf.

Hodges [70] p. 330 and Exercise 6 p. 283), and since superstability is preserved

downward under definability, it would follow that M as an R[X]-module or Q[X]-

module would be superstable.

Corollary 204. Suppose that M is C[a, b], L1[a, b], 〈Den∗α[a, b]〉, 〈Denα[a, b]〉 or

Den[a, b], considered as a R[X]-module (resp. Q[X]-module), where Xf =∫ xaf .

Then M is stable but not superstable, and the map X : M →M is not definable

in the structure of M as a R-vector space (resp. Q-vector space).

Proof. This follows immediately from the previous corollary, keeping in mind that

the conditions of the previous corollary are satisfied for these modules, as we noted

in Remark 199.

4.4.2 Elementary Equivalence and Decidability

Remark 205. Considered as an abelian group or as a Q-module, the structures

C[a, b], L1[a, b], 〈Den∗α[a, b]〉, 〈Denα[a, b]〉, and Den[a, b] have elementarily equiva-

lent and decidable complete theories, since these theories are respectively the the-

ories of divisible torsion-free abelian groups and Q-vector spaces. But considered

in the language of rings, the theory of C[a, b] is very complex, since it is known to

interpret full second-order arithmetic (cf. Cherlin [20] pp. 47-48), and it is known

that Den[a, b] is not closed under multiplication (cf. Swartz [142] Example 14

294

p. 43). Hence, it is a natural question to ask after the elementary equivalence and

decidability of the complete theories of C[a, b], L1[a, b], 〈Den∗α[a, b]〉, 〈Denα[a, b]〉,

and Den[a, b] as Q[X]-modules, where again X is interpreted as the indefinite in-

tegral, so that Xf =∫ xaf . In this section, we show that the complete theories of

these structures are elementarily equivalent and decidable (cf. Corollary 227).

Definition 206. If M is a module over a ring R, then a pp-formula ϕ(x1, . . . , xj)

is a formula of the form ∃ y1, . . . , yk∧ni=1 ϕi(x1, . . . , xj, y1, . . . , yk) where ϕi is an

atomic formula. Further, since the language is that of R-modules, atomic formulas

have the form ψ(x1, . . . , xj, y1, . . . , yk) ≡∑j

`=1 r`x`+∑k

`=1 s`y` = 0, where r`, s` ∈

R. Hence, for a pp-formula ϕ(x1, . . . , xj), there is a n × j matrix A and n × k

matrix B with entries in R such that M |= ∃ y∧ni=1 ϕi(x, y)⇐⇒ ∃ y Ax+By = 0,

where we view x as a j × 1 matrix and y as an k × 1 matrix, and where 0 is the

n× 1 matrix with entires all equal to 0.

Remark 207. Note that any subset G ⊆ M j defined by a pp-formula is a sub-

group of M j.

Definition 208. The invariant sentences of Th(M) are sentences of the form

[G : G ∩ H] = k or [G : G ∩ H] > k, where k ≥ 0 and where G,H ⊆ M are

pp-definable subgroups of M which are definable without parameters.

Theorem 209. (pp-Elimination of Quantiifers) (i) Every set definable without

parameters in an R-module M is a Boolean combination of pp-definable sets. (ii)

For an R-module M , the theory Th(M) is axiomatized by the R-module axioms

and the invariant sentences of M .

Proof. See Prest [124] Corollaries 2.16 & 2.19 p. 37 and Hodges [70] p. 655.

295

Definition 210. A pp-formula ϕ(x1, . . . , xj) is said to be basic if it can be written

as ∃ y (∑j

`=1 r`x`) + sy = 0 or as r`x` = 0. That is, over an R-module M , the

basic pp-formula definable sets are r−1sM j or ker(r) since

a ∈ r−1sM j ⇐⇒ r ·a ∈ sM j ⇐⇒ ∃ b ∈M r ·a = sb⇐⇒ ∃ b ∈M r ·a+ sb = 0

a ∈ ker(r)⇐⇒ r · a = 0

Proposition 211. (i) If R is a PID, then every pp-formula formula is equivalent to

a finite conjunction of basic pp-formulas. (ii) Further if R is countable, then given

a pp-formula one can compute from R the finite conjunction of basic pp-formulas.

Proof. The proof of (i) is from Prest [124] Theorem 2.Z.1 pp. 46-47, which we

include merely for the sake of noting that this proof also gives us a proof of (ii).

The pp-formula defines a set ∃ y Ax + By = 0. Since R is a PID, there is a

diagonal matrix D and invertible matricies U, V such that UBV = D. Then

∃ y Ax+By = 0⇐⇒ ∃ y UAx+ UBV V −1y = 0

⇐⇒ ∃ y UAx+DV −1y = 0⇐⇒ ∃ y UAx+Dy = 0 (4.54)

Since D is diagonal, this is equivalent to a finite conjunction of basic pp-formulas.

Moreover, since there is an algorithm for computing the matrices D,U, V from

the matrix B and an oracle for R, this procedure is computable in R.

Definition 212. Suppose that M is a normed space. Then a compact linear

operator p : M → M is a linear operator which maps bounded sets to sets with

compact closure.

Theorem 213. (Riesz Theorem) Suppose that M is a normed space and p : M →

296

M is a compact linear operator, and consider the map 1+p given by a 7→ a+p(a).

Then 1 + p is surjective if and only if 1 + p is injective.

Proof. See Kress [94] Theorem 3.4 p. 32.

Definition 214. Suppose that U ⊆ C[a, b]. Then U is pointwise bounded if for

every x ∈ U there is M > 0 such that |f | ≤ M for all f ∈ U . Further, U is

equicontinuous if for every ε > 0 there is δ > 0 such that |x− y| < δ implies

|f(x)− f(y)| < ε for all f ∈ U .

Theorem 215. (Arzela-Ascoli) Suppose that U ⊆ C[a, b]. Then U has compact

closure if and only if U is pointwise bounded and equicontinuous.

Proof. See Folland [43] Theorem 4.43.

Proposition 216. The map X : C[a, b] → C[a, b] given by Xf =∫ xaf is a

compact linear operator.

Proof. Suppose that U ⊆ C[a, b] is bounded, say |f |u < M for all f ∈ U . We must

show that XU is relatively compact, which by the Arzela-Ascoli comes down to

showing that it is pointwise bounded and equicontinuous. For pointwise bouded-

ness, simply note that |(Xf)(x)| =∣∣∫ xaf(t)dt

∣∣ ≤ ∫ xa|f(t)| dt ≤

∫ xaMdt ≤ M(b −

a). For equicontinuity, suppose that ε > 0 and let δ < εM

. If 0 < x − y < δ then

|(Xf)(x)− (Xf)(y)| =∣∣∫ xaf(t)dt−

∫ yaf(t)d(t)

∣∣ =∣∣∣∫ xy f(t)dt

∣∣∣ ≤ ∫ xy|f(t)| dt ≤∫ x

yMdt = M(x− y) < M ε

M= ε.

Proposition 217. Suppose that M is an module over a commutative ring R and

that r ∈ R such that r : M →M is a bijection. Further suppose that s ∈ R such

that (i) sa = 0 implies a = 0 for all a ∈M , and such that (ii) s is invertible in R.

Then sr : M →M is a bijection.

297

Proof. If (sr)a = 0 then s(ra) = 0 then by hypothesis (i) on s we have that ra = 0,

and by injectivity of r we have a = 0. Hence sr is injective. Suppose that b ∈M .

By surjectivity of r choose a ∈ M such that ra = b. By hypothesis (ii) on s and

the commutativity of R, we have (sr)(s−1a) = (srs−1)a = (ss−1r)a = ra = b.

Hence sr is surjective.

Proposition 218. Suppose that p ∈ R[X] such that X - p. Then p : C[a, b] →

C[a, b] is a bijection.

Proof. Since X - p, by the previous proposition we may without loss of generality

write p(X) = 1 + a1X + · · · + akXk. Further, set C = max{|ai| : i ∈ [1, k]}.

Proposition 216 implies that a1X + · · ·+ akXk is a compact linear operator, and

hence, by the Riesz Theorem 213, it suffices to show that p is injective. Before

proceeding, note that if 1 denotes the real-valued function which is equal to 1

everywhere on [a, b], then Xk(1) = (x−a)kk!

for k ≥ 1. Recall also that if f ∈ C[a, b]

then ‖f‖u denotes the supremum of f on [a, b]. From this it follows easily that

Xkf ≤ ‖f‖u · (x−a)kk!

for k ≥ 1 and f ∈ C[a, b].

So suppose that f ∈ ker p. We must show that f(x) = 0 for all x ∈ [a, b]. It

suffices to show that there is ex ≥ 0 such that for all n ≥ 0.

|f(x)| ≤ ‖f‖uenxn!

(4.55)

For, choose N > 0 such that ex < N . Then for n ≥ N it follows that the ratio

en+1x

(n+1)!

enxn!

=ex

n+ 1<

N

n+ 1< 1 (4.56)

Hence, by the ratio test, the series∑∞

n=1enxn!

converges, from which it follows that

limnenxn!

= 0, so that (4.55) implies that f(x) = 0.

298

The values of ex depend on whether x−a ≥ 1. So first suppose that x−a ≥ 1.

Then define ex = Ck(x− a)k, so that it suffices to show for all n ≥ 1 that

|f(x)| ≤ ‖f‖u(Ck(x− a)k)n

n!= ‖f‖u

Cnkn(x− a)kn

n!(4.57)

For n = 1, this follows since 0 = pf implies f = −a1Xf − · · · − akXkf and hence

since x− a ≥ 1 it follows that

|f(x)| ≤k∑i=1

C‖f‖u ·(x− a)i

i!≤

k∑i=1

C‖f‖u · (x− a)k = ‖f‖uCk(x− a)k (4.58)

Suppose that (4.57) holds for n. To show that it holds for n + 1, first note again

that 0 = pf implies f = −a1Xf − · · · − akXkf . From this, the fact that (4.57)

holds for n, and x− a ≥ 1, it follows that

|f(x)| ≤k∑i=1

C‖f‖uX i(Cnkn(x− a)kn

n!) =

k∑i=1

C‖f‖uCnkn(x− a)kn+i

n! · kn · (kn+ 1) · · · (kn+ i)

(4.59)

≤k∑i=1

‖f‖uCn+1kn(x− a)k(n+1)

(n+ 1)!= ‖f‖uCn+1kn+1 (x− a)k(n+1)

(n+ 1)!(4.60)

Hence, we have that (4.57) holds for n + 1. Hence, by mathematical induction,

(4.57) holds for all n ≥ 1, which is what we wanted to show.

Now consider the second case, where x− a < 1. Then define ex = Ck(x− a),

so that it suffices to show for all n ≥ 1 that

|f(x)| ≤ ‖f‖u(Ck(x− a))n

n!= ‖f‖u

Cnkn(x− a)n

n!(4.61)

For n = 1, this follows since 0 = pf implies f = −a1Xf − · · · − akXkf and hence

299

since x− a < 1 it follows that

|f(x)| ≤k∑i=1

C‖f‖u ·(x− a)i

i!≤

k∑i=1

C‖f‖u · (x− a) = ‖f‖uCk(x− a) (4.62)

Suppose that (4.61) holds for n. To show that it holds for n + 1, first note again

that 0 = pf implies f = −a1Xf − · · · − akXkf . From this, the fact that (4.61)

holds for n, and x− a < 1, it follows that

|f(x)| ≤k∑i=1

C‖f‖uX i(Cnkn(x− a)n

n!) =

k∑i=1

C‖f‖uCnkn(x− a)n+i

(n+ i)!(4.63)

≤k∑i=1

‖f‖uCn+1kn(x− a)n+1

(n+ 1)!= ‖f‖uCn+1kn+1 (x− a)k(n+1)

(n+ 1)!(4.64)

Hence, we have that (4.61) holds for n + 1. Hence, by mathematical induction,

(4.61) holds for all n ≥ 1, which is what we wanted to show.

Remark 219. The following trick of lifting the Riesz theory to Den[a, b] is more or

less explicit in the proof of Theorem 3.10 of Federson and Bianconi ([35] pp. 103 ff),

although they restrict themselves to the case of Den[a, b] and do not frame this in

the language of modules.

Proposition 220. Suppose that M is a submodule of Den[a, b] which contains

C[a, b]. Suppose that p ∈ R[X] such that X - p. Then p : M →M is a bijection.

Proof. By Proposition 217, we may assume that p(X) = 1 + a1X + · · · + akXk.

To see that p is injective, note that if pf = 0 then f = −a1Xf − · · · − akXkf .

Since XM ⊆ C[a, b], we have that f ∈ C[a, b] and pf = 0 in C[a, b]. But by the

previous proposition, p : C[a, b]→ C[a, b] is an injection, and hence f = 0. So in

fact p : M →M is an injection.

300

To see that p : M → M is a surjection, suppose that g ∈ M . Since XM ⊆

C[a, b], we have that (p − 1)g ∈ C[a, b] and hence −(p − 1)g ∈ C[a, b]. By

the previous proposition, p : C[a, b] → C[a, b] is a surjection, and hence there is

f ∈ C[a, b] such that pf = −(p−1)g. Then p(f+g) = pf+pg = −(p−1)g+pg =

(−(p− 1) + p)(g) = g. Hence, in fact p : M →M is a surjection.

Proposition 221. Suppose that M is a module over a commutative ring R and

that r ∈ R is such that r : M → M is bijective. Then r is an automorphism of

M .

Proof. Let α : M → M by α(a) = ra. By hypothesis, α is a bijection. Further,

by the definition of a module, α(a + b) = r(a + b) = ra + rb = α(a) + α(b), and

likewise since R is commutative we have α(sa) = r(sa) = (rs)a = (sr)a = s(ra) =

sα(a).


C[a, b]. Suppose that p ∈ R[X] such that X - p. Then p : M → M is an

automorphism of M as an R[X]-module (or Q[X]-module).


C[a, b]. Suppose further that p, q ∈ R[X]. Then p−1qM is either M or X`M for

some ` > 0. Further, there is a computable procedure which (i) given p, q ∈ Q[X]

determines which of these occurs and which (ii) returns ` > 0 if the latter occurs.

Proof. Compute the largest k such that Xk divides both p and q. Let p = Xkp0

and q = Xkq0. Then p−1qM = p−10 q0M since pf + qg = 0 if and only if Xk(p0f +

q0g) = 0 if and only if p0f + q0g = 0. Now either X | q0 or X - q0, and we can

compute which of these occurs.

301

If X | q0 then by definition of k we have X - p0 and so by the previous

proposition p0 is an automorphism of M as a R[X]-module. Further, if X | q0 then

compute the largest ` > 0 such that X` | q0. Let q0 = X`q1, where X - q1. Then by

the previous proposition, q1 is an automorphism of M as a R[X]-module. Then we

have the following, where the last equality is due to the fact that automorphisms

fix definable sets:

p−1qM = p−10 q0M = p−1

0 X`q1M = p−10 X`M = X`M (4.65)

On the other hand, suppose that X - q0. Then by the previous proposition, q0

is an automorphism of M as a R[X]-module. Then we have the following, where

the last equality follows from the fact that r−1M = M for any R-module M and

r ∈ R:

p−1qM = p−10 q0M = p−1

0 M = M (4.66)


C[a, b]. Suppose further that p ∈ R[X]. Then ker(p) is either 0 or M . Further,

there is a computable procedure which given p, q ∈ Q[X] determines which of

these occurs.

Proof. If p is zero then ker(p) = M , and we can compute whether this occurs. If

p is non-zero, then compute the largest k such that Xk divides p. Let p = Xkp0.

Then ker(p) = ker(p0) since Xkp0f = 0 if and only if p0f = 0. Then X - p0 and so

by Corollary 222, we have that p0 is an automorphism of M as an R[X]-module

and so ker(p0) = 0.


302


or (ii) M is subinterval-closed. Suppose finally that G,H are pp-definable sub-

groups of M definable without parameters. Then [G : G ∩H] = 1 or [G : G ∩H]

infinite, and from formulas defining G and H we can compute which of these oc-

curs. Further, this procedure is uniform in such M , in that formulas for G and H

will return the same values for [G : G ∩H] for all such M .

Proof. By Proposition 211 and the two previous propositions, G and H are finite

conjunctions of the subgroups 0, X`M , and M , and hence themselves are among

the subgroups 0, X`M , and M . Further, from Proposition 211 and the two

previous propositions, given formulas defining G we can computably determined

whether G (resp. H) is 0, X`M , or M . So there are nine possible cases to consider.

The cases in which 0 occurs are trivial, and so there are really only four interesting

cases to consider. Case one: G = M and H = M . Then [G : G ∩H] = 1. Case

two: G = M and H = XkM . Then [G : G∩H] infinite by Proposition 200. Case

three: G = X`M and H = M . Then [G : G ∩H] = 1. Case four: G = X`M and

H = XkM . Then [G : G ∩ H] = 1 if ` ≥ k and [G : G ∩ H] infinite if ` < k by

Proposition 200.



or (ii) M is subinterval-closed. Then the theory of M as a Q[X] module is com-

putable, and any two such M have the same theory as R[X]-modules or Q[X]-

modules.

Proof. Consider the following computable theory T of Q[X]-modules. This theory

has all the axioms of Q[X]-modules, and if G and H are pp-definable subgroups

then T has the axiom [G : G ∩H] = 1 or the axioms [G : G ∩H] > k according

303

to what the computation in the previous corollary returns. By Theorem 209 (ii),

this theory T is the complete theory of M as a Q[X]-module.

Corollary 227. The Q[X]-modules C[a, b], L1[a, b], 〈Den∗α[a, b]〉, 〈Denα[a, b]〉, and

Den[a, b] have elementarily equivalent and decidable complete theories. Further,

as R[X]-modules they are elementarily equivalent.

Proof. This follows immediately from the previous corollary and the fact, noted

in Remark 199, that we can apply Theorem 200 to all these modules.

4.5 Further Questions

Question 228. Can the assumptions of analytic determinacy be removed in

Corollaries 195 & 197? By Corollary 195, we have that Den[a, b] is Σ12. Is

Den[a, b] co-analytic or Σ12, perhaps Σ1

2-complete? By Corollary 195, we have

that Den∗α[a, b] is analytic. Is Den∗α[a, b] also Borel?

Question 229. Viewing Derv[a, b] as a subspace of (C[a, b])ω, Dougherty and

Kechris [28] show that Derv[a, b] is co-analytic but not analytic (and indeed not

even analytic on the co-analytic subspace of Cω[a, b] of sequences which converge

pointwise). Is Derv[a, b] co-analytic in M [a, b]?

Question 230. Do the stability, elementary equivalence, and decidability results

from § 4.4 still hold if one views C[a, b], L1[a, b], 〈Den∗α[a, b]〉, 〈Denα[a, b]〉, and

Den[a, b] as R[X] or Q[X]-modules, where Xf 7→∫ baK(x, y)f(y)dy for appropriate

real-valued continuous functions K(x, y)? Note that some care has to be exercised

with respect to the choice of K, since Den[a, b] is not closed under multiplication

(cf. Swartz [142] Example 14 p. 43).

304

BIBLIOGRAPHY

1. Aetas Kantiana. Culture et Civilisation, Bruxelles, 1968-1981.

2. Brad Armendt. Dutch Books, Additivity, and Utility Theory. PhilosophicalTopics, 21:1–20, 1993.

3. Emil Artin. Geometric Algebra. Interscience, New York, 1957.

4. James Ax. The Elementary Theory of Finite Fields. Annals of Mathematics,88:239–271, 1968.

5. Alan Baker. Is there a Problem of Induction for Mathematics? In Mary Leng,Alexander Paseau, and Michael Potter, editors, Mathematical Knowledge,pages 59–72. Oxford University Press, Oxford, 2007.

6. Jon Barwise and John Schlipf. On Recursively Saturated Models of Arith-metic. In A. Dold and B. Eckmann, editors, Model Theory and Algebra,volume 498 of Lecture Notes in Mathematics, pages 42–55. Springer, Berlin,1975.

7. Paul Benacerraf. Logicism, Some Considerations. PhD Thesis, PrincetonUniversity, 1960.

8. Paul Benacerraf. Frege: The Last Logicist. Midwest Studies in Philosophy,6:17–35, 1981. Reprinted in [24].

9. Frederick C. Besier. The Fate of Reason: German Philosophy from Kant toFichte. Harvard University Press, Cambridge, 1987.

10. George Boolos. The Consistency of Frege’s Foundations of Arithmetic. InJudith Jarvis Thomson, editor, On Being and Saying: Essays in Honor ofRichard Cartwright, pages 3–20. MIT Press, Cambridge, 1987. Reprinted in[13], [24].

11. George Boolos. Frege’s Theorem and the Peano Postulates. The Bulletin ofSymbolic Logic, 1(3):317–326, 1995. Reprinted in [13].

305

12. George Boolos. On the Proof of Frege’s Theorem. In Adam Morton andStephen P. Stich, editors, Benacerraf and His Critics, pages 143–159. Black-well, 1996. Reprinted in [13].

13. George Boolos. Logic, Logic, and Logic. Harvard University Press, Cam-bridge, MA, 1998. Edited by Richard Jeffrey.

14. George Boolos and Richard G. Heck Jr. Die Grundlagen der Arithmetik, Sec-tions 82-83. In Matthias Schirn, editor, Philosophy of Mathematics Today,pages 407–428. Clarendon Press, 1998. Reprinted in [13].

15. John P. Burgess. Fixing Frege. Princeton Monographs in Philosophy. Prince-ton University Press, Princeton, 2005.

16. John P. Burgess and A. P. Hazen. Predicative Logic and Formal Arithmetic.Notre Dame Journal of Formal Logic, 39(1):1–17, 1998.

17. Samuel R. Buss. Nelson’s Work on Logic and Foundations and Other Re-flections on the Foundations of Mathematics. In William G. Faris, editor,Diffusion, Quantum Theory, and Radically Elementary Mathematics, vol-ume 47 of Mathematical Notes, pages 183–208. Princeton University Press,Princeton, NJ, 2006.

18. Ernst Cassirer. Substanzbegriff und Funktionsbegriff. Cassirer, Berlin, 1910.Reprinted in [19] vol. 6.

19. Ernst Cassirer. Gesammelte Werke. Meiner, Hamburg, 1998-2009. 26 vol-umes. Edited by Birgit Recki.

20. Gregory Cherlin. Rings of Continuous Functions: Decision Problems. InL. Pacholski, J. Wierzejewski, and A.J. Wilkie, editors, Model Theory ofAlgebra and Arithmetic, volume 834 of Lecture Notes in Mathematics, pages44–91. Springer, Berlin, 1980.

21. A.P. Dawid. Probability, Symmetry, and Frequency. British Journal for thePhilosophy of Science, 36(2):107–128, 1985.

22. Richard Dedekind. Was sind und was sollen die Zahlen? Vieweg, Braun-schweig, 1888. Second edition 1893. Reprinted in [23] vol. 3 pp. 335-391.

23. Richard Dedekind. Gesammelte mathematische Werke. Vieweg, Braun-schweig, 1930-1932. Three volumes. Edited by Robert Fricke, Emmy Noether,and Øystein Ore.

24. William Demopoulos, editor. Frege’s Philosophy of Mathematics. HarvardUniversity Press, Cambridge, 1995.

306

25. William Demopoulos and Peter Clark. The Logicism of Frege, Dedekind, andRussell. In Stewart Shapiro, editor, The Oxford Handbook of Philosophy ofMathematics and Logic, pages 129–165. Oxford University Press, 2005.

26. Keith J. Devlin. Constructibility. Perspectives in Mathematical Logic.Springer, Berlin, 1984.

27. J. L. Doob. Measure Theory, volume 143 of Graduate Texts in Mathematics.Springer, New York, 1994.

28. Randall Dougherty and Alexander S. Kechris. The Complexity of Antidif-ferentiation. Advances in Mathematics, 88(2):145–169, 1991.

29. Michael Dummett. Frege: Philosophy of Mathematics. Harvard UniversityPress, Cambridge, 1991.

30. John Earman. Bayes or Bust? A Critical Examination of BayesianConfirmation Theory. MIT Press, Cambridge, 1992.

31. Kenny Easwaran. Probabilistic Proofs and Transferability. PhilosophiaMathematica, 17:341–362, 2009.

32. Herbert B. Enderton. A Mathematical Introduction to Logic. Harcourt,Burlington, second edition, 2001.

33. Don Fallis. The Epistemic Status of Probabilistic Proof. The Journal ofPhilosophy, 94(4):165–186, 1997.

34. Don Fallis. The Reliability of Randomized Algorithms. British Journal forthe Philosophy of Science, 51:255–271, 2000.

35. Marcia Federson and Ricardo Bianconi. Linear Fredholm Integral Equationsand the Integral of Kurzweil. Journal of Applied Analysis, 8(1):83–110, 2002.

36. Solomon Feferman. Reflecting on Incompleteness. Journal of Symbolic Logic,56(1):1–49, 1991.

37. Solomon Feferman. In the Light of Logic. Logic and Computation in Phi-losophy. Oxford University Press, New York, 1998.

38. Solomon Feferman, Harvey M. Friedman, Penelope Maddy, and John R.Steel. Does Mathematics need New Axioms? The Bulletin of SymbolicLogic, 6(4):401–446, 2000.

39. Pierre Fermat. Remarques sur l’Arithmetique des Infinis du S. J. Wallis. InCommercium epistolicum de quaestionibus quibusdam mathematicis nuperhabitum, pages 24–29. Lichfield, 1656.

307

40. Fernando Ferreira and Kai F. Wehmeier. On the Consistency of the ∆11-CA

fragment of Frege’s Grundgesetze. Journal of Philosophical Logic, 31(4):301–311, 2002.

41. Branden Fitelson. The Plurality of Bayesian Measures of Confirmation andthe Problem of Measure Sensitivity. Philosophy of Science, 66:S362–S378,1999.

42. Branden Fitelson. A Decision Procedure for Probability Calculus with Ap-plications. Review of Symbolic Logic, 1(1):111–125, 2008.

43. Gerald B. Folland. Real Analysis. Pure and Applied Mathematics. JohnWiley & Sons Inc., New York, second edition, 1999.

44. Gottlob Frege. Die Grundlagen der Arithmetik. Koebner, Breslau, 1884.

45. Gottlob Frege. Kleine Schriften. Olms, Hildesheim, 1967. Edited by IgnacioAngelelli.

46. Jakob Friedrich Fries. Die mathematische Naturphilosophie nachphilosophischer Methode bearbeitet: ein Versuch. Mohr & Winter, Hei-delberg, 1822. Reprinted in [47] vol. 13.

47. Jakob Friedrich Fries. Samtliche Schriften. Scientia, Aalen, 1967-2004. 26volumes. Edited by Gert Konig and Lutz Geldsetzen.

48. Haim Gaifman. Reasoning with Limited Resources and Assigning Probabil-ities to Arithmetical Statements. Synthese, 140:97–119, 2004.

49. R. O. Gandy. Proof of Mostowski’s Conjecture. Bulletin de l’AcademiePolonaise des Sciences. Serie des Sciences Mathematiques, Astronomiques etPhysiques, 8:571–575, 1960.

50. Mihai Ganea. Burgess’ PV is Robinson’s Q. Journal of Symbolic Logic,72(2):618–624, 2007.

51. Su Gao. Invariant Descriptive Set Theory. Pure and Applied Mathematics.CRC Press, Boca Raton, 2009.

52. Clark Glymour. Relevant Evidence. Journal of Philosophy, 72(14):403–426,1975.

53. Clark Glymour. The Epistemology of Geometry. Nous, 11(3):227–251, 1977.

54. Clark N. Glymour. Theory and Evidence. Princeton. Princeton UniversityPress, 1980.

308

55. Kurt Godel. Collected Works. Vol. III. Unpublished Lectures and Essays.Clarendon, New York, 1995. Edited by Solomon Feferman et. al.

56. Russell A. Gordon. The Integrals of Lebesgue, Denjoy, Perron, and Henstock,volume 4 of Graduate Studies in Mathematics. American MathematicalSociety, Providence, RI, 1994.

57. Ronald L. Graham, Donald E. Knuth, and Oren Patashnik. ConcreteMathematics. Addison-Wesley, Reading, 1994. second edition.

58. Robert E. Greene and Steven G. Krantz. Function Theory of One ComplexVariable, volume 40 of Graduate Studies in Mathematics. American Math-ematical Society, Providence, RI, third edition, 2006.

59. Petr Hajek and Pavel Pudlak. Metamathematics of First-Order Arithmetic.Perspectives in Mathematical Logic. Springer, Berlin, 1998.

60. Bob Hale and Crispin Wright. The Reason’s Proper Study. Oxford UniversityPress, Oxford, 2001.

61. Valentina S. Harizanov. Pure Computable Model Theory. In Yu. L. Er-shov, S. S. Goncharov, A. Nerode, J.B. Remmel, and V. W. Marek, editors,Handbook of Recursive Mathematics, Vol. 1, volume 138 of Studies in Logicand the Foundations of Mathematics, pages 3–114. North-Holland, Amster-dam, 1998.

62. Robin Hartshorne. Geometry: Euclid and Beyond. Undergraduate Texts inMathematics. Springer, New York, 2000.

63. Allen Hatcher. Algebraic Topology. Cambridge University Press, Cambridge,2002.

64. Richard G. Heck, Jr. The Development of Arithmetic in Frege’s Grundge-setze der Arithmetik. The Journal of Symbolic Logic, 58(2):579–601, 1993.

65. Richard G. Heck, Jr. The Consistency of Predicative Fragments of Frege’sGrundgesetze der Arithmetik. History and Philosophy of Logic, 17(4):209–220, 1996.

66. Richard G. Heck, Jr. Frege’s Theorem: An Introduction. The HarvardReview of Philosophy, 7:56–73, 1999.

67. Richard G. Heck, Jr. Cardinality, Counting, and Equinumerosity. NotreDame Journal of Formal Logic, 41(3):187–209, 2000.

309

68. Thomas Hobbes. Six Lessons to the Professors of Mathematiques, One ofGeometry, the Other of Astronomy: in the Chaires set up by Sir Henry Savilein the University of Oxford. Crook, London, 1656. Reprinted in [69] vol. 7.

69. Thomas Hobbes. The English Works of Thomas Hobbes. Bohn, 1839. Elevenvolumes.

70. Wilfrid Hodges. Model Theory, volume 42 of Encyclopedia of Mathematicsand its Applications. Cambridge University Press, Cambridge, 1993.

71. Walter Hoering. Anomalies of Reduction. In Wolfgang Balzer, David A.Pearce, and Heinz-Jurgen Schmidt, editors, Reduction in Science: Structure,Examples, Philosophical Problems, pages 33–50. Dordrecht, 1984.

72. Thomas Hofweber. Proof-Theoretic Reduction as a Philosopher’s Tool.Erkenntnis, 53:127–146, 2000.

73. Paul Horwich. Probability and Evidence. Cambridge University Press, Cam-bridge, 1982.

74. Colin Howson and Urbach Peter. Scientific Reasoning: The BayesianApproach. Open Court, Chicago, second edition, 1993.

75. Kenneth Ireland and Michael Rosen. A Classical Introduction to ModernNumber Theory, volume 84 of Graduate Texts in Mathematics. Springer,New York, second edition, 1990.

76. Daniel Isaacson. Arithmetical Truth and Hidden Higher-Order Concepts.In Paris Logic Group, editor, Logic Colloquium ’85 (Orsay, 1985), volume122 of Studies in Logic and the Foundations of Mathematics, pages 147–169.North-Holland, Amsterdam, 1987.

77. Daniel Isaacson. Some Considerations on Arithmetical Truth and the ω-Rule. In Michael Detlefsen, editor, Proof, Logic and Formalization, pages94–138. Routledge, London, 1992.

78. St. Iwan. On the Untenability of Nelson’s Predicativism. Erkenntnis, 53(1-2):147–154, 2000.

79. Frank Jackson. Petitio and the Purpose of Arguing. Pacific PhilosophicalQuarterly, 65:26–36, 1984.

80. Thomas Jech. Set Theory. Springer Monographs in Mathematics. Springer,Berlin, 2003. The Third Millennium Edition.

81. James Joyce. How Probabilities Reflect Evidence. Philosophical Perspectives,19:153–178, 2005.

310

82. Abraham Gotthelf Kastner. Ueber die geometrischen Axiome.Philosophisches Magazin, 2(4):420–430, 1790. Reprinted in Aetas Kan-tiana [1] vol. 63.

83. Alexander S. Kechris. The Complexity of Antidifferentiation, Denjoy To-talization, and Hyperarithmetic Reals. In Proceedings of the InternationalCongress of Mathematicians. Berkeley, California, August 3-11, 1986, vol-ume 1, pages 307–313, Providence, RI, 1987. American Mathematical Soci-ety.

84. Alexander S. Kechris. Classical Descriptive Set Theory, volume 156 ofGraduate Texts in Mathematics. Springer, New York, 1995.

85. H. Jerome Keisler. Model Theory for Infinitary Logic. Logic with CountableConjunctions and Finite Quantifiers, volume 62 of Studies in Logic and theFoundations of Mathematics. North-Holland, Amsterdam, 1971.

86. Kevin T. Kelly. The Logic of Reliable Inquiry. Oxford University Press, NewYork, 1996.

87. Kevin T. Kelly. The Logic of Success. In Peter Clark and Katherine Hawley,editors, Philosophy of Science Today, pages 11–38. Oxford University Press,Oxford, 2000.

88. Kevin T. Kelly and Clark Glymour. Why Probability does not Capture theLogic of Success. In Christopher Hitchcock, editor, Contemporary Debatesin Philosophy of Science, pages 94–114. Blackwell, Malden, 2004.

89. Kevin T. Kelly and Oliver Schulte. Church’s Thesis and Hume’s Problem.In Maria Lusia Dalla Chiara, editor, Logic and Scientific Methods, pages159–177. Kluwer, Dordrecht, 1997.

90. Bakhadyr Khoussainov and Anil Nerode. Automata Theory and ItsApplications, volume 21 of Progress in Computer Science and Applied Logic.Birkhauser, Boston, MA, 2001.

91. Philip Kitcher. The Nature of Mathematical Knowledge. Oxford UniversityPress, Oxford, 1984.

92. Stephen Cole Kleene. Quantification of Number-Theoretic Functions.Compositio Mathematica, 14:23–40, 1959.

93. Peter Koellner. Truth in Mathematics: The Question of Pluralism. InOtavio Bueno and Øystein Linnebo., editors, New Waves in the Philosophyof Mathematics, pages 80–116. Palmgrave, 2009.

311

94. Rainer Kress. Linear Integral Equations, volume 82 of Applied MathematicalSciences. Springer, second edition, 1999.

95. Serge Lang. Algebra, volume 211 of Graduate Texts in Mathematics.Springer, New York, third edition, 2002.

96. Shaughan Lavine. Something about Everything: Universal Quantificationin the Universal Sense of Universal Quantification. In Agustın Rayo andGabriel Uzquiano, editors, Absolute Generality, pages 98–148. ClarendonPress, Oxford, 2006.

97. Gottfried Wilhelm Leibniz. New Essays on Human Understanding. Cam-bridge Texts in the History of Philosophy. Cambridge University Press, Cam-bridge, 1996. Edited and translated by Peter Remnant and Jonathan Ben-nett. References are to the page numbers of the original, which are given inthe margins of this edition.

98. Hannes Leitgeb. On Formal and Informal Provability. In Otavio Bueno andØystein Linnebo, editors, New Waves in the Philosophy of Mathematics,pages 263–299. Palmgrave, New York, 2009.

99. Per Lindstrom. Aspects of Incompleteness, volume 10 of Lecture Notes inLogic. Association for Symbolic Logic, Urbana, IL, second edition, 2003.

100. Øystein Linnebo. Predicative Fragments of Frege Arithmetic. The Bulletinof Symbolic Logic, 10(2):153–174, 2004.

101. John Locke. An Essay Concerning Human Understanding. Oxford UniversityPress, Oxford, 1979. Edited by Peter H. Nidditch.

102. Fraser MacBride. Speaking with the Shadows: A Study of Neo-Logicism.The British Journal for the Philosophy of Science, 54(1):103–163, 2003.

103. Fraser MacBride. Can Ante Rem Structuralism Solve the Access Problem?Philosophical Quarterly, 58(230):155–164, 2008.

104. Patrick Maher. Betting on Theories. Cambridge University Press, Cam-bridge, 1993.

105. Kenneth Manders. Diagram-Based Geometric Practice. In Paolo Mancosu,editor, The Philosophy of Mathematical Practice, pages 65–79. Oxford Uni-versity Press, Oxford, 2008.

106. Kenneth Manders. The Euclidean Diagram (1995). In Paolo Mancosu, editor,The Philosophy of Mathematical Practice, pages 80–133. Oxford UniversityPress, Oxford, 2008.

312

107. David Marker. Model Theory, volume 217 of Graduate Texts in Mathematics.Springer, New York, 2002.

108. David Marker. Introduction to the Model Theory of Fields. In Model Theoryof Fields, volume 5 of Lecture Notes in Logic, pages 1–37. Association forSymbolic Logic, La Jolla, CA, second edition, 2006.

109. David Marker, Margit Messmer, and Anand Pillay. Model Theory of Fields,volume 5 of Lecture Notes in Logic. Association for Symbolic Logic, La Jolla,CA, second edition, 2006.

110. Margit Messmer. Some Model Theory of Separably Closed Fields. In ModelTheory of Fields, volume 5 of Lecture Notes in Logic, pages 135–152. Asso-ciation for Symbolic Logic, La Jolla, CA, second edition, 2006.

111. William Mitchell. Beginning Inner Model Theory. In Matthew Foremanand Akihiro Kanamori, editors, Handbook of Set Theory, pages 1449–1496.Springer, Berlin, 2010.

112. Antonio Montalban and Andre Nies. Borel Structures: A Brief Survey. Un-published. Dated January 27, 2010.

113. Yiannis N. Moschovakis. Descriptive Set Theory, volume 100 of Studies inLogic and the Foundations of Mathematics. North-Holland, Amsterdam,1980.

114. Rajeev Motwani and Prabhakar Raghavan. Randomized Algorithms. Cam-bridge University Press, Cambridge, 1995.

115. James R. Munkres. Topology. Prentice Hall, Upper Saddle River, secondedition, 2000.

116. Mark Nadel. Lω1ω and Admissible Fragments. In Jon Barwise and SolomonFeferman, editors, Model-Theoretic Logics, Perspectives in MathematicalLogic, pages 271–316. Springer, New York, 1985.

117. Edward Nelson. Predicative Arithmetic, volume 32 of Mathematical Notes.Princeton University Press, Princeton, NJ, 1986.

118. Piergiorgio Odifreddi. Classical Recursion Theory. Vol. I, volume 125 ofStudies in Logic and the Foundations of Mathematics. North-Holland, Am-sterdam, 1989.

119. J.B. Paris and P. Waterhouse. Atom Exchangeability and Instantial Rele-vance. Journal of Philosophical Logic, 38:313–332, 2009.

313

120. Charles Parsons. Mathematical Thought and Its Objects. Harvard UniversityPress, Cambridge, 2008.

121. Lee Peng Yee. Lanzhou Lectures on Henstock Integration, volume 2 of Seriesin Real Analysis. World Scientific, Singapore, 1989.

122. Washek F. Pfeffer. A Note on the Generalized Riemann Integral. Proceedingsof the American Mathematical Society, 103(4):1161–1166, 1988.

123. Bruno Poizat. Stable Groups, volume 87 of Mathematical Surveys andMonographs. American Mathematical Society, Providence, RI, 2001.

124. Mike Prest. Model Theory and Modules, volume 130 of LondonMathematical Society Lecture Note Series. Cambridge University Press,Cambridge, 1988.

125. Proclus. A Commentary on the First Book of Euclid’s Elements. PrincetonUniversity Press, Princeton, 1970. Translated by Glenn R. Morrow. Pagereferences are to the critical edition, which are given in the margins of thisedition.

126. Thomas Reid. Essays on the Intellectual Powers of Man. Bell, Edinburgh,1785. Reprinted in [127].

127. Thomas Reid. The Works of Thomas Reid, D.D. MacLachlan and Stewart,Edinburgh, 1863. Two volumes. Edited by Sir William Hamilton.

128. Michael D. Resnik. Mathematics as a Science of Patterns: Ontology andReference. Nous, 15(4):529–550, 1981.

129. Michael D. Resnik. Mathematics as a Science of Patterns. Clarendon, Oxford,1997.

130. Lance J. Rips and Jennifer Asmuth. Mathematical Induction and Induc-tion in Mathematics. In Aidan Feeney and Evan Heit, editors, InductiveReasoning: Experimental, Developmental, and Computational Approaches,pages 248–268. Cambridge University Press, 2007.

131. Lance J. Rips, Amber Bloomfield, and Jennifer Asmuth. From NumericalConcepts to Concepts of Number. Behavioral and Brain Sciences, 31:623–687, 2008.

132. Juho Ritola. Begging the Question: A Study of a Fallacy, volume 12 ofReports from the Department of Philosophy. Painosalama Oy, Turku, 2004.Dissertation, Department of Philosophy, University of Turku, Finland.

314

133. Hartley Rogers, Jr. Theory of Recursive Functions and EffectiveComputability. MIT Press, Cambridge, MA, second edition, 1987.

134. Gerald E. Sacks. Higher Recursion Theory. Perspectives in MathematicalLogic. Springer, Berlin, 1990.

135. Stewart Shapiro. Foundations without Foundationalism: A Case forSecond-Order Logic, volume 17 of Oxford Logic Guides. The ClarendonPress, New York, 1991.

136. Stewart Shapiro. Philosophy of Mathematics: Structure and Ontology. Ox-ford University Press, Oxford, 2000.

137. Stephen G. Simpson. An Extension of the Recursively Enumerable TuringDegrees. Journal of the London Mathematical Society, 75(2):287–297, 2007.

138. Stephen G. Simpson. Subsystems of Second Order Arithmetic. CambridgeUniversity Press, Cambridge, second edition, 2009.

139. Robert I. Soare. Recursively Enumerable Sets and Degrees. Perspectives inMathematical Logic. Springer, Berlin, 1987.

140. Clifford Spector. Hyperarithmetical Quantifiers. Fundamenta Mathematicae,48:313–320, 1959/1960.

141. John R. Steel. Forcing with Tagged Trees. Annals of Mathematical Logic,15(1):55–74, 1978.

142. Charles Swartz. Introduction to Gauge Integrals. World Scientific, Singapore,2001.

143. William Tait. Finitism. Journal of Philosophy, 78(9):524–546, 1981. This isreprinted in [144].

144. William Tait. The Provenance of Pure Reason. Logic and Computation inPhilosophy. Oxford University Press, New York, 2005.

145. Alfred Tarski, Andrzej Mostowski, and Raphael M. Robinson. UndecidableTheories. Studies in Logic and the Foundations of Mathematics. North-Holland, 1953.

146. Adolf Trendelenburg. Logische Untersuchungen. Hirzel, Leipzig, second edi-tion, 1862. Two volumes.

147. Lou van den Dries. Tame Topology and O-Minimal Structures, volume 248of London Mathematical Society Lecture Note Series. Cambridge UniversityPress, Cambridge, 1998.

315

148. Albert Visser. Categories of Theories and Interpretations. In Ali Enayat,Iraj Kalantari, and Mojtaba Moniri, editors, Logic in Tehran, volume 26 ofLecture Notes in Logic, pages 284–341. Association for Symbolic Logic, LaJolla, 2006.

149. Albert Visser. The Predicative Frege Hierarchy. Unpublished. Dated October13, 2006.

150. John Wallis. Due Correction for Mr Hobbes, or Schoole Discipline, for notSaying his Lessons Right. Lichfield, Oxford, 1656.

151. John Wallis. A Treatise of Algebra. Davis, Oxford, 1685.

152. John Wallis. The Arithmetic of Infinitesimals. Sources and Studies in theHistory of Mathematics and Physical Sciences. Springer, New York, 2004.Translated by Jaequeline A. Stedall.

153. Andrew Wayne. Bayesianism and Diverse Evidence. Philosophy of Science,62(1):111–121, 1995.

154. Kai F. Wehmeier. Consistent Fragments of Grundgesetze and the Existenceof Non-Logical Objects. Synthese, 121(3):309–328, 1999.

155. Kai F. Wehmeier. Russell’s Paradox in Consistent Fragments of Frege’sGrundgesetze der Arithmetik. In One Hundred Years of Russell’s Paradox,volume 6 of de Gruyter Series in Logic and its Applications, pages 247–257.de Gruyter, Berlin, 2004.

156. Jon Williamson. Countable Additivity and Subjective Probability. BritishJournal for the Philosophy of Science, 50:401–416, 1999.

157. Timothy Williamson. Vagueness. Routledge, London and New York, 1994.

158. Crispin Wright. Frege’s Conception of Numbers as Objects, volume 2 of ScotsPhilosophical Monographs. Aberdeen University Press, Aberdeen, 1983.

159. Crispin Wright. On the Harmless Impredictavity of N= (Hume’s Principle).In Matthias Schirn, editor, Philosophy of Mathematics Today, pages 393–368. Clarendon Press, Oxford, 1998. Reprinted in [60].

160. Crispin Wright. Response to Dummett. In Philosophy of MathematicsToday, pages 389–405. Clarendon Press, Oxford, 1998. Reprinted in [60].

161. Crispin Wright. Is Hume’s Principle Analytic? Notre Dame Journal ofFormal Logic, 40(1):6–30, 1999. Reprinted in [60].

316

162. Martin Ziegler. Model Theory of Modules. Annals of Pure and AppliedLogic, 26(2):149–213, 1984.

163. Boris Zilber. Zariski Geometries: Geometry from a Logician’s Point of View,volume 360 of London Mathematical Society Lecture Note Series. CambridgeUniversity Press, 2010.

317

ARITHMETICAL KNOWLEDGE AND ARITHMETICAL DEFINABILITY ...cholak/papers/walsh.pdf · I initially read...

Documents

Transcript of ARITHMETICAL KNOWLEDGE AND ARITHMETICAL DEFINABILITY ...cholak/papers/walsh.pdf · I initially read...