Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of...

121
Aspects of Ergodic Theory in Subsystems of Second-order Arithmetic Ksenija Simic Carnegie Mellon University Thesis advisor: Jeremy Avigad Submitted to The Department of Mathematical Sciences, in partial fulfillment of the requirements for the degree of Doctor of Philosophy on May 5, 2004 Revised August 20, 2004

Transcript of Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of...

Page 1: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

Aspects of Ergodic Theory

in

Subsystems

of

Second-order Arithmetic

Ksenija Simic

Carnegie Mellon University

Thesis advisor: Jeremy Avigad

Submitted to

The Department of Mathematical Sciences,

in partial fulfillment

of the requirements for the degree of

Doctor of Philosophy on May 5, 2004

Revised August 20, 2004

Page 2: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:
Page 3: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

i

The ordinary-sized stuff which is our lives, the things people writepoetry about - clouds - daffodils - waterfalls - and what happensin a cup of coffee when the cream goes in - these things are full ofmystery, as mysterious to us as the heavens were to the Greeks.We’re better at predicting events at the edge of the galaxy or in-side the nucleus of an atom than whether it’ll rain on auntie’sgarden party three Sundays from now. Because the problem turnsout to be different. We can’t even predict the next drip from adripping tap when it gets irregular. Each drip sets up conditionsfor the next, the smallest variation blows predictions apart, andthe weather is unpredictable the same way, will always be unpre-dictable. When you push numbers through the computer you cansee it on the screen. The future is disorder. A door like this hascracked open five or six times since we got up on our hind legs.It’s the best possible time to be alive, when almost everything youthought you knew is wrong.

(Tom Stoppard, Arcadia)

Page 4: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

ii

Page 5: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

Acknowledgements

First and foremost, I wish to thank my advisor Jeremy Avigad who, through-out the time we worked together never ran out of ideas, patience, or wordsof encouragement, and whose passion for mathematics always inspires me.Many of the results in the dissertation came out of our joint work. I amalso grateful to my thesis committee members: James Cummings, WilliamHrusa, Richard Statman and Stephen Simpson, for careful reading of mywork and their suggestions, as well as to the mathematics department, inparticular Russ Walker and Bill Hrusa for their support, as well as to Stella,Ferna and Nancy for always being extraordinarily helpful and kind.

I am grateful (beyond words) to my parents and sister as well as to myoldest friends from the old country who never for a moment entertained thenotion that I could fail at anything I do.

Thanks to Chad for helping me through the first two years and to myoffice-mates from the math ghetto also known as the physical plant buildingfor friendship and great conversation, to Linda Hooper for endless emotionalsupport, to my students for making teaching great, to Dr. Hirsch, and allthe other wonderful people of Pittsburgh and the world who made the lastsix years not so bad at all.

iii

Page 6: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

iv ACKNOWLEDGEMENTS

Page 7: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

Contents

Acknowledgements iii

Introduction vii

1 General Preliminaries 1

1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Some Useful Facts . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Brief Overview of Ergodic Theory . . . . . . . . . . . . . . . 7

1.4 Challenges of Reverse Mathematics . . . . . . . . . . . . . . . 9

1.5 Parallels With Constructive Mathematics . . . . . . . . . . . 11

2 Banach and Hilbert Spaces 15

2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2 Properties of Banach and Hilbert Spaces . . . . . . . . . . . . 19

2.3 Bounded Linear Functionals . . . . . . . . . . . . . . . . . . . 28

3 The Mean Ergodic Theorem 33

3.1 Proof of the Main Theorem . . . . . . . . . . . . . . . . . . . 33

3.2 Reversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4 Aspects of Measure Theory 41

4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.2 Products and Powers in Lp Spaces . . . . . . . . . . . . . . . 48

4.3 Further Properties of Lp Spaces . . . . . . . . . . . . . . . . . 56

4.4 Integrable Functions and Sets . . . . . . . . . . . . . . . . . . 59

4.5 The Monotone Convergence Theorem . . . . . . . . . . . . . 63

5 The Pointwise Ergodic Theorem 69

5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5.2 Convergence in Norm . . . . . . . . . . . . . . . . . . . . . . 76

v

Page 8: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

vi CONTENTS

5.3 The Maximal Ergodic Theorem . . . . . . . . . . . . . . . . . 805.4 First Proof of the Main Theorem . . . . . . . . . . . . . . . . 825.5 Second Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . 845.6 Third Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . 885.7 Reversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

6 Closing Remarks 99

Appendix A 101

Appendix B 103

Page 9: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

Introduction

Formalizing mathematics within second-order arithmetic is far from beinga novel venture. On the contrary, the origin of this practice can be tracedback to the beginning of the twentieth century. The turn of the last cen-tury saw major changes in both practice and philosophy of mathematics,one of the main new developments being the introduction of formal meth-ods. The greatest mathematicians of the time, including Dedekind, Frege,Peano, Russell and Weyl, contributed, in different ways, to the developmentof axiomatic foundations for mathematics. It would be inaccurate to saythat these mathematicians began the development of mathematics withinsecond-order arithmetic, since for the most part they worked in higher-orderlogic or without an underlying formal system. It wasn’t until Hilbert thatfirst and second-order logic were clearly separated from higher-order logic.It was also Hilbert who first advocated second-order arithmetic as a for-mal system suitable for development of analysis. He wrote about first andsecond-order logic as early as 1917/18 in his lecture notes for the course“Prinzipien der Mathematik,” but the true pioneering work in this area wasGrundlagen der Mathematik that Hilbert wrote with Bernays, and whichwas first published in 1934. More than just recognizing second-order logicas a separate entity, they gave axioms for second-order arithmetic and ex-pressed the belief that analysis can be done naturally in this formal system.They proceeded to develop some of the theory and proved a number ofresults. Between the 1950’s and 1970’s this subject was taken up by a num-ber of mathematicians, including Kleene, Kreisel, Friedman, Feferman andTakeuti, who undertook an even finer analysis, dividing second-order arith-metic into subsystems of varying logical strengths. The question arose ofwhich of the standard mathematical theorems are provable in which frag-ments of second-order arithmetic, and has since been successfully answeredin a number of different areas. More on this subject, in the broader contextof historical development of proof theory, can be found in [1].

A question that often seems to arise in connection to doing mathematics

vii

Page 10: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

viii INTRODUCTION

in second-order mathematics is: what is the purpose of this endeavor? If settheory provides a sound foundation for the practice of mathematics, whyforsake it for a much weaker theory? The simple answer is this: using thecomplex machinery of ZFC for proving theorems of countable mathematicsis, in a sense, an overkill. It is frequently the case that only a fragment offull set theory is used in a proof. Our goal is to determine exactly what thisfragment is. We want to know not only whether a theorem can be formalizedand proved in second-order arithmetic, but also precisely how much of its fullstrength is used. The motivation for this approach is primarily foundational,though it should be noted that it also provides insight into the structure andcomplexity of proofs and the mathematical concepts involved.

It is true that the language of second-order arithmetic is limited: theonly objects it allows for are natural numbers and sets of natural numbers.But if we focus our attention on non set-theoretic mathematics, it turnsout that second-order arithmetic suffices for most considerations, both withrespect to representations and proofs.

Full second-order arithmetic consists of axioms for an ordered semiring,the induction axiom and the comprehension scheme

∃X∀n(n ∈ X ↔ ϕ(n))

where ϕ is any formula of the language in which X doesn’t occur freely.This implies that every set defined by a formula in the language exists. Inmost cases, however, full comprehension is more than is needed to provea theorem and can be replaced with weaker set existence axioms, yieldingdifferent subsystems of second-order arithmetic. Conveniently, there is noneed for a whole array of such subsystems: six suffice to prove most theoremsof countable mathematics, and I will for the most part be using only threeout of those six.

The name frequently used for this area of research is reverse mathemat-ics. It comes from the nature of the pursuit: not only do the axioms applycertain theorems, but oftentimes the reverse is also true, and most relevanttheorems are logically equivalent to set existence axioms used to prove them.For example, the Heine-Borel covering theorem is equivalent to weak Konig’slemma over the base system RCA0 and the Bolzano-Weierstrass theorem isequivalent to arithmetic comprehension (ACA) over RCA0. (The subsystemswill be defined in Chapter 1.)

Principal investigators in this area in the past twenty years or so havebeen Harvey Friedman, Stephen Simpson and Simpson’s students. Largeportions of mathematics have been formalized, including topology, measure

Page 11: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

ix

theory, algebra. My research builds on this work, especially that in measuretheory. At the same time, it deals with a number of topics that were notpreviously addressed, such as the theory of Hilbert spaces. Although thetitle of the thesis indicates that it is an investigation into ergodic theory,proofs of the mean and pointwise ergodic theorems are only the end resultof a broader work. Substantial effort needs to be put into setting the groundsfor proving them.

Outline

Chapter 1 provides general definitions and preliminaries, as well as a briefoverview of ergodic theory, a discussion of some of the peculiarities andpossible drawbacks of working in second-order arithmetic, and a comparisonwith constructive mathematics. Hilbert spaces are discussed in the secondchapter, in particular, existence of orthogonal projections and propertiesof bounded linear functionals, interesting in their own right, but essentialin the proof of the mean ergodic theorem. Next follows the proof of thistheorem and its reversal to (ACA) for a general Hilbert space. Then, beforemoving on to the pointwise ergodic theorem, an entire chapter (Chapter 4)is dedicated to measure theory. Though it is almost entirely self-contained,it lists properties of integrable functions (and more generally elements of Lp

spaces) needed to prove the pointwise ergodic theorem. The last chaptercontains three proofs of the pointwise ergodic theorem and its reversal to(ACA). The appendices contain proofs of a few facts that are used in thetext, but are not necessary for its understanding.

Revisions

This thesis was submitted to the Dean of the Mellon College of Science atCarnegie Mellon University on May 5, 2004. In August 2004 I encountereda few errors in the text. This is a revised version. The text is different fromthe original in the following ways:

1. Some stylistic alterations were made. These changes have no effect onany of the results.

2. Section 4.2 has been slightly modified. Lemma 4.2.2 was incorrect inthe original version, as it was stated and proved in RCA0. This neces-sitated a few changes in wording and in some conclusions throughoutthe section.

Page 12: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

x INTRODUCTION

Page 13: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

Chapter 1

General Preliminaries

1.1 Definitions

Note: Most definitions in this and other chapters are contained in Simpson’smonograph [26], which is the most comprehensive account on the topic ofsecond-order arithmetic to date.

The language of second-order arithmetic (usually referred to as L2) isa two-sorted language. This means that there are two types of variables:number variables, denoted as n,m, . . . and intended to range over naturalnumbers, and set variables, denoted as X,Y, . . . intended to range over setsof natural numbers. Terms are built from variables and constant symbols 0and 1. Atomic formulas are of the form t1 = t2, t1 < t2 and t1 ∈ X, wheret1 and t2 are terms. Finally, formulas are built from atomic formulas bymeans of propositional connectives, and number and set quantifiers.

Full second-order arithmetic (usually referred to as Z2 or Π1∞-CA0) is

formalized by classical propositional logic and axioms for second-order arith-metic. The latter are divided into:

• Basic (ordered semiring) axioms:n+ 1 6= 0m+ 1 = n+ 1 → m = nm+ 0 = mm+ (n+ 1) = (m+ n) + 1m · 0 = 0m · (n+ 1) = (m · n) +m¬(m < 0)m < n+ 1 ↔ (m < n ∨m = n);

1

Page 14: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

2 CHAPTER 1. GENERAL PRELIMINARIES

• Induction axiom

(0 ∈ X ∧ ∀n (n ∈ X → n+ 1 ∈ X)) → ∀n (n ∈ X);

• Comprehension scheme

∃X (ϕ(n) ↔ n ∈ X),

where ϕ(n) is any formula in L2 and X is not free in ϕ(n).

Different subsystems of second-order arithmetic are obtained by restrict-ing comprehension and replacing it with weaker set existence axioms. Insome cases induction is also modified.

Definition 1.1.1 An L2 formula is said to be a bounded quantifier formulaif all the quantifiers occurring in it are of the form ∀n < t , ∃n < t , ∀n ≤ tor ∃n ≤ t .

An L2 formula is Σ01 if it is of the form ∃n ϕ where n is a number

variable and ϕ a bounded quantifier formula. Similarly, a Π01 formula is of

the form ∀n ϕ.More generally, a Σ0

k formula is of the form ∃n1 ∀n2 ∃n3 . . . nk ϕ, wheren1, . . . nk are number variables and ϕ is a bounded quantifier formula. Sim-ilarly, a Π0

k formula is of the form ∀n1 ∃n2 . . . nk ϕ.A formula of L2 is an arithmetical formula if it is equivalent to a Σ0

k orΠ0

k formula for some k.A formula is Π1

1 if it is of the form ∀X θ, where X is a set variable, andθ is an arithmetic formula.

Note: If a formula being considered has only first-order quantifiers, andif there is no confusion, the superscript 0 will be omitted. For example, Σ1

stands for Σ01 etc.

Most countable mathematics can be formalized in one of the five systemslisted below, ordered from weakest to strongest. The subscript 0 will standfor restricted induction, since no system allows induction over an arbitraryformula in the language.

• [RCA0]. Basic axioms, plus Σ1 induction and ∆1 comprehension.

• [WKL0]. RCA0 together with weak Konig’s Lemma (WKL), a compact-ness principle stating that every infinite subtree of 2<N has a path.

• [ACA0]. RCA0 plus arithmetic comprehension axiom (ACA).

Page 15: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

1.1. DEFINITIONS 3

• [ATR0]. ACA0 with arithmetical transfinite recursion.

• [Π1

1-CA0]. ACA0 plus Π1

1 comprehension (Π 1

1-CA).

Though Simpson doesn’t consider it one of the main subsystems, a sixthsystem, WWKL0, will be of great use to us. It lies between RCA0 and WKL0

and consists of the axioms for RCA0 and the weak-weak Konig’s Lemma(WWKL), stating that if T is a subtree of 2<N with no infinite path, then

limn|{σ∈T | lh(σ)=n}|

2n = 0.Much more could be said about the origin of each of these subsystems and

their relationships with pertinent mathematical programs, such as Weyl’spredicativism (ACA0) or Hilbert’s finitism (WKL0), but, since Chapter 1 of[26] provides all the necessary information, there is no need to discuss ithere.

All the remaining definitions in this section take place in RCA0.The basic building block of second-order arithmetic are natural numbers.

As is customary, integers and rational numbers are represented as pairs ofnatural numbers. Furthermore, a sufficient portion of number theory isformalizable in RCA0, most importantly, it is possible to code finite sets ofnatural numbers as single numbers within this system. This enables us todiscuss sequences of rational numbers, and consequently real numbers.

Definition 1.1.2 A real number is represented as a strong Cauchy sequenceof rational numbers, that is, a sequence 〈qn | n ∈ N〉 such that qn ∈ Q forall n, and ∀m ∀n (m < n→ |qm − qn| ≤ 2−m).

Two real numbers represented as 〈qn〉 and 〈q′n〉 are equal if ∀n (|qn − q′n| ≤2−k+1) and it can be shown that all algebraic operations are well-defined. Ifx and y are real numbers, the statements x = y, x ≤ y are Π1, while x 6= y,x < y are Σ1.

The set of real numbers is the completion of the countable dense sequenceQ. This definition can be generalized:

Definition 1.1.3 A (code for a) complete separable metric space A is pre-sented as a nonempty set A ⊆ N together with a sequence of real numbersA×A→ R such that

1. d(a, a) = 0,

2. d(a, b) = d(b, a),

3. d(a, b) + d(b, c) ≥ d(a, c),

Page 16: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

4 CHAPTER 1. GENERAL PRELIMINARIES

for a, b, c ∈ A.

A point of A is a sequence 〈an | n ∈ N〉 such that d(an, am) < 2−m form < n. If a and b are two points in A, given as a = 〈an〉 and b = 〈bn〉,then d(a, b) = limn d(an, bn), and two points a and b are considered equal ifd(a, b) = 0, which makes d a metric. Each a ∈ A is identified with the point〈a | n ∈ N〉 ∈ A, and A is dense in A.

In other words, a complete separable metric space is the completion ofa countable dense subset A, and the elements of such a space are strongCauchy sequences of elements of A. This is a typical construction whichcorresponds to the standard procedure of completing a metric space, and wewill see later that the definitions for Banach and Hilbert spaces are rathersimilar to this one. Note that the set of real numbers does not formally existas an object in second-order arithmetic and neither does any metric spaceA.

The name complete metric space is justified, because RCA0 proves thefollowing: if 〈xn〉 is a sequence of elements of A with the property thatd(xn, xm) ≤ r(m) for m < n, where limn r(n) = 0, then there exists anelement x ∈ A such that d(x, xn) → 0.

Definition 1.1.4 A code for a (partial) continuous function φ : A → Bbetween two complete separable metric spaces A and B is given as a set ofquintuples Φ ⊆ N ×A× Q+ ×B × Q+ with the following properties (write(a, r)Φ(b, s) to abbreviate ∃n ((n, a, r, b, s) ∈ Φ)):

1. If (a, r)Φ(b, s) and (a, r)Φ(b′, s′) then d(b, b′) ≤ s+s′ (all neighborhoodsof Φ(x) intersect when x is within r from a).

2. If (a, r)Φ(b, s) and (a′, r′) < (a, r) (meaning that d(a, a′)+r′ < r) then(a′, r′)Φ(b, s) (consistency condition).

3. If (a, r)Φ(b, s) and (b, s) < (b′, s′) then (a, r)Φ(b′, s′) (if φ(x) is in anopen ball, it is also in all open balls containing it).

The formula (a, r)Φ(b, s) intuitively means that if x ∈ B(a, r), then φ(x) isin the closure of B(b, s) (provided it is defined) and all such formulas giveenough information about Φ(x).

A point x ∈ A is said to be in the domain of φ if for all ǫ there exists(a, r)Φ(b, s) such that d(x, a) < r and s < ǫ. It is provable within RCA0

that if x ∈ dom(φ), there is a unique point y ∈ B such that d(y, b) ≤ s forall (a, r)Φ(b, s) and d(x, a) < r.

Page 17: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

1.1. DEFINITIONS 5

We will have little use for this definition. A much more important spaceof functions will be the space C(X) defined in Section 4.1 on page 42, andcan be thought of as the space of all functions on a compact space withmoduli of uniform continuity.

Definition 1.1.5 A continuous function f : A → B between two metricspaces is uniformly continuous if for every ε > 0 there exists δ > 0 suchthat whenever d(x, y) < δ then d(f(x), f(y)) < ε. If there exists a functionh : N → N such that d(x, y) < h(n) → d(f(x), f(y)) < 2−n, the function his the modulus of uniform continuity.

Definition 1.1.6 (A code for) an open subset U of a complete separablemetric space is given as U ⊆ N×A×Q+ and a point x ∈ A is said to belongto U if

∃n ∃a ∃r (d(x, a) < r ∧ (n, a, r) ∈ U).

The formula encodes the open set which is the (countable) union of openballs B(a, r) with rational radii, and centered at the points of A. It isprovable in RCA0 that finite intersections and countable unions of open setsare open. The formula x ∈ U , where U is an open set, is Σ1.

A set is closed if it is the complement of an open set. A closed set C isrepresented with the same code as an open set U , and x ∈ C if and only ifx 6∈ U .

This representation of closed sets is not always useful, since it only pro-vides negative information. Another definition will be given later (Defini-tion 2.1.7, page 18), one that will be more convenient to work with. Toprove that the two notions coincide, stronger set existence axioms, (ACA)and (Π 1

1-CA) are needed.

An important property that a set may have is locatedness. It will pro-vide additional information about the set in question, as will be shown inSection 2.2. The definition applies to all sets, though for our purposes, theonly relevant case is when C is closed.

Definition 1.1.7 A subset C of a complete separable metric space X is saidto be located if there exists a continuous function f such that f(x) = d(x,C)for all x ∈ X The function f is called a locating function.

The Heine-Borel characterization of compactness and sequential compact-ness are both inconvenient from the point of view of second-order arithmetic,as the former is equivalent to (WKL) and the latter to (ACA) over RCA0,which means that having one or the other would be the same as having

Page 18: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

6 CHAPTER 1. GENERAL PRELIMINARIES

weak Konig’s lemma or arithmetic comprehension, respectively, available inRCA0, which is obviously undesirable. The definition below coincides withthat used by Bishop.

Definition 1.1.8 A compact metric space is a complete separable metricspace A such that there is an infinite sequence of finite sequences 〈xij | i ≤nj , j ∈ N〉 of points from A with the property that

∀z ∈ A ∀j ∈ N ∃i ≤ nj (d(xij , z) < 2−j).

1.2 Some Useful Facts

The results in this section were established earlier. Proofs of the first twocan be found in [2], as well as the proof of Proposition 1.2.3, except for(1) → (2), which is given in [26]. They are simple facts with straightforwardproofs that will be used throughout.

Lemma 1.2.1 (RCA0) Let ϕ(x, y) be any Σ1 formula, possibly with set andnumber parameters other than the ones shown. Then

∀x ∃y ϕ(x, y) → ∃f ∀x ϕ(x, f(x)).

We will also make use of least element principles. The following claimstates that in RCA0 the least element principle is available for Σ1 (and Π1)formulas. The class of ∆0(Σ1) formulas is defined to be the smallest class offormulas containing the Σ1 formulas and closed under Boolean operations(including negation) and bounded quantification.

Proposition 1.2.2 (RCA0) The following induction principles are deriv-able:

1. Ordinary induction for ∆0(Σ1) formulas.

2. Complete induction for ∆0(Σ1) formulas.

3. The least-element principle for ∆0(Σ1) formulas.

Reversals are always proved using one of the following equivalent char-acterizations.

The Turing jump of Z ⊆ N is defined as the set {x | ∃y θ(x, y, Z)}, whereθ is ∆0 and ∃y θ(x, y, Z) is a complete Σ1 formula.

Proposition 1.2.3 (RCA0) The following statements are equivalent:

Page 19: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

1.3. BRIEF OVERVIEW OF ERGODIC THEORY 7

1. Every increasing sequence 〈an〉 of real numbers in [0, 1] has a limit.

2. If 〈bn〉 is any sequence of nonnegative reals such that for each n,∑

i<n bi ≤ 1, then∑

n bn exists.

3. If 〈cn〉 is any sequence of reals such that for each n,∑

i<n c2i ≤ 1, then

n c2n exists.

4. If 〈dn〉 is any sequence of real numbers, the set D = {i ∈ N | di 6= 0}exists.

5. For every Z ⊆ N, the Turing jump of Z exists.

6. (ACA)

1.3 Brief Overview of Ergodic Theory

The original motivation for ergodic theory comes from the following problemfrom physics:

Consider a mechanical system with n degrees of freedom. . . . thesystem consists of k particles in a vessel in three-dimensionalspace. Assuming that the masses of these particles are com-pletely known, the instantaneous state of the system can be de-scribed by giving the values of the n coordinates of positiontogether with the corresponding n velocities.. . . The state of thesystem becomes from this point of view a point in 2n dimen-sional space. As time goes on, the state of the system changes inaccordance with the appropriate physical laws.. . . In accordancewith classical, deterministic, mechanics, that entire trajectorycan, in principle, be determined, once one point of it is given.In practice we almost never have enough information for such acomplete determination.. . . Instead of asking “what will the stateof the system be at time t?”, we should ask “what is the prob-ability that at time t the state of the system will belong to aspecified subset of phase space?”. The questions of greatest in-terest are the asymptotic ones: what will (probably) happen tothe system as t tends to infinity? (Halmos, [16, p.1-2])

To simplify matters, instead of considering continuous time flow in theproblem above, the system is observed at discrete moments. To rephrasethe last question: what do we know about the convergence of Sn(x) =

Page 20: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

8 CHAPTER 1. GENERAL PRELIMINARIES

1n

∑n−1k=0 f(T k(x)), where f represents “any numerical quantity determined

by the momentary state of the mechanical system (for instance, the force ex-erted by the given system, assumed to be a large collection of gas moleculescontained in a vessel)” (Dunford and Schwartz, [12, p.657]). Based on a the-orem by Liouville, “if the coordinates used in the description of the phasespace are appropriately chosen, then the flow in phase space leaves all vol-umes invariant. In other words, the transformations that constitute the floware measure preserving transformations...” (Halmos, [16, p.2]). It is there-fore reasonable to suppose that µ(T−1(E)) = µ(E) if E is a measurable set.If f is a characteristic function of a set E, the limit of the average above (ifit exists) can be thought of as the mean amount of time the point x spendsin E.

This question occupied physicists at the end of the 19th and beginning ofthe 20th century; as its appeal to them began to wane, it became increasinglyinteresting to mathematicians. Ergodic theory especially flourished between1930 and 1960. Two of the earliest results in ergodic theory were mean andpointwise ergodic theorems, the subject of this treatise. The mean ergodictheorem was proved first, by von Neumann in 1931 [28], but he did notpublish it until the following year. It established convergence in the meanof the aforementioned sequence of averages 〈Snf | n ∈ N〉 for all squareintegrable functions. The pointwise ergodic theorem (also referred to asBirkhoff’s, or sometimes as Birkhoff-Khinchine’s ergodic theorem) the more“prestigious” result (von Neumann’s biographer refers to the mean ergodictheorem as the “naive” version of the pointwise) was proved and publishedby Birkhoff [4] in 1931. It showed almost everywhere convergence of thesequence 〈Snf〉 for every integrable function f . Proofs of both theoremsconsidered standard today are due to Riesz [23].

Ergodic theory has become a prominent area of mathematics. Require-ments on the systems considered have been altered and weakened and thereare many modifications and variations of the theorems. Physicists are mostlyinterested in ergodic transformations, those in which the only invariant setshave measure 0 or 1, and much research has been done in finding necessaryand sufficient conditions for ergodicity.

This dissertation is only concerned with the two main theorems in theirmost basic form. The sources used can be divided into two groups: classicaland constructive. The relevant classical sources, apart from [16] are [3, 12,21, 29]. The most important constructive sources for the ergodic theoremsare [6, 5, 7, 20] and especially [27]. An extensive historical introduction intothe subject (and the controversies connected to it) is given in [13].

Page 21: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

1.4. CHALLENGES OF REVERSE MATHEMATICS 9

1.4 Challenges of Reverse Mathematics

The goal of a reverse mathematician is to code as large a portion of math-ematics as possible within second-order arithmetic, utilizing only naturalnumbers and sets of natural numbers. This endeavor begins with Godel’ssequencing function, which allows us to code sequences of natural numbers(hence integers and rationals), and consequently more complicated objectswe will deal with, with natural numbers. In the construction every finiteobject is assigned a single natural number, its code. It is to be expected thatthis alters and limits discourse, even if we are only interested in countablemathematics. A consequence of this limitation is that, because quantifica-tion is permitted only over numbers and sets, it is not possible to considerequivalence classes of infinitary objects, so for example, a real number is notthe equivalence class of all Cauchy sequences convergent to that number.Instead, any such sequence represents that number. Similarly, an integrablefunction has different representations. For our theory to be sound, it iscrucial that results do not depend on representations of objects. For mostconsiderations, this issue does not come up or it is trivial to resolve it. Theonly nontrivial occurrence of this concern we will come across here is whendealing with integrable functions, more specifically with their products andpowers, and we will deal with it accordingly.

The greatest challenge, however, lies in formalizing familiar concepts inthis restricted language. How do we choose definitions? Set theory often-times provides a number of classically equivalent definitions for a term, someof which are not compatible with the countability requirement, and even ifthey are, it may not be clear which one to use. How is this decided? Howdo we know that the definition we end up choosing is the best one? Andwhich criteria are used to decide what “best” means?

For example, there are multiple ways to introduce the real numbers. Wewill opt for Cauchy sequences with a fixed rate of convergence. What wouldhappen if another definition was chosen? Would the results obtained bedifferent? Most of the time these questions can be answered and differentapproaches can be compared. In the example of compact spaces (and moregenerally), we want definitions with least logical strength. Choosing a differ-ent definition of compactness from the one we have would change the resultsassociated with it. Or, consider closed sets. By choosing different defini-tions, we will have different outcomes regarding orthogonal projections. Ifthe set under consideration is closed in the sense given in Section 1.1, it isnot even clear if projection exists as an operator. Sometimes the questionis not which definition is preferable, but whether it can be used at all. For

Page 22: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

10 CHAPTER 1. GENERAL PRELIMINARIES

example, it is not possible to define an arbitrary function; and even whenwe restrict our attention to continuous functions, they are still not given viathe usual epsilon-delta characterization, but in terms of countable represen-tations. Though counterintuitive, this definition is consistent with the goalof assigning every object its code.

In reverse mathematics, as in the study of constructive and re-cursive mathematics, it is common to insist that objects comeequipped with such additional information, especially when suchinformation is typically available. In the prolog to Bishop andBridges’ Constructive Mathematics, Bishop refers to this practiceas the “avoidance of pseudogenerality.”

Of course, in many instances, the choice of formal definitionis more-or-less canonical, or various natural definitions can beshown to have equivalent properties in a weak base theory.. . .Wecontend, however, that in situations where there are a plurality ofinequivalent “natural” representations of mathematical notions,this should not be viewed as a bad thing. Indeed, the nuancesand bifurcations that arise constitute much of the subject’s ap-peal! Set theoretic foundations provide a remarkably uniformlanguage for communicating mathematical concepts, as well aspowerful principles to aid in their analysis; but from the pointof view of the mathematical logician, this uniformity and powercan sometimes obscure interesting methodological issues with re-spect to the way the concepts are actually used. (Avigad andSimic, [2, p.3])

It is to be expected that difficulties will arise with respect to the proofsof both ergodic theorems. Here are some of the problems:

(Mean ergodic theorem) Classically, each Hilbert space can be decomposedinto subspaces M and N (defined later) which are orthogonal to eachother. The limit of the sequence of operators 〈Sn〉 is the projectionoperator on M (denoted as PM ) and (classically) the existence of PM

implies the existence of the limit. Showing that N is the orthogonalcomplement of M is not trivial (or, generally, provable in RCA0), andthe existence of PM may not imply convergence of 〈Sn〉 at all.

(Pointwise ergodic theorem) Classically, the measure preserving transfor-mation T is defined on the points of the space X. For a number

Page 23: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

1.5. PARALLELS WITH CONSTRUCTIVE MATHEMATICS 11

of reasons, we are unable to do the same. Instead, we consider thetransformation that T induces on the space L1(X).

(Pointwise ergodic theorem) It may not be clear which properties the in-duced transformation need have. Classically, it is multiplicative, non-negative, preserving L1 and L∞ norm. With our definition it will turnout that some of these properties imply others, and that the assump-tions made also depend on the proof.

(Pointwise ergodic theorem) The standard proof of the pointwise ergodictheorem is more difficult to formalize than the other two. This proof,first given by Riesz, is classically short and simple. However, it requiresthe knowledge that a certain set is invariant, which is the main sourceof difficulties for us, because second-order arithmetic for the most parttakes a point-free approach to measure. Talking about invariant setsin our framework is difficult, and showing that a set (or function) isinvariant requires using methods borrowed from the classical theory ofintegrable functions, so the proof ends up being less natural than theother two.

These issues will be discussed in more detail in the appropriate chapters.

1.5 Parallels With Constructive Mathematics

There is a close connection between mathematics formalized in subsystemsof second-order arithmetic and constructive mathematics, especially betweenRCA0 and Bishop-style mathematics. Though the similarity between RCA0

and computable mathematics is probably even more pronounced, we con-sider the work of Bishop and Bridges, as it is more complete and covers alltopics we will discuss. Furthermore, a number of their proofs have beeninvaluable in my work.

Simpson pointed out the main differences between the two approachesin [26, p.31]. Apart from the immediate difference that constructivists pre-suppose intuitionist logic (which is why, for example the intermediate valuetheorem fails constructively, but is provable in RCA0), the main distinctionis that Bishop never specified a formal system to work in. There are norestrictions in the complexity of formulas used, induction is not restricted,and sets are defined somewhat vaguely. According to Bishop [6, p.7],

A set is defined by describing exactly what must be done in or-der to construct an element of the set and what must be done

Page 24: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

12 CHAPTER 1. GENERAL PRELIMINARIES

in order to show that two elements are equal.. . . There is alwaysambiguity, but it becomes less and less as the reader continuesto read and discover more and more of the author’s intent, mod-ifying his interpretations if necessary to fit the intentions of theauthor as they continue to unfold.

Bridges [9, p.5] states “Thus, if P (x) is some property of x, we can formthe class {x ∈ A | P (x)} of those x in A for which P (x) holds.” He thenconcludes “Some people find it hard to accept our freedom of construction ofsubclasses by abstraction.” In other words, constructivists allow themselvesfreedom in definitions of sets that we do not have. The maximal ergodictheorem is an example of a theorem that is constructively valid, but needsto be proved in ACA0 in our framework. Similarly, Bishop defines closuresof sets without hesitation. For us, even the statement that an open set in ageneral metric space has a closure is equivalent to (Π 1

1-CA0 ) .

The other cause for discrepancies comes from difference in definitions.To state just one example, a uniformly continuous function in [8] alwayscomes equipped with a modulus of uniform continuity, whereas we needWKL0 to show that a uniformly continuous function has a modulus. Moreoften than not, though, definitions in the two frameworks coincide, likethat for a real number or a compact space, and most constructive proofsare easily translated, for example existence of orthogonal projection or theRiesz representation theorem. In fact, encountering a result of Bishop andBridges which cannot be translated directly into RCA0 is an anomaly. Forthis reason, the question of differences between the two frameworks is moreinteresting than that of similarities.

Either way, the exploration of these issues is a worthy task, but one thatwould require much time and space. We will instead only look into somequestions concerning measure theory and the pointwise ergodic theorem, andwe will do so only in passing. An interesting situation occurs with respect tomeasure theory. Bishop and Bridges call a set that contains the domain ofan integrable function full and are able to show that the complement of a fullset has measure 0, which is quite similar to the statement that an integrablefunction is defined almost everywhere. In comparison, Simpson and Yushowed (see [30, 11]) that to show this fact in our framework, one needsto presuppose (WWKL). More precisely, RCA0 proves that an integrablefunction is defined outside of a Gδ set, while with (WWKL) it can be shownthat this set is null. Where does this difference stem from?

Both approaches to integrability are based on the classical Daniell in-tegral. In particular, Bishop lists axioms that every integration space need

Page 25: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

1.5. PARALLELS WITH CONSTRUCTIVE MATHEMATICS 13

satisfy and proves that C(X) (the space used to build integrable functions)satisfies them. Once this fact is established, it is easy to show that a com-plement of a full set has measure 0. What about our framework? It cannotbe shown in RCA0 that C(X) satisfies the classical definition of an integra-tion space and Appendix A will contain a proof of this fact, but it is stillunclear whether the proof by Bishop and Bridges that C(X) satisfies theconstructive definition can be formalized. It is likely that it cannot be useddirectly, as it relies on properties of sets that we do not have. The otherdiscrepancy is in the definition of complements: they replace the notion ofthe complement of a set with that of complemented sets, pairs of sets thatdo not intersect. They do not prove that the domain of an integrable func-tion is complemented, meaning that it may not always have a characteristicfunction. Before giving any definite answers, however, a much more carefulanalysis of the two approaches needs to be conducted.

The ergodic theorems are not provable constructively. In [6], Bishopgives an example of two equal tanks, with T representing the motion of thefluid inside them. Depending on whether there is a leak between the tanks ornot, the system will exhibit perceptibly different behavior after a sufficientamount of time. We do not know beforehand if there is a leak or not, sodifferent limits of the sequence 〈Sn〉 as defined in the previous section implythe law of the excluded middle (we will see later that even assuming the lawof the excluded middle does not save us, and that proofs of both the meanand pointwise ergodic theorem require arithmetic comprehension). Bishopgave three different proofs (all equally complicated) of the pointwise ergodictheorem. Each used upcrossing inequalities and provided what is called anequal hypothesis substitute for the pointwise ergodic theorem, meaning thatthe conclusion is classically equivalent to the conclusion of the ergodic theo-rem, but is constructively weaker. Nuber [20], his student, provided a proofwhich obtained the classical result, but with stronger assumptions. Theproof relied heavily on functional analysis, and was almost equally compli-cated. Finally, Spitters [27] provided a comprehensible constructive versionof the ergodic theorems. His proof of the mean ergodic theorem is similar tothe standard one and in a number of cases his results provided a foundationfor my proofs.

Page 26: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

14 CHAPTER 1. GENERAL PRELIMINARIES

Page 27: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

Chapter 2

Banach and Hilbert Spaces

2.1 Preliminaries

Many of the relevant definitions were given in the introduction. Conceptsintroduced below pertain to Banach and Hilbert spaces that will be used inthis and the next chapter. As before, all definitions are made in RCA0.

Definition 2.1.1 A countable vector space A over a countable field K con-sists of a set |A| ⊆ N with operations + : |A|× |A| → |A| and · : |K|× |A| →|A|, and a distinguished element 0 ∈ |A| such that (|A|,+, ·) satisfies theusual properties of a vector space over K.

Definition 2.1.2 A (code for a) separable Banach space A consists of acountable vector space A over Q together with a sequence of real numbers‖ · ‖ : A→ R such that

1. ‖q · a‖ = |q| ‖a‖ for all q ∈ Q and a ∈ A,

2. ‖a+ b‖ ≤ ‖a‖ + ‖b‖ for all a, b ∈ A.

The construction is similar to that of a complete separable metric space. Apoint of A is a sequence 〈an | n ∈ N〉 of points in A such that ‖am − an‖ ≤2−m for all m < n. The function d(a, b) = ‖a− b‖ is a pseudometric on A.The norm and metric can be extended to all of A and the resulting space isa complete separable normed space.

Definition 2.1.3 A code for a bounded linear operator between separableBanach spaces A and B, is a sequence F : A → B of points of B indexedwith elements of A, such that

15

Page 28: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

16 CHAPTER 2. BANACH AND HILBERT SPACES

1. F (q1a1 + q2a2) = q1F (a1) + q2F (a2) for all q1, q2 ∈ Q and a1, a2 ∈ A,

2. There exists a real number α such that ‖F (a)‖ ≤ α‖a‖ for all a ∈ A.

Then, for x given as 〈an〉 define F (x) = limn F (an) (limit exists in RCA0

by virtue of boundedness of F ) and for x ∈ A, ‖F (x)‖ ≤ α‖x‖.

If continuous linear operators are defined as total continuous functions whichare further linear, RCA0 proves that the two notions coincide: a linear op-erator is bounded if and only if it is continuous.

The definition of a Hilbert space differs little from that of a Banachspace.

Definition 2.1.4 A real separable Hilbert space H consists of a countablevector space A over Q together with a function 〈·, ·〉 : A×A→ R satisfying

1. 〈x, x〉 ≥ 0,

2. 〈x, y〉 = 〈y, x〉,

3. 〈ax+ by, z〉 = a〈x, z〉 + b〈y, z〉,

for all x, y, z ∈ A and a, b ∈ Q.

Given a Hilbert space H as above, define a function d(x, y) on A by d(x, y) =〈x − y, x − y〉1/2. As before, d is a pseudometric and H is the completionof A under this pseudometric, which is often written as H = A. The innerproduct is extended to the whole space by letting 〈x, y〉 = limn〈xn, yn〉 forx and y represented as 〈xn〉 and y = 〈yn〉 respectively. The inner productcan be shown to be continuous. Once again, define x = y to mean thatd(x, y) = 0, making d a metric. The space H is separable and Cauchycomplete.

The norm function on a Hilbert space is defined by ‖x‖ = 〈x, x〉 1

2 =d(x, 0), so every Hilbert space gives rise to a Banach space, in the sense ofreverse mathematics.

Clearly, every statement that holds for all Banach spaces will hold forHilbert spaces as well. Because Hilbert spaces have a richer geometric struc-ture, however, the reverse is not true. There exist plenty of results that arevalid specifically for Hilbert spaces.

To prove the reversal of the mean ergodic theorem, complex Hilbertspaces need to be introduced. Although we haven’t explicitly defined whata complex number is, it should be clear that it can be defined as an orderedpair of reals, with operations defined in the standard way. Just as the

Page 29: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

2.1. PRELIMINARIES 17

real numbers are the completion of the rationals, complex numbers are acompletion of complex rationals, the countable set Q(i), which can also bethought of as the set of all ordered pairs of rational numbers. The definitionof a complex Hilbert space is then as follows:

Definition 2.1.5 A complex separable Hilbert space H consists of a count-able vector space A over Q(i) together with a function 〈·, ·〉 : A × A → R

satisfying

1. 〈x, x〉 ≥ 0,

2. 〈x, y〉 = 〈y, x〉,

3. 〈ax+ by, z〉 = a〈x, z〉 + b〈y, z〉,

for all x, y, z ∈ A and a, b ∈ Q(i).

The notion of a bounded linear operator can be lifted accordingly.

The standard examples of separable infinite dimensional Hilbert and Ba-nach spaces can be developed in RCA0. Examples of Hilbert spaces includethe space L2(X) of square-integrable real-valued functions on any compactseparable metric space X, and the space l2 of square-summable sequencesof reals. An example of a complex Hilbert space is the space of square-summable sequences of complex numbers l2(C). Standard examples of Ba-nach spaces are the space of uniformly continuous functions with a modulusof uniform continuity C(X), and L1(X), the space of integrable real-valuedfunctions over X. All of these spaces will occur in later chapters.

The mean ergodic theorem is usually stated for a transformation on aHilbert space which is an isometry, but it is enough that it be nonexpan-sive. An isometry is a bounded linear operator that preserves norm, so thedefinition is analogous to that of a bounded linear operator.

Definition 2.1.6 An isometry on a Banach (Hilbert) space H = A is amapping T : A→ H such that

1. T (αx+ βy) = αT (x) + βT (y) for α, β ∈ Q and x, y ∈ A,

2. ‖Tx‖ = ‖x‖ for all x ∈ A.

A transformation is nonexpansive if ‖Tx‖ ≤ ‖x‖ holds for all x ∈ A insteadof clause 2. above.

Since we will be looking at iterations of the transformation T , we willneed to know that Tn is also an isometry (or nonexpansive) for all n. Using

Page 30: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

18 CHAPTER 2. BANACH AND HILBERT SPACES

Lemma 1.2.1 we can make sense of the sequence of iterations Tn, and henceof the partial averages 〈Sn〉. For every x and y the following holds:

‖Tx− Ty‖ = ‖T (x− y)‖ ≤ ‖x− y‖.

Since T is a continuous function, the identity is a modulus of continuity forT . It is shown in [14] that if a function has a modulus of uniform continuity,then Tn is continuous for all n. Since the statement that Tn is linear (resp. anisometry, nonexpansive operator) is Π1, and since Π1 induction is equivalentto Σ1 induction, it can be proved by induction in RCA0 that if T is linear(resp. an isometry, nonexpansive linear operator) then so is Tn for each n.

Let us now return to closed sets. As was already mentioned, there ismore than one notion of closed to consider. A more useful concept from aclosed set being the complement of an open is that of a separably closed set.

Definition 2.1.7 The code for a separably closed subset S of a metric spaceX = A is given as a countable sequence 〈xn | n ∈ N〉 of points in A.

The set S is the completion of the sequence 〈xn〉, therefore x ∈ S means∀m ∃n (d(x, xn) < 2−m).

Definition 2.1.8 A closed subspace of a Hilbert (resp. Banach) space is asubset of the space which is a Hilbert (resp. Banach) space in itself.

Note: Upon inspection, it can be seen that a separably closed subset of aBanach or Hilbert space is a closed subspace of the space and conversely,i.e. the two definitions are equivalent and can be used interchangeably.

For orthogonal projection onto a set to exist, the set in question needsto be linear. Consider the following four notions for a Banach space X.

1. A closed linear set is a closed subset of X that is further linear.

2. A closed subspace is a subset of X which is a Banach space in itself.

3. A located closed linear set is a closed linear set with a locating function.

4. A located closed subspace is a closed subspace with a locating function.

These definitions are not vacuous, nor are the relationships among themtrivial. We showed in [2] that the third and fourth notion in the abovedefinition coincide in RCA0. All others are distinguishable in RCA0 andoccur in practice. For example, if f is a bounded linear functional, its kernel,kerf , is a closed linear set; it is provable in RCA0 that it is also a subspace,

Page 31: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

2.2. PROPERTIES OF BANACH AND HILBERT SPACES 19

and that it is located if and only if the norm of f exists. In the proof ofthe mean ergodic theorem we will consider the set {x | Tx = x}. This setis closed and linear. However, for an appropriately chosen T , (ACA) willbe required to show that it is a closed subspace. Similarly, the set spannedby {x − Tx | x ∈ A} is a closed subspace, but, as above, (ACA) may benecessary to prove it is a closed set.

Definition 2.1.9 Let M be a closed subspace of a Hilbert space H. Let xand y be elements of H, with y in M . If d(x, y) ≤ d(x, z) for any otherpoint z in M , y is said to be the projection of x on M . Let P be a boundedlinear operator from H to itself. If for every x in H, Px is the projectionof x on M , P is said to be the projection function for M .

Although existence of orthogonal projections is typically associated withHilbert spaces, there are certain Banach spaces that have this property, forexample, uniformly convex spaces. Classically, a Banach space is uniformlyconvex if it has the additional property that for every two sequences ofvectors 〈xn〉 and 〈yn〉, if ‖xn‖ → 1, ‖yn‖ → 1 and ‖xn + yn‖ → 2, then‖xn − yn‖ → 0. For our purposes, a modulus of convexity is needed, givingthe rate of convergence of the norms. The modified definition is:

Definition 2.1.10 A Banach space is uniformly convex if there exists afunction δ : Q → Q such that whenever ‖u‖ ≤ 1, ‖v‖ ≤ 1 and ‖u + v‖ ≥2 − δ(ε), then ‖u− v‖ ≤ ε.

This property is a substitute for the parallelogram identity, which does nothold in Banach spaces without an inner product.

2.2 Properties of Banach and Hilbert Spaces

In [2] we showed that every Hilbert space has an orthonormal basis in RCA0.Since it has no bearing on the proof of the mean ergodic theorem, this proofis omitted.

Let us now establish the relationship between the notions of “closed”and “separably closed”. In the case of metric spaces, the work has beendone by Brown [10].

Theorem 2.2.1 (RCA0) Each of the following statements is equivalent to(ACA):

1. In a compact metric space, every closed set is separably closed.

Page 32: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

20 CHAPTER 2. BANACH AND HILBERT SPACES

2. In [0, 1], every closed set is separably closed.

3. In an arbitrary separable metric space, every separably closed set isclosed.

4. In [0, 1], every separably closed set is closed.

Each of the following statements is equivalent to (Π 1

1-CA):

1. In an arbitrary space, every closed set is separably closed.

2. In Baire space (NN), every closed set is separably closed.

The following theorem collects some of the results obtained in [2]. In asense, it is the analogue of Theorem 2.2.1 for Banach (and Hilbert) spaces.

Theorem 2.2.2 (RCA0) With respect to Banach spaces, the following hold:

1. The statement that every closed subspace is a closed linear set is equiv-alent to (ACA).

2. The statement that every closed linear set is a closed subspace is im-plied by (Π 1

1-CA) and implies (ACA).

3. Every located closed subspace is a located closed linear set and con-versely.

4. The statement that every closed subspace is located is equivalent to(ACA).

5. The statement that every closed linear set is located is implied by(Π 1

1-CA) and implies (ACA).

Notice that the exact strength of 2. and 5. is not known. This question,along with a number of others, remains open. See also Chapter 6 or Section16 of [2].

The above theorem also shows that proving that a certain set is locatedusually requires strong set existence axioms. The following results hold formetric spaces and are available in [10].

Theorem 2.2.3 (RCA0) Each of the following statements is equivalent to(ACA):

1. Every nonempty closed set in a compact space is located.

2. Every nonempty closed set in [0, 1] is located.

Page 33: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

2.2. PROPERTIES OF BANACH AND HILBERT SPACES 21

3. Every nonempty separably closed set in an arbitrary space is located.

4. Every nonempty separably closed set in [0, 1] is located.

5. Every nonempty open set in an arbitrary space is located.

6. Every nonempty open set in [0, 1] is located.

Each of the following is equivalent to (Π 1

1-CA):

1. Every nonempty closed set in an arbitrary space is located.

2. Every nonempty closed set in Baire space is located.

Note: It was mentioned earlier that the code for a closed subspace pro-vides more information than the code for a closed set. The previous theoremjustifies that claim, as proving that a separably closed set is located requiresweaker axioms than proving the same for a closed set ((ACA) vs. (Π 1

1-CA)).

In [2] we were able to provide an improvement to the above theorem,showing that even existence of distances for individual points requires strongaxioms; more precisely, all relationships remain the same when the require-ment that a set is located is replaced with existence of distance from anypoint to that set.

Theorem 2.2.4 (RCA0) Each of the following is equivalent to (ACA):

1. In a compact space, if C is any nonempty closed set and x is any point,then d(x,C) exists.

2. If C is any nonempty closed subset of [0, 1], then d(0, C) exists.

3. In an arbitrary space, if O is a nonempty open set and x is any point,then d(x,O) exists.

4. If O is any nonempty subset of [0, 1], then d(0, O) exists.

Each of the following is equivalent to (Π 1

1-CA):

1. In an arbitrary space, if C is any closed set and x is any point, thend(x,C) exists.

2. In any compact space, if S is any Gδ set and x is any point, thend(x, S) exists.

3. If S is a Gδ subset of [0, 1], then d(0, S) exists.

Page 34: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

22 CHAPTER 2. BANACH AND HILBERT SPACES

We now restrict our attention to Hilbert spaces and uniformly convexBanach spaces and look into orthogonal projections. The notion of projec-tion makes sense for both separably closed and closed linear sets. We willattempt to determine the strength of the statements “orthogonal projectionof x onto M exists” and “orthogonal projection onto M exists” in bothcases.

Proposition 2.2.5 (RCA0) If the projection of x on M exists, it is unique(up to equality in the Hilbert or Banach space).

Proof. Note that if y is the projection of x on M , then ‖x − y‖ = d(x,M).The claim is proved using the parallelogram identity

‖x+ y‖2 + ‖x− y‖2 = 2‖x‖2 + 2‖y‖2.

Suppose y and y′ are both projections of x on M . Then ‖x− y‖ = ‖x− y′‖,and by linearity 1

2(y + y′) is also in M . But then

‖x− y‖ ≤ ‖x− 1

2(y + y′)‖ = ‖1

2(x− y) +

1

2(x− y′)‖

≤ 1

2‖x− y‖ +

1

2‖x− y′‖

= ‖x− y‖,

hence all inequalities are actually equalities. Let d = ‖x− y‖. The previousparagraph and the parallelogram identity imply

4d2 = ‖(x− y) + (x− y′)‖2

= 2(‖x− y‖2 + ‖x− y′‖2) − ‖y − y′‖2

= 4d2 − ‖y − y′‖2,

so ‖y − y′‖ = 0 and y = y′. �

The following criterion will be useful later:

Lemma 2.2.6 (RCA0) An element y is the projection of x on M if andonly if y is in M , x− y is orthogonal to M and y is the unique vector withthese properties.

Proof. If y is the projection of x onto M , then x − y is orthogonal to M .For, suppose that y = Px and x − y is not orthogonal to M. Let z ∈ M ,

Page 35: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

2.2. PROPERTIES OF BANACH AND HILBERT SPACES 23

such that 〈x− y, z〉 6= 0. Then for every a ∈ R, y − az is also an element ofM , and

〈x− y + az, x− y + az〉 ≥ d(x,M)2 = 〈x− y, x− y〉,

therefore|a|2‖z‖2 + 2a〈x− y, z〉 ≥ 0.

By taking a with sufficiently small absolute value the last inequality can becontradicted, so 〈x− y, z〉 = 0.

To show there can be at most one point y such that x− y⊥M , assumefor the sake of contradiction that there is a y′ with the same property. Then

〈x− y, y〉 = 0 → 〈x, y〉 = 〈y, y〉,〈x− y′, y′〉 = 0 → 〈x, y′〉 = 〈y′, y′〉,〈x− y, y′〉 = 0 → 〈x, y′〉 = 〈y, y′〉,〈x− y′, y〉 = 0 → 〈x, y〉 = 〈y, y′〉.

Therefore, 〈x, y〉 = 〈x, y′〉 = 〈y, y〉 = 〈y, y′〉 = 〈y′, y′〉 and

〈y − y′, y − y′〉 = 〈y, y〉 − 2〈y, y′〉 + 〈y, y′〉 = 0,

so y = y′.Conversely, if y is any vector in M with 〈x− y, z〉 = 0 for all z ∈M , for

any other vector y′ ∈M ,

‖x− y′‖2 = ‖x− y‖2 + ‖y − y′‖2 > ‖x− y‖2,

and y is the closest point in M to x; hence y = Px. �

We will use the notation PM to denote the projection function for M , so,for example, the statement “PM exists” is shorthand for the statement “thereexists a projection function for M .” The next theorem demonstrates therelationship between distances and projections in the context of a subspace.

Theorem 2.2.7 (RCA0) Let H be a Hilbert space, let M be a closed sub-space of H, and let x be any element of H. Then the following are equivalent:

1. The distance from x to M exists.

2. The projection from x to M exists.

Moreover, for any closed subspace M , the following are equivalent:

Page 36: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

24 CHAPTER 2. BANACH AND HILBERT SPACES

1. M is located.

2. The projection function, PM , exists.

Proof. In both cases, the direction (2) → (1) is immediate, since if y is theprojection of x on M , then d(x,M) = d(x, y).

For the first implication (1) → (2), suppose M is a closed subspacewith 〈wm〉 a dense sequence of points in M . By assumption, d(x,M) =inf{d(x, y) | y ∈M} exists.

The definition of infimum implies that

∀n ∃m (d(x,wm) < d(x,M) + 2−n).

By Lemma 1.2.1 there is a sequence of points yn from 〈wm〉 such that forevery n,

d(x,M) ≤ d(x, yn) < d(x,M) + 2−n.

It is enough to show that 〈yn〉 is a Cauchy sequence, because this impliesthat d(x, limn yn) = d(x,M). Using the parallelogram identity,

‖x+ y‖2 + ‖x− y‖2 = 2‖x‖2 + 2‖y‖2,

we have

‖yn − ym‖2 = ‖yn − x− (ym − x)‖2

= 2‖yn − x‖2 + 2‖ym − x‖2 − 4‖1

2(yn + ym) − x‖2

≤ 2 (d(x,M) + 2−n)2 + 2 (d(x,M) + 2−m)2 − 4d2

= (2−n+2 + 2−m+2) d(x,M) + 2−n−m.

The inequality follows from the fact that 12(yn + ym) ∈ M and d(1

2(yn +ym), x) ≥ d(x,M). A more careful algebraic manipulation of the last quan-tity would show that, given that m < n, (2−n+2+2−m+2) d(x,M)+2−n−m ≤r(m), and r(m) → 0 when m→ ∞. In RCA0 this is enough for 〈yn〉 to be aconvergent sequence.

The second implication (1) → (2) is just a uniform version of the preced-ing argument. To define the code of P as a bounded linear operator defineits value at each element of the countable dense subset A of H, as above. Itremains to check that P is linear and bounded. For linearity, given a, b ∈ Q

and x, y ∈ A,

〈ax+ by − (aPx+ bPy), z〉 = a〈x− Px, z〉 + b〈y − Py, z〉 = 0,

Page 37: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

2.2. PROPERTIES OF BANACH AND HILBERT SPACES 25

so by Lemma 2.2.6aPx+ bPy = P (ax+ by),

and P is linear. For any x ∈ H,

‖x‖2 = 〈x, x〉 = 〈x− Px+ Px, x− Px+ Px〉= ‖x− Px‖2 + ‖Px‖2,

and ‖Px‖ ≤ ‖x‖. Therefore, ‖P‖ ≤ 1. �

Note that if there is a nonzero element y in M , then Py = y so ‖P‖ = 1.If in Theorem 2.2.7 “closed subspace” is replaced by “closed linear sub-

set,” the situation changes. The uniform case stays the same: the existenceof a locating function is equivalent to the existence of a projection function.

Theorem 2.2.8 (RCA0) Suppose M is a closed linear subset in a Hilbertspace and x is a point in the space.

1. If PMx exists, then so does d(x,M).

2. M is located if and only if PM exists.

Proof. As before, it is easy to see that if PMx exists, then d(x,M) = ‖x −PMx‖, and similarly, if PM exists then f(x) = ‖x − PMx‖ is a locatingfunction.

It remains to show that if M is located, the projection y of x on M canbe obtained uniformly. The construction is similar to the preceding one, butinstead of choosing ym in a dense subset of M , we choose it from the densesubset A of H, ensuring that ym is close enough to M .

Let d be the locating function. By definition, for every n ∈ N there isxn ∈M such that

d ≤ d(x, xn) ≤ d+ 2−n−1.

By density of A in H there is y′ ∈ A with

d(xn, y′) ≤ 2−n−1,

which means that y′ is such that

d− 2−n ≤ d(x, y′) ≤ d+ 2−n.

Thus we have

∀n ∃y′ (d− 2−n < ‖y′ − x‖ < d+ 2−n ∧ d(y′,M) < 2−n). (2.1)

Page 38: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

26 CHAPTER 2. BANACH AND HILBERT SPACES

By Lemma 1.2.1 there is a sequence 〈yn〉 such that for each fixed n, yn

satisfies (2.1).The proof that 〈yn〉 is a Cauchy sequence is similar to that in the case

of a closed subspace.

‖yn − ym‖2 = 2‖yn − x‖2 + 2‖ym − x‖2 − 4‖1

2(yn + ym) − x‖2

≤ (d+ 2−n)2 + (d+ 2−m)2 − 4‖1

2(yn + ym) − x‖2.

The only difference is in finding the lower bound for ‖12(yn + ym) − x‖, as

yn and ym are not necessarily in M.Since d(yn,M) < 2−n, there is a w′ ∈M such that ‖yn −w′‖ < 2−(n−1),

and similarly there is a w′′ ∈ M such that ‖ym − w′′‖ < 2−(m−1). Then12(w′ + w′′) ∈M , so ‖1

2(w′ + w′′) − x‖ ≥ d. Thus we have

‖1

2(yn + ym) − x‖ ≥ ‖1

2(w′ + w′′) − x‖ − ‖1

2(yn + ym) − 1

2(w′ + w′′)‖

≥ d− 1

2‖yn − w′‖ − 1

2‖ym − w′′‖

≥ d− 2−n − 2−m.

The rest of the proof remains the same: 〈yn〉 is a Cauchy sequence thatconverges to a point y ∈ H. Since for each n, d(yn,M) ≤ 2−n, this meansd(y,M) = 0, and since M is closed, this implies y ∈ M . Furthermore, thisy is the projection of x onto M . �

The strength of the statement “if d(x,M) exists then so does PMx”remains open. It is implied by (Π 1

1-CA): this follows from part 2 of the above

theorem, since, by Theorem 2.2.3, (Π 1

1-CA) suffices to prove the existence

of a locating function. Whether Π11 comprehension is also necessary is not

known.We can use projections to give an alternative proof of one of the claims

made in Theorem 2.2.2 above:

Proposition 2.2.9 (RCA0) Suppose M is closed linear set in a Hilbertspace, and either M is located or PM exists. Then M is a closed subspace.

Proof. By Theorem 2.2.8, saying M is located is equivalent to saying thatPM exists. If A = 〈xn〉, then 〈PMxn〉 is a countable dense subset of M . �

Now that the relationship between distances and projections is known,we can ask the question of what it takes to show the existence of either. Forclosed subspaces of Hilbert spaces, it requires exactly (ACA):

Page 39: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

2.2. PROPERTIES OF BANACH AND HILBERT SPACES 27

Theorem 2.2.10 (RCA0) The following are equivalent:

1. Every closed subspace of a Hilbert space is located.

2. For every closed subspace M , the projection on M exists.

3. For every closed subspace M and every point x, the projection of x onM exists.

4. For every closed subspace M and every point x, d(x,M) exists.

5. (ACA).

Proof. By Theorem 2.2.7 1 and 2 are equivalent, as are 3 and 4, 2 clearlyimplies 3 and by Theorem 2.2.3, 5 implies 1. To close the chain it suffices toshow that either 2 or 4 implies 5; this will be a consequence of Theorem 3.1.3(page 36). �

If we replace “closed subspace” by “closed linear subset,” the answer isless precise:

Theorem 2.2.11 (RCA0) Each of the following statements is implied by(Π 1

1-CA) and implies (ACA):

1. For every closed linear subset M of a Hilbert space, the projection onM exists.

2. Every closed linear subset of a Hilbert space is located.

3. For every closed linear subset M and every point x, the projection ofx on M exists.

4. For every closed linear subset M and every point x, d(x,M) exists.

Proof. Again, we have seen that 1 and 2 are equivalent, and it follows fromTheorem 2.2.3 that they are implied by (Π 1

1-CA). Also, each of 1 and 2

implies 3, which in turn implies 4. The fact that 4 implies (ACA) is givenby Corollary 3.2.2 below (page 39). �

What can we say about Banach spaces? If B is a uniformly convexBanach space, one can still consider the projection of x on M . It will bethe closest point to x in M . The non-uniform part of Theorem 2.2.7 holdsmore generally for uniformly convex Banach spaces. The uniform convexityis needed in showing that 〈yn〉 is a Cauchy sequence, as the parallelogram

Page 40: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

28 CHAPTER 2. BANACH AND HILBERT SPACES

identity does not hold in Banach spaces. We cannot, however, construct abounded linear projection operator on a Banach space. The construction inTheorem 2.2.7 cannot be paralleled: defining PMx as the closest point to xin M when x in the dense countable subset of B does not work because toshow linearity of P , an inner product is necessary — it cannot be replacedwith uniform convexity.

2.3 Bounded Linear Functionals

A bounded linear functional on a Banach space is a bounded linear operatorfrom the space in question to R. Recall that in RCA0, every bounded linearfunctional on a Banach space is equivalent to a linear continuous function.If f is a bounded linear functional, kerf is a closed linear set, since it is theinverse image of the closed set {0}. It was mentioned above that it is also aclosed subspace, provably in RCA0, and that it is located if and only if thenorm of f exists. These facts are proved in the following section.

Unless specified otherwise, assume that B is the underlying Banachspace, spanned by the countable set A = 〈xn〉.Proposition 2.3.1 (RCA0) The kernel of a bounded linear functional in aBanach space is a closed subspace of that space.

Proof. If f is the constant zero function, the statement is trivial. Thus itcan be assumed that there is a y ∈ B such that f(y) 6= 0. Replacing y byy/f(y) if necessary, consider y such that f(y) = 1.

Define the sequence 〈an〉 by an = xn − f(xn)y. Then for each n

f(an) = f(xn − f(xn)y) = f(xn) − f(xn)f(y) = 0,

so an is in the kernel of f . It suffices to show that the sequence an is densein the kernel, that is, for every n and every x such that f(x) = 0, existsxm ∈ A such that ‖x− (xm − f(xm)y)‖ < 2−n.

Because A is dense in B, choose xm ∈ A such that ‖x− xm‖ < ε (whichwill be specified later). Then, since f is bounded, |f(xm)| ≤ Mε (for allx ∈ B, |f(x)| ≤M‖x‖). Therefore,

‖x− (xm − f(xm)y)‖ ≤ ‖x− xm‖ + |f(xm)|‖y‖ ≤ ε+M‖y‖ε.

The choice for ε is now clear: ε = min{ 12n+1 ,

12n+1M‖y‖}. �

When is the kernel of a functional located? The answer to this questionis provided by the next theorem. Its proof has been adapted from [8].

Page 41: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

2.3. BOUNDED LINEAR FUNCTIONALS 29

Theorem 2.3.2 (RCA0) Let f be a bounded linear functional. The follow-ing are equivalent:

1. f has a norm.

2. kerf is located.

Proof. (1) → (2): Let f be a bounded linear functional whose norm is known.If f ≡ 0 then d(x, kerf) = 0 for all x, so suppose f is not identically 0, inwhich case ‖f‖ > 0. We will show that for every x ∈ B,

d(x, kerf) =|f(x)|‖f‖ .

On the one hand, if f(z) = 0,

|f(x)| = |f(x− z)| ≤ ‖f‖ ‖x− z‖,

and ‖x− z‖ ≥ |f(x)|‖f‖ so d(x, kerf) ≥ |f(x)|

‖f‖ .On the other hand, since ‖f‖ > 0, there is ε′ such that ‖f‖ > ε′. Because

f is normable, the quantity

‖f‖ = sup{|f(x)|‖x‖ | x ∈ H} = sup{|f(x)|

‖x‖ | x ∈ A}

exists. Fix ε < ε′. By definition of supremum, there exists y ∈ A such that

|f(y)| > (‖f‖ − ε)‖y‖.

Let z = x− f(x)yf(y) . Then, f(z) = 0 and

‖x− z‖ =|f(x)|‖y‖|f(y)| <

|f(x)|‖y‖(‖f‖ − ε)‖y‖ =

|f(x)|‖f‖ − ǫ

.

Since ε can be made arbitrarily small, |f(x)|‖f‖ is an upper bound for the

distance: d(x, kerf) = inf{d(x, z) | f(z) = 0} = |f(x)|‖f‖ and it is a continuous

function, so the kernel is located.As for (2) → (1), because f is nonzero, there exists x0 such that f(x0) =

1, and, as f is continuous, d(x0, kerf) > 0.

d(x0, kerf) = inf{‖x0 − z‖ | z ∈ ker f} = inf{‖y‖ | f(y) = 1}.

(If y = x0 − z, then f(y) = f(x0)− f(z) = 1; on the other hand, if f(y) = 1,it can be written as y = x0 − z, as in the proof of (1) → (2) above.)

Page 42: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

30 CHAPTER 2. BANACH AND HILBERT SPACES

However,

‖f‖ = sup{|f(x)| | ‖x‖ = 1}= sup{‖y‖−1 | y =

x

f(x), ‖x‖ = 1, f(x) 6= 0} =

= sup{‖y‖−1 | f(y) = 1} = d(x0, kerf)−1,

and, as the kernel is located, the norm of f exists. �

For Hilbert spaces the following theorem holds. Note that clause 3 isone form of the Riesz Representation Theorem.

Theorem 2.3.3 (RCA0) Let f be a bounded linear functional on a Hilbertspace H. The following are equivalent:

1. f has a norm.

2. kerf is located.

3. There is a y in H such that for every x, f(x) = 〈x, y〉.

Proof. The equivalence of statements 1 and 2 has just been proved. Itremains to show (2) → (3) and (3) → (1).

(2) → (3): If f ≡ 0, simply take y = 0. Otherwise, there is an x0 suchthat f(x0) 6= 0. By Proposition 2.2.9 kerf is a closed subspace of H, and byTheorem 2.2.7 the projection Px0 of x0 on kerf exists. Let z = x0 − Px0,and recall from Lemma 2.2.6 that 〈x, z〉 = 0 whenever f(z) = 0. Finally, let

y = f(z)z‖z‖2 . We will show that this y has the required property.

Note that every x ∈ H can be written as x = u + αy, where f(u) = 0and α ∈ R, since

x = (x− f(x)

f(y)y) +

f(x)

f(y)y

and

f(x− f(x)

f(y)y) = f(x) − f(x) = 0.

Then,

〈x, y〉 = 〈u+ αy, y〉 = 〈u, y〉 + α‖y‖2

= α‖y‖2 = α|f(z)|2‖z‖2

‖z‖4

= α|f(z)|2‖z‖2

Page 43: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

2.3. BOUNDED LINEAR FUNCTIONALS 31

and

f(x) = x(u+ αy) = f(u) + αf(y) = αf(y) = α(f(z))2

‖z‖2,

so for all x ∈ H, f(x) = 〈x, y〉. It is not hard to show that this y is unique.(3) → (1). If f(x) = 〈x, y〉 for every x, it is straightforward to show that

‖f‖ = ‖y‖. �

The following theorem is the uniform version of the previous one andalso holds particularly for Hilbert spaces.

Theorem 2.3.4 (RCA0) The following are equivalent:

1. Every bounded linear functional has a norm.

2. For every bounded linear functional f , kerf is located.

3. Every bounded linear functional is representable (∃y ∀x f(x) = 〈x, y〉).

4. (ACA)

Proof. By the preceding theorem, 1–3 are equivalent. Thus it suffices toshow that 4 implies 1, and that 3 implies 4.

(4) → (1): ‖f‖ = sup{ |f(x)|‖x‖ | x ∈ A}. As 〈 |f(x)|

‖x‖ | x ∈ A〉 is a boundedsequence of real numbers it has a least upper bound in ACA0.

(3) → (4). Let 〈an〉 be a sequence of real numbers; we may assume thatfor all n, 0 ≤ an ≤ 1. Next, let H be l2 (as defined in [26]). H has anorthonormal basis: en = 〈δn,m〉.

Let bi =√ai+1 − ai, and define a functional f on the orthonormal basis

by f(ei) = bi and extend linearly to all of H. Show that it is bounded.For each fixed n,

|f(n∑

i=1

xiei)| = |n∑

i=1

xibi| ≤ (n∑

i=1

x2i )

1

2 (n∑

i=1

b2i )1

2 .

The last inequality uses Holder’s inequality for sums in Rn, which is easy toformalize in RCA0. Since

n∑

i=1

b2i = an ≤ 1

for all n,

|f(n∑

i=1

xiei)| ≤ (n∑

i=1

x2i )

1

2 ≤ ‖x‖ <∞.

Page 44: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

32 CHAPTER 2. BANACH AND HILBERT SPACES

In particular, for every x ∈ A (the space of all finite sequences of rationalnumbers; l2 is the completion of A under the l2 norm), f(x) is well-defined,f is linear, and |f(x)| ≤ ‖x‖ so f is a bounded linear functional.

By assumption, the Riesz representation theorem holds. Let y be thevector guaranteed to exist by the theorem. Then bi = f(ei) = 〈y, ei〉 = yi.Because the theorem also tells us that ‖y‖ exists, since ‖y‖2 = limn an, thetheorem is proved. �

In case of a Banach space the following holds.

Corollary 2.3.5 (RCA0) The following are equivalent:

1. Every bounded linear functional on a Banach space is normable.

2. For every bounded linear functional f , kerf is located.

3. (ACA).

Page 45: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

Chapter 3

The Mean Ergodic Theorem

3.1 Proof of the Main Theorem

The mean ergodic theorem was initially stated and proved (by Von Neu-mann) for the space L2(X) of square integrable functions. In this form, thetheorem was closely related to Birkhoff’s pointwise ergodic theorem.

In the original statement of the mean ergodic theorem, the transfor-mation whose average effect on the system is examined acts on the space,i.e. the measure preserving transformation U is defined on the measure spaceX. This transformation induces a transformation T on L2(X), defined as(Tf)(x) = f(U(x)) (see [16]). The induced transformation is an isometry onL2(X). This motivated Riesz, who was the first to consider a more generalcase, to replace L2(X) with an arbitrary Hilbert space and to consider anisometry on that space. It is in this form that the mean ergodic theorem isstated and proved in standard sources today.

The statement of the theorem is:

Let T be an isometry of a Hilbert space, let x be any point, andconsider the sequence 〈Snx〉 given by

Snx =1

n(x+ Tx+ . . .+ Tn−1x).

Then the sequence converges.

The classical proof of the mean ergodic theorem hinges on the fact that theHilbert space can be represented as a direct sum of two subspaces. GivenT as above, let M = {x | Tx = x} be the set of fixed points, and let N bethe closure of the set {Tx − x | x ∈ H}. Classically, M and N are closed

33

Page 46: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

34 CHAPTER 3. THE MEAN ERGODIC THEOREM

subspaces and each other’s orthogonal complements, so H = M ⊕N . Theproof of the mean ergodic theorem examines the behavior of elements inboth subspaces with respect to T and shows that Snx, as described above,converges to the projection of x onto M .

Some of the difficulties connected to proving the mean ergodic theorem inweak subsystems of second-order arithmetic were mentioned in Section 1.3.We will now continue that discussion. A number of interesting problemsarise when one tries to translate all these facts into our framework. There isno suitable definition for closure of a set so N is represented with the code〈x − Tx | x ∈ A〉, with the understanding that this is a code for a closedsubspace of H and every point of N is represented with a Cauchy sequence〈xn−Txn〉. Since A is dense in H, the set thus defined is precisely N . Next,as was mentioned before, one can only conclude that M is a closed linearset (it is the kernel of the continuous function f(x) = ‖Tx−x‖), not that itis a closed subspace. Similarly, it cannot be deduced that N is a closed set.Finally, though it is not difficult to show that M = N⊥ (this will be donebelow), it is not possible in RCA0 to show the converse to this statement. Inorder to show x⊥M → x ∈ N , one needs to exhibit an element of N that xis equal to, which may not be possible.

The main result is that the mean ergodic theorem is equivalent to (ACA):

Theorem 3.1.1 (RCA0) The following statements are equivalent:

1. For every Hilbert space H, nonexpansive linear operator T , and pointx, the sequence of partial averages 〈Snx〉 converges.

2. For every Hilbert space H, isometry T , and point x, the sequence ofpartial averages 〈Snx〉 converges.

3. (ACA).

Clearly 1 implies 2. That 2 implies 3 is proved in the next section, and3 implies 1 is Corollary 3.1.3.

The proof of the statement presented below is constructive. Significantparts of it were adapted from [27]. It establishes the mean ergodic theoremand gives a finer analysis of conditions needed for it to hold.

Theorem 3.1.2 (RCA0) Let T be any nonexpansive linear operator on aHilbert space and let x be any point. With the notation above, the followingare provably equivalent in RCA0:

1. PNx exists.

Page 47: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

3.1. PROOF OF THE MAIN THEOREM 35

2. x can be written as x = xM + xN , where xM ∈M and xN ∈ N .

3. limn Sn(x) exists.

Furthermore, if these statements hold, then the decomposition in 2 is uniqueand PMx also exists. In fact, the following equalities hold:

limnSn(x) = PMx = xM = x− PNx.

Proof. First, let us show that M ∩ N = {0}. If x ∈ M ∩ N , then Tx = x(and so Snx = x for all n), and for every ε there is some y such that‖x− (y − Ty)‖ < ε, which implies that ‖Snx− Sn(y − Ty)‖ = ‖x− Sn(y −Ty)‖ < ε. But ‖Sn(y − Ty)‖ → 0 (proved in more detail below) so ‖x‖ < εand consequently, ‖x‖ = 0.

Next, show that if an element y ∈ H is orthogonal to N , it is in M . Lety⊥N . Then 〈y, x− Tx〉 = 0 for all x ∈ H. In particular, 〈y, y− Ty〉 = 0, or

〈y, Ty〉 = 〈y, y〉 = ‖y‖2,

so

‖Ty − y‖2 = 〈Ty − y, Ty − y〉 = ‖Ty‖2 − 2〈Ty, y〉 + ‖y‖2

≤ ‖y‖2 − 2‖y‖2 + ‖y‖2 = 0.

Therefore, y = Ty, and y ∈M , as required.Let us now prove the three statements above equivalent.(1) → (2): Write x = (x − PNx) + PNx. Clearly, PNx is in N. It

remains to show that x−PNx ∈M . Since x−PNx is orthogonal to N , theargument above shows that x − PNx ∈ M , as required. Note that the factthat M ∩N = {0} implies moreover that this decomposition is unique.

(2) → (3): If x ∈ H and x = xM + xN then Sn(xM ) = xM for all n, solimn Sn(xM ) = xM . Since xN ∈ N , for every m exists um ∈ A such that

‖xN − (um − Tum)‖ ≤ 2−m.

Then

Sn(um − Tum) =1

n

n−1∑

k=1

(T k−1um − T kum) =1

n(um − Tnum)

and

‖Sn(um − Tum)‖ ≤ 1

n(‖um‖ + ‖Tnum‖) ≤ 2‖um‖

n→ 0.

Page 48: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

36 CHAPTER 3. THE MEAN ERGODIC THEOREM

Also,

‖SnxN − Sn(um − Tum)‖ = ‖Sn(xN − (um − Tum))‖ ≤≤ ‖xN − (um − Tum)‖≤ 2−m,

and therefore Sn(xN ) → 0 as well. Finally, Sn(x) = Sn(xM ) + Sn(xN ) →xM + 0 = xM .

(3) → (1): If limn Sn(x) exists, then we expect it to be x − PN (x), sodefine PNx : A→ H by

PN (x) = x− limnSn(x)

and show that this works.

Use Lemma 2.2.6. Let limn Sn(x) = y. We need to show that x− y ∈ Nand that x− (x− y) = y is orthogonal to N.

Let yn = (n−1n I + n−2

n T + · · · + 1n T

n−1)x. Note that (I − T ) yn =(I −Sn)x→ x− y which shows that x− y is in N. For the second part, firstshow that y ∈M , i.e. Ty = y, which is true by the following argument:

TSnx =1

n(Tx+ . . . Tnx) = Snx+

1

n(Tnx− x),

and limn TSnx = limn Snx or Ty = y. Since y ∈M if z ∈ N , 〈y, z〉 = 0. �

Corollary 3.1.3 (ACA0) The mean ergodic theorem holds, i.e. for everyHilbert space H and nonexpansive mapping T , for every x ∈ H, 〈Snx〉converges.

Proof. Since N is a closed subspace, by Theorem 2.2.7 ACA0 proves that PN

exists. It was shown above that for all x, Snx converges to (I − PN )x. �

Just as knowing that N = M⊥ does not help us conclude that M = N⊥,the hypothesis that PM exists provides no help at all in proving the meanergodic theorem. These two facts are closely related. If PN exists, it isprovable in RCA0 that PM = I − PN . The other direction, however, is nottrue. Therefore, knowledge of PM does not provide us with PN , or withthe decomposition in the second item of the previous theorem, or for thatmatter, with the proof of convergence of 〈Snx〉, even though the limit, if itexists, is equal to PMx.

Page 49: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

3.2. REVERSAL 37

3.2 Reversal

This section will provide the conclusion to the proof of Theorem 3.1.1 andtie some other loose ends.

We will begin by listing two direct and simple reversals: one for the caseof nonexpansive transformation T and the other for an isometry. Naturally,the second reversal would suffice, but both are given for illustrative purposes.

First suppose 〈ai〉 is a sequence of reals in [0, 1], and letH be the space l2.Define an operator T on l2 by T (ei) = (1− ai)ei. Then T is a nonexpansivemapping because 0 ≤ an ≤ 1 for every n, and Snei = ei if ai = 0, while

Snei =1

n

n∑

m=0

(1 − ai)mei =

1

n· 1 − (1 − ai)

n+1

aiei

otherwise, which clearly converges to 0. Let x =∑

i(1/2i)ei, and let y =

limn Snx. Then for each i, 〈y, ei〉 6= 0 if and only if ai = 0, providing a Σ1

equivalent to the Π1 assertion ai = 0. By Lemma 1.2.3, this shows thatthe mean ergodic theorem for nonexpansive mappings implies arithmeticcomprehension.

The reversal when T is an isometry is somewhat more involved. If we stillwant to consider square-summable sequences and use a similar argument,the question arises of how to define T so that it preserves norm. If l2 istreated as a real Hilbert space, this cannot be done. Instead, consider l2(C).

Given any sequence of real numbers an in [0, 1], define the sequence ofcomplex numbers

zn =1 + ani

|1 + ani|,

so |zn| = 1 for each n and zn = 1 if and only if an = 0. Define a linearoperator T on l2(C) by T (ek) = zkek. The fact that |zk| = 1 for each kimplies that T is an isometry. If zk = 1, then Snek = ek; otherwise,

Snek =1

n(1 + zk + z2

k + . . .+ zn−1k )ek

=1 − zn

k

n(1 − zk)ek,

which converges to 0 as n increases, since

∣∣∣∣

1 − znk

n(1 − zk)

∣∣∣∣

≤ |1| + |znk |

n|1 − zk|= 2/(n|1 − zk|).

Page 50: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

38 CHAPTER 3. THE MEAN ERGODIC THEOREM

As above, if x =∑

k(1/2k)ek and y = limn Snx, then 〈y, ek〉 6= 0 if and only

if zk = 0, i.e. if and only if ak = 0. Once again, by Lemma 1.2.3, this impliesarithmetic comprehension.

The results stated below provide strengthenings for the reversal. Theremainder of the section contains results proved for the most part by JeremyAvigad, presented here for completeness. The proof of the next theorem andthe corollary that follows it has been taken from [2], while 3.2.3 is statedwithout proof.

Theorem 3.2.1 (RCA0) Each of the following statements is equivalent to(ACA):

1. For any nonexpansive mapping T on a Hilbert space, M is a closedsubspace.

2. For any isometry T on a Hilbert space, M is a closed subspace.

3. For any nonexpansive mapping T on a Hilbert space, N is a closedlinear set.

4. For any isometry T on a Hilbert space, N is a closed linear set.

5. For any nonexpansive mapping T on a Hilbert space and any x, theprojection PMx exists.

6. For any isometry T on a Hilbert space and any x, the projection PMxexists.

7. For any nonexpansive mapping T on a Hilbert space and any x, thedistance from x to M exists.

8. For any isometry T on a Hilbert space and any x, the distance from xto M exists.

Proof. By Corollary 3.1.3, (ACA) implies that both PM and PN exist forany nonexpansive mapping T . This implies 5 right away. Also, by Theo-rems 2.2.8 and 2.2.7 respectively, it implies that M and N are both located;and since under the assumption of locatedness, a set is closed if and onlyif it is separably closed, this, in turn, implies that M and N are separablyclosed and closed, respectively. Hence (ACA) proves 1, 3, and 5. Clearly 1implies 2, 3 implies 4, 5 implies 6–8, 6 implies 8, and 7 implies 8. Thus weonly need to establish reversals from each of 2, 4, and 8 to (ACA).

Page 51: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

3.2. REVERSAL 39

Let us first show that 2 implies (ACA). We will use statement 3. inLemma 1.2.3. Given a sequence of real numbers 〈ak | k ∈ N〉 in [0, 1], definean isometry T as described just before the statement of the lemma. By 2,the set M = {x | Tx = x} is a closed subspace, with a countable densesequence 〈yk | k ∈ N〉. But then for every k we have

ak = 0 ↔ ek ∈M

↔ ∃j (‖ek − yj‖ < 1),

providing a Σ1 definition of {k | ak = 0}. To see that ∃j (‖ek − yj‖ < 1)implies that ek ∈ M , note that if ek 6∈ M we have 〈ek, yj〉 = 0 for every yj .But then ‖ek − yj‖2 = 〈ek − yj , ek − yj〉 = ‖ek‖2 + ‖yj‖2 = 1 + ‖yj‖2 ≥ 1.

To show 4 implies (ACA), given 〈ak〉 we use the same T . Assuming 4,N is a closed set. But then ak = 0 ↔ ek 6∈ N again provides a Σ1 definitionof {k | ak = 0}.

The reversal from 8 to (ACA) is omitted and can be found in Section 15of [2]. �

Corollary 3.2.2 (RCA0) Each of the following implies (ACA):

1. If S is any closed subspace of a Hilbert space, then S is a closed linearset.

2. If S is any closed linear set in a Hilbert space, then S is closed subspace.

3. If S is any closed linear set in a Hilbert space and x is any point, theprojection of x on S exists.

4. If S is any closed linear set in a Hilbert space and x is any point, thedistance from x to S exists.

The second implication completes the proof of Theorem 2.2.2. The thirdand fourth implications complete the proof of Theorem 2.2.11.

Theorem 3.2.3 RCA0 proves that the following are equivalent:

1. For every x in a Hilbert space and isometry T , if PMx exists, thenlimn Snx exists.

2. For every x in a Hilbert space and isometry T , if PMx = 0, thenlimn Snx exists.

Page 52: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

40 CHAPTER 3. THE MEAN ERGODIC THEOREM

3. For every x in a Hilbert space and isometry T , if PMx = 0, thenlimn Snx = 0.

4. (ACA)

Of the 4 statements listed, statement 4 implies 1 by Corollary 3.1.3,clearly 1 implies 2, and 2 implies 3 by Theorem 3.1.2. So, the only partremaining to be shown is that 3 implies 4. The construction is based on astrategy employed in a different context by Pour-El and Richards [22]. Onceagain, for details see [2].

Page 53: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

Chapter 4

Aspects of Measure Theory

4.1 Preliminaries

The classical theory of measure is highly nonconstructive. Bishop andBridges state in [8]:

Any constructive approach to mathematics will find a crucial testin its ability to assimilate the intricate body of mathematicalthought called measure theory, or the theory of integration.

Simpson writes in ([26], X.1):

Historically, the subject of measure theory developed hand inhand with the nonconstructive, set-theoretic approach to math-ematics.. . . Although Reverse Mathematics is quite different fromBishop-style constructivism, we see that Bishop’s remark raisesan interesting question: Which nonconstructive set existence ax-ioms are needed for measure theory?

In their work in measure theory, Simpson and his students (principally Yu),have utilized an approach similar to Bishop’s, modifying the axioms for theDaniell integral. The basic objects are functions, not sets. Just as metric,Banach and Hilbert spaces are presented as completions of certain countablesets, the space of integrable functions is given as the completion of the spaceof simple functions under the L1 norm: it is, in fact, a Banach space. Allspaces considered are compact probability spaces. Every measurable set isidentified with its characteristic function, so a set is measurable if and onlyif the characteristic function is measurable. In order to remain consistentwith the standard practice of measure theory, I am going to diverge from

41

Page 54: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

42 CHAPTER 4. ASPECTS OF MEASURE THEORY

the terminology in [26] and call functions in L1(X) integrable, not measur-able. Similarly, a set whose characteristic function is integrable will itselfbe called an integrable set. The classical notion of a measurable functionwill not occur in what follows. Classically, a measurable function that is notintegrable has infinite measure, which will never be the case in our setup.

ACA0 is a natural system to work with in measure theory, as it providesus with least upper bounds and greatest lower bounds of sequences, assuringexistence of measure for countable unions and intersections of integrable sets.However, WWKL0 turns out to be sufficient for a large portion of discourseabout integrable functions, as it proves some of their relevant pointwiseproperties. Yu points this out in [31]:

These results are expected, if we recall that Littlewood’s threeprinciples are centered at the idea of “nearly.”. . . the weak-weakKonig’s lemma provides means of doing things “nearly” contin-uous. The weak version of Heine-Borel theorem, which is equiv-alent to weak-weak Konig’s lemma, gives finite “subcovers” withsmall errors for any open covering. Hence it is reasonable toexpect to develop the theory of integration in the subsystemWWKL0.

If one is careful with definitions, it is possible to talk about these thingseven in RCA0. It is usually possible to formalize standard definitions orfind equivalent ones in RCA0. Though it can sometimes be shown thatdesired properties of these definitions hold in RCA0, those proofs are usuallymore complicated and circuitous. Oftentimes it is necessary to reason aboutpointwise properties of functions, and in those cases stronger axioms, like(WWKL) or (ACA) are employed.

Relevant definitions and facts about measures are listed below. Moreinformation can be found in [32, 31, 26, 11]. These sources provide anin-depth treatment of measure theory in second-order arithmetic. Unlessspecified otherwise, all definitions are stated in RCA0.

The Banach space C(X), elements of which are functions with a modulusof uniform continuity, is the completion of the countable set S(X) of simplefunctions over X under the sup norm. More precisely:

Definition 4.1.1 Let X = A and d a metric on X. Let B = A×Q+ ×Q+.Each b = (a, r, s) ∈ B is intended to represent a function φb as follows:

φb(x) =

1 d(a, x) ≤ xr−d(a,x)

r−s s < d(a, x) < r

0 d(a, x) ≥ r.

Page 55: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

4.1. PRELIMINARIES 43

Define C = Q×B. If c = {(q1, b1), . . . (qn, bn)} is a finite subset of C, defineφc : X → R by φc(x) =

∑nk=1 qkφbk

(x).

Let S(X) = {c | c is a finite subset of C}. Finally, C(X) = S(X) underthe sup norm given by ‖c‖ = supx∈X |φc(x)|.

The elements of S(X) will be referred to as simple functions, disregardingthe distinction between functions and their codes, as is customary whenworking in second-order arithmetic. Bishop and Bridges refer to elements ofC(X) as test functions, which is terminology we will also occasionally use.Notice that test functions are bounded, i.e. if f ∈ C(X), then there is someconstant Mf ∈ R such that |f(x)| ≤Mf for all x.

Definition 4.1.2 If X is a compact metric space, a Borel measure µ is anonnegative bounded linear functional µ : C(X) → R such that µ(1) = 1.

In the caseX = [0, 1], a natural interpretation for µ is the Lebesgue measure,where for f ∈ C([0, 1]), µ(f) =

∫ 10 f(x)dx. Measure of open and closed sets

is defined via test functions.

Definition 4.1.3 If U is an open set, and µ a Borel measure on C(X),then define

µ(U) = sup{µ(g) | g ∈ C(X), 0 ≤ g ≤ 1, g = 0 on X \ U}.

The definition in the case of closed sets is dual: µ(C) = inf{µ(h) | h ∈C(X), 0 ≤ h ≤ 1, h = 1 on C}.

These suprema may not exist; in fact, it is not difficult to show that thestatement that every open set has a measure is equivalent to (ACA). Moreproperties of this measure can be found in [31]. For example, it is provablein RCA0 that it is regular on open and closed sets.

The definition of almost everywhere (a.e.) convergence, which will beused in the statement of the pointwise ergodic theorem, requires the conceptof a null Gδ set. A Gδ set is coded by a sequence of open sets (meaning thatit is the intersection of this sequence). It can be assumed that this sequenceis decreasing. Let G be a Gδ set coded with a sequence 〈Un〉. If µ(Un) ≤ 2−n

for all n, then G is a null Gδ set. The complement of a null Gδ set is a fullFσ set. Almost everywhere convergence is defined as convergence outside ofa null Gδ set (or equivalently, convergence on a full Fσ set).

Definition 4.1.4 Let p ≥ 1 be a real number and X a compact metricspace. The separable Banach space Lp(X) is the completion of S(X) (the

Page 56: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

44 CHAPTER 4. ASPECTS OF MEASURE THEORY

space of simple functions) under the Lp norm: ‖f‖p = µ(|f |p)1/p. A functionf ∈ Lp(X) is a strong Cauchy sequence with respect to the norm of functionsin S(X). In other words, f = 〈fn | n ∈ N〉 and ‖fm−fn‖p ≤ 2−m for n > m.

Notice that the definition implies that if 〈fn〉 is a representation of f ∈Lp(X), then ‖fn‖p → ‖f‖p in R.

The definition works for all real p > 1, though this fact is not obvious:it is not obvious how to define |f |p when p is not a natural number, but itcan be shown, using the definition of power functions given in Appendix B,that if f ∈ S(X), |f |p ∈ C(X) so the definition above makes sense. Thecase of special interest is when p = 1. The elements of L1(X) are calledintegrable functions. We sometimes write µ(f) =

∫fdµ and usually omit

the subscript in the norm. It is important to point out the following result,which can be found in [30] and [11].

Proposition 4.1.5 (WWKL0) If f is an element of L1(X), represented as〈fn〉, then f(x) is defined a.e. and is equal to limn fn(x).

It will later be shown that Lp(X) ⊆ L1(X) for all p > 1, so elements of allLp spaces are pointwise defined in WWKL0. It is important to keep in mindthat functions in Lp(X) are elements of a Banach space. Their pointwiseproperties are not known in RCA0 and f = g means, by definition, equalityin the normed space: f = g ↔ ‖f − g‖p = 0.

Note: It is sometimes convenient to presuppose that if f ∈ Lp(X), then〈fn〉 is a sequence of test functions instead of simple functions.

If f, g ∈ Lp(X), then it is possible to define max(f, g) and min(f, g). Theformer is represented as 〈max(fn, gn)〉 and the latter as 〈min(fn, gn)〉 and itis easily shown that they are elements of the space Lp(X). Furthermore, iff = f ′ and g = g′, then max(f, g) = max(f ′, g′) and min(f, g) = min(f ′, g′).In particular, for f ∈ Lp(X), we can define f+ = max(f, 0), represented as〈max(fn, 0)〉, and f− = max(−f, 0), with the representation 〈max(−fn, 0)〉.It immediately follows (in RCA0) that f+, f− ∈ Lp(x) and f = f+ − f−. As|f | = f+ + f−, |f | is integrable whenever f is.

The following definitions are substitutes in RCA0 for the usual, pointwisedefinitions of properties in question.

Definition 4.1.6 (RCA0) A function f ∈ Lp(X) is nonnegative if |f | = fin Lp(X).

Note: Equivalently, f is nonnegative if and only if f = f+ if and only iff− = 0.

Page 57: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

4.1. PRELIMINARIES 45

Fact [RCA0]: It can be assumed that if 〈fn〉 is a nonnegative function,then fn(x) ≥ 0 for all n and all x, because otherwise fn can be replaced byf+

n .Fact [WWKL0]: A function f is nonnegative in the sense defined above

if and only if f(x) ≥ 0 a.e.

Similarly, the ≤ relation between two functions in Lp(X) can be intro-duced in RCA0,

Definition 4.1.7 (RCA0) Given two functions f and g in Lp(X), f ≤ g ifand only if max(f, g) = g.

Note: Equivalently, f ≤ g if and only if min(f, g) = f .Fact [RCA0]: f ≤ g if and only if |g−f | = g−f if and only if (g−f)+ =

(g − f) if and only if (g − f)− = 0 since f ≤ g ↔ g − f ≥ 0.Fact [WWKL0]: The ordering defined above coincides with the usual,

pointwise ordering a.e.

It is also possible to define f < g as ¬(g ≤ f).To discuss products of functions it is necessary to consider a class of

functions that is closed under products and powers: these are essentiallybounded functions. They correspond to the classical space L∞ whose ele-ments are measurable functions such that for some M ∈ R, |f(x)| ≤ Ma.e. As there is no natural analogue of the concept of measurable functionin reverse mathematics, here, as in a few other places, we will find a sensi-ble substitute for the stipulation of measurability. In fact, for our purposesmeasurable functions can be avoided all together. In this particular case,instead of the entire space L∞(X), we are only interested in the functionsin Lp(X)∩L∞(X) which can be characterized with the following definition:

Definition 4.1.8 (RCA0) A function f ∈ Lp(X) is said to be essentiallybounded if |f | ≤M for some M ∈ R+.

Note: We think of M above as the constant function M , and then interpret|f | ≤M in the sense of the definition of the ≤ relation.

Another convenient characterization is this: f is essentially bounded ifand only if f = max(min(f,M),−M) for some M ∈ R+.

We are also not interested in considering Lp,∞(X) as normed spaces,since we only look at individual essentially bounded functions. It is worth-while pointing out, however, that L1,∞ corresponds to the classical L∞(X),since X is a finite measure space, and for any p, every f ∈ Lp,∞(X) is inL1(X).

Page 58: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

46 CHAPTER 4. ASPECTS OF MEASURE THEORY

Proposition 4.1.9 (RCA0) The function f is essentially bounded if it isrepresented as 〈fn〉, where each fn is a simple function and there exists aconstant M such that |fn| ≤M for all n.

Proof. Let f be represented with 〈fn〉. Let M be the bound for f . Thenf = max(min(f,M),−M), and the right-hand side of this expression isrepresented as 〈max(min(fn,M),−M)〉. This exactly means that every |fn|is bounded by M , as required. �

Fact [RCA0]: If one representation of a function is essentially bounded,all representations are, and with the same bound, since if f = g, thenmax(f,M) = max(g,M) and min(f,M) = min(g,M).

Fact [WWKL0]: The definition above implies the usual characterizationof essential boundedness, since

∀n |fn| ≤M → ∀n |fn(x)| ≤M → |f(x)| ≤M

for almost all x.

We will write “f ∈ Lp,∞(X)” to mean “f is an essentially boundedfunction in Lp(X).”

Although this is a natural place to define products and analyze theirintegrability, the discussion about these issues is complex enough to merita section of its own, and is therefore postponed until Section 4.2. The nextorder of business is instead to define characteristic functions and integrablesets, though this also needs some results from the next section. The defini-tion found in [26] is made in WWKL0 because it presupposes that functionsare defined a.e. We first give this definition, and then show how it can bemodified to make sense in RCA0.

Definition 4.1.10 (WWKL0) A function f ∈ L1(X) is a characteristicfunction if f(x) ∈ {0, 1} a.e. A code for an integrable set E with respect tothe measure µ on X is defined to be a corresponding characteristic functionand µ(E) = µ(f).

Fact [WWKL0]: A characteristic function defined in such a way is es-sentially bounded: 0 ≤ χA ≤ 1. Because of this, χ2

A ∈ L1(X) (this will beproved in Lemma 4.2.3); also, χ2

A(x) = χA(x) for every x in the domain.Moreover, if f is any function such that f2 = f , then f ≥ 0 and it followsthat (1 − f)2 = 1 − f , so 1 − f ≥ 0, or f ≤ 1.

This motivates the following definition:

Page 59: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

4.1. PRELIMINARIES 47

Definition 4.1.11 (RCA0) A function f ∈ L1(X) is a characteristic func-tion if f2 = f . The definition of integrable set remains the same as in 4.1.11.

Fact [WWKL0]: The two definitions coincide.Proof. One direction is the previous fact. The other follows from Proposi-tion 4.4.1: f2 = f if and only if f2(x) = f(x) a.e. The latter means thatthere is a an Fσ set F = ∪nCn on which f2(x) = f(x), which is equivalentto saying f(x) ∈ {0, 1} for all x ∈ F .

If E = (a, b) is an open interval in [0, 1], its characteristic function can becoded with the sequence 〈pn〉 of simple functions, where pn = 〈a+b

2 , a−b2 , (1−

2−n)a−b2 〉. RCA0 proves that 〈pn〉 satisfies both definitions of a characteristic

function, and µ(E) = limn µ(pn) (compare to [30]).

Note: We are not interested in general Borel sets, and for this reasonentirely skip the definition and discussion of Borel codes. All the sets appear-ing in the theorems that follow are simple enough: they can be expressedwith up to four quantifiers. Just as a Gδ set can be coded as a sequenceof open set, with the understanding that it is an intersection of these sets,so can, for example, an infinite union of Gδ sets be coded as a sequence ofsequences of open sets. A similar procedure applies for all sets we will beconsidering later.

Integrable sets are identified with their characteristic functions. Whenworking in WWKL0, where integrable functions are pointwise defined a.e.,given a characteristic function f , we can define membership in the corre-sponding set A: x ∈ A ↔ f(x) = 1. Notice that there is no discrepancybetween this definition and the previous definition of measure for open sets:in general, to define the characteristic function of an open set, arithmeticcomprehension is necessary.

The complement A of an integrable set A is integrable, since if f is acharacteristic function of A, then a characteristic function of A is 1 − f .Similarly, if A and B are integrable sets with characteristic functions f1 andf2, characteristic functions for A∪B and A∩B are respectively max(f1, f2)and min(f1, f2). It is not difficult to show that the characteristic function forA∩B can equivalently be written as f1 ·f2 and the characteristic function forA∪B as f1 + fn − f1 · f2. To show that these functions really correspond tocomplements, unions and intersections, (WWKL) is needed. Finite unionsand intersections can be shown to be integrable in RCA0, but (ACA) isneeded for the infinite case. If fn is the characteristic function of An, thenthe characteristic function of ∪nAn is supn fn and the characteristic functionof ∩nAn is infn fn. A proof that suprema and infima of integrable functions

Page 60: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

48 CHAPTER 4. ASPECTS OF MEASURE THEORY

are integrable is provided in Proposition 4.5.4, and it is not difficult to show(in WWKL0, provided they exist) that suprema and infima of characteristicfunctions are characteristic functions themselves.

In particular, in ACA0, Gδ and Fσ sets, as well as their unions andintersections, are integrable, or, for that matter any set at a very low levelin the Borel hierarchy. If M is a Borel set of finite rank, ACA0 provesthat there exists a Gδ set G such that M ⊆ G and µ(G \M) = 0. Thisimplies that every null set that is “simple enough”, that is, at a low levelin the Borel hierarchy, is contained in a null Gδ set. This will allow us toprove a.e. convergence of a sequence of functions even if the set on whichthe sequence converges is not Fσ. The fact follows immediately from Yu’sobservation in [30] that ACA0 proves measure for Borel sets of finite rankto be well-defined and regular (this fact requires ATR0 in the general case).For, if µ(M) = infn{µ(U) | M ⊆ U and U is open}, then there exists asequence Un of open sets such that M ⊆ Un and µ(Un) ≤ µ(M) + 2−n forall n. The set G = ∩nUn is the required Gδ set.

In some cases it is possible to build a theory of measure and integrationfrom sets. When X is [0, 1]n or the Cantor space, there is a simple repre-sentation of measurable sets, as the completion of the Boolean algebra ofclopen sets with respect to symmetric differences (see [11]). In this case,the definitions are made in RCA0 and the equivalence between this and thefunctional approach is provable in WWKL0. If there is no suitable Booleanalgebra of sets whose completion yields measurable sets, the situation be-comes more complicated and the question of whether a similar procedureexists for arbitrary metric spaces still remains open.

4.2 Products and Powers in Lp Spaces

Giving a meaningful, viable definition of products and powers of elementsof Lp(X) may seem like an innocuous problem at first, but many difficultiessurround it, as we will show below. Even in classical measure theory, aproduct of integrable functions is not necessarily integrable. It is not atall obvious how to give a definition of this concept that will be natural insecond-order arithmetic, and at the same time capture classical propertiesof products. Such a definition is necessary, as knowing the behavior ofT (fg) for integrable functions f and g is necessary for proving the pointwiseergodic theorem. More importantly, products and powers are essential ifwe hope to emulate classical measure theory as closely as possible. Forexample, integrating a function f over a set A will require knowing that

Page 61: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

4.2. PRODUCTS AND POWERS IN LP SPACES 49

fχA is integrable, and to consider Lp spaces, we will need to understand thebehavior of |f |p when p ≥ 1 is a real number.

The goal is to develop as much theory as possible in RCA0, yet thissystem, for the most part, cannot reason about pointwise properties of func-tions. Worse yet, a natural definition of products can be sensitive to rep-resentations. It is possible to construct a function f represented with thesequence of simple functions 〈fn〉 such that f = 0, but 〈f2

n〉, though point-wise convergent to 0, does not converge in norm. Such a function can, forexample, be defined in the following way:

fn(x) =

2n3x 0 ≤ x ≤ 12n2

2n− 2n3x 12n2 < x ≤ 1

n2

0 otherwise(4.1)

Each fn is nonzero only on an interval of length 1n2 and peaks at x = 1

2n2 ,with y = n. The measure of each fn is 1

2n and a simple computation showsthat 〈fn〉 is a Cauchy sequence in L1([0, 1]) (converging to the zero function),while 〈f2

n〉 is not.This means that if one tries to define the product of two functions by the

products of their approximations, it can happen that f1 = f2 and g1 = g2,but f1g1 6= f2g2. The question that the function defined with (4.1) raisesis this: do we want to structure the definition of products so that f2 existsand coincides with the pointwise product (0 in this case) or do we wantit to somehow reflect differences in representations? I have opted for thefirst solution. After all, in classical measure theory products are definedpointwise.

It makes sense then to start from the pointwise product of two functions.Classically, if f , g, and h are elements of Lp(X), h is said to be the productof f and g if h(x) = f(x)g(x) a.e. This definition is meaningful in our frame-work, and it can be proved in WWKL0 that the product of two functions,if it exists, is unique and does not depend on representations. We will callthis the pointwise product of f and g. The shortcoming of this definition isthat it uses pointwise properties of functions. At the same time, it cannotbe modified to work in RCA0. The logical rewording would be this:

If f , g and h are elements of Lp(X), we say that h = fg ifthere exist some representations 〈fn〉, 〈gn〉 and 〈hn〉 of f , gand h respectively, such that h(x) = f(x)g(x) whenever f(x) =limn fn(x), g(x) = limn gn(x) and h(x) = limhn(x) all exist.

The problem with this definition is that it is too restrictive. Because it canhappen, for example, that h = h′ and both h(x) and h′(x) are defined, but

Page 62: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

50 CHAPTER 4. ASPECTS OF MEASURE THEORY

h(x) 6= h′(x), it is not possible to show that if h = fg and h′ = h thenh′ = fg.

We consider instead the following, stronger characterization.

Definition 4.2.1 (RCA0) If f, g and h are elements of Lp(X) and if thereexist representations 〈fn〉 of f and 〈gn〉 of g such that ‖h− fngn‖p → 0 witha fixed rate of convergence for some h ∈ Lp(X), then we say that h is thestrong product of f and g.

Notice that if for some representations 〈fn〉 and 〈gn〉 of f and g the sequence〈fngn〉 is Cauchy with a fixed rate of convergence, then the strong productfg exists.

Proposition 4.2.2 (WWKL0) If fg = h in the strong sense, then fg = hpointwise.

Proof. Assume 〈fngn〉 converges to h in Lp norm, and let x be such thatf(x), g(x) and h(x) are defined. Then, since ‖h − fngn‖p → 0 with a fixedrate of convergence, one can construct subsequences 〈f ′n〉 and 〈g′n〉 of 〈fn〉and 〈gn〉 such that ‖h− f ′ng

′n‖p ≤ 2−n for all n.

Let f ′ be the function given by 〈f ′n〉, let g′ be the function given by 〈g′n〉,and let h′ be the function represented as 〈f ′ng′n〉. Then f ′ = f , g′ = g,and h′ = h; furthermore, f ′(x) and g′(x) are defined because 〈f ′n〉 is asubsequence of 〈fn〉, and similarly for 〈g′n〉 and 〈gn〉. Therefore, f ′(x) = f(x),and g′(x) = g(x) implying

f(x)g(x) = f ′(x)g′(x) = limnf ′n(x) lim

ng′n(x) = lim

nf ′n(x)g′n(x) = lim

nh′n(x).

Because limn h′n(x) exists, it is equal to h(x), consequently, f(x)g(x) = h(x)

as required. �

This proof almost went through in RCA0. The only argument that re-quired weak-weak Konig’s lemma was in stating that h(x) = h′(x).

From now on, we are going to use Definition 4.2.1: the remainder of thesection will show that this is justified, at least in the case when one of thefunctions is essentially bounded. The strong product does not depend onrepresentations of f and g. Showing that the product, if it exists, is unique,requires WWKL0. In this case, the proof is easy, since if h1 = fg andh2 = fg, then h1(x) = f(x)g(x) = h2(x) a.e. which implies that h1 = h2.Notice that the definition naturally takes care of our counterexample (4.1)above, as for the representation of the zero function 〈0〉, its product withitself is also the zero function, as required.

Page 63: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

4.2. PRODUCTS AND POWERS IN LP SPACES 51

The main result of this section will be Proposition 4.2.5, which statesthat the strong product of two functions, one of which is essentially bounded,exists. First we show a weaker result.

Proposition 4.2.3 (RCA0) If f, g ∈ Lp,∞, then the strong product fg existsand is also in Lp,∞. In fact, for any choice of representations of f and gthe sequence 〈fngn〉 converges, and to the same function.

Proof. Let f be represented with 〈fn〉, such that for all n, |fn| ≤ Mf andg be represented with 〈gn〉, with |gn| ≤ Mg for all n. To show that thesequence 〈fngn〉 is convergent, we show it is Cauchy with a fixed rate ofconvergence. Let m < n.

‖fngn − fmgm‖p = ‖fngn − fngm + fngm − fmgm‖p

≤ ‖fn(gn − gm)‖p + ‖gm(fn − fm)‖p

≤ Mf2−m +Mg2−m.

Pulling out constants outside of the norm is justified, because all functionsin question are simple, and measure is monotonic on simple functions.

Showing that the product does not depend on the representations off and g is analogous. Let 〈fn〉 and 〈f ′n〉 represent f , and 〈gn〉 and 〈g′n〉represent g. This means that for all n, ‖fn−f ′n‖p ≤ 2−n+1 and ‖gn−g′n‖p ≤2−n+1. By assumption, ‖h − fngn‖p → 0 with a fixed rate of convergence.Recall also that all representations of an essentially bounded function havethe same bound.

‖h− f ′ng′n‖p ≤ ‖h− fngn‖p + ‖fngn − f ′ngn‖p + ‖f ′ngn − f ′ng

′n‖p

≤ ‖h− fngn‖p +Mg2−n+1 +Mf2−n+1,

which implies that ‖h− f ′ng′n‖p → 0 also (with a fixed rate of convergence).

It is immediate that |fg| ≤MfMg. �

To prove the more general claim, we will need the following (stronger)condition (⋆):

Functions f and g are elements of Lp(X), represented, respec-tively as 〈fn〉 and 〈gn〉 and there exist a sequence of functions〈hn〉 as well as a function h in Lp(X) such that fngm → hn whenm→ ∞ and hn → h when n→ ∞.

To be able to use this characterization, we need to show that it impliesDefinition 4.2.1.

Page 64: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

52 CHAPTER 4. ASPECTS OF MEASURE THEORY

Proposition 4.2.4 (RCA0) If two functions f and g in Lp(X) satisfy prop-erty (⋆), then fg exists in the sense of Definition 4.2.1.

Proof. Assume fngmm−→ hn

n−→ h. For each n let mn be such that ‖hn −fngmn‖p ≤ 2−n. Then it is clear that 〈fn〉 represents f and 〈gmn〉 represents

g and ‖h− fngmn‖pn−→ 0. �

If another assumption is added, namely that one of the functions isessentially bounded, then the other direction holds true as well. A sketch ofthe proof is as follows.

Suppose ‖h− fngn‖p → 0 for some representations of f and g, and thatf is essentially bounded. According to the first fact below, for a fixed n,there always exists hn such that ‖hn − fngm‖p

m−→ 0. Then

‖h− hn‖p ≤ ‖h− fngn‖p + ‖fngn − fngm‖p + ‖fngm − hn‖p

≤ ‖h− fngn‖p +Mf2−n + ‖fngm − hn‖p,

which, since m is arbitrary, implies that ‖h− hn‖p ≤ ‖h− fngn‖p +Mf2−n.Letting n→ ∞, it follows that hn → h in Lp(X). This still does not concludethe proof, because the conclusion should hold for any representation of fand g. Lemma 4.2.5 will show that this is indeed the case.

Observe now some facts about products implied by (⋆):

Fact [RCA0]: The sequence of functions 〈hn〉 in the definition abovealways exists, since ‖fngm − fngk‖p ≤Mfn

‖gm − gk‖p, where |fn| ≤Mfn.

Fact [RCA0]: The definition is independent of the order in which thelimits are taken: if fngm

n−→ hmm−→ h and fngm

m−→ wnn−→ w, then

h = w. Furthermore, one limit exists if and only if the other does.This is because we have ‖h−w‖p ≤ ‖h−hm‖p +‖hm−fngm‖p +‖fngm−

wn‖p + ‖wn − w‖p for all n and m. For every k it is possible to choose mand n large enough such that ‖h− w‖p ≤ 2−k.

The second part is proved by a slight modification of the argument, sincefor example ‖h−wn‖p ≤ ‖h− hm‖p + ‖hm − fngm‖p + ‖fngm −wn‖p for allm (by the previous fact the sequence 〈hn〉 always exists).

We now come to the promised main result of this section, which willenable us to integrate over arbitrary integrable sets. Notice that (⋆) isprecisely the characterization needed to prove the claim.

Proposition 4.2.5 (RCA0) Let f and g be in Lp(X) for some p. If g isessentially bounded, then fg exists as an element of Lp(X). Furthermore,all representations of f and g yield the same product.

Page 65: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

4.2. PRODUCTS AND POWERS IN LP SPACES 53

Proof. Let f be represented with 〈fn〉, and g with 〈gn〉. Assume that g isessentially bounded with |g| ≤Mg.

Fix n. Since fn ∈ S(X), it is bounded, and there exists Mfnsuch that

|fn(x)| ≤Mfnfor all x. Consider 〈fngk | k ∈ N〉. Let m > k.

‖fngk − fngm‖p ≤Mfn2−k.

Therefore, 〈fngk | k ∈ N〉 is a Cauchy sequence that represents fng.

Next show that the sequence 〈fng | n ∈ N〉 is strong Cauchy in Lp(X).For n > m

‖fmg − fng‖p = ‖(fm − fn) g‖p

≤ Mg‖fm − fn‖p

≤ Mg2−m.

The first inequality would be almost trivially true in WWKL0 (compareto Proposition 4.4.1). In RCA0, however, it requires some work. It followsfrom these two facts:

‖(fm − fn) gk‖p ≤Mg‖fm − fn‖p,

which is true by monotonicity of measure for simple functions, and

‖(fm − fn) gk‖p → ‖(fm − fn) g‖p. (4.2)

That this is enough follows from the more general fact (with a simple proof)that, if 〈xn〉 is a sequence of real numbers that converges to x, and if y issuch that for all n, xn ≤ y, then x ≤ y as well. It remains to show (4.2).

It is true that ‖g − gk‖p ≤ 2−k. Furthermore,

‖(fm − fn) g − (fm − fn) gk‖p = ‖(g − gk)(fn − fm)‖p

≤ (Mfn+Mfm

) 2−k,

since fn and fm are simple functions. This means that (fm − fn) gk →(fm − fn) g in Lp(X) for each fixed m and n when k → ∞, which alsoconcludes the proof of the above claim.

Consequently, there exists h ∈ Lp(X) such that ‖h − fng‖p → 0 asn→ ∞. By property (⋆), h = fg.

To prove the second part, let f and g be as above, such that fngmm−→

hnn−→ h for some representations of f and g (recall that, if one of the func-

tions is essentially bounded, then property (⋆) coincides with the definition

Page 66: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

54 CHAPTER 4. ASPECTS OF MEASURE THEORY

of products). We will show that if 〈f ′n〉 and 〈g′n〉 are different representa-tions of f and g, there exists a sequence of functions 〈h′n〉 in Lp(X) such

that f ′ng′m

m−→ h′nn−→ h.

Recall that for each n, h′n exists. Since each f ′n is a simple function, forevery n there exists a positive constant Mf ′

nsuch that |f ′n| ≤Mf ′

n. Next,

‖hn − h′n‖p ≤ ‖hn − fngm‖p + ‖fngm − f ′ngm‖p

+ ‖f ′ngm − f ′ng′m‖p + ‖f ′ng′m − h′n‖p

≤ ‖hn − fngm‖p +Mg2−n+1

+ Mfn2−m+1 + ‖f ′ng′m − h′n‖p.

Since m is arbitrary, it follows that ‖hn − h′n‖p ≤Mg2−n+1. Furthermore,

‖h− h′n‖p ≤ ‖h− hn‖p + ‖hn − h′n‖p ≤ ‖h− hn‖p +Mg2−n+1,

which approaches 0 when n→ ∞ and with a fixed rate of convergence. �

The following corollary contributes evidence to the claim that essentiallybounded functions have particularly nice properties with respect to prod-ucts.

Corollary 4.2.6 (WWKL0) If f or g is essentially bounded, and if fg = hpointwise a.e., then fg = h in the strong sense.

Proof. Since the strong product in this case exists, and is unique, the twohave to coincide. �

To summarize:In RCA0 we have the following:

• The strong product, if it exists, does not depend on representations.

• If one of the functions is essentially bounded, the strong product existsand is unique.

In WWKL0 the following hold:

• The strong product, if it exists, is unique.

• The pointwise product, if it exists, is unique and does not depend onrepresentations.

• If h is the strong product of f and g, then h = fg a.e.

Page 67: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

4.2. PRODUCTS AND POWERS IN LP SPACES 55

• If one of the functions is essentially bounded, then their pointwiseproduct exists.

• If one of the functions is essentially bounded, and h = fg a.e, thenh = fg in the strong sense.

Thus, when f or g is essentially bounded, Definition 4.2.1 provides acharacterization of the pointwise product that is useful in the absence ofWWKL0, but provably equivalent to the pointwise definition in the presenceof WWKL0. This situation is satisfactory. In general, however, the two maynot be equivalent for more general functions; in other words, when neitherf nor g is essentially bounded, they may have a pointwise product that isnot a product in the strong sense. This question is still open.

In light of these difficulties, we cannot safely say that the suggesteddefinition is necessarily the best one. The most important thing is thatit works, which it does, but it is not difficult to imagine that there is amore natural, more elegant, more accessible (with respect to RCA0) way todeal with products of integrable functions. For now, it may be the bestsolution to restrict the definition of products only to the case when one ofthe functions is essentially bounded.

Defining powers of functions is also tricky. (Before continuing, note thatthe discussion below makes sense only for nonnegative functions.) Evendefining powers of numbers is not trivial. We tend to forget that, whiledefining x2 is easy, x

√3, or even x1/2, requires some thought. All the defini-

tions and necessary properties of power functions are given in Appendix B.As with products, we can give two definitions. The first is pointwise.

Definition 4.2.7 (WWKL0) Let f ∈ Lp(x) and let k ∈ R. If there exists arepresentation 〈fn〉 of f and h ∈ Lp(X) such that fk(x) = h(x) a.e., we saythat h is the pointwise kth power of f .

The other alternative is to consider convergence in norm.

Definition 4.2.8 Let f ∈ Lp(X) and let k ∈ R. If there exists a represen-tation 〈fn〉 of f such that ‖h− fk

n‖p → 0 for some h ∈ Lp(X), we say thath is the strong kth power of f .

The relationships between the two definitions in RCA0 and WWKL0 are thesame as the ones for products.

Definition 4.2.8 is closely related to Definition 4.2.1, but is weaker. Al-though we would hope that the two would coincide when considering f2 forsome function f , it is not necessarily the case: if f2 exists in the sense just

Page 68: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

56 CHAPTER 4. ASPECTS OF MEASURE THEORY

defined, then f · f exists as well. The reverse is still an open question: itis conceivable that two different representations of f are taken to form theproduct, whereas for no representation is 〈f2

n〉 convergent. As before, if fis an element of Lp,∞, the definitions do coincide. Assuming 〈fn〉, 〈f ′n〉 and〈f ′′n〉 are all representations of an essentially bounded function with boundMf , and there is some function h such that ‖h−f ′nf ′′n‖p → 0, the conclusionfollows from the fact that

‖h− f2n‖p ≤ ‖h− f ′nf

′′n‖p + ‖f ′nf ′′n − f ′nfn‖p + ‖f ′nfn − f2

n‖p

≤ ‖h− f ′nf′′n‖p +Mf2−n+1 +Mf2−n+1.

For this reason, as in the case of products, it would be reasonable to restrictthe definition to essentially bounded functions only.

4.3 Further Properties of Lp Spaces

The next task is to determine which of the standard properties of Lp spacescarry over to our framework, and for those that do, whether the same proofscan be used and in which subsystem.

First we prove Holder’s inequality.

Lemma 4.3.1 (RCA0) If f ∈ Lp(X) and g ∈ Lq(X) (where p > 1, q >1 and 1

p + 1q = 1) are given with representations 〈fn〉 and 〈gn〉, then the

sequence 〈fngn〉 is strong Cauchy in L1(X), therefore fg is integrable and‖fg‖1 ≤ ‖f‖p‖g‖q.

Proof. In the standard proof, integrability of fg is not established separately,as it is a consequence of Holder’s inequality:

∫fg < ∞, which is enough

in the classical setup. We cannot follow this reasoning, since fg cannot beintegrated without the knowledge that it is integrable in the sense of one ofthe definitions of products.

We are able to show, however, that for every n

‖fngn‖1 ≤ ‖fn‖p ‖gn‖q, (4.3)

using the standard argument. The inequality ab ≤ ap

p + bq

q holds for allnonnegative real numbers a and b is provable in RCA0, since a basic theoryof differentiation is available in this system, and the proof consists of findingthe minimum of the function b 7→ ab− ap

p + bq

q (see Appendix B for a proof

Page 69: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

4.3. FURTHER PROPERTIES OF LP SPACES 57

of this inequality). Let u and v be any two simple functions. Let a = |u(x)|‖u‖p

and b = |v(x)|‖v‖q

in the above inequality and obtain

|u(x)|‖u‖p

· |v(x)|‖v‖q≤ 1

p

|u(x)|p‖u‖p

p+

1

q

|v(x)|q‖v‖q

q.

Integrating yields the desired inequality. In particular, when u = fn andv = gn, (4.3) follows.

Next, show that fg is integrable. Let m < n. Since fn − fm and gn − gm

are simple, Holder’s inequality for simple functions applies.

‖fngn − fmgm‖1 = ‖fngn − fngm + fngm − fmgm‖1

≤ ‖fn(gn − gm)‖1 + ‖gm(fn − fm)‖1

≤ ‖fn‖p ‖gn − gm‖q + ‖gm‖q ‖fn − fm‖p.

Every Cauchy sequence of real numbers is bounded: ‖fn‖p ≤ ‖f1‖p + 2−1 =M1 and ‖gn‖1 ≤ ‖g1‖ + 2−1 = M2. Then

‖fngn − fmgm‖1 ≤M12−m +M22

−m,

which is a Cauchy sequence with a fixed rate of convergence. The functionfg is integrable, and taking the limit in (4.3) is justified. The Holder’sinequality follows. �

It is worth pointing out that our definition of Lp spaces is different fromthat of most authors. Even in Bishop’s work, a function f is an element ofLp(X) if and only if f is measurable and |f |p ∈ L1(X). How are the twodefinitions related? An exact correspondence cannot be established, since wedo not have the notion of measurable function. The measurability conditionhas to be replaced with another, though, as it is easy to construct a sequenceof simple functions 〈fn〉 which is not Cauchy, whereas for some p, 〈fp

n〉 is (forexample, p = 2, f2n(x) = 1 and f2n+1(x) = −1). No such difficulty ariseswhen functions considered are nonnegative, so it is reasonable to give thefollowing characterization. Interestingly enough, its proof is not as obviousas one might imagine. Since we cannot discuss arbitrary functions, thefollowing lemma has to be stated for sequences instead.

Lemma 4.3.2 (RCA0) The sequence 〈fn〉 of simple functions is an elementof Lp(X) if and only if 〈(f+

n )p〉 and 〈(f−n )p〉 are elements of L1(X).

Page 70: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

58 CHAPTER 4. ASPECTS OF MEASURE THEORY

Proof. Let 〈fn〉 be given and assume 〈(f±)p〉 ∈ L1(X). The inequality|x− y|p ≤ |xp − yp| holds for all x, y ∈ R (see Section B) so

|f±n − f±m|p ≤ |(f±n )p − (f±m)p|

for all n,m. Applying µ to the inequality yields

‖f±n − f±m‖pp ≤ ‖(f±n )p − (f±m)p‖1 ≤ 2−m

when m < n. Therefore, 〈f+n 〉 and 〈f−n 〉 are Cauchy in Lp(X), meaning

that f+ and f− are elements of Lp(X); consequently 〈fn〉 defines a functionf ∈ Lp(X), since f = f+ − f− ∈ Lp(X).

For the other direction, assume 〈fn〉 represents a function f ∈ Lp(X).It suffices to show that if g is nonnegative and g ∈ Lp(X), then gp ∈ L1(X)(since f+ and f− are nonnegative and both are elements of Lp(X)).

Based on the inequality |xp−yp| ≤ p |(x−y)(xp−1 +yp−1)|, which is truefor all real x and y (see Section B), it follows that

|gpn − gp

m| ≤ p |(gn − gm)(gp−1n + gp−1

m )|,

for all n and m. Assume m < n. After integrating, and with the aid ofHolder’s inequality,

‖gpn − gp

m‖1 ≤ p ‖gn − gm‖p ‖gp−1n + gp−1

m ‖q

≤ p ‖gn − gm‖p (‖gp−1n ‖q + ‖gp−1

m ‖q),

where q is such that 1p + 1

q = 1.

Because g ∈ Lp(X), ‖gn − gm‖p ≤ 2−m. It remains to estimate ‖gp−1n ‖q

and ‖gp−1m ‖q. Notice that p = (p− 1)q.

‖gp−1n ‖q = [µ(g(p−1)q

n )]1/q = [µ(gpn)]1/q = ‖gn‖p/q

p .

The sequence 〈‖gn‖p〉 is Cauchy in R and thus bounded; let M be an upperbound. Then

‖gpn − gp

m‖1 ≤ pMp/q 2−n+1,

and 〈gpn〉 defines an integrable function, gp, therefore, both (f+)p and (f−)p

are integrable. �

Next we can determine the relationship among the Lp(X) spaces. Theproof is adapted from [27].

Lemma 4.3.3 (RCA0) For p < q, Lq(X) ⊆ Lp(X).

Page 71: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

4.4. INTEGRABLE FUNCTIONS AND SETS 59

Note: The spaces Lp(X) and Lq(X) don’t really exist as objects. Thestatement Lq(X) ⊆ Lp(X) is just an abbreviation for “every function f thatcan be represented as a Cauchy sequence in the Lq norm has an equivalentrepresentation (in the Lq sense) which is Cauchy in Lp(X).”

Proof. It is sufficient to show that if 〈fn〉 ∈ Lq(X), then 〈fn〉 ∈ Lp(X),that is, if ‖fm − fn‖q ≤ 2−m for all m and n such that m < n, then‖fm − fn‖p ≤ 2−m as well. It is clearly enough to show that ‖g‖p ≤ ‖g‖q

for g ∈ S(X). The claim is derived from the following: for x, y ∈ R+,

yq/p ≥ qp x

q−p

p (y− x) + xq/p. This inequality is provable in RCA0 (proved inAppendix B). In particular, let f = |g|p (f is in C(X)) and let y = f(t):

[f(t)]q/p ≥ q

px

q−p

p (f(t) − x) + xq/p.

Recall that µ(1) = 1. Since measure is monotonic on C(X), it follows that

µ(f q/p) ≥ q

px

q−p

p (µ(f) − x) + xq/p,

and since the above inequality is true for all x ∈ R, let x = µ(f), obtaining

µ(f q/p) ≥ (µ(f))q/p,

which after some algebraic manipulation turns into ‖g‖q ≥ ‖g‖p. �

The immediate consequence of the above proof is that L2(X) ⊆ L1(X),a fact needed in the passage from L2 convergence to L1 convergence in theergodic theorem. Furthermore, if p < q and f ∈ Lq(X), then ‖f‖p ≤ ‖f‖q

(proof: let f = 〈fn〉 and, as in the proof of 4.3.3, ‖fn‖q ≤ ‖fn‖p for each fn;to obtain the result, take the limit).

Lemma 4.3.3 also implies that every function in Lp(X), for any p > 1,is pointwise defined a.e. (as it is also an element of L1(X)).

4.4 Integrable Functions and Sets

In the remainder of the chapter we restrict our attention to the space L1(X).The proof of the following statements can be found in [31]:

Proposition 4.4.1 (WWKL0) For f, f ′ ∈ L1(X) the following propertieshold:

1. ‖f − f ′‖ = 0 if and only if f = f ′ a.e.

Page 72: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

60 CHAPTER 4. ASPECTS OF MEASURE THEORY

2. If f ≤ f ′ a.e., then µ(f) ≤ µ(f ′).

From now on, “f = g in L1(X)” and “f = g a.e. ” will be used interchange-ably when the underlying system is WWKL0 or ACA0.

Lemma 4.4.2 (ACA0) For f ∈ L1(X) and any c ∈ R the sets {x | f(x) >c} and {x | f(x) ≥ c} are integrable.

Proof. Set f = 〈fn〉, with fn ∈ S(X). For each n, {x | fn(x) > c} is open,and f(x) = limn fn(x) for all x in the domain of f , a full Fσ set F . Sincecharacteristic functions of sets that differ by a null set are equal in L1(X),there is no loss of generality in assuming that x ∈ X instead of x ∈ F .

{x | f(x) > c} = {x | ∃m ∃k ∀n ≥ k fn(x) > c+ 2−m}

=

∞⋃

m=1

∞⋃

k=1

n≥k

{x | fn(x) > c+ 2−m},

which is integrable (in ACA0). The second claim follows from the first, usingcomplements. �

It is not difficult to show that the previous claim reverses to ACA0.

Theorem 4.4.3 (RCA0) The following are equivalent:

1. For f ∈ L1(X) and any c ∈ R the set {x | f(x) > c} is integrable.

2. (ACA).

Proof. That (2) → (1) was proved in Lemma 4.4.2. To prove the otherdirection, we will use item 2. from Lemma 1.2.3. For that purpose, let asequence 〈an〉 of nonnegative numbers is given, such that for all n,

i<n ai ≤1. We are going to construct a function f ∈ L1([0, 1]) such that

µ({x | f(x) > 0}) =∑

n

an. (4.4)

For this purpose, define a Cauchy sequence of simple functions 〈fn〉 suchthat µ(fn) =

i≤nai

2i . This can be accomplished as follows. First define f1:

f1(x) =

2a1x 0 ≤ x < a1

2

1 − 2a1x a1

2 ≤ x < a1

0 a1 ≤ x.

Page 73: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

4.4. INTEGRABLE FUNCTIONS AND SETS 61

Clearly, µ(f1) = a1. Similarly, assuming fi is defined for i < n, assuming itsatisfies (4.4), define

fn(x) =

fi(x)∑

k≤i−1 ak ≤ x <∑

k≤i ai, i < n,2n

anx,

k≤n−1 ak ≤ x <∑

k≤n−1 ak + an

2 ,1

2n−1 − 2n

anx,

k≤n−1 ak + an

2 ≤ x <∑

k≤n ak,

0,∑

k≤n ak ≤ x.

It is probably easier to understand how fn is defined from its graph:

6

-��������A

AAAAAAA�

���A

AAA�

�AA�A

. . .

where the length of the side of the ith triangle lying on the x-axis is ai andeach triangle has area ai

2i , hence µ(fn) =∑

i≤nai

2i . The function correspond-ing to this graph is a simple function.

First let us show that 〈fn〉 is a strong Cauchy sequence. Notice that thesequence 〈fn〉 is increasing, so if m < n,

‖fn − fm‖ = µ(fn − fm) = µ(fn) − µ(fm)

=

n∑

k=m+1

ak

2k≤

n∑

k=m+1

1

2k≤ 1

2m,

as required. This means that there is a function f ∈ L1([0, 1]) representedwith this sequence. Furthermore, the set M = {x | f(x) > 0} is preciselythe domain of f with the exception of countably many points of the formx =

i≤n ai, where f(x) = 0, but countable sets don’t effect measure. SinceM is integrable, µ(M) exists, and it is not hard to show that µ(M) =

n an,which completes the proof. �

It is conceivable that, with more care, integrability of {x | f(x) > c}could be proved in a weaker system than ACA0, for “enough” values of c.Bishop developed the entire theory of profiles (see [8]) for this purpose.

Page 74: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

62 CHAPTER 4. ASPECTS OF MEASURE THEORY

He defined accessible numbers and showed that the sets of the above formare integrable if c is accessible. Without a doubt, it would be worthwhile toexamine Bishop’s approach more carefully and see if his ideas can be utilizedin our framework, but this falls outside of the scope of this work. Since theproof of the pointwise ergodic theorem requires arithmetic comprehensionin a number of places, this is not a great loss.

We also want to integrate over arbitrary integrable sets. Since g = fχG

is integrable by Proposition 4.2.5, it is valid to define∫

G f = µ(fχG).Similarly, to define a function by cases, for f to be fk on Mk, where

fk ∈ L1(X) and Mk a integrable set, define

f =n∑

k=1

fkχMk.

Linear combinations of integrable functions are integrable, and sincefkχMk

is integrable for each k, f is integrable.We now adapt two standard results from measure theory, the first of

which will be used in the first proof of the pointwise ergodic theorem, andthe second to prove one of its consequences. Both proofs are similar tostandard ones, though the second uses the monotone convergence theoremand is postponed until the next section.

Lemma 4.4.4 (ACA0) Let 〈fn〉 be a pointwise convergent sequence of func-tions in L1(X), which converges in norm to an integrable function f . Then〈fn〉 converges to f pointwise a.e.

Proof. First show that 〈fn〉 has a subsequence that converges pointwise tof . Since ‖f − fn‖ → 0, for each k let fnk

be such that ‖f − fnk‖ ≤ 2−k.

We are going to show that µ({x | limk fnk(x) 6= f(x)} = 0, which will,

by regularity of measure for Gδ sets, imply that limk fnk(x) = f(x) a.e.

Let U = {x | limk fnk(x) 6= f(x)}. Since

limkfnk

(x) 6= f(x) ↔ ∃m ∀l ∃k ≥ l (|f(x) − fnk(x)| ≥ 2−m),

the set U can be represented as

U =⋃

m

l

k≥l

Umk,

where Umk = {x | |f(x) − fnk(x)| ≥ 2−m}.

Let us evaluate the measure of Umk. Because Umk is integrable, we canintegrate over this set.

Page 75: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

4.5. THE MONOTONE CONVERGENCE THEOREM 63

On the one hand,∫

Umk|fnk

− f | ≤ ‖fnk− f‖ ≤ 2−k. On the other,

Umk|fnk

− f | ≥∫

Umk2−m = 2−mµ(Umk). Therefore, µ(Umk) ≤ 2m−k.

Next, µ(⋃

k≥l Umk) ≤∑

k≥l 2m−k = 2m−l+1, which implies

µ(⋂

l

k≥l

Umk) ≤ 2m−l+1

for all l, and hence this set has measure 0. The set U , consequently, is theunion of sets of measure 0, so itself has measure 0. As was mentioned earlier,since it is simple enough in structure, U is contained in a null Gδ set, so itcan be assumed that U is a Gδ set. Hence, the subsequence 〈fnk

〉 convergesa.e. to f .

This concludes the first part of the proof. It remains to show that theentire sequence converges pointwise to f . By pointwise convergence of 〈fn〉,for all x outside of a null Gδ set there exists some x such that fn(x) → x.It remains to show that x = f(x) a.e. Since

|x− f(x)| ≤ |x− fnk(x)| + |fnk

(x) − f(x)|,

it is easy to see that this is true. �

Compare this lemma to the result from Section 4.2 that if 〈fngn〉 con-verges strongly to h, then it converges to it pointwise as well. The proofof this fact only needed RCA0, whereas Lemma 4.4.4 is proved in ACA0.There is no contradiction between the two. The functions in Section 4.2were simple, and the convergence was strong: we have neither of the two inthe previous lemma.

With additional assumptions, the reverse also holds, i.e. pointwise con-vergence can imply convergence in norm.

Lemma 4.4.5 (ACA0) If 〈fn〉 is a sequence of integrable functions that con-verges pointwise to an integrable function f , and if in addition ‖fn‖ ≤ ‖f‖(or ‖fn‖ → ‖f‖) then fn → f in L1(X).

A proof is given in the next section.

4.5 The Monotone Convergence Theorem

The monotone convergence theorem will be used in a number of places inthe next chapter, directly, or through one of its consequences. It is also aconvenient tool for proving integrability of certain functions. In [31], Yu

Page 76: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

64 CHAPTER 4. ASPECTS OF MEASURE THEORY

showed that the dominated convergence theorem is equivalent to (ACA),while a weaker version of the monotone convergence theorem is equivalent to(WWKL) over RCA0. In the latter, the limit is a known integrable function.We have little use for that theorem, since in general the sequence of functionswill converge at almost every point, but the limit function will not be knownin advance. Since the proof of the main theorem is formalized in ACA0, itis not a great loss that this stronger version of the monotone convergencetheorem, presented below, is also stated and proved in ACA0.

Theorem 4.5.1 (ACA0) Assume 〈fn〉 is a monotonic sequence of functionsin L1(X) with bounded measure. Then there is f ∈ L1(X) such that ‖fn −f‖ → 0, and µ(fn) → µ(f).

Proof. Without loss of generality, let 〈fn〉 be increasing, and µ(fn) ≤M forall n. Since 〈fn〉 is increasing, the sequence 〈µ(fn)〉 is an increasing sequenceof real numbers (Proposition 4.4.1) and as it is bounded, it is convergentand therefore Cauchy so

∀ε ∃N ∀n > m > N (|µ(fn) − µ(fm)| < ε)

and

|µ(fn) − µ(fm)| = µ(fn) − µ(fm) = µ(fn − fm) = µ(|fn − fm|),

which means that

∀ε ∃N ∀n > m > N (‖fn − fm‖ ≤ ε).

The sequence 〈fn〉 is Cauchy, hence convergent in ACA0. Because L1(X)is complete, the sequence converges to an integrable function f such thatµ(fn) → µ(f). �

The proof of the theorem does not change if the sequence 〈fn〉 is assumedto be decreasing and bounded below instead. This would not be true werethe space X not of finite measure.

In both this reversal and that of the pointwise ergodic theorem, it willbe necessary to show that a function of the form

k ckχIk(where Ik are

disjoint, half-open intervals that cover [0, 1]) is integrable. In both cases,the following situation arises:

Lemma 4.5.2 (RCA0) Let 0 = a0 < a1 < . . . , define Ik = [ak, ak+1) and∪kIk = [0, 1]. Then χIk

∈ L1([0, 1]) and if 〈ck〉 is a sequence of ratio-nal numbers with |ck+1 − ck| ≤ M for all n and some constant M , then∑

n ckχIk∈ L1(x).

Page 77: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

4.5. THE MONOTONE CONVERGENCE THEOREM 65

Note: In examples that we will encounter, ak = 1−2−k, or the sequence〈ak〉 is finite.

Proof. It is necessary to find a Cauchy sequence (with a fixed rate of con-vergence) 〈gn〉 of functions in C(X) such that limn gn =

k ckχIk. Let

lk = ak+1 − ak. Fix n and define

gn(x) = ck, for ak + lk · 2−k−n ≤ x ≤ ak+1 − lk · 2−k−n,

(except when k = 0, in which case the condition is 0 ≤ x ≤ a1 − l0 · 2−n)and make it linear otherwise. Let m < n. Then gn and gm differ only onthe open interval

(ak+1 − lk · 2−k−m, ak+1 + lk+1 · 2−k−1−m)

for each k. Each of these intervals has length lk+1 · 2−k−1−m + lk · 2−k−m,which is less than 2−k−m−1, and |gn − gm| ≤M on each, hence

µ(|gn − gm|) ≤∑

k

M2−k−m−1 = M2−m+2.

The sequence 〈gn〉 therefore represents an integrable function: the function∑

k ckχIk. �

The reversal can now be proved.

Theorem 4.5.3 (RCA0) The following are equivalent:

1. Monotone convergence theorem for an arbitrary measure on a completeseparable metric space.

2. Monotone convergence theorem for the Lebesgue measure on [0, 1].

3. (ACA).

Proof. (3) → (1) was proved above and (1) → (2) is immediate. It remains toshow (2) → (3). Let 〈an〉 be a monotonic bounded sequence of real numbers.We may assume 0 ≤ a1 ≤ a2 ≤ · · · ≤ 1. The goal is to show that thissequence is convergent, which will in turn imply arithmetic comprehension(by Lemma 1.2.3). We will construct an increasing, pointwise convergentsequence of functions 〈fn〉 in L1[0, 1], such that µ(fn) = an. Assume themonotone convergence theorem holds. One of the conclusions of the theoremwas that µ(fn) converges. This implies that 〈an〉 is convergent as well.

It remains to construct the desired sequence of functions.

Page 78: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

66 CHAPTER 4. ASPECTS OF MEASURE THEORY

fn(x) =

a1 x ≤ 12

2a2 − a112 < x ≤ 3

4. . .

2n−1an − 2n−2an−1 − · · · − a12n−1−12n−1 < x ≤ 1.

For each fixed n, fn is a step function that jumps at every 2k−12k for 1 ≤

k ≤ n−1, with step size bounded by 2n−1. According to Lemma 4.5.2, fn ∈L1([0, 1]) for each n. Furthermore, µ(fn) = an, the sequence is increasingand converges everywhere except possibly at 1, and is therefore as required.This completes the proof. �

The monotone convergence theorem helps us establish integrability ofsuprema and infima of sequences of integrable functions in the case theirnorms are uniformly bounded (or if the functions are essentially boundedwith the same bound). This is really the only case of interest for thepointwise ergodic theorem, because the sequence under consideration willbe 〈Snf〉, where, since the transformation T preserves norm, ‖Snf‖ ≤ ‖f‖for all n.

By the supremum of a sequence of functions we mean the L1 limit ofthe sequence gn = max{f1, . . . fn} (the definitions of infn fn, lim infn fn andlim supn fn are analogous). This definition is not to be confused with thepointwise definition: in ACA0 for a.e. x, supn fn(X) exists, but it is notobvious that these pointwise values define an integrable function. Unlessotherwise specified, all limits, suprema or infima, and all equalities are meantin the L1 sense.

Proposition 4.5.4 (ACA0) If 〈fn〉 is a sequence of functions in L1 suchthat |µ(fn)| ≤ M , then supn fn, infn fn, lim supn fn and lim infn fn are allintegrable as well.

Proof. Consider only supn fn and lim supn fn since the argument for infn fn

and lim infn fn is analogous. Define gn = max{f1, . . . fn}. The sequence〈gn〉 is increasing and ‖gn‖ ≤M . Theorem 4.5.1 applies: limn gn = supn fn

is an integrable function.

The argument for lim supn fn is similar. Define hn,k = maxn≤m≤k fn.Then hn,k ∈ L1(X) and ‖hn,k‖ ≤M for all k, n and, for a fixed n, 〈hn,k | k ∈N〉 is increasing. By the monotone convergence theorem, for all n, limk hn,k

exists and is in L1(X). Set hn = limk hn,k = supk≥n fk, then hn ∈ L1(X)for all n.

Applying a similar argument to the sequence hn (decreasing sequence offunctions on L1(X), bounded in measure), it follows that limn hn exists and

Page 79: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

4.5. THE MONOTONE CONVERGENCE THEOREM 67

is in L1(X). But this limit is lim supn fn, hence lim supn fn ∈ L1(X), asrequired. �

Proposition 4.5.4 and Lemma 4.4.4, imply (in ACA0) that if a sequence〈fn〉 converges pointwise to an integrable function f , then lim infn fn = f .

Corollary 4.5.5 (ACA0) (Fatou’s Lemma) Let 〈fn〉 be a sequence of non-negative integrable functions with µ(fn) ≤M for all n. Then

µ(lim infn

fn) ≤ lim infn

µ(fn).

Proof. Let gn = infk≥n fk. Then by Proposition 4.5.4, gn ∈ L1(X) foreach n, so 〈gn〉 is an increasing sequence of integrable functions such thatµ(gn) ≤ µ(fn) ≤M for all n. Since

∀n (µ(gn) ≤ µ(fn)) → lim infn

µ(gn) ≤ lim infn

µ(fn)

and lim infn µ(gn) = limn µ(gn) = µ(lim infn fn), it follows that

µ(lim infn

fn) ≤ lim infn

µ(fn)

as required. �

Now we can complete the proof of Lemma 4.4.5, promised in the lastsection.

Proof. First observe that |fn|+ |f |− |f −fn| → 2|f | pointwise. According tothe comment after Proposition 4.5.4 on page 66, lim infn |fn|+|f |−|f−fn| =2|f | and by Fatou’s Lemma

2µ(|f |) = µ(lim infn

|fn| + |f | − |f − fn|) ≤ lim infn

µ(|fn| + |f | − |f − fn|)≤ µ(|f |) + lim

nµ(|fn|) + lim inf

n(−µ(|f − fn|))

= 2µ(|f |) − lim supn

µ(|f − fn|).

The last inequality yields 0 ≤ lim supn µ(|f−fn|) ≤ 0, and it is easy to showthat this implies limn µ(|f − fn|) = 0. �

Page 80: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

68 CHAPTER 4. ASPECTS OF MEASURE THEORY

Page 81: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

Chapter 5

The Pointwise Ergodic

Theorem

The classical statement of the pointwise ergodic theorem is as follows:

If T is a measure preserving transformation on a metric spaceX and f a function in L1(X), then there exists a function f ∈L1(X) such that the average Snf(x) = 1

n

∑n−1k=0 f(T kx) converges

to f(x) ∈ L1(X) a.e. Furthermore, the limit function f is invari-ant, i.e. f ◦ T = f .

Before proving the theorem, we need to translate it into our framework.The first thing to consider is the definition of T . Because there is no suchthing as an arbitrary, pointwise defined function from the point of view ofsecond-order arithmetic, T cannot be defined on the points of the space.Instead, it will be represented as an operator on functions, following thesame reasoning as for the mean ergodic theorem. More precisely, instead ofthe operator on X, we will consider the operator induced on L1(X). Again,more information on the relationship between these representations can befound in Halmos [16]. The appropriate definition of T will be discussed fullyin Section 5.1.

Three proofs of the pointwise ergodic theorem are presented below. Thefirst is adapted partly from Spitters [27] and partly from Billingsley [3], thesecond is based on a proof by Katznelson and Weiss [19], which in turn isderived from a proof by Kamae [18] that used nonstandard analysis, whilethe third mostly follows Billingsley [3]. The first two proofs are in some sensemore natural than the third one, though that one is more commonly foundin textbooks. This is because the first two adopt a functional approach

69

Page 82: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

70 CHAPTER 5. THE POINTWISE ERGODIC THEOREM

and avoid the use of invariant sets, whose treatment in our framework isproblematic. Nevertheless, as Section 5.6 will show, these difficulties can beovercome. Each proof is interesting in its own right and provides insightsinto the behavior of measure preserving transformations on the space ofintegrable functions over a compact metric space.

5.1 Motivation

In standard mathematical practice, if (X,F , µ) is a measure space, a measurepreserving transformation (m.p.t.) T is defined as a mapping of X to itselfsuch that:

• The inverse image of each measurable set is measurable, and

• for each A ∈ Fµ(A) = µ(T−1A),

or, equivalently,µ(f) = µ(f ◦ T )

for each f ∈ L1(X).

When defining a measure preserving transformation within weak sub-systems of second-order arithmetic, we cannot do so pointwise. Even if Tis defined on the dense subset A of X, as it is not linear, there is no wayof extending it to the entire space. Instead, there are two choices beforeus: to define T on functions, or on sets. For a number of reasons, thefunctional approach is preferable. In this case, T is first defined on simplefunctions, then extended in the usual way, and we will see later that WWKL0

(and sometimes even RCA0) will suffice to capture the properties that theclassical version of the transformation T has. On the other hand, we sawthat there is presently no satisfactory way to formalize integrable sets, otherthan via characteristic functions. Even when X is [0, 1] or [0, 1]n, where thecharacterization of integrable sets is relatively simple, there is no merit indefining T on sets. It does not make showing that a given set is invariantany easier; besides, the pointwise ergodic theorem is ultimately a statementabout points and functions and not about sets. In addition, this method isin tune with the treatment of measure theory by Bishop, and by Simpsonand his students. Finally, it matches the approach taken in the first part ofthe thesis.

To motivate the definition of T in the context of second-order arith-metic, let us employ briefly classical mathematical reasoning. Assume a

Page 83: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

5.1. MOTIVATION 71

transformation T is given, measure preserving in the sense specified above.Consider now the induced transformation T , as we did earlier, which is de-fined as T f(x) = f(Tx) for all x for which Tx is in the domain of f . ThenT has the following properties:

• T sends L1(X) to itself.

• T is linear.

• T is measure preserving: µ(T f) = µ(f) for all f ∈ L1(X).

• T is norm preserving (an isometry): ‖T f‖ = ‖f‖ for all f ∈ L1(X).

• T is multiplicative: T (fg) = T f · T g a.e.

• T is nonnegative: if f ≥ 0 a.e. then Tf ≥ 0 a.e.

• If |f | ≤M a.e. then |T f | ≤M a.e. (T preserves L1,∞ norm).

Note: Such a transformation will be called T and not T for the remainderof the text. It will be clear from the context that it is an operator on L1(X)and not on X.

The question presents itself of which of these properties characterize thetransformation. We will show in the next section that the properties ofbeing norm preserving and measure preserving are equivalent in WWKL0

and that nonnegativity is a consequence of linearity and multiplicativity. Inproofs of statements that require at least WWKL0, we are going to freelyswitch between the two notions, and assume T is an isometry in claims thatare proved in RCA0. The only definition given below is that of an isometry,with obvious modifications for a measure preserving transformation. In fact,in the pointwise ergodic theorem, it suffices that T be nonexpansive withrespect to norm, as this alteration does not affect any of the proofs.

Although the main theorem is concerned only with the space L1(X),other Lp spaces will also be of interest. For this reason, consider first amore general definition. It corresponds exactly to Definition 2.1.6.

Definition 5.1.1 An isometry on Lp(X) is a function T : S(X) → Lp(X)such that

1. T (q1f1 + q2f2) = q1Tf1 + q2Tf2 for q1, q2 ∈ Q and f1, f2 ∈ S(X),

2. ‖Tf‖p = ‖f‖p for every f ∈ S(X).

Page 84: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

72 CHAPTER 5. THE POINTWISE ERGODIC THEOREM

Then, for f = 〈fn〉 ∈ Lp(X), the sequence 〈Tfn〉 is Cauchy in the spacebecause T is an isometry, and since Lp(X) is complete, Tf is well-definedand ‖Tf‖p = ‖f‖p for all f ∈ Lp(X).

(The alternative is for T to be nonexpansive, and to change clause 2. to‖Tf‖p ≤ ‖f‖p. All proofs remain the same with this modification, so assumefor simplicity from now on that T is an isometry.) A function is invariantif Tf = f .

We will mainly be concerned with the case p = 1.

How does one define a multiplicative transformation? This question isrelated to that of products of Lp functions. It seems logical to character-ize multiplicativity only on simple functions, and as long as fg ∈ Lp(X) itshould follow in RCA0 that T (fg) = Tf · Tg. This is not necessarily thecase. In fact, as with products, essential boundedness needs to be employed.To ensure that Tf · Tg is defined, we need to assume that T takes essen-tially bounded functions to essentially bounded functions (T preserves Lp,∞norm). This is a reasonable assumption, true classically of measure preserv-ing transformations. In practical terms, it guarantees that simple (and test)functions behave well with respect to the transformation.

Definition 5.1.2 (RCA0) A transformation T : Lp(X) → Lp(X) is said tobe multiplicative if the following two conditions are satisfied:

1. If f is a simple function with |f | ≤M , then |Tf | ≤M also.

2. For all f, g ∈ S(X), T (fg) = Tf · Tg in Lp(X).

There is no need to demand in 2. that Tf ·Tg be in Lp(X), as this is providedby the first condition.

In the preceding definition, product is meant in the sense of Defini-tion 4.2.1, that is, strong product.

It can easily be shown that this definition implies that T (fg) = Tf · Tgwhen both functions are in Lp,∞. In particular, T is multiplicative on C(X).

Proposition 5.1.3 (RCA0) If 〈fn〉 and 〈gn〉 are representations of f, g ∈Lp,∞(x), then T (fg) = Tf · Tg.

Proof. Assume |f | ≤Mf and |g| ≤Mg; then also |Tf | ≤Mf and |Tg| ≤Mg.Notice that Tf ·Tg exists, as do Tfn ·Tg, T (f − fn) ·Tg and Tf ·T (g− gm)(all functions are essentially bounded) and recall that T (fngm) = Tfn ·Tgm

Page 85: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

5.1. MOTIVATION 73

for all m and n. Using the triangle inequality,

‖T (fg) − Tf · Tg‖p ≤ ‖T (fg) − T (fng)‖p + ‖T (fng) − T (fngm)‖p

+ ‖Tfn · Tgm − Tfn · Tg‖p + ‖Tfn · Tg − Tf · Tg‖p

= ‖T ((f − fn)g)‖p + ‖T (fn(g − gm))‖p

+ ‖Tfn · T (gm − g)‖p + ‖T (fn − f) · Tg‖p

≤ Mg2−n +Mf2−m +Mf2−m +Mg2

−n

= 2Mf2−m + 2Mg2−n,

based on the (L1 and L1,∞) norm preserving property of T .

Since the inequality holds for all values of m and n, ‖T (fg)−Tf ·Tg‖p =0, as required. �

It is still unclear if this type of argument can be extended to a moregeneral class of functions.

A seemingly stronger property will be required when considering Lp(X)for p > 1, a more general multiplicativity principle:

T (fp) = (Tf)p,

whenever f ∈ S(X) and is nonnegative. This property follows from multi-plicativity when p is natural or rational. When p ∈ N, it can be proved byΣ1 induction (the statement “T is multiplicative on simple functions” is Π1)

and holds not only for simple, but also for test functions. Since f1

n ∈ C(X)

if f ∈ S(X), and Tf = T ([f1

n ]n) = [T (f1

n )]n,

(Tf)1

n = T (f1

n ),

so general multiplicativity holds for all rational numbers. Finally, if p is anarbitrary real number, fp = limn f

pn , where p = 〈pn〉 (see Appendix B fora justification of this fact) and

T (fp) = T (limnfpn) = lim

nT (fpn) = lim

n(Tf)pn = (Tf)p.

The second equality is true since T , as a continuous operator, commuteswith limits.

Definition 5.1.4 (RCA0) An operator T : Lp(X) → Lp(X) is nonnegativeif whenever f is a nonnegative simple function, Tf is nonnegative as well.

Page 86: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

74 CHAPTER 5. THE POINTWISE ERGODIC THEOREM

We proceed to show that if T is nonnegative on simple functions, it willpreserve nonnegativity of arbitrary functions as well. This fact is almostimmediate, with our definition of nonnegativity.

To see this, let f be a nonnegative integrable function, represented as asequence of simple functions 〈fn〉. It was remarked earlier that it is safe toassume that each fn is nonnegative. By assumption, |Tfn| = Tfn for all nand

‖(|Tf | − Tf)‖p ≤ ‖(|Tf | − |Tfn|)‖p + ‖(|Tfn| − Tfn)‖p + ‖Tfn − Tf‖p

≤ 2−n + 0 + 2−n

for all n because T is norm preserving, so

‖Tfn − Tf‖p = ‖T (fn − f)‖p = ‖fn − f‖p ≤ 2−n,

and by the triangle inequality, ‖(|Tf | − |Tfn|)‖p ≤ ‖Tf − Tfn‖p. Because‖(|Tf | − Tf)‖p ≤ 2−n+1 for all n, Tf is nonnegative.

We showed earlier in the case of Hilbert spaces that for all n, Tn is con-tinuous, linear and norm preserving. The proof in the case of T : L1(X) →L1(X) is no different. Note also that the formula stating that Tn is multi-plicative is also Π1, so multiplicativity of Tn is provable using Lemma 1.2.1.

This may be a good place to examine the behavior of T , as defined inthe previous section, on sets. For example, can we prove that T takes sets tosets? Fortunately, this is the case whenever T is linear and multiplicative.(If T is an isometry, then it preserves measures of sets, too.) Let A be anintegrable set, i.e. let χA be a characteristic function. Since χA is essentiallybounded, multiplicativity applies, and since χ2

A = χA, T (χA) = T (χ2A) =

(TχA)2. In addition, TχA is essentially bounded by 1 so it is a characteristicfunction. In standard mathematical practice, with the pointwise definitionof T , if T is the transformation induced by T , then T χA = χT−1A. It may betempting to think of TχA in those terms, but we have no way of determiningthe action of T on sets, and therefore cannot make this claim.

Recall that the following identities hold in RCA0:

χA∩B = χA · χB,

χA∪B = χA + χB − χA · χB,

χAC = 1 − χA.

After applying T , they become:

T (χA∩B) = TχA · TχB,

Page 87: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

5.1. MOTIVATION 75

T (χA∪B) = TχA + TχB − TχA · TχB,

T (χAC ) = T (1) − TχA = 1 − TχA.

It remains to justify the last identity, i.e. to show that T (1) = 1 (the constant1 function). If one is careful, this fact can be shown in RCA0, assuming Tis measure preserving. Since 1 is a bounded function, T (1) is also bounded,and multiplicativity applies.

First, observe that T (1) = T (1 · 1) = [T (1)]2 and that consequently,

(1 − T (1))2 = 1 − 2T (1) + [T (1)]2 = 1 − T (1).

The multiplication above was legitimate, as 1−T (1) is a bounded function.We can conclude that |1 − T (1)| = 1 − T (1) and

‖1 − T (1)‖ = µ(1 − T (1)) = µ(1) − µ(T (1)) = 0,

or T (1) = 1 in L1(X).The above identities can also be interpreted in the following way. If M

and N are sets such that TχA = χM and TχB = χN , then

T (χA∩B) = χM∩N ,

T (χA∪B) = χM∪N ,

T (χAC ) = χMC .

Another useful property holds of T . It seems that pointwise propertiesof functions cannot be avoided in proving it.

Proposition 5.1.5 (WWKL0) Let T : L1(X) → L1(X) be a linear, mul-tiplicative transformation. Then T (f+) = (Tf)+ and T (f−) = (Tf)− inL1(X).

Proof. Let f be represented as 〈fn〉, where each fn is a simple function.The goal is to show that ‖T (f±) − (Tf)±‖ = 0, where ± stands for + or−. Notice that for all n, Tfn = T (f+

n ) − T (f−n ) = (Tfn)+ − (Tfn)− andall functions involved are defined a.e., are nonnegative on the domain, andat most one of the terms on each side of the identity is nonzero wheneverthey are all defined. All these facts immediately imply the claim. Forexample, if Tfn(x) ≥ 0, then T (f+

n )(x) ≥ 0, and (Tfn)+(x) ≥ 0, whileT (f−n )(x) = (Tf)−(x) = 0, therefore T (f+

n )(x) = (Tfn)+(x). The caseTfn(x) ≤ 0 is analogous. In WWKL0 this means that ‖T (f±n )−(Tfn)±‖ = 0for all n, which in turn implies the claim. �

Page 88: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

76 CHAPTER 5. THE POINTWISE ERGODIC THEOREM

We can now (when working in WWKL0) without ambiguity write Tf+

or Tf−. This fact also implies that T |f | = T (f+ + f−) = (Tf)+ + (Tf)− =|Tf |, from which the following proposition follows:

Proposition 5.1.6 (WWKL0) Every measure preserving transformation isnonnegative.

Proof. Let |f | = f . Then Tf = T |f | = |Tf |. �

As promised in the previous section, we will show that a linear trans-formation T : L1(X) → L1(X) is measure preserving if and only if it is anisometry.

Proposition 5.1.7 (WWKL0) A linear transformation T on L1(X) is mea-sure preserving if and only if it is norm preserving.

Proof. If T is measure preserving, then µ(Tf+) = µ(f+) and µ(Tf−) =µ(f−), and

µ(|f |) = µ(f+ + f−) = µ(f+) + µ(f−)

= µ(Tf+) + µ(Tf−) = µ(|Tf |).

For the other direction, let T be an isometry. It should be clear that |f±| =f±. Similarly, because Tf± stands for (Tf)±, |Tf±| = Tf±. Therefore:

µ(f) = µ(f+) − µ(f−) = µ(|f+|) − µ(|f−|)= µ(|Tf+|) − µ(|Tf−|)= µ(Tf+) − µ(Tf−)

= µ(Tf).

Due to the two propositions just proved, from now on we can always as-sume that T is a multiplicative isometry when working in theories extendingWWKL0.

5.2 Convergence in Norm

For T defined as in Section 5.1, for n ≥ 1, let

Snf =1

n

n−1∑

k=0

T kf.

Page 89: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

5.2. CONVERGENCE IN NORM 77

If T is an isometry, then ‖Snf‖ ≤ ‖f‖ for all n. In fact, if T is measurepreserving (recall the two are equivalent in WWKL0), ‖Snf‖ = ‖f‖, thoughthe first property is sufficient for our purposes.

Recall that Chapter 3 established that the mean ergodic theorem isequivalent to (ACA). Since L2(X) is a real Hilbert space, arithmetic com-prehension implies convergence of Snf in L2 norm.

The next step in the proof of the pointwise ergodic theorem is to showthat L2 convergence implies L1 convergence, under some additional assump-tions. Section 5.4 provides a proof that, in turn, L1 convergence impliespointwise convergence, which will conclude the proof of the pointwise er-godic theorem (in ACA0). For the sake of generality, though, we are goingto consider general Lp spaces. Results in this section are largely adaptedfrom [27].

The transformation T is going to be an isometry throughout. In thenext claim also assume that it is multiplicative. The underlying systemis WWKL0, because the fact that T |f | = |Tf |, which is used both inLemma 5.2.1 and Lemma 5.2.2, requires (WWKL).

Lemma 5.2.1 (WWKL0) Let T : L1(X) → L1(X) be a multiplicative isom-etry. Then T , when restricted to Lp(X) for any p > 1, is an isometry onthat space. Moreover, T takes Lp(X) to Lp(X).

Proof. Let f = 〈fn〉 be in L1(X). Observe that, since fn ∈ C(X), all powersof fn are in Lp(X) for all p.

‖f‖pp = lim

n‖fn‖p

p = limnµ(|fn|p)

= limnµ(|T (|fn|p)|) = lim

nµ(|(T |fn|)p|)

= limnµ(|T (|fn|)|p) = lim

n‖Tfn‖p

p = ‖Tf‖pp.

The limit in the last step exists since T is an isometry. The step before lastis the consequence of T |f | = |Tf |.

For the second part of the claim, let f = 〈fn〉 ∈ Lp(X): ‖fm−fn‖p ≤ 2−m

for all m, n such that n > m. But since T is an isometry on Lp(X),

‖Tfm − Tfn‖p = ‖T (fm − fn)‖p = ‖fm − fn‖p ≤ 2−m.

Therefore, 〈Tfn〉 is strong Cauchy with respect to the Lp norm, so Tf ∈Lp(X). �

Lemma 5.2.2 (WWKL0) If T is an isometry on Lp(X) for some p > 1,then it is also an isometry on L1(X).

Page 90: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

78 CHAPTER 5. THE POINTWISE ERGODIC THEOREM

Proof. It will suffice to show that, if f ∈ C(X), then ‖Tf‖ = ‖f‖. Byassumption, for every g ∈ C(X)

µ(|Tg|p) = µ(|g|p).

Let g be such that |f | = gp, i.e. g = |f |1/p. It is easily shown that g is inC(X). Therefore:

µ(|f |) = µ(|g|p) = µ(|Tg|p) = µ(|T (gp)|)= µ(|T |f ||) = µ(|Tf |).

Put together, the previous two lemmas yield the following:

Corollary 5.2.3 (WWKL0) If T is a multiplicative isometry on some Lp,with p ≥ 1, then it is an isometry for all Lq, q ≥ 1.

In particular, T is an isometry on L1(X) if and only if it is an isometryon L2(X).

Lemma 5.2.4 (RCA0) Let T be a multiplicative isometry on Lp(X). If〈Snf〉 converges in Lq norm for all f ∈ Lq(X), for some q > p, then 〈Snf〉converges in Lp norm for all f ∈ Lp(X).

Note: In WWKL0 there would be no need to specify where T is an isometry.This is due to the previous corollary: it is an isometry simultaneously on allLp(X) spaces.

Proof. Let f ∈ Lp(X) be represented by 〈fk〉. Then, for each k, fk ∈ S(X) ⊆Lq(X), and therefore for each k, 〈Snfk | n ∈ N〉 converges in Lq(X) to somegk ∈ Lq ⊆ Lp.

Find an estimate for ‖gk−gm‖p for any k,m, where k < m. The goal is toshow that the sequence 〈gk〉 is strong Cauchy in Lp(X) and thus convergentto an element in the space. This will precisely be the limit of 〈Snf〉.

For k < m, ‖Snfk − Snfm‖p = ‖Sn(fk − fm)‖p ≤ ‖fk − fm‖p ≤ 2−k.

Choose n so that ‖Snfk − gk‖q < 2−k, ‖Snfm − gm‖q < 2−k (this can bedone recursively, since the limit exists). Because p < q, the above inequali-ties also hold in Lp norm, based on the comment after Lemma 4.3.3 on page58. Then

Page 91: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

5.2. CONVERGENCE IN NORM 79

‖gk − gm‖p ≤ ‖gk − Snfk‖p + ‖Snfk − Snfm‖p + ‖Snfm − gm‖p

≤ 2−k + 2−k + 2−k = 3 · 2−k.

This is clearly a (strong) Cauchy sequence and since the space is complete,the proof is done. �

Corollary 5.2.5 (RCA0) Let T be an isometry on L1(X). Then L2 con-vergence of 〈Snf〉 for all f ∈ L2(X) implies L1 convergence of 〈Snf〉 for allf ∈ L1(X).

By Theorem 3.1.3, arithmetic comprehension proves L2 convergence of 〈Snf〉for all f ∈ L2(X). The following holds:

Corollary 5.2.6 (ACA0) For every f ∈ L1(X), 〈Snf〉 converges in L1

norm.

The converse to Lemma 5.2.4 holds. It is not relevant to the proof of theergodic theorem, however.

Lemma 5.2.7 (RCA0) If T is a multiplicative isometry on Lq(X), and if〈Snf〉 converges in Lp norm for all f ∈ Lp(X), then 〈Snf〉 converges in Lq

norm for all f ∈ Lq(X) whenever q > p.

Proof. We will show that 〈Snf〉 is a Cauchy sequence in Lq(X).

‖Snf − Smf‖q ≤ ‖Snf − Snfk‖q + ‖Snfk − Smfk‖q + ‖Smfk − Smf‖q,

and the first and third term are dominated by 2−k. As for the middle term,note that fk ∈ S(X), hence |fk| ≤ Mk for some real number Mk. Then|T ifk| ≤Mk for all i, which implies that for all n, also |Snfk| ≤Mk and

‖Snfk − Smfk‖qq = µ(|Snfk − Smfk|q)

= µ(|Snfk − Smfk|p |Snfk − Smfk|q−p)

≤ (2Mk)q−p ‖Snfk − Smfk‖p

p,

or‖Snfk − Smfk‖q ≤ (2Mk)

1−p/q ‖Snfk − Smfk‖p/qp .

The fact that 〈Snfk〉 converges in Lp(X) implies that 〈Snfk〉 is Cauchyin Lp(X), and therefore for every l, and m and n large enough, ‖Snfk −Smfk‖p ≤ 2−l, therefore

‖Snfk − Smfk‖p ≤ (2Mk)1−p/q2−lp/q,

Page 92: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

80 CHAPTER 5. THE POINTWISE ERGODIC THEOREM

and it is not difficult to see that the sequence 〈Snfk〉 is strong Cauchy inLq(X), as is 〈Snf〉, and is therefore convergent. �

The next step is passing from convergence in norm to pointwise conver-gence. This requires the maximal ergodic theorem.

5.3 The Maximal Ergodic Theorem

The standard proof of the maximal ergodic theorem presents a problemsimilar to that encountered when trying to formalize the standard proofof the pointwise ergodic theorem. In this case the problem is resolved byadapting a constructive proof by Garsia [15], which (unlike the standardone) makes no use of inverse images of sets.

Recall that if a transformation is multiplicative, it is also nonnegative(in WWKL0 and stronger systems).

Theorem 5.3.1 (ACA0) Let T be a multiplicative isometry on L1(X), f ∈L1(X), and M = {x | supnSnf(x) > 0}. Then µ(fχM ) ≥ 0.

Proof. Proposition 4.5.4 shows that supnSnf is an integrable function, andby Lemma 4.4.2, M is an integrable set.

Define

A0f = 0, Anf =n−1∑

k=0

T kf,

and notice that Snf = 1nAnf for n ≥ 1. Also note that

{x | supnSnf(x) > 0} = {x | sup

nAnf(x) > 0},

so consider the latter set.Fix N ∈ N. Let hN = max0≤n≤N Anf . Then hN ∈ L1(X) and hN ≥ 0.

Let MN = {x | hN (x) > 0}. By Lemma 4.4.2, MN is an integrable set andM =

N MN . Since MN is integrable, χMN∈ L1(X).

For all n, 0 ≤ n ≤ N ,

hN ≥ Anf → ThN ≥ TAnf

→ ThN + f ≥ An+1f,

therefore,

ThN + f ≥ max0≤n≤N

An+1f

≥ max1≤n≤N

Anf.

Page 93: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

5.3. THE MAXIMAL ERGODIC THEOREM 81

This implies that ThNχMN+ fχMN

≥ max1≤n≤N AnfχMN, and this fact

is true in RCA0, assuming χMNis an integrable function. The next step,

however, requires WWKL0:

max1≤n≤N

AnfχMN= max

0≤n≤NAnfχMN

= hNχMN,

because hN (x) > 0 on MN , so the maximum of Anf is not attained at n = 0.Therefore,

fχMN≥ hNχMN

− ThNχMN.

After integrating, this becomes

µ(fχMN) ≥ µ(hNχMN

) − µ(ThNχMN) (5.1)

= µ(hN ) − µ(ThNχMN) (5.2)

≥ µ(hN ) − µ(ThN ). (5.3)

The equality in (5.2) follows from the fact that hN = hNχMN+hNχMN

andhN = 0 on X \MN , therefore hN = hNχMn

while (5.3) is a consequence ofnonnegativity of T : ThN ≥ 0 implying that ThNχMN

≤ ThN (the formeris provable in WWKL0, the latter in RCA0).

As T is norm preserving, and hN and ThN are nonnegative, µ(hN ) =‖hN‖ = ‖ThN‖ = µ(ThN ), and µ(fχMN

) ≥ 0.

Now pass to the limit. It is easy to show that χMN→ χM and fχMN

→fχM in L1(X), which implies that µ(fχMN

) → µ(fχM ). Consequently,µ(fχM ) ≥ 0. �

If G = {x | supn Snf(x) > λ}, then it immediately follows from theproof above that µ(χG) ≤ 1

λµ(fχG).

Although this proof is constructive from the standpoint of Bishop-stylemathematics, this is not the case in second-order arithmetic. First of all, theclaims that supnSnf is an integrable function, and M and MN for all N areintegrable sets require arithmetic comprehension. The operation of takingthe limit over N can be performed in RCA0 with the knowledge that χM andχMN

are integrable functions. Even if we suppose these facts beforehand,the proof of the maximal ergodic theorem still does not formalize in RCA0

because pointwise properties of the function hN are used in the proof.

We will need the following modification of the maximal ergodic theo-rem for the third proof of the pointwise ergodic theorem: we need to showthat the conclusion of the theorem still holds if we restrict ourselves to aninvariant subset of the space (A is invariant if TχA = χA).

Page 94: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

82 CHAPTER 5. THE POINTWISE ERGODIC THEOREM

Corollary 5.3.2 (ACA0) Let T be a multiplicative isometry on L1(X), f ∈L1(X), and A an invariant set. If M = {x | (supnSnf(x))χA > 0}, thenµ(fχAχM ) ≥ 0.

Proof. Apply 5.3.1 to the integrable function fχA. It suffices to show that{x | (supnSnf)χA > 0} = {x | supnSn(fχA) > 0}, for then the state-ment immediately follows. This is where the invariance of A is used: sinceT k(fχA) = T kf · T kχA = T kf · χA for all k, Sn(fχA) = (Snf)χA and thetwo sets are equal. �

As above, if G = {x | supn Snf(x) > λ}, then µ(χG∩A) ≤ 1λµ(fχG∩A).

5.4 First Proof of the Main Theorem

Before proving the main theorem, one last auxiliary statement is needed.

Lemma 5.4.1 (ACA0) If 〈fk〉 is a sequence of functions in L1(X) that con-verges in norm to f ∈ L1(X), and if for all k, limn Snfk exists a.e., then〈Snf〉 converges a.e.

Note: The above statement can be interpreted as saying that the set ofall functions for which Snf converges a.e. is L1 closed.

Proof. Let G be the null set on which neither f , Snf for any n nor limn Snfk

for any k are defined (a countable union of null Gδ sets is a null Gδσ set, sointegrable). Due to regularity of measure, there is no loss in assuming thatG is a null Gδ set. If x is outside of this set, and n,m, k ∈ N are arbitrary,then

|Snf(x) − Smf(x)| ≤ |Snf(x) − Snfk(x)| + |Snfk(x) − Smfk(x)|+ |Smfk(x) − Smf(x)|.

Since limn Snfk(x) exists, there is Nk ∈ N such that |Snfk(x) − Smfk(x)| <2−k whenever m,n ≥ Nk. It remains to approximate the other two terms.Because |Sif(x) − Sifk(x)| ≤ supj |Sj(f(x) − fk(x))| for all i,

|Snf(x) − Smf(x)| ≤ 2supj |Sj(f(x) − fk(x))| + 2−k

for n,m ≥ Nk. According to the maximal ergodic theorem, for every λ > 0,

µ({x | supn|Sn(f(x) − fk(x))| > λ}) ≤

µ({x | supnSn|f(x) − fk(x)|) > λ}) ≤ 1

λ‖f − fk‖ ≤ 1

λ2−k.

Page 95: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

5.4. FIRST PROOF OF THE MAIN THEOREM 83

Let Ulk = {x | supn|Sn(f(x) − fk(x))| > 2−l} ∪ G. By Lemma 4.4.2, Ulk

is integrable. Its measure does not exceed 2l−k, while |Snf(x) − Smf(x)| ≤2 · 2−l + 2−k for n,m > Nk and outside of Ulk.

Let ε be arbitrary. Choose l such that 2−l+1 < ε/2 and consider U =⋂

k≥l Ulk: µ(U) ≤ 2l−k for all k, thus µ(U) = 0. If x 6∈ U , then x 6∈ Ulk for

some k. For this value supn|Sn(f(x) − fk(x))| ≤ 2−l and

|Snf(x) − Smf(x)| ≤ 2 · 2−l + 2−k < ε

for m,n ≥ Nk.Now for each i let Ui be the set U which corresponds to ε = 2−i above.

Define V = ∪iUi. As a union of null sets, V is itself a null set, at a low levelin the Borel hierarchy. We can therefore appeal to regularity of measure onceagain, and as V is contained in a null Gδ set, it may be assumed that V itselfis a Gδ set. For x 6∈ V , 〈Snf(x)〉 is a Cauchy sequence, thus convergent,which by definition means that 〈Snf〉 converges a.e. �

Theorem 5.4.2 (ACA0) (Pointwise ergodic theorem) For every multi-plicative isometry T and for every f ∈ L1(X), 〈Snf〉 converges a.e. to anintegrable function f . In addition, µ(f) = µ(f) and f is invariant.

Proof. Given f ∈ L1(X), represented as 〈fn〉, where fn ∈ S(X) for all n.Since fn ∈ L2(X), by Theorem 3.1.2, ACA0 proves that fn = fM

n +fNn , where

fMn and fN

n are in L2(X) and therefore integrable (follows from the proofof the mean ergodic theorem). Then TfM

n = fMn and fN

n is represented bythe sequence 〈gnk − Tgnk〉, with gnk ∈ S(X) for all k. 〈Smf

Mn 〉 converges

pointwise to fMn , as Smf

Mn (x) = fM

n (x) for all x for which it is defined.For any function of the form g − Tg, if g ∈ C(X) then |g(x)| ≤ K for

some K, and because T preserves L1,∞ norm

|Sng(x)| =1

n|g(x) − Tng(x)| ≤ 2K/n→ 0.

Thus, 〈Sng〉 converges pointwise to 0.As fN

n is the L1 limit of functions of the above form (an L2 limit is also anL1 limit), according to Lemma 5.4.1, 〈Smf

Nn 〉 is also pointwise convergent,

and, as a consequence, 〈Smfn〉 is pointwise convergent. But, once again, bythe same lemma, as f is the L1 limit of the sequence fn, 〈Snf〉 convergesa.e. At the same time, it was proved above that 〈Snf〉 converges in L1 toan integrable function f , which by Lemma 4.4.4 implies that limn Snf(x) =f(x) a.e. This establishes the first part of the claim. That µ(f) = µ(f)

Page 96: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

84 CHAPTER 5. THE POINTWISE ERGODIC THEOREM

follows directly from µ(f−f) = µ(f−Snf)+µ(Snf−f), and µ(Snf−f) = 0while, since Snf → f in norm, ‖f − Snf‖ → 0, thus µ(f − Snf) → 0.

To prove invariance of f , it suffices to show that ‖T f − f‖ = 0. For alln ∈ N,

‖T f − f‖ ≤ ‖T f − TSnf‖ + ‖TSnf − Snf‖ + ‖Snf − f‖.

Consider each of the terms on the right-hand side of the inequality. Fixε > 0. Because ‖f−Snf‖ → 0, there exists N1(ε) such that ‖T f−TSnf‖ =‖f − Snf‖ < ε for n > N1(ε). Similarly, there exists N2(ε) such that‖TSnf − Snf‖ = 1

n‖Tnf − f‖ ≤ 2n‖f‖ < ε for n > N2(ε). For n >

max{N1, N2}, ‖T f − f‖ < 3ε. Since ε is arbitrary, this concludes the proof.�

We showed that L1 convergence of the sequence 〈Snf〉 implies its point-wise convergence. In fact, the reverse also holds. This follows directly fromLemma 4.4.5. If Snf → f a.e., since also ‖Snf‖ = ‖f‖, we have Snf → fin L1(X).

5.5 Second Proof

We now proceed to the second proof of the pointwise ergodic theorem. Theadvantage this proof has over the first one is that it requires fewer assump-tions on T . It is only required that it be a measure preserving (and thereforenonnegative) linear operator. The resulting proof, however, is longer andsomewhat less intuitive. It also relies heavily on arithmetic comprehension.The goal is to prove that lim infn Snf and lim supn Snf coincide. To evenbegin talking about these notions (suprema, infima, limits), arithmetic com-prehension is necessary. Most proofs in this section are formalized in ACA0.

For the sake of clarity, the proof is divided into a number of lemmas. Weretain the notation from the previous sections.

Since any function f can be written as f = f+ − f−, and T is linear, itcan safely be assumed that f ≥ 0. Consider the full Fσ set X on which f ,along with T kf for all k, is defined and nonnegative.

Recall that for each n, ‖Snf‖ ≤ ‖f‖. Set f(x) = lim supn Snf(x) andf(x) = lim infn Snf(x) for x ∈ X. By virtue of arithmetic comprehension,

both of these pointwise limits exist. Furthermore, by Proposition 4.5.4, fand f are integrable.

Fix ε > 0. Let

ψ(x, n) ≡ (n = min{k ≥ 1 | Skf(x) ≥ f(x) − ε}).

Page 97: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

5.5. SECOND PROOF 85

Then ψ(x, n) is an arithmetic formula which holds if the first average thatcomes within ε of f(x) is Snf(x).

Next, define a formula ϕ so that

ϕ(x,m, n) ↔ ∃〈k1, . . . km〉 (ψ(x, k1) ∧ ψ(T k1x, k2) ∧ . . .∧ ψ(T k1+···+km−1x, km) ∧ n = k1 + · · · + km).

In other words, once an average is reached that is within ε from f , we start“counting” over, until another such average is reached. If ϕ(x,m, n) holds,then by term with index n this has happened m times.

We will write n = R(x) when ψ(x, n) holds and n = Rm(x) whenϕ(x,m, n) holds. Both R(x) and Rm(x) exist a.e. and SRm(x)f(x) ≥ f(x)−εfor all m.

First consider a restricted case. In the following lemma use the assump-tion that T is nonnegative. So far in this section, arithmetic comprehensionhas been necessary for all considerations, but if we assume that Rm(x) isdefined a.e. for all m and that f ∈ L1(X), this lemma can be proved inWWKL0.

Lemma 5.5.1 (WWKL0) If f ∈ L1(X), and there exist R∗ and M in N

such that R(x) ≤ R∗ and f(x) ≤M a.e., then µ(f) ≤ µ(f) + 2ε.

Proof. Pick L such that MR∗

L < ε and fix x in the domain of f and for whichRm(x) is defined for all m. There exists n such that Rn(x) ≤ L < Rn+1(x)and

L · (f(x) − ε) ≤ Rn+1(x) · (f(x) − ε)

= Rn(x) · (f(x) − ε) + (Rn+1(x) −Rn(x)) · (f(x) − ε)

≤ Rn(x)SRnf(x) +MR∗ ≤ LSLf(x) +MR∗.

The inequality Rn+1(x) −Rn(x) ≤ R∗ is deduced from

Rn+1(x) −Rn(x) ≤ Rn+1(x) = R(Rn(x)) ≤ R∗.

In the last step nonnegativity of T was used to conclude that SRnf ≤ SLf .

Now divide by L. Since f(x) − ε ≤ SLf(x) + MR∗

L a.e., by Proposi-tion 4.4.1:

µ(f − ε) ≤ µ(fL) +MR∗

L≤ 1

L

L−1∑

k=0

µ(T kf) + ε = µ(f) + ε.

Use the fact that T is measure preserving in the last step.

Page 98: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

86 CHAPTER 5. THE POINTWISE ERGODIC THEOREM

Finally, because µ(f − ε) = µ(f) − µ(ε) = µ(f) − ε (since µ(1) = 1) itfollows that µ(f) ≤ µ(f) + 2ε. �

Before proceeding to the general case, we prove the following statement.

Lemma 5.5.2 (ACA0) The set W = {x | R(x) ≥ K} is integrable for allK ∈ N.

Proof. This fact doesn’t follow immediately from Lemma 4.4.2 because R(x)is not defined as a function.

The formula R(x) ≥ K is an abbreviation for ψ(x, n) ∧ n ≥ K. LetWn = {x | Sn(x) − f(x) + ε ≥ 0} and Yn = Wn \ ⋃n−1

i=1 Wi. Then x is inWn if and only if Sn(x) is within ε from f(x) and is in Yn if and only if it isthere for the first time: x ∈ Yn ↔ R(x) = n. Hence,

W =∞⋃

n=K

Yn =∞⋃

n=K

Wn \K−1⋃

n=1

Wn.

Each Wi is integrable by Lemma 4.4.2, as is W , because infinite unions ofintegrable sets are integrable, as are differences of integrable sets. �

To conclude the proof, consider an arbitrary f , and “cut off” f at someM , i.e. make it essentially bounded to reduce to the previous case. Takinglimits will give us the desired result.

Lemma 5.5.3 (ACA0) For any f ∈ L1(X), µ(f) ≤ µ(f).

Proof. Fix M ∈ N and set fM = min(f,M). Choose R∗ ∈ N such thatµ({x | R(x) ≥ R∗}) ≤ ε

M . This is possible because R(x) is finite a.e. andµ(X) = 1.

Let ψM (x, n) ≡ (n = min{k ≥ 1 | Snf(x) ≥ fM − ε}). As before, whenψM (x, n) holds, denote n = RM (x). Note that RM (x) ≤ R(x) since fM ≤ f .Define f as

f(x) =

{f(x) R(x) ≤ R∗

max(f(x),M) R(x) > R∗,

and define R with

R(x) =

{RM (x) R(x) ≤ R∗

1 R(x) > R∗.

Then f ∈ L1(X) because f = fχR≤R∗ + max(f,M)χR>R∗ a.e. and both{R ≤ R∗} and {R > R∗} are integrable sets and max(f,M) is an integrable

Page 99: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

5.5. SECOND PROOF 87

function. Integrability of f follows from the comment in Section 4.4 on page59. In addition, f ≥ 0.

The following inequality holds a.e.:

fM (x) − ε ≤ 1

R(x)

R(x)−1∑

k=0

T kf(x).

There are two cases. If R(x) ≤ R∗, then R(x) = RM (x) and f(x) = f(x)and

1

RM

RM−1∑

k=0

T kf ≥ fM − ε

by definition.

The second case occurs when R(x) > R∗. Then R = 1 and

1

1

0∑

k=0

T kf = f = max(f,M) ≥M ≥ fM − ε.

The conditions of Lemma 5.5.1 are now satisfied. This lemma applies tofM in place of f and f in place of f and thus

µ(fM ) ≤ µ(f) + 2ε.

At the same time,

µ(f) = µ(fχR≤R∗) + µ(max(f,M)χR>R∗)

≤ µ(fχR≤R∗) + µ((M + f)χR>R∗)

≤ µ(f) +Mε

M= µ(f) + ε,

thus µ(fM ) ≤ µ(f) + 3ε.

Since ε was arbitrary, µ(fM ) ≤ µ(f). Since f = limM fM ,

µ(f) = µ(limMfM ) = lim

Mµ(fM ) ≤ µ(f).

The inequality is the consequence of the proof above. The limit commuteswith the integral due to the weaker version of monotone convergence the-orem, provable in WWKL0, as fM is an increasing sequence of integrablefunctions that converges to an integrable function f . �

Page 100: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

88 CHAPTER 5. THE POINTWISE ERGODIC THEOREM

Theorem 5.5.4 (ACA0) (Pointwise ergodic theorem) Let T : L1(X) →L1(X) be a measure preserving transformation and f ∈ L1(X). Then thesequence of averages Snf(x) =

∑n−1k=0 T

kf(x) converges for almost all x ∈ X.

Proof. The following chain of inequalities holds:

µ(f) ≤ µ(f) ≤ µ(f) ≤ µ(f).

The middle inequality is trivial, the third is Lemma 5.5.3, while the first isanalogous to the first.

To see that this proves the theorem, note that

µ(f − f) ≤ µ(f − f) = 0.

By Proposition 4.4.1, ‖f‖ = 0 iff f = 0 a.e. Since f is nonnegative, µ(f −f) = 0 implies f = f outside of a null Gδ set G. For all x outside of G, due to

nested interval convergence, f(x) = f(x) = limn Snf(x). This implies that

the sequence 〈Snf〉 converges to an integrable function f a.e. as required.

5.6 Third Proof

In this section we consider the standard proof of the pointwise ergodic the-orem. In the classical mathematical setting this proof is short and straight-forward. In our framework, however, it becomes complicated, providing anexample of the difficulties in dealing with subsets of a measure space inthe context of second-order arithmetic. For that reason, this proof is pre-sented last. To simplify matters, the underlying space X is taken to be[0, 1], though it is likely that the argument easily generalizes to an arbitrarycompact metric space equipped with a Borel measure. The main idea of theproof is to show that given a, b ∈ Q such that a < b, the set

Aa,b = {x | lim infn

Snf(x) < a < b < lim supn

Snf(x)}

is invariant (T−1(A) = A), and then with the help of the maximal ergodictheorem to show that µ(Aa,b) = 0. Since

a,bAa,b is the set of all x forwhich the limit of Snf(x) does not exist, and since µ(

a,bAa,b) = 0, thissuffices to prove the claim. The main difficulty for us is in showing that Aa,b

is an invariant set. In our setting this means that TχAa,b= χAa,b

.Throughout the section we are going to work under the assumption that

T is a multiplicative isometry on L1([0, 1]). Recall that a multiplicativetransformation also preserves L1,∞ norm.

Page 101: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

5.6. THIRD PROOF 89

Notice that Aa,b = B ∩ C, where

B = {x | lim infn

Snf(x) < a},

andC = {x | b < lim sup

nSnf(x)},

and thatTχB∩C = T (χB · χC) = TχB · TχC .

If we show that B and C are both invariant, Aa,b will be invariant as well.As B and C have symmetric definitions, it is enough to prove the claim foronly one of the two, so from now on consider only the set B.

Since x ∈ B ↔ ∃ε ∀k ∃n ≥ k (Snf(x) < a− ε),

B =⋃

ε

k

n≥k

{x | Snf(x) < a− ε}.

For brevity, letBnε = {x | Snf(x) < a− ε},

where ε ranges over rational numbers, e.g. those of the form 2−m for somem.

The next thing we need to show is that T , in a sense, commutes withinfinite unions and intersections, i.e. that

T (χ∪ε∩k∪n≥kBnε) = χ∪ε∩k∪n≥kDnε

,

where Dnε is such that TχBnε= χDnε

.This fact is true in the finite case (see also comment on page 75 before

Proposition 5.1.5). It is not difficult to show that, due to continuity ofT , the extension to infinite unions and intersections can be made. Forexample, assume M = ∪∞

n=1Mn exists (meaning χM = limn χ∪nk=1

Mk), with

TχMn= χNn

. Then

TχM = T (χlimn ∪nk=1

Mk) = T (lim

nχ∪n

k=1Mk

)

(⋆)= lim

nT (χ∪n

k=1Mk

) = limnχ∪n

k=1Nk

= χ∪∞n=1

Nn

The step (⋆) follows from continuity of T .With this conclusion, and since

TχB = Tχ∪ε∩k∪n≥kBnε= χ∪ε∩k∪n≥kDnε

,

Page 102: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

90 CHAPTER 5. THE POINTWISE ERGODIC THEOREM

it suffices to show that ∪ε ∩k ∪n≥kBnε = ∪ε ∩k ∪n≥kDnε. To make thisclaim, we need to see how T acts on χBnε

. For this purpose we consider analternative characterization of integrable functions.

Definition 5.6.1 (RCA0) A function f is called a step function if it is ofthe form

k≤n ckχIi, where I1, . . . , In are disjoint open intervals.

It can be shown in RCA0 that every step function is integrable, becausecharacteristic functions of open intervals can be defined in RCA0 (see page46) and linear combinations of integrable functions are integrable (commentbefore Lemma 4.4.4).

Lemma 5.6.2 (RCA0) Every function f ∈ L1(X) is the L1 limit of stepfunctions.

Proof. It is enough to show the claim for simple functions. Let f be asimple function. Let gn be the function defined as

∑n−1k=0 f( k

n)χIkn, where

Ikn = ( kn ,

k+1n ). It is clear that each gn is a step function. It can furthermore

be shown that 〈gn〉 is a Cauchy sequence and that it converges to f . �

Lemma 5.6.3 (WWKL0) Let g ∈ L1(X). Then

Tχ{x | g(x)>0} = χ{x | Tg(x)>0}. (5.4)

Proof. We will use the previous characterization of integrable functions toprove the claim. First assume that g is a characteristic function of an openinterval: call it χI .

This case is simple, since for all x, χI(x) > 0 ↔ χI(x) = 1 (if χI(x) is de-fined, it is either 0 or 1, according to the definition of characteristic functionsof intervals) and {x | χI(x) = 1} = I, so the left-hand side of (5.4) becomesTχI . As for the right-hand side, since TχI is the characteristic function ofsome set J , similarly, χJ(x) > 0 ↔ χJ(x) = 1 and χ{x | χJ (x)=1} = χJ , sothe two are equal.

Next assume that g =∑

k≤n ckχIk. Since the intervals Ik are disjoint,

{x |∑

k≤n

ckχIk> 0} = ∪k≤n{x | ckχIk

> 0}.

Observe that, if TχIk= χJk

, the sets J1, . . . , Jn can intersect only at a nullset, as

χJi∩Jk= χJi

· χJk= TχIi

· TχIk

= T (χIi∩Ik) = T (0) = 0.

Page 103: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

5.6. THIRD PROOF 91

Since two sets that differ only on a null set have the same characteristicfunction, and since T commutes with unions in the sense specified above, itis enough to consider only Tχ{x | ckχIk

>0} for a fixed k.

If ck ≤ 0, then {x | ckχIk> 0} = ∅, χ∅ = 0 and T (0) = 0. Following the

same reasoning, {x | T (ckχIk> 0)} = {x | ckχJk

> 0)} = ∅, and once again,the two sides of (5.4) coincide. If ck > 0, then {x | ckχIk

> 0} = {x | χIk>

0} and this reduces to the previous case.Finally, let g be represented as a limit of step functions 〈gn〉. Then

g(x) > 0 ↔ ∀ε ∃N ∀n ≥ N (gn(x) > ε),

and

{x | g(x) > 0} =⋂

ε

N

n

{x | gn(x) > ε}.

We can once again appeal to the fact that T commutes with unions andintersections. Since Tχ{x | gn(x)>ε} = χ{x | Tgn(x)>ε} for all n, it follows that

Tχ{x | g(x)>0} = Tχ∩ε∪N∩n{x | gn(x)>ε}= χ∩ε∪N∩n{x | Tgn(x)>ε}= χ{x | Tg(x)>0}.

The last step follows from the fact that T is continuous. �

Now we need to see why this is enough to prove our initial claim. Thelemma is stated and proved in WWKL0, with the assumption that all limitsexist.

Lemma 5.6.4 (WWKL0) The set B is invariant.

Proof. First assume that the function f is bounded. According to the pre-vious lemma, Tχ{x | Snf(x)<a−ε} = χ{x | TSnf(x)<a−ε} and

TχA = Tχ{x | lim infn Snf(x)<a}= Tχ∪ε∩k∪n≥k{x | Snf(x)<a−ε}= χ∪ε∩k∪n≥k{x | TSnf(x)<a−ε}= χ{x | lim infn TSnf(x)<a},

and it therefore suffices to show that

{x | lim infn

TSnf(x) < a} = {x | lim infn

Snf(x) < a}.

Page 104: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

92 CHAPTER 5. THE POINTWISE ERGODIC THEOREM

Notice that TSnf(x) = Snf(x) + T nf(x)−f(x)n . Since f is bounded, |f | ≤

M , for some M and as T preserves L1,∞ norm,∣∣T nf(x)−f(x)

n

∣∣ ≤ 2M

n → 0.It is not difficult to show that, if 〈an〉 and 〈bn〉 are two sequences of realnumbers such that limn bn = 0, then lim infn(an + bn) = lim infn an, andtherefore

lim infn

TSnf(x) = lim infn

(Snf(x) +

Tnf(x) − f(x)

n

)

= lim infn

Snf(x).

It should be clear that this proves the claim in the case f is bounded.The general case follows, as f is represented with a sequence 〈fn〉 of

simple functions, which are by definition bounded. More precisely, we haveproved that for every k,

Tχ{x | lim infn Snfk(x)<a} = χ{x | lim infn Snfk(x)<a},

while

lim infn

Snf(x) < a↔ ∀ε ∃N ∀m ≥ N (lim infn

Snfm(x) < a− ε),

from which the claim follows, once again by the useful fact that T commuteswith unions and intersections. �

Finally, it remains to prove the actual pointwise ergodic theorem. Theproof has been adapted from [3].

Theorem 5.6.5 (ACA0) (Pointwise ergodic theorem) If T is a mul-tiplicative isometry on L1([0, 1]), then the sequence of averages Snf(x) =1n

∑n−1k=0 T

kf(x) converges for a.e. x in [0, 1].

Proof. As indicated at the beginning of the section, we need to show that forall rational numbers a and b with a < b, µ(Aa,b) = 0. Fix a and b. We showedthat Aa,b is an invariant set. Furthermore, if M = {x | supnSnf(x) > b},then Aa,b = Aa,b∩M and by Corollary 5.3.2 of the maximal ergodic theorem,

bµ(χAa,b) ≤ µ(fχAa,b

).

Similarly,µ(fχAa,b

) ≤ aµ(χAa,b),

andbµ(χAa,b

) ≤ aµ(χAa,b),

Page 105: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

5.7. REVERSAL 93

which, as a < b, is only possible when µ(χAa,b) = 0, which means that Aa,b

is a null set and

µ({x | limnSnf(x) does not exist} = µ(∪a,b∈QAa,b) = 0,

which implies that the limit exists a.e. �

5.7 Reversal

The introduction contained Bishop’s explanation as to why the ergodic the-orem of Birkhoff is not constructive. He gave an example of two vesselsbetween which a leak may or may not exist. Imagine now infinitely manysuch pairs, each half the size of the previous one, so that they all fit on a finiteline, and associate to the nth pair the nth Turing machine is some standardordering. A leak exists iff the corresponding Turing machine halts. Know-ing the behavior of the system after a sufficient amount of time provides thesolution to the halting problem, and we know that is not constructive. Infact, it is equivalent to arithmetic comprehension. Formalizing this heuristicargument will provide us with the reversal of the pointwise ergodic theorem.Recall that the Turing jump of Z ⊆ N is {x | ∃y θ(x, y, Z)}, where θ is ∆0

and ∃y θ(x, y, Z) is a complete Σ1 formula.

The transformation T defined below will preserve measure and norm,and will be multiplicative and nonnegative (recall that in RCA0 some ofthese notions may differ). This will imply that the equivalence between thepointwise ergodic theorem and (ACA) will hold regardless of the definition,regardless of which proof of the theorem we consider (recall that the secondproof uses slightly different assumptions from the other two). Having saidthis, we take the statement of the theorem to be the one below, with theunderstanding that even with modified assumptions, the reversal works.

IfX is a compact complete separable metric space, and T a multiplicativeisometry on L1(X), then the sequence of averages Snf(x) = 1

n

∑n−1k=1 T

kf(x)

converges a.e. to an invariant function f ∈ L1(X) such that µ(f) = µ(f).

Theorem 5.7.1 (RCA0) The following are equivalent:

1. Pointwise ergodic theorem.

2. (ACA).

Page 106: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

94 CHAPTER 5. THE POINTWISE ERGODIC THEOREM

Proof. It only remains to prove the reversal. Let the space X be [0, 1] withthe standard metric. It is compact as required and the Borel measure onthis space coincides with the Riemann integral. Define a measure preservingtransformation T on L1(X) in the following way:

We split [0, 1] into intervals Ik = [xk, xk+1), where xk = 1 − 2−k andk ≥ 0, so that Ik has length 2−(k+1). Define

(ak)n =

{2−(k+m) m ≤ n ∧ θ(k,m,Z) ∧ ∀l < m ¬θ(k, l, Z)0 otherwise.

It is immediate that each ak is a computable real number and that ak isequal to 0 if and only if ¬∃m θ(k,m,Z).

Next, define T on simple functions. If f ∈ S(X) and x ∈ [xk−1, xk),define

Tf(x) =

{f(x+ ak) x+ ak < xk

f(x+ ak − 2−k) x+ ak > xk.

Note that Tf is not defined when x + ak = xk, but since there are onlycountably many such points, this presents no difficulty.

We will show that T satisfies all the properties previously associated withit. First show that T is well-defined, i.e. that Tf ∈ L1([0, 1]). This is due tothe fact that Tf is a piecewise continuous function, with at most countablymany jump discontinuities (they can occur at xk−1, xk−1 + ak and xk for allk), which is moreover bounded above; by Lemma 4.5.2, Tf is an integrablefunction.

Next show that T is measure preserving on S(X). Let f be a simple func-tion. Because µ(f) =

∫ 10 f(x)dx =

k

∫ xk

xk−1f(x)dx, it is enough to show

that∫ xk

xk−1Tf(x)dx =

∫ xk

xk−1f(x)dx, but this fact follows immediately from

invariance of the Riemann integral under translation. The same argumentcan be used to show that T is an isometry.

Showing that T is nonnegative is also straightforward, and it is clearthat if f ∈ S(X), Tf ∈ L1,∞, as the maximum of f is preserved. It remainsto show that if f, g ∈ S(X), then T (fg) = Tf · Tg. That T (fg)(x) =Tf(x)Tg(x) whenever all the quantities are defined is immediate. Howabout the strong product of Tf and Tg? We know that this product existsand is unique. By a previous remark, it has to coincide with the pointwiseproduct, which is T (fg).

Now extend the definition to all of L1([0, 1]). Because T is an isometryon simple functions, this will imply that T is well-defined on the entire space.

Pointwise properties of integrable functions are in general not availablein RCA0. However, the only choice for f of interest to us is that of the

Page 107: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

5.7. REVERSAL 95

identity function. In this case, we will be able to analyze both Tf and Snf .The definition of T implies that Tf(x) remains in the same subinterval asx, as do all the averages Snf(x). If ak = 0, T does not change f , henceSnf = f on [xk−1, xk) for all n. If ak 6= 0, each iteration moves f(x), andregardless of where in the interval x is, T 2m

f = f .

Consider the graphical representation of this. For the sake of conve-nience, assume k = 0 and m = 2.

f

6

-��

��

��

��

Tf

6

-

��

��

��

��

T 2f

6

-

��

��

��

��

T 3f

6

-

��

��

��

��

Algebraically, it can be shown that

S2mf(x) =1

2m[x′ + (x′ + ak) + (x′ + 2ak) + · · · + (x′ + (2m − 1)ak)]

= x′ +2m − 1

2ak = x′ + 2−(k+1) − 2−(k+m+1).

where x′ = x− i2−m, and i is the largest integer such that x− i2−m ≥ xk−1,that is, x′ is the leftmost iteration of x in the interval [xk−1, xk). This x′ iscomputable, based on Proposition 1.2.2. The above formula holds for all xfor which T kf(x) is defined for all k < 2m.

The function S2mf can graphically be represented as

Page 108: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

96 CHAPTER 5. THE POINTWISE ERGODIC THEOREM

S2mf

6

-

��

��

��

��

Define Anf = nSnf . Because T 2m

f = f ,

Si2mf =Ai2m

i2m=iA2m

i2m= S2m .

Finally, for arbitrary n, since for some i, i2m ≤ n < (i+ 1)2m,

‖Snf − S2mf‖ = ‖Snf − Si2mf‖ = ‖Anf

n− Ai2mf

i2m‖

=‖i2m(Ai2mf +Rn) − nAi2mf‖

ni2m,

where Rn = An −Ai2m .This is in turn equal to

‖Ai2mf(i2m − n)

ni2m+Rn

n‖ ≤ | i2

m − n

n| ‖S2mf‖ +

‖Rn‖n

.

Since ‖S2m‖ ≤ ‖f‖ < 1, | i2m−nn | < 2m

i2m = 1i , and

‖Rn‖ = ‖T i2m

f + · · · + Tn−1f‖ ≤ (n− i2m)‖f‖ < 2m,

it follows that

‖Snf − S2mf‖ < 2

i.

Thus, if ak = 0, limn Snf = f . If ak 6= 0, then limn Snf = S2mf . Onceagain, graphically:

limn Snf

6

-��

��

��

��

��

��

��

��

Page 109: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

5.7. REVERSAL 97

It remains to find a formula that will distinguish between the two limitfunctions, and consequently between the cases when ak = 0 and ak 6= 0.We cannot directly compare integrals of limit functions, since the pointwiseergodic theorem states that the integral of the limit is equal to the integralof the original function. Instead, we can consider the restriction of limn Snfon the second half of each subinterval. It is clear from the above graph thatthe two have different measures. A brief computation shows this.

Let Ik = [xk + 2−(k+2), xk+1) and fk = limn Snf ↑ Ik = limn SnfχIk.

In general, (WWKL) is needed to prove that a product of an integrablefunction and a characteristic function is integrable. In this case, however,it can be shown directly that this is the case: fk is a piecewise continuousfunction and its integrability is shown using Lemma 4.5.2.

If ak = 0, then µ(fk) = 2−(k+2)(1 − 2−(k+1) − 2−(k+2) + 2−(k+3)).If ak 6= 0, then µ(fk) = 2−(k+2)(1 − 2−(k+1) − 2−(k+2)).Therefore

ak 6= 0 ↔ µ(fk) = 2−(k+2)(1 − 2−(k+1) − 2−(k+2)),

and the latter formula is Π1, which completes the proof.�

It can be shown using the same argument that the L1 ergodic theorem(statement that the sequence 〈Snf〉 converges in the L1 norm for all f) alsoreverses to (ACA). The transformation T remains the same, as does theproof. This is because we never used any of the pointwise properties of thelimit function in the above theorem.

Page 110: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

98 CHAPTER 5. THE POINTWISE ERGODIC THEOREM

Page 111: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

Chapter 6

Closing Remarks

It is my hope that the reader has been convinced that doing analysis insecond-order arithmetic is both a fascinating and useful endeavor, that un-covers relationships that are easily overlooked in the standard mathematicalpractice.

This is by no means an attempt to formalize all of measure theory andall of functional analysis within second-order arithmetic. My purpose wasto give a self-contained account of the ergodic theorems and results requiredto prove them. Much more work can and should be done towards a morecomplete understanding of these spaces within second-order arithmetic. Anatural continuation of the work presented in Chapters 2 and 3, and in morebreadth with Jeremy Avigad in [2], would include the analysis of the spectraltheory of Hilbert spaces, as well as consideration of more general classes ofBanach spaces. As for measure theory, there is the question of whether thereare situations in which it is necessary to introduce measurable functions, andif so, how. It is also still unclear if it is possible to create a satisfactory theoryof measure starting from sets. Finally, it would be especially worthwhile toexplore and fully grasp the relationship between reverse and constructivemathematics.

Despite of my effort to make this presentation complete, a number ofquestions still remain open. Those pertaining to Hilbert spaces and themean ergodic theorem were discussed in [2]. The list given below can befound in Section 16 of that paper.

1. Every closed linear subset of a Banach space is located.

2. Every closed linear subset of a Hilbert space is located.

3. Every closed linear subset of a Banach space is a closed subspace.

99

Page 112: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

100 CHAPTER 6. CLOSING REMARKS

4. Every closed linear subset of a Hilbert space is a closed subspace.

5. If T is any bounded linear operator from a Banach space to itself andλ any real number, {x | Tx = λx} is a closed subspace.

6. If T is any bounded linear operator from a Banach space to itself,{x | Tx = x} is a closed subspace.

7. If T is any bounded linear operator from a Hilbert space to itself,{x | Tx = x} is a closed subspace.

In RCA0, all of these are implied by (Π 1

1-CA), and all, in turn, apply (ACA).

In addition, 1 implies all the statements below it; 2 implies 4 and 7; 3 impliesall the statements below it; 4 implies 7; 5 is equivalent to 6 (since if λ 6= 0,we can define T ′x = 1

λTx), and these in turn imply 7. It is possible, however,that all the statements are equivalent to (Π 1

1-CA), and it is also possible

that they are all equivalent to (ACA).Also left wide open is the strength of the statement:

• If M is any closed linear subset of a Hilbert space, and x is any point,and the distance from x to M exists, then the projection of x on Mexists.

There are also many unresolved questions in the realm of measure the-ory. I proved a number of results under the condition that the functionsconsidered are essentially bounded, but in some instances was not able toconclude anything about the general case. It is especially unfortunate thatthe definitions of f ·f and f2 in general cannot be shown to be equivalent inRCA0. Also, I tailored the definition of products and powers of Lp functionsto suit my particular needs in this work; the definition may be too restrictiveand may not recognize all functions that are classically integrable as such.This should also be addressed in the future.

Another unresolved issue is this: it is a known fact that (WWKL) provesthat the domain of an integrable function is a full set. What is not known,however, is whether this is sharp, that is, if that statement reverses to weak-weak Konig’s lemma.

It would also be quite interesting to figure out the nuances regarding theexact relationship between measure theory in Bishop-style mathematics andin our framework.

Page 113: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

Appendix A

To pay off a debt from the introduction, we show that the statement “C(X)is an integration space” (in the classical sense) reverses to (WWKL). Theconstructive definition of an integration space is given in [8] and is as follows:

Definition A.0.2 An integration space is a triple (X,L, I) with the follow-ing properties:

i) If f, g,∈ L and α, β ∈ R, then αf + βg, |f | and f ∧ 1 belong to L andI(αf + βg) = αI(f) + βI(g).

ii) If f ∈ L and 〈fn〉 is a sequence of nonnegative functions in L such that∑

n I(fn) converges and∑

n I(fn) < I(f), then there exists x in Xsuch that

n fn(x) converges and∑

n fn(x) < f(x).

iii) There exists a function p in L with I(p) = 1.

iv) For each f in L, limn I(f ∧ n) = I(f) and limn I(|f | ∧ n−1) = 0.

The classical definition of an integration space in the Daniell integral theoryis almost the same as the one above, and can be found, for example, in [24],where Daniell integrals are discussed at length. The essential difference liesin item ii). In the standard definition, it is replaced with the continuitycondition (A):

Let 〈fn〉 be a sequence of functions in L such that fn(x) ↓ 0 forall x. Then I(fn) ↓ 0.

Classically, the two are equivalent, but constructively, ii) is stronger.We will show that the statement “(A) holds in every space C(X) induced

by a complete separable metric space X” implies (WWKL). In fact, thespace that provides this reversal is C([0, 1]), so it may be more accurate tosay that the statement “(A) holds in L = C([0, 1])” implies (WWKL). Theproof will be based on the following result [26, 32, 11]:

101

Page 114: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

102 APPENDIX A.

Theorem A.0.3 The following assertions are equivalent over RCA0:

1. (WWKL).

2. For any covering of the closed unit interval [0, 1] by a sequence of openintervals 〈(an, bn) | n ∈ N〉, we have

n |bn − an| ≥ 1.

We will show that (A) implies (2) above, or rather that ¬(2) → ¬(A).If (2) does not hold, there exists a covering of [0, 1] by 〈(an, bn)〉 such that∑

n |bn − an| < 1.Now consider C([0, 1]) and construct a sequence of functions 〈fn〉 that

fails (A), using the covering above. Let fn be 1 on [ai + bi−ai

2n+2 , bi − bi−ai

2n+2 ] fori ≤ n, 0 on [0, 1]\⋃i≤n[ai, bi] and linear everywhere else. This is an increasingsequence of functions in C([0, 1]). Furthermore, limn fn(x) = 1 for all x,because 〈(an, bn)〉 is a covering. At the same time, µ(fn) ≤ ∑

i≤n |bi − ai|.Because

n |bn − an| < 1, if limn µ(fn) exists, it is less than 1. In otherwords, fn(x) ↑ 1 for all x, but µ(fn) 6→ µ(1). This immediately implies ¬(A)and so the classical definition of an integration space implies the weak-weakKonig’s lemma.

However, this conclusion no longer holds if we replace the classical def-inition of integration spaces with that by Bishop and Bridges. It cannotbe proved in RCA0 that item ii) in Definition A.0.2 and property (A) areequivalent.

To prove this fact, suppose (A), and let 〈fn〉 be a sequence satisfying thehypotheses of ii). Define a sequence 〈gi〉 such that

g0 = f

gi+1 = gi − fi.

Then for each n, gn = f − ∑

i<n fn, and the hypothesis implies that µ(gn)is decreasing and converges, but not to 0. Applying (A), we conclude thatfor some x, gn(x) does not converge to 0. In other words, for some x, either∑

n fn(x) does not converge at all, or it converges to something other thanf(x). But this is different from conclusion of ii), which does not allow thefirst possibility.

This last argument makes it clear that Bishop and Bridges were verycareful in choosing definitions that go through constructively.

Page 115: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

Appendix B

A number of proofs in Chapter 4 used properties of derivatives of powerfunctions. Instead of developing a theory of differentiation, this Appendix isintended only to convince the reader that power functions are well-defined,and that it is legitimate to differentiate them, that basic laws of differ-entiation hold, along with the mean value theorem and some other simpleresults. These are all concepts that are usually taken for granted, but requiresome effort in reverse and constructive mathematics The motivation comesfrom Bishop’s work [6, 8], and, more importantly, from Schwichtenberg, whoworked out most of these issues constructively in [25]. Most technical detailsof proofs will be omitted. Assume throughout that the underlying systemis RCA0.

The first question to answer is: how to define the function xp? Theanswer is clear for p ∈ N, but more difficult when p is rational, and evenmore so when it is an irrational number. All standard calculus and analysistextbooks define power functions via exponentiation, i.e. xp = ep ln x for allx > 0. This is the approach that we will also take, but prior to that we needto justify the validity of taking these operations, that is, we need to showthat ex and lnx (when x > 0) are continuous.

Based on Lemma II.6.5. in [26], power series give rise to continuous func-tions, so ex can be defined as the sum of the series

∑∞n=0

xn

n! . Schwichtenbergshows in [25] that ex · ey = ex+y. He then proceeds to define lnx =

∫ x1

1t dt,

for x > 0. It is not obvious that this integration is permitted in RCA0. Infact, according to Lemma IV.2.6 in [26], 1

x can be integrated only if it has amodulus of uniform continuity, but it is a well-known fact that this is true,as long as we restrict ourselves to a closed interval [a, b], with a > 0.

A function obtained via integration is continuous and differentiable (def-inition of derivatives will follow below), hence lnx is continuous for all x > 0.Furthermore, the reader can find a proof in [25] that lnx thus defined hasthe usual properties and that ex and lnx are inverse to each other.

103

Page 116: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

104 APPENDIX B.

This means that when x > 0, the function f(x) = ep ln x as a compositionof continuous functions is itself continuous ([26], Lemma II.6.4) and we letxp = ep ln x. Standard properties of power functions can be shown, for ex-ample that xp+q = xp · xq and (xp)q = xpq (and in particular, (xp)1/p = x).Also, if p ∈ N, xp = x · · · · · x

︸ ︷︷ ︸

p

.

There is only one problem with this definition, and that is that it onlyworks for x > 0. However, it is possible to show that limx→0 e

p ln x = 0, byshowing that lnx is negative and unbounded when x → 0 and that ex isunbounded when x→ ∞. We also know that xp is a continuous function forall x > 0, which means that there is a sequence of quintuples (as describedin Definition 1.1.4) that codes it. With some care, it can be shown that, ifwe add to this sequence a countable number of conditions specifying that0 7→ 0, we will obtain a code for another continuous function, the functionxp for all nonnegative numbers.

Another useful fact can be shown: suppose p is an arbitrary real number,hence represented as the limit of a strong Cauchy sequence. Then it followsfrom properties of exponents and logarithms that xp = limn x

pn when x > 0(the fact is immediate when x = 0).

We can now define fp, when f is a nonnegative simple function. Withsome effort it can be shown that if f ∈ S(X), then fp ∈ C(X), by showingit has a modulus of uniform continuity.

Next we need to discuss some basic concepts regarding differentiation.The definition of derivative is what one would expect. A function f isdifferentiable at x if limh→0

f(x+h)−f(x)h exists. This limit, if it exists, is

f ′(x). Since the goal is to formalize all this in RCA0, the rate of convergencehas to be computable. Bishop gives the following definition:

Definition B.0.4 Let f and g be continuous functions on a compact properinterval I such that for each ε there exists δ(ε) > 0 with

|f(y) − f(x) − g(x)(y − x)| ≤ ε|y − x|

whenever x, y ∈ I and |y − x| ≤ δ(ε).

and with it proceeds to prove the following:

Proposition B.0.5 With the notation Df = f ′ = dfdx , the following hold:

1. D(f1 + f2) = Df1 +Df2.

2. D(f1f2) = Df1f2 + f1Df2.

Page 117: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

105

3. D(f−11 ) = −f−2

1 Df1.

4. dxdx = 1.

5. dcdx = 0.

6. (g ◦ f)′ = (g′ ◦ f)f ′.

The standard proofs of these facts are constructive: they apply almost wordfor word in RCA0.

The statements about power functions that were mentioned in the maintext and require proper proofs are these:

Lemma B.0.6 (RCA0) 1. The inequality ab ≤ ap

p + bq

q holds for all non-negative real numbers a and b. (4.3.1)

2. The inequality |x− y|p ≤ |xp − yp| holds for all x and y. (4.3.2)

3. The inequality |xp − yp| ≤ p |(x − y)(xp−1 + yp−1)| is true for all realx and y. (4.3.2)

4. For all x, y ∈ R+, and q > p, yq/p ≥ qp x

q−p

p (y − x) + xq/p. (4.3.3)

To prove Lemma B.0.6, a number of facts are needed.

Proposition B.0.7 The function f(x) = xp is differentiable for all realnumbers p and f ′(x) = p xp−1.

Proof. If p is a natural number, this is proved by induction, using the productrule. The base case is item 4. in Proposition B.0.5. If p is rational, the resultis provided by the chain rule. Finally, if p is an arbitrary real number,

(xp)′ = (ep ln x)′ =p

xep ln x = p xp−1,

by another application of the chain rule and by properties of exponents andlogarithms. �

Proposition B.0.8 The function f(x) = xp is convex for all p > 1, thatis, it is above its tangent line at every point.

Proof. The standard proof that if f ′′ > 0, then f is convex, applies. It iseasy to see that when p > 1, (xp)′′ = p (p− 1)xp−2 > 0. �

Page 118: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

106 APPENDIX B.

Proposition B.0.9 The mean value theorem for xa: for every x and ythere is a ξ between them such that xa−ya

x−y = aξa−1.

Proof. The general form of the mean value theorem was proved by Hardinand Velleman in [17]. The proof of this theorem in RCA0 is not at all obvious.�

Proposition B.0.10 If f is differentiable at x and f ′(x) > 0, then f isincreasing in some neighborhood of x; if f ′(x) < 0, f is decreasing; con-sequently, if f ′(x0) = 0, f ′(x) < 0 (resp. f ′(x) > 0) for all x < x0 andf ′(x) > 0 (resp. f ′(x) < 0) for all x > x0, then f(x0) is the absolute mini-mum (resp. absolute maximum) of f .

Proof. If f ′(x) > 0, then based on properties of limits and continuous func-

tions, there is some neighborhood of x for which f(x)−f(y)x−y > 0. This means

that for y < x in that neighborhood, f(x) > f(y) and for y > x, f(x) < f(y)as required. Other facts are shown similarly. �

We can now outline the proof of Proposition B.0.6:Proof.

1. The proof consists of finding the maximum of the function b 7→ ab −ap

p − bq

q . Since f ′(x) = a − xq−1, and f ′(x) > 0 when x > ap/q, while

f ′(x) < 0 when x > ap/q, the maximum is attained when x = a1

q−1 =ap/q and is equal to a1+p/q − ap

p − ap

q = 0. Therefore, ab ≤ ap

p + bq

q .

2. Let x ≥ y ≥ 0 and divide the entire inequality by xp. Then it sufficesto show (1 − t)p ≤ (1 − tp) for 0 ≤ t ≤ 1. As in the previous item,we consider the derivative. In this case, f(t) = (1− t)p − 1 + tp, whilef ′(t) = −p(1 − t)p−1 + tp−1, so f ′(t) > 0 when t < 1

2 and f ′(t) < 0when t > 1

2 . Clearly, the minimum of the function on the interval isattained at t = 0 or t = 1. Since f(0) = f(1) = 0, f(t) ≥ 0 whichimplies the original claim.

3. This inequality is proved using the mean value theorem. If x, y ≥ 0,(at least one is nonzero, otherwise the claim is trivial),

xp − yp

x− y= pξp−1 ≤ p (xp−1 + yp−1).

4. Let f(x) = xq/p. Then the inequality can be written as f(y)−f(x)y−x ≥

f ′(x), which is true since f is a convex function, as qp > 1. �

Page 119: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

Bibliography

[1] Jeremy Avigad and Erich Reck. “Clarifying the nature of the infinite”:the development of metamathematics and proof theory. Technical Re-port CMU-PHIL-120, Carnegie Mellon University, 2001.

[2] Jeremy Avigad and Ksenija Simic. Fundamental notions of anal-ysis in subsystems of second-order arithmetic. Draft available athttp://www.andrew.cmu.edu/∼avigad/, 2003.

[3] Patrick Billingsley. Ergodic Theory and Information. Wiley, New York,1965.

[4] G.D. Birkhoff. Proof of the ergodic theorem. Proc. Nat. Acad. Sci.,17:656–660, 1931.

[5] Errett Bishop. An upcrossing inequality with applications. MichiganMath. J., 13:1–13, 1966.

[6] Errett Bishop. Foundations of Constructive Analysis. McGraw-Hill,New York, 1967.

[7] Errett Bishop. A constructive ergodic theorem. J. Math. Mech., 17:631–639, 1967/1968.

[8] Errett Bishop and Douglas Bridges. Constructive Mathematics.Springer, Berlin, 1985.

[9] Douglas Bridges. Constructive functional analysis. Research Notes inMathematics. Pitman Publishing Limited, 1979.

[10] Douglas K. Brown. Notions of closed subsets of a complete separablemetric space in weak subsystems of second-order arithmetic. In Logicand computation (Pittsburgh, PA, 1987), pages 39–50. Amer. Math.Soc., Providence, RI, 1990.

107

Page 120: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

108 BIBLIOGRAPHY

[11] Douglas K. Brown, Mariagnese Giusto, and Stephen G. Simpson. Vi-tali’s theorem and WWKL. Arch. Math. Logic, 41(2):191–206, 2002.

[12] Nelson Dunford and Jacob T. Schwartz. Linear operators. Part I. WileyClassics Library. John Wiley & Sons Inc., New York, 1988. Spectral op-erators, With the assistance of William G. Bade and Robert G. Bartle,Reprint of the 1971 original, A Wiley-Interscience Publication.

[13] Matthew Frank. Ergodic theory in the 1930s: a study in internationalmathematical activity.Draft available at http://math.uchicago.edu/∼mfrank/erg.pdf.

[14] Harvey Friedman, Stephen G. Simpson, and Xiaokang Yu. Periodicpoints and subsystems of second-order arithmetic. Ann. Pure Appl.Logic, 62(1):51–64, 1993. Logic Colloquium ’89 (Berlin).

[15] A.M. Garsia. A simple proof of Eberhard Hopf’s maximal ergodic the-orem. J. Math. Mech., 14:381–382, 1965.

[16] Paul R. Halmos. Lectures on ergodic theory. Chelsea Publishing Co.,New York, 1960.

[17] Christopher S. Hardin and Daniel J. Velleman. The mean value theoremin second order arithmetic. J. Symbolic Logic, 66(3):1353–1358, 2001.

[18] Teturo Kamae. A simple proof of the ergodic theorem using nonstan-dard analysis. Israel J. Math., 42(4):284–290, 1982.

[19] Yitzhak Katznelson and Benjamin Weiss. A simple proof of some er-godic theorems. Israel J. Math., 42(4):291–296, 1982.

[20] J. A. Nuber. A constructive ergodic theorem. Transactions of theAmerican Mathematical Society, 164:115–137, 1972.

[21] Karl Petersen. Ergodic Theory, volume 2 of Cambridge studies in ad-vanced mathematics. Cambridge University Press, Cambridge, 1983.

[22] Marian B. Pour-El and J. Ian Richards. Computability in analysis andphysics. Perspectives in Mathematical Logic. Springer-Verlag, Berlin,1989.

[23] Frederic Riesz. Sur la theorie ergodique. Comment. Math. Helv.,17:221–239, 1945.

Page 121: Aspects of Ergodic Theory in Subsystems of …simicmka/THESIS3.pdfdifferent subsystems of second-order arithmetic. Conveniently, there is no need for a whole array of such subsystems:

BIBLIOGRAPHY 109

[24] H. L. Royden. Real analysis. Macmillan Publishing Company, NewYork, third edition, 1988.

[25] Helmut Schwichtenberg. Constructive analysis with witnesses.Draft available athttp://www.mathematik.uni-muenchen.de/∼schwicht/.

[26] Stephen G. Simpson. Subsystems of Second-Order Arithmetic. Springer,Berlin, 1998.

[27] Bas Spitters. Constructive and intuitionistic integration theory andfunctional analysis. PhD thesis, University of Nijmegen, 2002.

[28] John von Neumann. Proof of the quasi-ergodic hypothesis. Proc. Nat.Acad. Sci., 18:70–82, 1932.

[29] Peter Walters. An introduction to ergodic theory, volume 79 of GraduateTexts in Mathematics. Springer-Verlag, New York, 1982.

[30] Xiaokang Yu. Riesz representation theorem, Borel measures and subsys-tems of second-order arithmetic. Ann. Pure Appl. Logic, 59(1):65–78,1993.

[31] Xiaokang Yu. Lebesgue convergence theorems and reverse mathematics.Math. Logic Quart., 40(1):1–13, 1994.

[32] Xiaokang Yu and Stephen G. Simpson. Measure theory and weakKonig’s lemma. Arch. Math. Logic, 30(3):171–180, 1990.