Abstrat Algebra Jason Juett

download Abstrat Algebra Jason Juett

of 150

description

abstract algebra

Transcript of Abstrat Algebra Jason Juett

  • Abstract Algebra

    Dr. Jason Juett

    April 7, 2014

  • Table of Contents

    1.1: Sets and Basic Notation

    1.2: Functions

    1.3: Equivalence Relations and Partial Orders

    1.4: Well-Ordered Sets and Induction/Recursion

    1.5: Ordinal Numbers

    1.6: Cardinal Numbers

    2.1: Semigroups, Monoids, and Groups

    2.2: Subgroups and Cosets

    2.3: Homomorphisms and Isomorphisms

    2.4: Cyclic Groups

    2.5: Cauchys Theorem and Other Assorted Facts

    2.6: Permutations

    3.1: Rings, Integral Domains, and Fields

    3.2: More Ideals

    3.3: The Construction of Z and Q

    3.4: The Construction of R

    3.5: Polynomial Rings

    1

  • 23.6: The Ascending Chain Condition

    3.7: Divisibility

    Appendix to 3.7: The Euclidean Algorithm

    3.8: Unique Factorization Domains

    3.8: Cohens Theorem

    4.1: Vector Spaces

    4.2: Field Extensions

    4.3: Splitting Fields

  • Introduction

    These notes are more or less a self-contained introductory abstract algebracourse. However, while only an introduction to this subject, this course isstill fairly ambitious and expects a significant amount of effort from the stu-dent who wishes to master these topics. Very little pre-requisite knowledge isassumed, beyond a general level of familiarity with pre-calculus mathematicsthat is necessary in order to understand references made in examples. We willstart with a development of set theory, then move on to discuss groups, rings,and fields. Along the way, we will carefully define and construct the familiarnumber systems of the integers, rational numbers, real numbers, and complexnumbers. Some things that will set this course apart from the typical one are:(i) we are developing topics with a fair degree of rigor, (ii) we spend longer onbackground topics than usual at the start but then make up for it by movingmore efficiently through the main material, and (iii) my various biases aboutwhich topics are interesting or important.

    When references are made to Durbin, I am referring to the textbook Mod-ern Algebra: An Introduction, 6th Edition, by Durbin [1]. At the beginningof each section of the notes, I have indicated the most closely correspondingsection(s) of Durbin. It is not necessary to possess or read Durbins book, butyou may find it a useful additional learning aid, because it provides a greaterquantity of concrete examples than I have, and is written in a different style.Be advised, though, that Durbin and I have developed our topics in a differentorder, and there are some small differences in notation and definitions, so youwill have to accommodate for this if reading Durbin.

    In order to be able to do the exercises, you will need to have some grasp ofhow to write proofs. I suggest that you use the style in the proofs and theoremsin these notes (or in Durbin) as a model of roughly how your mathematicalwriting should look. I have written many hints to help guide you throughthe more challenging exercises, and we will be working with you during thediscussion sections and office hours on helping you develop your mathematicalreasoning skills and your fluency in writing mathematics. Here are a few morewords of advice. When one reads your mathematical writing, it should flownaturally, as if reading a normal sentence, and it is usually easier to processwritten words than a huge mess of symbols. For this reason, you should alsouse complete sentences, proper grammar, well-organized thoughts, and so on.

    3

  • Chapter 1

    Set Theory

    During the course of this introductory chapter, we will learn the basics aboutsets, as well as some slightly advanced things about the foundations of set theoryand mathematics, building up to a development of the ordinal and cardinalnumbers, and tools such as mathematical induction and Zorns Lemma. Thestudy of abstract algebra proper will begin in the next chapter. While thischapter will certainly provide a foundation for and have direct applications tothe algebraic structures we study later, we admittedly could go into somewhatless depth and still have enough background to achieve a basic understandingof algebraic structures. Therefore, this chapter is not only about giving youinformation you will specifically need for the algebra to come, but also aboutexposing you to some (hopefully) interesting and challenging ideas and gettingyou used to thinking mathematically.

    1.1 Sets and Basic Notation

    [Durbin: Appendices A and B]

    Notation.

    We will occasionally use the following logic symbols.

    : implies or only if : is implied by or if : if and only if or is equivalent to

    Definition. Roughly speaking, a set is a collection of objects. Two sets areequal if they have the same members.

    Notation.

    1. We reserve special symbols for the following sets.

    4

  • 5 : empty set (the set with no elements) N: set of natural numbers (including 0) Z: set of integers Q: set of rational numbes R: set of real numbers C: set of complex numbers

    We will sometimes add a superscript + (resp., ) to denote the modifiedversion of that set containing only the positive (resp., nonzero) elements.(The use of resp., which stands for respectively, above illustrates an-other convention. That sentence consists of two very similar sentencescombined together to save space. To interpret such a sentence, first readit without any of the resp. things, then read it with the substitutionsindicated by the places that say resp..)

    2. We use to indicate that an object is a member of a set, and / toindicate that it is not. For example, 2 Z and 2 / Q.

    3. There are three main ways to describe a set.

    (a) We can represent a set by listing its elements inside { }, separatedby commas, e.g., S = {1, 2, 5} or T = {2, 4, 6, 8, . . .}.

    (b) Alternatively, we can represent sets by stating a rule describing whichelements are in the set, e.g., T = {x Z+ | x is even}, where thesymbol | means such that in this context. (Alternatively, youmay use a colon, as Durbin does.) This notation is called set buildernotation.

    (c) Finally, we may simply describe a set in words, such as the set of allodd integers.

    Definition.

    1. If every member of a set X is a member of a set Y , we say that X is asubset of Y , and write X Y . Alternatively, we say Y is a superset of Xand write Y X. (Note that X = Y if and only if X Y and Y X.)If X Y and X 6= Y , then X is a proper subset of Y , and we may writeX ( Y or Y ) X. (Some people use the symbols and , but we willavoid this, because it is not universally agreed upon if these mean and or ( and ).) The power set of X is the set P(X) of subsets of X.

    2. An indexed family is a set {x} that contains an object x for each in the index set . Note that there is no requirement here that distinctindices give distinct elements. Also, every set can be written as an indexedfamily, because X = {x}xX for any set X.

  • 63. The union of sets X and Y is X Y = {x | x X or x Y }. Moregenerally, if {X} is an indexed family of sets, then

    X = {x |

    x X for some }. The empty union is the union with an empty

    index set; it equals .4. The intersection of sets X and Y is X Y = {x | x X and x Y }.

    More generally, if {X} is a nonempty indexed family of sets, thenX = {x | x X for every }. The empty intersection is the

    intersection with an empty index set, and what this equals is a matter

    of convention. (Like how the expression 00 is undefined in general, but insome contexts may be defined to be 0 or 1.) We say X and Y are disjointif X Y = . A collection of sets is called pairwise disjoint if each pair ofdistinct members of the collection are disjoint.

    5. The complement of a set Y in a set X is X \ Y = {x X | x / Y }.(Note that this is a backslash.) Thus X and Y are disjoint if and only ifX \ Y = X.

    Notation. We will often use notation akin to sigma-notation as a shorthandwhen dealing with certain index sets. For example,

    ni=1Xi =

    i{1,...,n}Xi

    andj=0 Yj =

    jN Yj .

    The approach to set theory described so far, where a set is simply anycollection of objects, is called nave set theory, and in practical applications it isusually sufficient. However, if one is creative, then inconsistencies within naveset theory can be found, such as Russells Paradox: if R is the collection of allsets that are not a member of themselves (this collection is called the Russellclass) and R is a set, then R is a member of itself if and only if it is not. In orderto avoid such problems, we need to place some restrictions on which collectionsof objects are allowed to be sets.

    This leads us to our refined set theory, which works as follows. Every objectin our mathematical universe is a class, which is a collection of objects calledsets. Sets are then also classes, hence also collections of sets. But not everyclass is a set; the ones that are not are called proper classes. Note then thata proper class cannot be a member of a class. Russells Paradox is now nolonger a paradox, but merely a proof that R is a proper class. (And thereis no analogous paradox with a class of all classes that are not a member ofthemselves, because that is not a valid definition of a class.) We want all theconstructions above to give sets, provided that the objects we started with weresets, so we make axioms that this is so. More explicitly:

    1. If X is a set, then any subclass of X is a set. (Hence complements in setsand intersections of arbitrary nonempty families of sets are sets.)

    2. If is a set and X is a set for each , then {X} andX

    are sets.

    3. If X is a set, then so is P(X).

  • 7These laws give us ways to create new sets or verify that certain classes aresets, but in order to do anything at all, we have to make some axiom assumingthat a set exists. We will do this with the Axiom of Infinity, which, roughlyspeaking, asserts that N is a set. (We will discuss this in more detail later.)In view of (1), this implies that is a set. Now you may be wondering, if Nis a set, and members of sets are also sets in our foundational system, thenhow is a number such as 5 a set? For a look at whats to come, the waythat these numbers are built up from literally nothing is to define 0 = , 1 ={}, 2 = {, {}}, 3 = {, {}, {, {}}}, and so on, with each natural numberbeing the set containing the previous ones. (We are able to define any givennatural number in this fashion starting only with , but the Axiom of Infinityis necessary in order for N itself to be a set.) In fact, we will see that everyobject in standard mathematics is definable as a set (or a proper class at worst),so nothing is really lost by our apparently restricted point of view that everyobject is a class.

    Remark.

    1. Most normal math that one would do does not necessitate dealing withproper classes, so throughout these notes I will generally frame our def-initions in terms of sets rather than classes, and in those cases when wedo wish to apply those terms to classes, I will trust that what is meant isclear.

    2. Proper classes can intuitively be thought of as classes that are too bigto be a set. We will eventually see that a class is proper if and only if it isthe same size as the class V of all sets. (We will make this more preciselater.)

    3. This course (like almost all math courses) assumes the Global Axiom ofChoice: Given any class of nonempty sets, there is a way to simultaneouslychoose one element from each set. (This is the global version because itallows the collection of sets to be a class. We will state the Global Axiomof Choice more precisely in the next section.) This statement is provablewhen there are only finitely many sets involved; it is the case where thereare infinitely many when this cannot be proven and we need to take it asan axiom. In some specialized parts of mathematical logic, the (Global)Axiom of Choice is not assumed, but it is such an intuitive assumptionthat it is quite easy to use it without even realizing it, so we will use itfreely without necessarily mentioning it each time.

    4. We will also use the Axiom of Regularity: Every nonempty set has amember that is disjoint from it, i.e., there is no set X whose members areall nonempty subsets of X. This is actually a rather natural assumption,since we would not even be able to properly describe such a set anyway,since in attempting to describe what one of its members was, you wouldbe referred to one of the other members that was in that member, then indescribing that second member, you would be referred to a third, and so

  • 8on forever, without ever achieving a real description of what the membersof your set actually are. In an exercise, you will see that a consequenceof the Axiom of Regularity is that no set can be a member of itself, andthus V = R. (This is one explicit way to show that V is a proper class,but even without the Axiom of Regularity it would still be true, since anyclass containing a proper class must be proper.) Another exercise willshow that, if you are given a set, then you pick a member of that set, thenyou pick a member of that member, and continue doing this, then in afinite number of steps you will always reach .

    Exercises.

    1. Let A and X be sets. Show that X \ (X \A) A, and that equality holdsif and only if A X. (Hint: It suffices to prove that X \(X \A) = XA.)

    2. (Distributive Law) Show that, if {X} and {Y} are indexed fami-lies of sets, then

    (X

    ) ( Y) = ,(X Y).3. (De Morgans Laws) Show that, if X is a set and {A} is a nonempty

    family of sets, then X \ A = (X \ A) and X \ A =(X \A).

    4. (a) Show that there is no set that is a member of itself. (Hint: SupposeX X. Note why {X} is a set, and then apply the Axiom ofRegularity to {X} to get a contradiction.)

    (b) Show that there is no infinite sequence {Xn}n=0 of sets with eachXn+1 Xn. (Hint: Show that this sequence is a set that violates theAxiom of Regularity.)

    (c) Give an example of a set X0 such that, for each N Z+, there is asequence {Xn}Nn=0 with each Xn+1 Xn. (This shows that, whileno such sequence can go on forever, it may be the case that there isno upper bound on the length of such finite sequences.)

    5. A class is transitive if each of its members is a subset of it. We will denotethe class of transitive sets by TR.

    (a) Prove that a class T is transitive tT t T a T whenevera b and b T . (This explains the name transitive.)

    (b) Let X be a transitive set. Prove that P(X) is transitive. (Hint:The definition of transitive can be rephrased as T is transitive T P(T ).)

    (c) Prove that every nonempty transitive class has as a member. (Hint:Use the Axiom of Regularity.)

    (d) Show that TR is not transitive, i.e., that members of transitive setsare not necessarily transitive. (Hint: The smallest possible coun-terexample has three members.)

    (e) Let {T} be a family of transitive sets. Prove that T and

    T are transitive. (For the latter statement, assume 6= .)

  • 91.2 Functions

    [Durbin: Sections 1-2]

    Definition.

    1. A function (or mapping or map) from a set X (called the domain) into aset Y (called a codomain) is a correspondence that assigns to each x X aunique element of Y , denoted f(x) and called the image of x. (Note that,although maps are often defined by a formula, this need not be the case.Also, some sources use the word range for codomain, but some use itto mean something else, so we avoid the word entirely.) We indicate that f

    is a function from X into Y by writing f : X Y or X f Y . We denotethe set of maps X Y by M(X,Y ), and abbreviate M(X,X) = M(X).

    2. Let f : X Y . For A X, the image of A is f [A] = {f(a) | a A};we say f maps A onto f [A]. It is also common to write f(A) instead off [A] (as Durbin does), but we will occasionally encounter situations whereboth A X and A X, so we have adopted the latter notation in orderto avoid any possible confusion.

    3. For B Y , the pre-image of B is f1[B] = {x X | f(x) B}. (Thisnotation is defined even if f does not have an inverse function.)

    4. If f : X Y and g : Z W , then we say f = g if X = Z andf(x) = g(x) for each x X. Note that with our definition there aremultiple different possible codomains for a function f : X Y ; any setcontaining f [X] will do. (This is different from Durbins convention, buthas certain advantages, since we are really only concerned with the domainand the correspondence, not the rather arbitrary choice of codomain.)

    Example. f : R R : x 7 x2 is a function. (We use this notation as ashorthand to indicate that f is given by the formula f(x) = x2.)

    1. f [R] = f [[0,)] = f [(, 0]] = [0,).2. f [(0, 2)] = (0, 4).

    3. f1[(0, 4)] = (2, 0) (0, 2).Example. If Y is a set, then there is a unique function Y , called the emptyfunction, which is the function that makes no assignments. However, there areno functions from a nonempty set into .Remark.

    1. Functions from classes into classes can be defined in an analogous way.(Though M(X,Y ) does not exist if X is a proper class, because in thetechnical definition of a function as a class, a function is a set if and onlyif its domain is. See exercises.)

  • 10

    2. The indexed family of sets {X} can be identified with the function 7 X, so the notion of an indexed family is thus just a notationalconvenience. Along these lines, the axiom that a family of sets indexed bya set is a set can be rephrased as: the image of a set under a function is aset. However, pre-images of sets are not necessarily sets (if the functionsdomain is a proper class). For example, if f : V N is any function, thenf1[N] = V.

    Definition.

    1. If X Y , the inclusion map : X Y is given by (x) = x. If we wish tothink of this map as a map X X, we refer to it as the identity map, anddenote it id. If there is any danger of confusion about to which inclusionor identity map we are referring, we may add subscripts, e.g., X or idX .

    2. If f : X Y and g : Y Z, then the composition of g with f is thefunction g f : X Z : x 7 g(f(x)). Note that function compositionis associative, i.e., if f : X Y , g : Y Z, and h : Z W , then(h g) f = h (g f). (Exercise.)

    3. If A X and f : X Y , then the restriction of f to A is the functionf A= f A, i.e., f A: A Y : x 7 f(x). In this case, we say the formerfunction is an extension of the latter to X.

    Definition. The Cartesian product (or direct product) of an indexed family{X} of sets is the set

    X of all functions f :

    X with each

    f() X.1. For n Z+, the Cartesian product X1 Xn =

    ni=1Xi can be

    considered as the set of all ordered n-tuples (x1, . . . , xn) with each xi Xi,by associating each f ni=1Xi with the n-tuple (f(1), . . . , f(n)).

    2. Similarly, the Cartesian producti=1Xi can be thought of as the set of

    sequences (x1, x2, . . .) with each xi Xi.3. The empty Cartesian product by definition consists only of the empty

    function . We can represent this as a 0-tuple: = {()}.4. When writing functions with an input of an ordered n-tuple or sequence,

    we use the abbreviations f((x1, . . . , xn)) = f(x1, . . . , xn) and f((x1, x2, . . .)) =f(x1, x2, . . .).

    Remark. We now have the terminology to more precisely state the two formsof the Axiom of Choice.

    1. Axiom of Choice:X = if and only if some X = .

    2. Global Axiom of Choice: There is a g : V \ {} V with each g(x) x.(It is a good idea to ponder for a moment how these two statements repre-sent choice.) Recall that we have agreed to freely use these axioms withoutnecessarily mentioning that we are doing so.

  • 11

    Definition.

    1. A function f is injective (or one-to-one or an injection) if f(x) = f(y)x = y.

    2. A surjection X Y is a function f : X Y with f [X] = Y .3. A bijection (or one-to-one correspondence) X Y is an injection withf [X] = Y .

    4. In the phrases surjection X Y and bijection X Y , we may omitthe X Y if it is clear from context what is meant, e.g., let f : X Ybe a surjection.

    Example.

    1. f : R R : x 7 x2 is neither an injection nor a surjection.2. f : R R : x 7 ex is injective but not a surjection.3. f : R R : x 7 x3 3x is a non-injective surjection.4. f : R+ R : x 7 lnx is a bijection.

    Definition. Let f : X Y be a function. A function g : Y X is a left(resp., right) inverse function of f if g f = idX (resp., f g = idY ). We sayg is an inverse function of f if it is both a left and a right inverse function off . If a function f has an inverse, then the inverse is unique (we will prove thisshortly), and we denote the inverse function by f1. (Note that in this case wehave (f1)1 = f .)

    Example.

    1. The identity function on any set is its own inverse.

    2. The functions exp : R R+ and ln : R+ R are inverses.3. The empty function is its own inverse.4. Consider the functions sin : R [1, 1] and arcsin : [1, 1] R. We

    have sin arcsin = id[1,1], so sin is a left inverse of arcsin, and arcsinis a right inverse of sin, but the two functions are not inverses, becausearcsin(sinpi) = 0. However, one can modify the domains/codomains sothat the functions are inverses: sin : [pi/2, pi/2] [1, 1] and arcsin :[1, 1] [pi/2, pi/2].

    Theorem 1. Let f : X Y .1. f is injective it has a left inverse Y X or X = .2. f is a surjection it has a right inverse Y X. Hence a right inverse

    function of f must have domain f [X].

  • 12

    3. If f has a left inverse g : Y X and a right inverse h : Y X, theng = h and Y = f [X]. In particular, inverse functions are unique whenthey exist.

    4. f is a bijection it has a left and a right inverse Y X it has aninverse Y X.

    5. f has an inverse it has an inverse f [X] X it is injective.Proof.

    1. (): Assume f is injective and X 6= . Define g : Y X so that, foreach y f [X], g(y) is the unique element of X with f(g(y)) = y. Thenfor each x X we have g(f(x)) = x, so g f = idX . (): If f has a leftinverse g, then f(x) = f(y) x = g(f(x)) = g(f(y)) = y.

    2. (): Assume f [X] = Y . For each y Y , define g(y) to be an elementsuch that f(g(y)) = y. Then f g = idY . (): If f has a right inverseg : Y X, then for each y Y we have f(g(y)) = y, and hence f [X] = Y .

    3. In this case, we have g = g idY = g (f h) = (g f) h = idX h = h.4. The second equivalence follows from (3). The case of the first equiv-

    alence is immediate from (1) and (2), and, if the domain is nonempty, sois the case. For the remaining case, assume f : Y is a bijection.Then Y = f [] = , and f is its own inverse.

    5. The first equivalence follows from (2), and the second follows from (4).

    Remark. It follows that there is a bijection X Y if and only if there isa bijection Y X. In this case, we say that X and Y are in one-to-onecorrespondence.

    Theorem 2. Let f : X Y and g : Y Z.1. g f is injective f and g f [X] are.2. g f is a surjection g f [X] is.3. g f is a bijection f is injective and g f [X] is a bijection.

    Proof.

    1. (): Assume g f is injective. Then f(x1) = f(x2) g(f(x1)) =g(f(x2)) x1 = x2, so f is injective. Also, if g(y1) = g(y2) for somey1, y2 f [X], then there are x1, x2 X with g(f(x1)) = g(y1) = g(y2) =g(f(x2)), so x1 = x2 and f(x1) = f(x2). Therefore g f [X] is injective.(): If X 6= and f and g f [X] are injective, then they have leftinverses f : Y X and g : Z Y , respectively, and (f g) (g f) =(f g) ((g f [X]) f) = idX , so g f is injective. On the other hand, ifX = , then f = g f [X] = g f is the empty function, which is injective.

  • 13

    2. This follows from the observation that (g f)[X] = g[f [X]] = (g f [X])[f [X]].

    3. Follows from (1) and (2).

    Exercises.

    1. Prove that function composition is associative.

    2. Let f : X Y , A1, A2 X, B1, B2 Y , {A} be an indexedfamily of subsets of X, and {B} be an indexed family of subsets of Y .Correctly replace the question marks with either , , or =. If youuse one of the former two symbols, give an example where the inclusionis proper.

    (a) f [f1[B1]] ? B1.

    (b) f1[f [B1]] ? B1.

    (c) f [A] ?

    f [A].

    (d) f [A] ?

    f [A] (here 6= ).

    (e) f [A1 \A2] ? f [A1] \ f [A2].(f) f1[

    B ] ?

    f

    1[B ].

    (g) f1[B ] ?

    f

    1[B ] (here 6= ).(h) f1[B1 \B2] ? f1[B1] \ f1[B2].

    3. Let X and Y be classes. Define (x, y)K = {{x}, {x, y}} for x X andy Y , and define X K Y = {(x, y)K | x X, y Y }. (The subscriptK is to distinguish these from our definitions of the analogous concepts,where the K is in honor of these versions inventor Kuratowski.) Ourofficial definition of a function f : X Y is an object of the form{(x, f(x))K | x X}, where each f(x) Y . (Intuitively, we are defininga function to be its graph.)

    (a) Show that (a, b)K = (c, d)K a = c and b = d. (Hint: You willprobably need to break this down into at least a couple cases.)

    (b) Show that the above definition is equivalent to the more informalone given at the beginning of the section. (That is, show that twofunctions are equal under one definition if and only if they are equalunder the other.)

    (c) Prove that f : X Y is a set if and only if X is. (Hint: Show thatthey can each be written as a family indexed by the other.)

    (d) Let X be a set. Prove that M(X,Y ) is a set if and only if Y is a setor X = . (Hint: If X 6= , then Y can be indexed by the constantfunctions. If Y is a set, then show that M(X,Y ) X K Y P(P(X Y )).)

  • 14

    (e) Let {X} be a family of sets indexed by a set. Show thatX

    is a set.

    (Note: Even though this is our official definition of a function, we willnot explicitly use it again, due to its incredible unwieldiness.)

    4. Show that the following are equivalent for a map f : X Y .(a) f is a surjection.

    (b) f [f1[B]] = B for each B Y .(c) f1[B] ( f1[C] for each B ( C Y .

    5. Show that the following are equivalent for a map f : X Y .(a) f is injective.

    (b) f1[f [A]] = A for each A X.(c) f [

    A] =

    f [A] for each indexed family {A} of sub-

    sets of X.

    (d) f [A] ( f [B] for each A ( B X.6. Let X and Y be classes, with X 6= . Show that there is an injectionX Y if and only if there is a surjection Y X.

    1.3 Equivalence Relations and Partial Orders

    [Durbin: Sections 16 and 63]

    Definition.

    1. A relation on a set X is a subset of XX. We write a b if (a, b) ,and otherwise we write a b.

    2. If is a relation on a set X and A X, then the restriction of toA is A= (A A) . In other words, the relation A is defined bya A b a b. (In the future, when we define a relation on a set,and then refer to it as a relation on some subset, what we are technicallyreferring to is the restriction of that relation to that subset.) In this case,we say the former relation is an extension of the latter to X.

    3. A relation on a set X is:(a) reflexive if x x for all x X,(b) irreflexive if x x for all x X,(c) symmetric if x y y x,(d) antisymmetric if x y and y x x = y, and(e) transitive if x y and y z x z,

  • 15

    4. An equivalence relation is a reflexive, symmetric, and transitive relation.

    5. If is an equivalence relation on a set X, then the equivalence class of anelement x X is [x] = {a X | a x} = {a X | x a}. (If necessary,we will add a subscript to avoid ambiguity, e.g., [x].) We denote the setof equivalence classes of by X/ .

    Example.

    1. If X is any set, then = is an equivalence relation on X. The equivalenceclasses are the singleton subsets {x}.

    2. The relation is in one-to-one correspondence with is an equivalence re-lation on V. The equivalence classes consist of sets that are the samesize.

    3. For each n Z, congruence modulo n is an equivalence relation on Z.(Recall that a, b Z are congruent modulo n, written a n b or a b(mod n), if n | (b a). Recall also that for a, b Z, we say a divides b,and write a | b, if b is a multiple of a.) This is not hard to verify directly,but we will prove it as a special case of a more general theorem later.

    4. The only relation on is the empty relation , which is an equivalencerelation. There are no equivalence classes.

    Proposition 3. Let be an equivalence relation on a class X. The followingare equivalent for x, y X.

    1. [x] = [y].

    2. x y.3. x [y].4. y [x].5. [x] [y] 6= .

    Proof. Exercise.

    Definition. A partition of a setX is a collection P of pairwise disjoint nonemptysubsets of X such that X =

    AP A.

    Remark.

    1. If X is a nonempty set, then a collection P of subsets of X is a partitionif and only if each element of X is a member of exactly one set in P .

    2. The only partition of is the empty partition .Theorem 4. Let X be a set. Then the set of equivalence relations on X is inone-to-one correspondence with the set of partitions of X, via 7 {[x]}xX .The inverse map is P 7P , where x P y there is an A P with x, y A.

  • 16

    Proof. We need to show three things: (1) the set described in the second sen-tence is a partition, (2) the relation described in the last sentence is an equiva-lence relation, and (3) the two maps described are inverses.

    (1) Let be an equivalence relation on X. Since each x [x], we have X =xX [x], and {[x]}xX is a pairwise disjoint collection of sets by Proposition

    3, as desired.

    (2) Let P be a partition of X. The relation P is reflexive since each x Xis a member of some A P , and the fact that P is symmetric is clear. Itonly remains to show transitivity. Assume x P y and y P z. Then thereare A,B P with x, y A and y, z B. Because the elements of P arepairwise disjoint and y A B, we have A = B. So x, z A and x P z.

    (3) We need to show that ={[x]}xX and P = {[x]P }xX for each equiv-alence relation on X and each partition P of X. The former equationstates that two elements are -related if and only if there is an equivalenceclass of in which they are both members; this follows from Proposition3. The latter equation can be phrased as the equivalence classes of P arethe members of P, which is clear.

    Remark. We can state a version of the above theorem for classes as follows. If is an equivalence relation on a class X, then every element of X is a memberof exactly one equivalence class.

    Definition.

    1. A partial order is a reflexive, antisymmetric, and transitive relation.

    2. We say x and y are comparable with respect to a partial order if x yor y x; otherwise, they are incomparable. A total order or linear orderis a partial order for which every pair of elements are comparable.

    3. A partially (resp., totally) ordered set is a set together with a partial (resp.,total) order on it. (Formally, we define a partially ordered set as an orderedpair (X,), whereX is a set and is a partial order on it, but we will oftenabbreviate this as simply X if there is no danger of confusion.) Sometimesa partially ordered set is called a poset for short. A totally ordered subsetof a partially ordered set is called a chain.

    Example.

    1. and are total orders on R.2. Any set X is partially ordered by =. This is a total order if and only if X

    has at most one element.

    3. The relations and are partial orders on V. They are not total. Therelation is often referred to as the inclusion relation.

  • 17

    4. The relation | is a partial order on N but not a total order. The chains arethose sequences (either finite or infinite) where each term is a multiple ofthe previous one.

    5. If is a partial order, then the reverse partial order is the partial order given by a b b a. Note that is the reverse partial order of .

    Definition.

    1. A strict partial order is an irreflexive and transitive relation.

    2. If < is a partial order, then its corresponding strict partial order is thepartial order > given by a > b b < a.

    3. Given a partial order , its corresponding strict partial order is the relation< given by a < b a b and a 6= b. Conversely, given a strict partialorder

  • 18

    Definition. Let (X,) be a partially ordered set.1. We say m X is maximal (resp., minimal) if x m (resp., x m) x = m.

    2. A maximum (resp., minimum) element of a partially ordered set X is anelement m such that m x (resp., m x) for all x X.

    3. If x < y and there is no z X with x < z < y, then we say x isa predecessor of y, and y is a successor of x. In a totally ordered set,successors and predecessors and unique when they exist. (Exercise.) Whenan element x has a unique successor (resp., predecessor), we denote it bys(x) (resp., p(x)).

    Example.

    1. It is important to understand the distinction between maximal andmaximum.

    (a) Maximum means that it is the largest element, but maximal simplymeans that there are no larger elements. Thus, in a totally orderedset, the notions of maximal and maximum are the same.

    (b) A partially ordered set can have at most one maximum element, butmay have arbitrarily many maximal elements. For example, if X isa set, then every element of X is maximal with respect to =.

    The above comments hold with maximal and maximum replaced withminimal and minimum, respectively.

    2. Z,Q, and R with the usual orders have no maximal or minimal elements.

    3. Let X be a set. Then (P(X),) has maximum element X and minimumelement .

    4. (N, |) has minimum element 1 and maximum element 0. If you removethose two elements, then the minimal elements are the primes and thereare no maximal elements.

    Definition.

    1. A map f : X Y between partially ordered sets is an order embedding iff(a) f(b) a b. Order embeddings are injective. (Exercise.)

    2. An order isomorphism between partially ordered sets is an order embed-ding of one onto the other. (Important: Note the use of onto here.) Wesay a partially ordered set X is order isomorphic to a partially orderedset Y if there is an order isomorphism f : X Y . In an exercise you willshow that order isomorphic to is an equivalence relation on the class ofpartially ordered sets.

  • 19

    Remark. If two partially ordered sets are order-isomorphic, then, as far aspartial order properties go, we can think of these partially ordered classes asbeing the same, except merely for their elements being renamed. For example,the order-isomorphic sets {0, 1, 2} and {10, 6, 105} (with the usual orders) havethe same partial order properties, and we can think of the latter partially orderedset being the same as the former, but with 0 renamed to 10, 1 renamed to 6,and 2 renamed to 105. Thus, if two partially ordered sets are order-isomorphic,then any sort of partial order property of one set is also true for the otherset, with the appropriate substitutions made if the property specifically nameselements. On a similar note, if there is an order embedding f : X Y , we maythink of f [X] as being a copy of X contained in Y that possesses all the samepartial order properties as X.

    Exercises.

    1. Prove Proposition 3.

    2. Let < be a strict partial order. Prove that a < b b a.3. Prove that, in a totally ordered set, an element can have at most one

    successor and at most one predecessor.

    4. (a) Prove that order embeddings are injective.

    (b) Prove that a map f : X Y between totally ordered classes is anorder embedding f(x) < f(y) whenever x < y.

    5. Show that is order isomorphic to is an equivalence relation on the classof partially ordered sets.

    1.4 Well-Ordered Sets and Induction/Recursion

    [Durbin: Appendix C]

    Definition. A well-ordered set is a totally ordered set in which every nonemptysubset has a minimum element. A well-ordering on a set is a partial order withrespect to which it is well-ordered. Note that every non-maximum element of awell-ordered set has a unique successor, and that every subset of a well-orderedset is well-ordered.

    Example. All sets in this example are given their usual orders.

    1. N is well-ordered.

    2. Q+ is not well-ordered, since it is non-empty and has no least element.

    3. Q+ {0} is also not well-ordered, because even though it has a minimumelement, it has a nonempty subset Q+ that does not.

    4. is well-ordered (by the empty relation). It has no non-empty subsets, soit is vacuously true that every non-empty subset has a minimum element.

  • 20

    Theorem 6 (Principle of Induction). Let X be a nonempty well-ordered class,and for each x X let P (x) be a statement about x. Then P (x) is true for allx X if and only if the following two statements hold.

    1. (Base Case) P (a) is true for the minimum element a of X.

    2. (Inductive Step) If b > a and P (x) is true for all x < b, then P (b) is true.

    Proof. (): Clear. (): By contrapositive. Assume that P (b) is false for someb X. We can choose a minimum such b, and by minimality P (x) is true forall x < b. Therefore (1) and (2) cannot both hold.

    Remark.

    1. The most common usage of induction is applying it to the well-orderedset X = Z+ to show that a statement is true for all positive integers.

    2. Statements (1) and (2) could equivalently be combined into If b X andP (x) is true for all x < b, then P (b) is true, because if this statementholds, then P (a) must be true. However, in practice it is often simplestto verify the cases b = a and b > a separately, which is why induction isusually formulated as above.

    Example. It is essential that you are comfortable reading and writing proofsby induction, so I will explain in some detail how this is done. For an example,

    we prove the summation formulank=1 k =

    n(n+1)2 . I will first write the proof

    in a manner that explicitly makes reference to the Principle of Induction, soyou can see how the theorem is being used, and then I will rewrite it in anabbreviated form that is more like how mathematicians write in practice. (Youcan decide which method of writing suits you best.)

    1. (Explicit use of Principle of Induction:) For each n N, let P (n) be thestatement:

    nk=1 k =

    n(n+1)2 . We will show by induction that P (n) is true

    for all n N. For the base case, we have 0k=1 k = 0 = 0(0+1)2 , so P (0)is true. For the inductive step, if n > 0 and P (m) is true for all m < n,

    thennk=1 k =

    n1k=1 k + n =

    (n1)n2 + n =

    n2n+2n2 =

    n2+n2 =

    n(n+1)2 ,

    so P (n) is true. By the Principle of Induction, the statement P (n) is truefor all n N.

    2. (Abbreviated form:) We will show thatnk=1 k =

    n(n+1)2 for all n N

    by induction on n. For the base case, we have0k=1 k = 0 =

    0(0+1)2 .

    So assume n > 0. By induction, we havenk=1 k =

    n1k=1 k + n =

    (n1)n2 + n =

    n2n+2n2 =

    n2+n2 =

    n(n+1)2 .

    The abbreviated form may seem almost nonsensical if read literally and withoutbeing accustomed to such things, so I will explain the conventions/understandingsbehind writing proofs in this way. You verify the base case, and then assume

  • 21

    that n is larger than the base case. Then, for the rest of the proof, you areallowed to assume the statement is true for everything smaller than n. (ThePrincipal of Induction justifies this.) Each time you use this assumption, youuse a phrase like by induction so the reader understands what you are doing.If it is not obvious from context what the variable in your statement is, youshould say it prior to commencing the inductive proof, e.g., by induction onn. In the above example, it would also have been stylistically acceptable toomit the by induction on n, and I could have also left out the words for thebase case. If the setup for your proof is exceptionally complicated (like if youare doing an induction within an induction within an induction or somethingsimilarly crazy), then it may be necessary to write out everything in a moreexplicit form so that what exactly you are doing is 100% clear.

    Theorem 7 (Recursive Definition). Let X be a well-ordered class, let Xb ={x X | x < b} for each b X, and let G : V V. Further assume that eachXb is a set. Then there is a unique f : X V with f(x) = G(f Xx) for eachx X.Remark.

    1. In other words, in this case, it is valid and unambiguous to define functionsX V recursively, i.e., we can specify a value for the first point and away to determine the value at a point given the values at the previouspoints. (We do not necessarily need to specify the value for the first pointif our rule is phrased in such a way that it makes sense if there are noprevious points.) The function G represents the rule for determining f(x)based on f s values at previous points.

    2. If X = N, then we may reword this theorem as: it is valid and unambigu-ous to define a sequence {an}n=0 recursively.

    Proof. Uniqueness immediately follows from induction. To prove existence, itsuffices to show that for each b X there is such a function fb : Xb {b} Y ,because then by uniqueness these functions agree where their domains overlapand they can thus be extended to the desired function. By induction, there issuch a function fa for each a < b, and by uniqueness these functions can beextended to such a function fb : Xb Y , and defining fb(b) = G(fb Xb)extends fb to the desired function.

    Example. The factorial function on N can be defined recursively by 0! = 1 andn! = n(n 1)! for n > 0. (Note that it is perfectly fine that our recursive rulehas n in it, because n = (n 1) + 1 can be derived from the numbers precedingit.)

    As an example of recursive definition, we prove the following useful result.

    Theorem 8 (Regularity of Classes). Every nonempty class has a member thatis disjoint from it.

  • 22

    Proof. By contradiction. Suppose that there is a nonempty class A that hasa nonempty intersection with each of its members. We recursively define asequence {Xn}n=0 of members of A with each Xn+1 Xn. Let X0 A, andfor n > 0, the set Xn1 A has some member Xn A. The sequence we haveconstructed violates the Axiom of Regularity (past exercise).

    Another way of phrasing the above theorem is the following: if there isa set satisfying a certain property P , then there is a set that satisfies P butits members do not. Replacing P in the above statement with its negationand then taking the contrapositive yields the following theorem describing apowerful proof technique.

    Theorem 9 (Hereditary Induction). Let P be a statement about sets. If P istrue for a set whenever it is true for all of its members, then P is true for allsets.

    One of the most famous, and perhaps surprising, results of set theory is theWell-Ordering Theorem: every class has a well-ordering. (If you do not considerthis somewhat strange, try to imagine a way to place a well-ordering on R. Butdo not try too hard, because it turns out that, even though one exists, thereis not one that is explicitly definable and provably correct.) We will put offproving the Well-Ordering Theorem until the next section. For now, we willexamine a couple of its consequences.

    Theorem 10 (Hausdorff Maximal Principle). Every chain in a partially orderedset is contained in a maximal chain.

    Proof. Let C be a chain in a partially ordered set (X,). By the Well-OrderingTheorem, there is a well-ordering on X. Recursively define f : X Vby f(x) = C {x} if C {x} ax f(a) is -totally ordered, and f(x) = otherwise.

    We claim that T =xX f(x) is a -chain. To see this, pick x y in T .

    By the definition of T , each of its elements is -comparable to every element ofC, so we may assume x, y / C. Then x and y are both in the -totally orderedset C {y} ay f(a), hence comparable, as desired.

    Now we show that C T . If T 6= , then some f(x) 6= , which meansC C {x} = f(x) T . On the other hand, if T = , then for each x C theset C {x} ax f(a) = C is not totally ordered, and since this is impossiblewe conclude that C = = T .

    The fact that T is a maximal -chain follows once we observe that, if x / T ,then C {x} ax f(a) T {x} is not -totally ordered.

    The proof of the following very useful result will be an exercise.

    Theorem 11 (Zorns Lemma). Let X be a nonempty partially ordered set. Ifevery nonempty chain in X has an upper (resp., lower) bound, then X has amaximal (resp., minimal) element.

    Exercises.

  • 23

    1. Show that, given sets X and Y , there is an injection or a surjection X Y . (Hint: Assume there is no surjection X Y . Well-order X, andrecursively define an injection f : X Y .)

    2. An initial segment of a partially ordered set X is a subset A such that,for every x X and a A, we have x a A x A.(a) Show that unions and intersections of families of initial segments are

    initial segments.

    (b) Show that every proper initial segment of a well-ordered class X isof the form {a X | a < b} for some b X.

    (c) Let f : X Y be an order isomorphism and A be an initial segmentof X. Show that f [A] is an initial segment of Y .

    (d) Prove that a well-ordered class cannot be order isomorphic to one ofits proper initial segments. (Hint: Let A be an initial segment of awell-ordered class X and f : A X be an order isomorphism. Useinduction to show that f = .)

    3. (a) Prove the upper version of Zorns Lemma. (Hint: Use the Haus-dorff Maximal Principle to get a maximal chain.)

    (b) Let X be a nonempty partially ordered set. Prove that, if eachnonempty well-ordered subset of X has an upper bound, then everyelement of X is bounded above by a maximal element. (Hint: Picka X and consider the subset Xa = {x X | x a}. Show that theset C of well-ordered subsets of Xa is partially ordered by the relationA B A is an initial segment of B, and apply the upper versionof Zorns Lemma to (C,). Verifying the requirement about chainsin C having upper bounds can be reduced to showing that the unionof such a chain is in C.)

    (c) Let X be a nonempty partially ordered set. Show that, if eachnonempty chain in X has an upper (resp., lower) bound, then eachelement of X is bounded above (resp., below) by a maximal (resp.,minimal) element. (As a special case of this part, the lower ver-sion of Zorns Lemma is now proved. Hint: The upper versionfollows immediately from part (b). For the lower version, applythe upper version to the reverse partial order.)

    1.5 Ordinal Numbers

    [Not in Durbin.]In the final two sections of this chapter we will begin the project of con-

    structing the number systems. Roughly speaking, one extends N to Z by addingadditive inverses, then extends that to Q by forming fractions, then extends thatto R by filling in the holes between rational numbers, then extends that to Cby adding an element i =

    1. (By far the biggest jump is from Q to R.) Often

  • 24

    the set N is taken as the starting point, and then everything else is built up outof that, but I would like to share with you a brilliant way that mathematicianshave shown that this set (and hence all of standard mathematics) can be re-cursively constructed from a starting point of only the empty set. We will firstdefine the ordinal numbers, which generalize the notion of natural numbers cor-responding to an order, i.e., first, second, third, and so on. In the next section,we will define the cardinal numbers, which are numbers to measure the size ofa set, and we will define arithmetic on the cardinal numbers, of which naturalnumber arithmetic will be a special case. This way, we will be able to carefullyand thoroughly prove the basic properties of the natural numbers (including thecommutative, associative, distributive properties, and so on).

    Before we define what an ordinal number is, we will discuss the general idea ofthe ordinals. We want to define ordinal numbers so that the class ON of ordinalsis strictly well-ordered by , and so that each ordinal number consists preciselyof the smaller ordinals, or in other words is an initial segment of ON. So thesmallest ordinal should be , the next smallest should be {}, then {, {}}, then{, {}, {, {}}}, and so on. In general, the successor of should be {},which you will verify in the exercises. We will label the smallest ordinal as 0, thenext smallest as 1, the one after that as 2, and so on. So we understand how toform the set representing each of the natural numbers, but the above descriptionof how to construct the ordinals would not be a very precise definition of whatan ordinal is. The following definition, due to von Neumann, very elegantlyand unambiguously defines the ordinals without having to resort to any sort of(possibly vague or circular) recursive process.

    Definition. An ordinal number (or simply ordinal) is a transitive set of transi-tive sets. We denote the class of ordinal numbers by ON.

    The next theorem will show that this very abstract definition is indeed theone we were looking for.

    Theorem 12 (Properties of Ordinals).

    1. The class ON is transitive, i.e., members of ordinals are ordinals.

    2. The class ON (and hence any ordinal) is strictly well-ordered by .3. The relations and ( are the same on ON.4. A set is an ordinal number if and only if it is an initial segment of ON.

    5. Every well-ordered set is order isomorphic to a unique ordinal number.

    Proof.

    1. It is immediate from the definition that any member of an ordinal is atransitive set consisting of transitive sets, hence an ordinal.

    2. The fact that is a strict partial order on ON follows from the fact thatordinals are transitive. Furthermore, if is an ordinal in a subclass A

  • 25

    of ON, then by regularity the set A has an -minimal element ,and by transitivity is in fact an -minimal element of A. So the proofwill be complete once we show that is a strict total order on ON. Bytwo successive hereditary inductions, we reduce to proving that distinctordinals and are -comparable if each member of (resp., ) is -comparable to (resp., ). [Technical note: In this specific proof, we arenot using the full version of the hereditary induction theorem as statedin the previous section. Using that theorem in this specific proof wouldbe circular, because in the course of proving it we made implicit use ofsome properties of ordinals by recursively defining a sequence indexed byN. Instead, our use of hereditary induction on ON is justified by the factthat each of its nonempty subclasses has an -minimal element.] Withoutloss of generality, let us say there is a \ . Then = or ,and in either case we have .

    3. If ( are ordinals, then by regularity we cannot have , and from(2) we conclude .

    4. By (1), an ordinal is the set of ordinals less than it, hence an initial segmentof ON. Conversely, if X is any set that is an initial segment of ON, thenit consists of transitive sets, and each A consists of smaller ordinals,hence is a subset of A.

    5. First we prove that every well-ordered set X is order isomorphic to anordinal number. For each b X let Xb = {x X | x < b}. Recursivelydefine f : X V : x 7 f [Xx]. We claim that f is an order-isomorphismonto an ordinal. Since any pair of elements of X is contained in a setof the form Xb {b}, and since a union of a set of ordinals is an ordinal(exercise), it suffices to show that the restriction of f to each of these setsis an order-isomorphism onto an ordinal. By induction, the set f(b) =f [Xb] =

    x

  • 26

    1. The above definition of the natural numbers is consistent with our intu-itive notion of them as counting numbers, and is the standard precisedefinition used in rigorous mathematics.

    2. One way to state the Axiom of Infinity is: there is an ordinal such that(i) 0 < and (ii) if < , then so is its successor. (You will show in anexercise that every ordinal number has a successor.) A more precise wayto state the definition of informally given above is that is the leastsuch .

    3. The ordinal numbers are not merely indicative of the size of a well-ordered set, but also of how it is ordered. Finite sets (a concept we willdefine precisely in the next section) can only be well-ordered in one way(up to order isomorphism), hence only correspond to one ordinal, but aninfinite set can correspond to infinitely many different ordinals, dependingon how it is well-ordered. (Future exercise.) The numbers that are usedto measure the size of a set are the cardinal numbers, and we will studythem in the next section.

    Theorem 13 (Well-Ordering Theorem for Sets). Every set has a well-ordering.

    Proof. It suffices to show that every set is in one-to-one correspondence with anordinal, since then the ordinals well-ordering corresponds to a well-ordering onthat set, and, since = 0, we only need to consider nonempty sets. Let X bea nonempty set, and let g : P(X) X be a function with g(A) A for eachA 6= . Recursively define f : ON X by f() = g(X \ f []). Note that thedefinition implies that f() 6= f() for < with f [] 6= X. If there is no ON with f [] = X, then f is an injection and X is a set containing theproper class f [ON], a contradiction. (An exercise shows that ON is a properclass.) Therefore there is some minimum ON with f [] = X, and X is inone-to-one correspondence with .

    Another application of the ordinal numbers is showing that every set can berecursively built up from .Theorem 14 (Rank of Sets). Recursively define V =

  • 27

    1. By induction, any V is a union of a set of powersets of transitive sets,which is transitive by past exercises. The second statement is clear.

    2. By hereditary induction it suffices to show that if every member of a setX is in

    ON V, then so is X. In this case, we have each x X in

    some Vx . Let =xX x, which is a union of ordinals and hence

    an ordinal (exercise). Then each x , so X xX Vx V and

    X P(V) = V , where is the successor of .3. Let X be a set of rank , and let be the least ordinal greater than the

    rank of all the elements of X. For each x X, we have x Vx for somex < , so X

    xX P(Vx) V . Therefore . On the other

    hand, the inclusion X V =

  • 28

    2. We can also make recursive definitions like defining a hereditarily finite setto be a finite set of hereditarily finite sets (we will give a precise definitionof finite in the next section). This is because a definition for a kind ofset could be viewed as a function f : V V, where f(X) = 1 if X isthat kind of set and f(X) = 0 if X is not. The hereditarily finite sets areinteresting from a logicians point of view, because they can be used toshow that the Axiom of Infinity cannot be proved from the other axioms(hence must be taken as an axiom), but we will not talk about them anyfurther in this class.

    Theorem 16 (Well-Ordering Theorem for Proper Classes). Every proper classhas a well-ordering that makes it order isomorphic to ON.

    Proof. We show that V has a well-ordering in which every proper initial segmentis a set. This then induces such a well-ordering for all other classes, and by anexercise this proves the desired conclusion. For each ordinal , let be awell-ordering on the set of sets of rank . (We know that this is a set becauseby definition it is a subset of P(V).) Define a relation on V by X Y rank(X) < rank(Y) or rank(X) = rank(Y) and X rank(X) Y . It is simple toverify that is a total order on V. If A is any nonempty subclass of V, then ithas an element of minimum rank , and the set of elements of A of rank hasa -minimum element, which is easy to verify is a -minimum element of A.Therefore is a well-ordering on V. Finally, if C is any proper initial segmentof V, then C has some upper bound Y , and C P(Vrank(Y)), showing that Cis a set.

    An immediate corollary is the following, which gives the previously alludedto result that a class is proper if and only if it is the same size as V.

    Corollary 17 (Limitation of Size). A class is proper if and only if it is inone-to-one correspondence with V.

    Exercises.

    1. (a) Prove that the successor of an ordinal number is {}.(b) Explicitly write out what set the ordinal number 5 is, without using

    any numerals.

    (c) Explicitly write out what set V3 is, this time using numerals to rep-resent the ordinals in it.

    2. Prove that every union or intersection of a family of ordinal numbers isan ordinal number. (Hint: Refer to a past exercise.)

    3. Show that ON is a proper class. (Hint: Show that if ON is a set, thenON ON.)

    4. (a) Show that every nonzero ordinal falls under exactly one of these twoclassifications.

  • 29

    i. Successor ordinal: a successor of an ordinal.

    ii. Limit ordinal: a nonzero ordinal that is the union of the ordinalsless than it.

    (b) Prove that is the smallest limit ordinal. (Hint: Show that any limitordinal satisfies conditions (i) and (ii) in the precise definition of .)

    5. This exercise will show that ON can be characterized as the unique (upto order isomorphism) well-ordered proper class whose proper initial seg-ments are sets.

    (a) Prove that every well-ordered proper class has an initial segmentorder isomorphic to ON. (Hint: Let W be a well-ordered properclass and recursively define a function f : ON W so that f() isthe least element of W not in f []. [Note that f [] 6= W since f [] isa set.] Show that f [ON] is an initial segment. To show that f is anorder embedding, by a previous exercise [cite it] it suffices to showthat if < , then f() < f().)

    (b) Prove that a well-ordered proper class is order isomorphic to ON ifand only if each of its proper initial segments is a set. (Hint: For, use the fact that the initial segments of such a class are imagesof the initial segments of ON. Conversely, use induction to showthat the order embedding from (a) is a surjection if the proper initialsegments are sets. More specifically, if x W , then by induction{a W | a < x} is the image of some subclass of ON. Show thatthis subclass is an initial segment that is a set, hence an ordinal ,and that x = f().)

    (c) Prove that every proper class contained in ON is order-isomorphic toON.

    (d) Give an example of a well-ordered proper class not order isomorphicto ON.

    1.6 Cardinal Numbers

    [Not in Durbin.]

    Definition. The cardinality of a set X is the smallest ordinal number |X|that is in one-to-one correspondence with X. (This exists because X can bewell-ordered, hence is in one-to-one correspondence with an ordinal, and ON iswell-ordered, so there is a least such ordinal.) The class CN of cardinal numbers(or simply cardinals) consists of the ordinal numbers that are cardinalities ofsets. Note that for each ON we have || , and equality holds if and onlyif CN.Remark.

  • 30

    1. It is immediate that || = 0, and we will soon see that |n| = n for n .Thus all ordinal numbers are cardinal numbers.

    2. The way that mathematicians have defined cardinal and ordinal arith-metic, the expression + 1 equals in cardinal arithmetic and the suc-cessor of in ordinal arithmetic. For this reason, it is customary to write0 instead of when thinking of it as a cardinal number.

    3. With this definition, proper classes do not have a cardinality, because theycannot be in one-to-one correspondence with any set. We could generalizethe definition of cardinality to classes by defining |X| to mean the smallestinitial segment of ON that is in one-to-one correspondence with X (so allproper classes would have cardinality ON), but this is nonstandard.

    Theorem 18 (Cantor-Bernstein Theorem). The following are equivalent forclasses X and Y .

    1. X and Y are in one-to-one correspondence.

    2. There is an injection X Y and a surjection X Y .3. There are injections (resp., surjections) X Y and Y X.

    Proof. Note that the theorem is trivially true if X or Y is empty, so we mayassume X,Y 6= .

    (1) (2): Clear. (2) (3) (4): Immediate from a past exercise.(2) (1): Assume there are injections f : X Y and g : Y X.

    Let A0 = X \ g(Y ), and recursively define An = g[f [An1]] for n 1. LetA =

    n=0An, and let h : X Y be the function with h(x) = f(x) for x A

    and h(x) = g1(x) for x / A. (Note that X \ A X \ A0 = g[Y ], so g1 isdefined on X \A.) We wish to show that h is a bijection.

    For injectivity, assume h(a) = h(b) for some a, b X. If exactly one ofthese elements is in A, say a A and b / A, then a An for some n 0, andb = g(g1(b)) = g(f(a)) g[f [An]] = An+1 A, a contradiction. So a, b Aor a, b / A, and it follows from the injectivity of f and g1 that a = b.

    For surjectivity, pick any y Y . If y f [A], then there is an x A withf(x) = y, and hence h(x) = y, so let us assume y / f [A]. If g(y) An =g[f [An1]] for some n Z+, then y f [An1] f [A], a contradiction. Becauseg(y) / A0, we conclude that g(y) / A, and hence h(g(y)) = g1(g(y)) = y.Corollary 19. Let X and Y be sets.

    1. |X| = |Y | X and Y are in one-to-one correspondence.2. |X| |Y | there is an injection X Y . In particular, if X Y , then|X| |Y |.

    Proof.

    1. Follows directly from the definition of cardinality.

  • 31

    2. Let f : X |X| and g : Y |Y | be bijections. (): If |X| |Y |,then |X| |Y | and g1 |X| f : X Y is an injection. (): Bycontrapositive. Assume |X| > |Y |. Then X and Y are not in one-to-onecorrespondence, and by () there is an injection Y X, so by theCantor Bernstein Theorem there is no injection X Y .

    We want to regard two sets as being the same size if they are in one-to-onecorrespondence, so the above corollary shows that the cardinality of a set is thenumber one should think of as a measurement of the size of a set.

    Theorem 20 (Characterization of Infinite Classes). The following are equiva-lent for a class X.

    1. There is an injection X X that is not a surjection.2. There is a surjection X X that is not an injection.3. There is a bijection between X and one of its proper subsets.

    4. There is a surjection X N.5. There is an injection N X.

    Remark. We define a class to be infinite if it satisfies one (equivalently, all)of the above conditions, and finite otherwise. Observe that two classes in one-to-one correspondence are either both finite or both infinite. Since V is clearlyinfinite (by (5)), it follows that every proper class is infinite and every finiteclass is a set.

    Proof. (2) (1): Assume there is a surjection f : X X that is not injective.Let g : X X be a right inverse of f . Then f g = id, so g is injective.However, the function g cannot be a surjection, because then it would be abijection and so would f = g1. (1) (3): If there is a map f : X Xthat is injective but not a surjection, then f is a bijection between X andits proper subset f [X]. (3) (4): The case where X is a proper class (andhence in one-to-one correspondence with V) is clear, so let us assume X is aset and that there is a bijection f from X onto a proper subset. Recursivelydefine a sequence {Xn}n=0 by X0 = X and Xn = f [Xp(n)] for n > 0. We willprove by induction on n that each Xs(n) ( Xn. For the base case, we haveX1 = f [X0] = f [X] ( X = X0. So assume n 1. By induction, we haveXn ( Xp(n), and thus Xs(n) = f [Xn] ( f [Xp(n)] = Xn, as desired. (Note thatwe are using the fact that f is injective to conclude that the last containmentis proper.) It follows that {Xn \ Xs(n)}n=0 is a collection of pairwise disjointnonempty subsets of X. Define g : X N to be a function that takes elementsin Xn \ Xs(n) to n (it makes no difference where g sends elements not in anyXn \ Xs(n)). Then g is a surjection. (4) (5): Immediate from an exercise.(5) (1): Assume there is an injection f : N X. Define g : X X by

  • 32

    g(x) = f(s(f1(x))) for x f [N] and g(x) = x for x / f [N]. Note that gtakes f [N] into f [N] and takes X \ f [N] onto itself. If there is an x X withg(x) = f(0), then x f [N] and f(0) = f(s(f1(x))), so 0 = s(f1(x)), acontradiction to the fact that 0 is the smallest ordinal. Therefore g is not asurjection. To show that g is injective, it suffices to show that its restrictions tof [N] and X \ f [N] are injective, which follows from observing that the formeris a composition of injections (note that the successor function is injective) andthe latter is the identity map. (1) (2): Similar to (2) (1).Example.

    1. Examples of infinite sets include Z+ and R.

    2. Examples of finite sets include and {1, . . . , n}, where n Z+.Proposition 21.

    1. An infinite cardinal number is a limit ordinal.

    2. |n| = n for n .3. A set X is infinite if and only if |X| 0. In other words, a set is finite

    if and only if it is in one-to-one correspondence with some n < .

    Proof.

    1. If is any infinite successor ordinal, then || = |p(){p()}| = |p()| p() < , since by an exercise removing a single point from an infiniteset does not change its cardinality. Since 0 is finite (there is certainly noinjection N ), this means that any infinite cardinal number must be alimit ordinal.

    2. By a past exercise, the smallest limit ordinal is , so any ordinal less than is finite. Since an infinite set cannot be in one-to-one correspondencewith a finite set, it follows that || = . On the other hand, if n < , thenn is finite and not in one-to-one correspondence with a proper subset, so|n| = n.

    3. Follows from (2) and Theorem 20.

    Definition. In practice, when one encounters an infinite set, it is often notreally important which infinite cardinality it has, but merely that it is infinite,so mathematicians have adopted the following lazy notation. For a cardinalnumber , we write = to indicate that it is infinite, and

  • 33

    Definition. We define addition, multiplication, and exponentiation operationson the cardinal numbers as follows. (We will not discuss ordinal arithmetic inthis course.)

    1. If X and Y are disjoint, then |X|+ |Y | = |XY |. (One can always choosedisjoint representative sets X and Y , by replacing X with {0}X and Ywith {1} Y , if necessary.)

    2. |X||Y | = |X Y |.3. |X||Y | = |M(Y,X)|.

    (An exercise shows that these definitions are well-defined, i.e., they do not de-pend on which sets of a particular cardinality we choose.)

    Remark.

    1. The definitions of cardinal addition, multiplication, and exponentiationare consistent with the usual definitions of these operations on N. (Pre-calculus counting techniques verify this.) Thus one may think of cardinalarithmetic as an extension of standard arithmetic.

    2. While the expression 00 is considered to be an indeterminate form incalculus, in the context of cardinal arithmetic it is defined to be 1.

    3. One can extend the definitions of cardinal addition and multiplication toaccommodate sums and products of arbitrarily many terms:

    |X| =

    |X| and |X| = |X|, where in the former case the Xsmust be chosen to be disjoint. Induction shows that these definitions areconsistent with the above ones.

    Theorem 22 (Addition and Multiplication of Infinite Cardinals). Let and be cardinal numbers, at least one of which is infinite. Then + = =max(, ).

    Proof. Without loss of generality, we may assume , and thus is infinite.It is simple to check that + and are bounded between and 2, so itwill suffice to show that = 2.

    Suppose to the contrary that there is an infinite cardinal with < 2;because CN is well-ordered, we can pick to be the least infinite cardinalwith this property. Define a relation on by (x, y) (z, w) if (i)max(x, y) < max(z, w), or (ii) max(x, y) = max(z, w) and x < z, or (iii)max(x, y) = max(z, w), x = z, and y w. It is slightly tedious but nothard to show that is a well-ordering. (You may fill in the details if youwish.) Thus there is an order isomorphism f from onto some ordinal ,and < 2 = || . Since , there are , with f(, ) = .Let be the successor of max(, ), and note that || < since is a limit ordinal, so by the minimality of we have either ||2 = || or||2 < 0, and in either case ||2 < . (The intuitively obvious fact that a

  • 34

    finite cardinal number squared is finite is noted in the theorem below.) But = f [{(, ) | (, ) (, )}] f [ ], so ||2 < , acontradiction.

    Theorem 23 (Properties of Cardinal Arithmetic). The cardinal numbers satisfythe following arithmetical properties.

    1. Addition and multiplication are commutative and associative.

    2. The distributive property holds.

    3. + 0 = 1 = 1 = .4. 0 = 0 and 0 = 1.5. If and , then + + , , and .6. n = + +

    n copies

    for 0 < n < 0.

    7. n = n copies

    for 0 < n < 0.

    8. () = .

    9. + = .

    10. () = .

    11. If m,n < 0, then m+ n is the nth successor of m.12. If m,n < 0, then m+ n,mn,mn < 0.13. For < max(0, , ), + = + = .14. For 0 < < max(0, , ), = = .15. If < or < 0, then there is a unique with + = .

    Remark. You would not be expected to memorize all of these properties. Justlook them over once and be content in the knowledge that most of the familiarfacts about natural number arithmetic are now solidly proven.

    Proof. [I have only written a sketch of this proof, since it is very tedious butnot too difficult once the key observations are made.] Properties (1)-(5) followfrom the definitions in a straightforward manner. Properties (6) and (7) can beproved by observing that n = |nk=1({k} )| and n = |nk=1 |.

    8. It is straightforward to verify that the map : M(, ) M(, ) M(, ) given by (f, g)(x) = (f(x), g(x)) is a bijection.

    9. Follows from (7) if and are finite, and the case where 1 is trivial.Otherwise, we have + = max(,) = max(, ) = .

  • 35

    10. It is straightforward to verify that the map : M(, )M(,M(, ))given by (f)(x)(y) = f(y, x) is a bijection.

    11. Because contains the successor of each of its elements, we have m+ 1 =|m {m}| = m {m}, which is the successor of m (past exercise). Thefull result can be proven with induction.

    12. The m + n case follows from (11) and induction. One can now use thisresult and (6) to prove the mn case by induction, and in turn use thatresult and (7) to prove the mn case by induction.

    13. Assume . If < 0, the result follows from (11) and induction, soassume 0 < . Then max(, ) = + = + = , and hence = .

    14. Assume . The cases = 0 and = 1 are trivial, and the case0 < is dealt with as in the proof of (13), so let us assume 1and 2 < 0. If 0, then (11) implies that 0 and p() =p()+ = (p()+1) = = = p(), and then = by induction(on ). So let us assume < 0. Then p() + = (p() + 1) = = = p() + , and p() = p() by (13). Thus p() = p() byinduction (on ), and hence = .

    15. Let = | \ | and note that + = and . Uniqueness in thecase < 0 follows from (13). On the other hand, if < and 0,then = max(, ), and hence = .

    In the last section, we rigorously constructed N, and the above theoremdemonstrates most of its important arithmetical properties. We will delay theofficial construction of the other number systems Z, Q, R, and C for the moment,because it will be useful to learn some algebra before commencing this project.For now, we will be content with our informal/intuitive understanding of thesenumber systems so that we can use them to make examples.

    Theorem 24 (Cantors Theorem). |X| < |P(X)| = 2|X| for any set X.Proof. The second equality follows from observing that there is a natural one-to-one correspondence between P(X) and M(X, {0, 1}), where A X correspondsto the function that takes elements in A to 1 and elements not in A to 0.

    Now suppose that |P(X)| |X|. Since P(X) 6= , this means there is asurjection f : X P(X), and there is a y X with f(y) = {x X | x / f(x)}.Hence y f(y) y / f(y), a contradiction.

    Since CN is well-ordered, Cantors Theorem shows that each cardinal numberhas a cardinal successor, i.e., a least cardinal number greater than it. It followsthat there is no largest cardinal number, and that CN is infinite. In fact, anexercise will show that the class of (infinite) cardinal numbers is a proper class,

  • 36

    and, since it is contained in ON, it is order isomorphic to ON (past exercise).Explicitly, the order isomorphism is as follows. For > 0, we define to be thesmallest cardinal number greater than for all ordinals < . Then 7 is an order isomorphism from ON onto the class of infinite cardinal numbers.

    In an exercise, you will see that |N| = |Z| = |Q| = 0.Theorem 25. |R| = 20 .Proof. We first show that |[0, 1)| = 20 . Consider the function f : P(Z+) [0, 1] that takes a set to the number 0.a1a2a3 . . .2 whose binary expansion haszeros in the digits in the subset and 1s elsewhere. For example, f(Z+) = 0,f({2}) = 0.101111 . . .2, f({3, 4, 5, . . .}) = 0.112, and {1, 3, 5, 7, . . .} 7 0.010101 . . .2.Each number in [0, 1) has a binary expansion, and the expansion is unique, ifwe agree to rewrite expressions of the form 0.a1 . . . an01111 . . .2 as 0.a1 . . . an12.So, if we restrict f to the set P(Z+) of infinite subsets of Z+, we get a bijectionP(Z+) [0, 1). By an exercise, we thus have |[0, 1)| = |P(Z+)| = 20 .

    Finally, the map (n, x) 7 n + x is a one-to-one correspondence betweenZ [0, 1) and R, so |R| = 0 20 = 20 .Remark. The famous Continuum Hypothesis is that 1 = 20 , or, in otherwords, that there are no cardinal numbers between |N| and |R|. The GeneralizedContinuum Hypothesis is that n = 2n1 for all n Z+. Logicians haveproven that both hypotheses are impossible to either prove or disprove from thestandard axioms of mathematics.

    Definition. A set is countable if its cardinality is at most 0; otherwise, it isuncountable. (The name comes from the fact that a set is countable if and onlyif there is a way to well-order it so that you could count to any given elementin a finite number of steps.) A set that is of cardinality equal to 0 is calledcountably infinite.

    Exercises.

    1. Let X be an infinite set. Show that adding or subtracting a single pointdoes not change its cardinality. (Do not use any results occurring afterTheorem 20 in your proof, because most of their proofs rely on this ex-ercise. Hint: First show that it does not matter which point is removed,then use the fact that X is in one-to-one correspondence with a propersubset.)

    2. (This exercise will show that the class of (infinite) cardinals is a properclass.) Prove that every class with members of arbitrary large cardinalityis a proper class. (Hint: Suppose that such a class is a set, then arriveat a contradiction by constructing a set of greater cardinality than all thesets in it.)

    3. Show that the cardinal arithmetic definitions are well-defined.

    4. (a) Prove that |Q| = |Z| = |Z+| = 0. (Hint: Use cardinal addition andmultiplication to reduce this to finding an injection Q+ Z+Z+.)

  • 37

    (b) Prove that |C| = 20 . (Hint: Find a bijection C R R.)5. (a) Let {X} be a family of disjoint sets of the same cardinality .

    Show that

    X = ||.

    (b) Let X be a nonempty set. Show that the cardinality of the set offinite sequences with elements from X has cardinality 0 if X is finiteand cardinality |X| if X is infinite. (Hint: This set can be written asn=1

    nk=1X.)

    (c) Let X be an infinite set and Pf (X) (resp., P(X)) be the set ofits finite (resp., infinite) subsets. Prove that |Pf (X)| = |X| and|P(X)| = 2|X|. (Hint: Use (b) to prove the first equation, and thenuse cardinal addition to derive the second from the first.)

    6. (This exercise will prove the earlier comments about how every infiniteset can be made order-isomorphic to infinitely many different ordinals,depending on how we choose to well-order it.) Let X be a set and C bethe class of ordinal numbers in one-to-one correspondence with X. Showthat (a) C is a set, (b) |C| = 1 if X is finite, and (c) |C| is the successorcardinal of |X| if X is infinite. (Hint: Let be the successor cardinal of|X|, and show that C = { ON | |X| < } and |X|+ |C| = .)

  • Chapter 2

    Group Theory

    In this chapter, we will begin our study of algebraic structures. We will gothrough the most elementary parts of group theory, covering topics such asquotient groups, direct products, and isomorphism, and then consider some im-portant special kinds of groups, mainly the cyclic groups and the permutationgroups. Group theory is kind of a peculiar topic in that the basic, most impor-tant parts are fairly simple, but studying it any further beyond that becomesincredibly intricate. So we will be able to do a thorough study of the key thingsfairly quickly (in about three weeks), and then move on to studying ring theory.As I mentioned before, my research area is commutative rings, so you can expectsome bias in that direction later in the course.

    2.1 Semigroups, Monoids, and Groups

    [Durbin: Sections 3-5, 14]

    Definition.

    1. An operation on a set S is a function : S S S. We abbreviate(a, b) = a b.

    2. An operation on a set S is:(a) associative if a (b c) = (a b) c for all a, b, c S; and(b) commutative if a b = b a for all a, b S.

    3. A semigroup is a pair (S, ), where S is a set and is an associative oper-ation on S. (For simplicity, we will often just refer to S as the semigroupwhen it is understood what the operation is.) A semigroup is commutativeif its operation is. Sometimes we refer to the cardinality of a semigroupas its order.

    Remark.

    38

  • 39

    1. In cases where there is no danger of confusion with some sort of standardmultiplication operation, we will usually name our operation instead of. (This is standard in actual mathematical practice; Durbin prefers touse in all cases, as a pedagogical tool to emphasize that these opera-tions do not necessarily correspond to any sort of familiar multiplication.)When we are using this multiplicative notation, we will make use of thestandard abbreviations a b = ab and an = a a

    n copies

    . (We will see below

    that it is unambiguous to write products a1 an in a semigroup with-out parentheses, and that in a commutative semigroup the order of thefactors does not matter.) It turns out that the familiar exponentiationproperties hold in semigroups, i.e., if S is a semigroup, then for x S andm,n Z+, we have (xm)n = xmn and xmxn = xm+n. These equationsare really just a special case of the fact that it does not matter how onegroups parentheses with an associative operation. For similar reasons, ifS is a commutative semigroup, x, y S, and n Z+, then (xy)n = xnyn.

    2. The next most common name for an operation is + (especially whenthe operation is commutative). When we are using this additive notation,we will make use of the abbreviation na = a+ + a

    n copies

    .

    3. Any operation on a finite set {x1, . . . , xn} can be represented in tableform as follows.

    x1 x2 xnx1 x

    21 x1x2 x1xn

    x2 x2x1 x22 x2xn

    ......

    .... . .

    ...xn xnx1 xnx2 x2n

    This is called a Cayley table for the operation. Note the convention re-garding order of multiplication: the (i, j) entry is xixj .

    4. The class of all semigroups is a proper class. In fact, the correspondingclasses for all the major algebraic structures we will discuss in this course(semigroups, monoids, groups, rings, integral domains, and fields) are allproper, because it will turn out that each of these classes has members ofarbitrarily large cardinality. (We will delay the proof of this for quite awhile.)

    Example.

    1. Addition and multiplication are commutative and associative operationson C, but the subtraction operation is neither.

    2. Division and exponentiation are operations on R+ that are neither com-mutative nor associative.

  • 40

    3. Addition modulo 4 is an operation on {0, 1, 2, 3}, with the following Cayleytable.

    +4 0 1 2 30 0 1 2 31 1 2 3 02 2 3 0 13 3 0 1 2

    It is commutative and associative.

    Theorem 26 (Generalized Commutative and Associative Properties).

    1. In a semigroup, the values of expressions are unaffected by how one groupsparentheses.

    2. In a commutative semigroup, the values of expressions are unaffected byorder of factors.

    Proof. Let S be a semigroup and a1, . . . , an S.1. We need to show that any product of a1, . . . , an, with the terms writ-

    ten in that order and parentheses inserted in any legal way, is equalto ( ((a1a2)a3)a4 )an. The case n 3 is already covered by theassociative property, so assume n 4. This product is an expressionof the form bc, where b (resp., c) is some sort of product of a1, . . . , ak(resp., ak+1, . . . , an), with the terms written in that order, for some k {1, . . . , n 1}. By induction, we have b = (( ((a1a2)a3)a4 )ak) andc = (( ((ak+1ak+2)ak+3)ak+4 )an) = (ak+1(ak+2(ak+3( an2(an1an))))).If n = k+1, then we are done, so let us assume n k+2. Then by the asso-ciative property bc = (bak+1)(ak+2(ak+3( an2(an1an)))), which by in-duction (viewing bak+1 as one factor) equals ( ((bak+1)ak+2)ak+3 )an,as desired.

    2. Assume that S is commutative. We need to show that any product ofa1, . . . , an, with the terms written in any order, is equal to a1 an. Thecase n = 1 is trivial, so let us assume n > 1. This product is an expressionof the form bak, where b is a product whose factors consist of the ais fori 6= k. By induction, we have b = a1 ak1ak+1 an. If k = n, thenwe are done. Otherwise, by the commutative and associative propertieswe have bak = (a1 ak1ak+1 an1ak)an, which by induction equals(a1 an1)an, as desired.

    Definition.

    1. An element 1 (resp., 0) of a semigroup S is an identity (resp., absorbing)element if 1 a = a 1 = a (resp., 0 a = a 0 = 0) for each a S. (Itis also extremely common to use the symbol e for an identity element, asDurbin does. If we are using additive notation, then we denote identity

  • 41

    and absorbing elements with 0 and , respectively. Then their definingproperties would be written as 0+a = a+0 = a and+a = a+ =.)A semigroup with an identity element is called a monoid. Note that asemigroup has at most one identity (resp., absorbing) element, becauseif x and y are identity (resp., absorbing) elements, then x = x y = y.When necessary, we will add subscripts to indicate for which semigroupan element is an identity or absorbing element, e.g., 1S or 0S .

    2. Let S be a monoid. If a, b S and ab = 1, then we say a is a left inverseof b and b is a right inverse of a. If ab = ba = 1, then a and b areinverses. An element with an inverse (resp., right inverse, left inverse)is called invertible (resp., right invertible, left invertible). The invertibleelements are also called units, and the set of units of S is denoted S.(The notation U(S) is also common.) We say S is a group if S = S; acommutative group is called abelian. In an exercise you will show that ifan element of S has a left inverse a and a right inverse b, then a = b. Thisshows that inverses are unique when they exist, so we may denote theinverse of a S by a1. (In additive notation, we use a for the inverseof a, and we abbreviate x a = x+a.) Also, in order to determine theinverse of an element of a group, it suffices to find a left or right inverse.We note that 1 S, that (a1)1 = a for each a S, and that theinverse of a product of units is given by (u1 un)1 = u1n u11 . (Thiscan be easily proven with induction. The order is important if S is notcommutative.) Thus S is a group, and is for that reason often called thegroup of units of S.

    3. If x is a member of a monoid S, we define x0 = 1. (In additive notation,this would be written 0 x = 0.) Observe that with this definition thepreviously noted exponentiation rules now apply for all natural numberexponents. If additionally x is a unit, then for each n Z+ we have(xn)1 = (x1)n, and we define xn to be this element. (In additivenotation, this definition would be written (n)x = (nx) = n(x).)

    Remark.

    1. Note that the symbols 1 and 0 now have two possibly different meanings:the natural numbers 1 and 0, or identity/absorbing elements for the semi-group we are discussing. (Occasionally, like in (N, ), the two meaningscoincide.) We have to determine from context which interpretation ofthese symbols is appropriate.

    2. An absorbing element of a monoid is a unit if and only if it is the onlyelement. (Exercise.) Thus groups with more than one element (callednontrivial groups) do not have an absorbing element.

    Proposition 27 (Exponentiation Rules). Let G be a group, g G, and m,n Z.

    1. (gm)n = gmn. In particular, (gn)1 = gn.

  • 42

    2. gmgn = gm+n.

    We note that the exponentiation rule (xy)n = xnyn is valid only in abeliangroups. (Exercise.)

    Proof.

    1. We have already noted that this is true if m,n > 0 or if m = n = 1,and if m = 0 or n = 0, then both sides equal 1. If m > 0 and n < 0,then (gm)n = ((gm)n)1 = (gmn)1 = gmn. If m < 0 and n > 0,then (gm)n = ((g1)m)n = (g1)mn = gmn. Finally, if m,n < 0, then(gm)n = (((gm)1)1)n = (gm)n = gmn.

    2. We have already noted that this is true if m,n > 0, and if m = 0 (resp.,n = 0), then both sides equal gn (resp., gm). If m > 0, n < 0, andm + n 0, then n > 0 and gmgn = (gm+ngn)gn = gm+n. Ifm > 0, n < 0, and m + n < 0, then (m + n) > 0 and gngm =(g(m+n)gm)gm = g(m+n), and taking inverses yields gm+n = gmgn.We have now established all cases where m 0. If m < 0, then m > 0and gmgn = (g1)m(g1)n = (g1)(m+n) = gm+n.

    Example. In the following example, I will list several examples of groups anddiscuss the various concepts we have defined in this section in relation to thesegroups. I will list several facts without proof, and when you are reading throughthem you should be thinking about why they are true, in order to develop afeeling for these concepts.

    1. is a commutative semigroup whose operation is the empty function.However, a monoid cannot be empty.

    2. The set 2Z of even integers is a commutative semigroup under , but it isnot a monoid.

    3. Let R {Z,Q,R,C}. Then both (R, ) and (R,+) are commutativemonoids. The former has identity 1, absorbing element 0, and its groupof units is {1, 1} if R = Z and R otherwise. The latter is an abeliangroup with identity 0; the inverses are the familiar additive inverses.

    4. Let n Z+ and R {Z,Q,R,C}. The set Mn(R) of n n matricesover R forms a monoid under either the usual matrix multiplication orthe usual matrix addition. The latter is an abelian group whose identityis the zero matrix. The former is commutative if and only if n = 1, itsidentity is the identity matrix (hence the name), its absorbing element isthe zero matrix, and its group of units is the general linear group of degreen over R, which is denoted GLn(R) and consists of the matrices whosedeterminant is a unit in R.

  • 43

    5. If X is a set, then (P(X),) and (P(X),) are commutative monoids.The former has identity X and absorbing element , and in the latterthose elements roles are reversed. These monoids are groups if and onlyif X = .

    6. If X is a set, then (M(X), ) is a monoid. The identity is the identitymap (hence the name), and there is an absorbing element |X| 1 M(X) is commutative. The units are the bijections, which are calledpermutations of X; thus M(X) is a group if and only if |X| 1. Thegroup of units is denoted Sym(X) and called the symmetric group on X;it is abelian |X| 2. We will be studying the symmetric groups later.

    7. If X is a nonempty set and (S, ) is a semigroup (resp., monoid, group),then so is (M(X,S), ), where is defined on S in the obvious way:(fg)(x) = f(x)g(x). If S is a monoid, then 1M(X,S) : X S : x 7 1S ,and M(X,S) = M(X,S), where the inverse of f M(X,S) is themap x 7 f(x)1. Note that M(X,S) is commutative if and only if S is.

    8. If {S} is a family of semigroups (resp., monoids, groups), then so isthe direct product

    S with the operation (fg)() = f()g(), i.e.,

    multiplication is done coordinate-wise. The direct product is commutativeif and only if each S is. If these are monoids, then the identity of thedirect product is the map 7 1S , and (

    S)

    = S

    . In

    other words, the units are the elements with units in each coordinate, andthe inverse of a unit 7 u is the map 7 u1 .

    9. Let n 1 and Zn = {0, 1, . . . , n1}. Let +n and n represent addition andmultiplication modulo n. Then (Zn,+) is an abelian gro