Introduction - Kansas State Universitycochrane/m506/m506s13lec.pdf · Squaring numbers that end in...

MATH 506: INTRODUCTION TO NUMBER THEORY,

SPRING 2013

TODD COCHRANE

1. Introduction

N = Natural Numbers: 1, 2, 3, 4, 5, 6, . . . .

Kronecker: “God created the natural numbers. Everything else is man’s handi-work.”

Gauss: “Mathematics is the queen of sciences– and number theory is the queenof mathematics.”

Number Theory: The study of the natural numbers.

Questions: Let P = {2, 3, 5, 7, 11, 13, . . . }= PrimesQ1. Are there infinitely many primes?Q2. How many primes are there up to a given value x?Q3. Are there infinitely many twin primes? (3,5), (5,7), (11,13), (17,19), etc.Q4. Which primes can be expressed as a sum of two squares? 5 = 12 + 22,

13 = 22 + 32, etc.Which problems are easy and which are hard?Q1: We can answer affirmatively at the beginning of this semester. Goes back

to Euclid.Q2: A formula was conjectured by Gauss, but not proven until 1896 by J.

Hadamard and de la Vallee Poussin. Let π(x) = # primes ≤ x. Gauss at age15 used a table of primes up to 10000 0 to make a table for π(N) and compared itwith Li(x).

Q3: This is still an open problem.Q4: Make a table with primes up to 43 and test them. What conjecture do you

make?1. Theory: Axioms, Properties, Theorems, Beauty, Art form, Depth.2. Puzzles, Patterns and Games: Amatuer mathematicians of all ages enjoy such

problems. This is important in early school education to get children interestedin mathematics and in thinking. People enjoy mathematical puzzles more thanis generally believed. Chess, Checkers, Tic-Tac-Toe, Card Games, Cross WordPuzzles, etc. all involve elements of mathematical reasoning and are valuable skills.

3. Applications: Communications–Cryptography and Error correcting codes.Physics and Chemistry– Atomic theory, quantum mechanics. Music–Musical Scales,acoustics in music halls. Radar and sonar camouflage.

Example 1.1. Theory. Which primes can be expressed as a sum of two squares?Formulate a conjecture based on the pattern observed.

Date: May 6, 2013.

1

2 TODD COCHRANE

Example 1.2. Theory. Triangular numbers: 1, 3, 6, 10, 15, 21, 28, 36, 45, . . . , n(n +1)/2. Squares: 1, 4, 9, 16, 25, 36, . . . , n2.Pentagonal numbers: 1, 5, 12, 22, 35, . . . , n(3n−1)/2. Fermat (1640) Polygonal num-ber conjecture: Every whole number is a sum of at most three triangular numbers,at most 4 squares, at most 5 pentagonal numbers, 6 hexagonal numbers, etc. La-grange proved squares. Gauss proved triangular numbers. Cauchy proved generalcase.

The next few examples are patterns and puzzles.

Example 1.3. Squaring numbers that end in 5. Make a conjecture and prove it.

Example 1.4. 8162 + 3572 + 4922 = 6182 + 7532 + 2942.

Example 1.5. Euler conjectured that a sum of three fourth powers could never bea fourth power. Elkies (1988) proved there are infinitely many counterexamples.4224814 = 958004 + 2175194 + 4145604.

Example 1.6. Collatz conjecture. Open problem today. Start with any positiveinteger. If it is even divide it by 2. If odd, multiply by 3 and add 1. After a finitenumber of steps one eventually ends up with 1.

Example 1.7. N. Elkies and I. Kaplansky. Every integer n can be expressed as asum of a cube and two squares. Note that n may be negative, as also may be thecube. For example, if n is odd, say n = 2k + 1, then

n = 2k + 1 = (2k − k2)3 + (k3 − 3k2 + k)2 + (k2 − k − 1)2

Example 1.8. There are just five numbers which are the sums of the cubes of theirdigits. 1 = 13. 153 = 13 + 53 + 33, 370 = 33 + 73 + 03, 371 = 33 + 73 + 13,407 = 43 + 03 + 73. This is an amusing fact, although challenging to prove. (extracredit).

Example 1.9. Start with any four digit number, say 2512 (with not all the samedigits). Rearrange the digits and subtract the smaller from the larger. Repeat.What happens?

Example 1.10. Consider the six digit number x = 142857. Note that 2x = 285714,3x = 428571, 4x = 571428, 5x = 714285, 6x = 857142. Is this just a coincidence?Are there any other six digit numbers with such a cyclical property? Have youever seen the digits 142857 before? (Theres a little bit of theory going on in thisproblem. The result can be generalized).

Note that the first three examples depend on the base 10 representation of nat-ural numbers. The important properties of the natural numbers are those thatare intrinsic, that is, that do not depend on the manner in which the number isrepresented.

Example 1.11. 13 + 23 = 9 = (1 + 2)2. 13 + 23 + 33 = 36 = (1 + 2 + 3)2. 13 +23 + 33 + 43 = 100 = (1 + 2 + 3 + 4)2. Maybe we’ve discovered a general formula.Lets see, is it always true that (x3 + y3) = (x + y)2. No. But we suspect that13 + · · · + n3 = (1 + 2 + · · · + n)2. We shall use induction to prove results of thisnature.

MATH 506: INTRODUCTION TO NUMBER THEORY, SPRING 2013 3

Example 1.12. Prime Numbers. 2,3,5,7,11,13,17,19,23,29,31,37,41,43,.. The primenumbers are the building blocks of the whole numbers, in the sense that every wholenumber is a product of primes. Do they just pop up at random? How many primesare there?

Theorem 1.1. There are infinitely many primes.

The proof goes back to Euclid (300 B.C.). Proof by contradiction. Supposethere are at most finitely many such primes, say p1, p2, . . . , pk. Consider the integerN = p1p2 . . . pk + 1. Now N must be divisible by some prime say pi. It follows that1 is a multiple of pi which is absurd. Thus our assumption must be wrong.

Depth: Are there infinitely many twin primes? How many primes are there upto N? Gauss, using a table of primes up to 100000, at the age of 15 made a tablecomparing the number of primes up to N with the function li(x) =

∫ x2

dtlog t .

N π(N) li(N)

103 168 178104 1229 1246105 9592 9630106 78498 78628107 664579 664918108 5761455 5762209109 50847534 508492351010 455052512 455055614The ratio of these two quantities approaches 1 as N goes to infinity. This can

be proved, (although Gauss wasn’t able to prove it. But he was instrumental inthe development of Complex numbers, which are an essential tool for proving thisresult). Its called the prime number theorem, one of the jewels of mathematics. Itwas proven by J. Hadamard and C. de la Vallee Pousin (1896). Look briefly at theaxiom sheet. In particular, the associative law.

1.1. A brief look at the Axioms sheet. Look at the axiom sheet. The first pageare axioms shared by the real number system. What distinguishes the integers istheir discreteness property. There are three equivalent ways of expressing thisproperty.

Well-Ordering Axiom of the Integers. Any nonempty subset S of positiveintegers contains a minimum element. That is, there is a minimal element m in Shaving the property that m ≤ x for all x ∈ S.

Note 1.1. (i) The rationals and reals do not have such a property. Consider forexample the set of real numbers on the interval (0,1).

(ii) It is this property that assures us that there is no integer hiding somewherebetween 0 and 1, in other words that 1 is the smallest positive integer. For if suchan integer a < 1 existed we could construct an infinite descending chain of positiveintegers a, a2, a3, . . . , with no minimal element.

Axiom of Induction. If S is a nonempty subset of N containing 1 and havingthe property that if n ∈ S then n+ 1 ∈ S then S = N.

Note 1.2. It is from this axiom that we obtain the Principle of Induction, which isthe basis for induction proofs.

4 TODD COCHRANE

Note: We say that a set of integers S is bounded above if there is some numberL say, such that x < L for all x ∈ S.

Maximum Element Property of the integers. Any nonempty set S ofintegers bounded above contains a maximum element, that is, there is an elementM ∈ S such that x ≤M for all x ∈ S.

Note the use of the word “has”: If we say a set S has an upper bound, this doesnot mean the upper bound is in S. If we say a set S has a maximum element, thisdoes mean the maximum element is in S.

2. Divisibility Properties

Definition 2.1. Let a, b ∈ Z, with a 6= 0. We say that a divides b, written a|b, ifthere is an integer x such that ax = b.

Equivalently: a|b iff b/a is an integer. This formulation assumes that we havealready constructed the set of rational numbers. Our text book uses this as adefinition. In this class I want you to be able to write proofs about integers justusing the axioms for the integers (so avoid using the rationals).

Terminology: The definition above is mathematical wording “a divides b”. Thisis not a common usage of the word divides, it sounds like the number a is doingsomething to b. Note the difference between 3|6 and 3/6.

Other variations of 3|6: We can say 3 divides 6, 3 is a divisor of 6, 3 is a factorof 6, 6 is divisible by 3, 6 is a multiple of 3.

Example 2.1. 3|15, 5|15, but 7 - 15. What is wrong with saying 7|15 because7× 15/7 = 15.

Example 2.2. List all divisors of 12. What numbers divide 0? What numbers aredivisible by 0?

Example 2.3. Find all positive n such that 5|n and n|60. 5|n so n = 5k for someinteger k. 5k|60 so 5kx = 60 for some integers k, x. Thus kx = 12 for some k, x.Thus k is a divisor of 12 so can let k = 1, 2, 3, 4, 6, 12, n = 5, 10, 15, 20, 30, 60.

Theorem 2.1. Transitive property of divisibility. If a|b and b|c then a|c.

Proof. You should be able to write a rigorous proof starting from the definition ofdivisibility. Note the use of the associative law. �

Example 2.4. 7|42, 42|420 therefore, 7|420.

Theorem 2.2. Additive property of divisibility. Let a, b, c be integers suchthat c|a and c|b. Then (i) c|(a+ b),

(ii) c|(a− b) and(iii) For any integers x, y, c|(ax+ by).

Proof. Again, this is a basic proof. �

Example 2.5. Note that the additive property of divisibility can be reworded: If aand b are multiples of c then so is a + b. Thus, a sum of two evens is even, or asum of two multiples of 5 is a multiple of 5.

Example 2.6. 3|21, 3|15. Therefore 3|(21−15), i.e. 3|6, and 3|(2 ·21+15), i.e. 3|57.


Definition 2.2. Let a, b be integers, not both 0.1) An integer d is called the greatest common divisor (gcd) of a and b, denoted

gcd(a, b) or (a, b), if (i) d is a divisor of both a and b, and (ii) d is the greatestcommon divisor, that is, if e|a and e|b then d ≥ e.

2) An integer m is called the least common multiple (lcm) of a, b denoted lcm[a, b]or [a, b], if (i) m > 0, (ii) m is a common multiple and (iii) m is the least commonmultiple.

Note: (i) If a, b are not both 0, then (a, b) exists and is unique. Proof. Let Sbe the set of common divisors of a and b. Note, S is nonempty since 1 ∈ S. Also,S is bounded above by |a|, that is, if x ∈ S then x ≤ |a|. Thus by the Maximumelement principle S contains a maximum element.

(ii) For any a, b, not both zero, gcd(a, b) ≥ 1. Why? 1 is always a commondivisor, and 1 is the smallest positive integer.

(iii) (0, 0) is not defined?(iv) gcd(0, a) = |a| for any nonzero a.(v) lcm[a, 0] does not exist.

Example 2.7. (6,−2) = 2, (0, 17) = 17, [6,−2] = 6, [6, 10] = 30.

There are three ways of computing GCD’s: (i) Brute force. (ii) Factoringmethod. (iii) Euclidean Algorithm. For large numbers, the Euclidean algorithm ismuch faster. A PC can handle GCD’s of numbers with hundreds of digits usingEuclidean algorithm in no time. But the fastest algorithms cannot factor 100 digitnumbers, given any amount of time.

Example 2.8. Factoring Method. Find gcd(240, 108) given the factorizations 240 =24 · 3 · 5, 108 = 22 · 33. Find lcm[240, 108].

Example 2.9. Find gcd(1127, 1129).

The Euclidean algorithm is based on the following

Lemma 2.1. gcd subtraction lemma. Let a, b be integers, not both 0. Then forany integer k, (a, b) = (a− kb, b).

Proof. Let S be the set of common divisors of a, b and T the set of common divisorsof a− kb, b. Claim S = T , and so S and T have the same maximal element. �

Example 2.10. (Euclidean Algorithm.) Show gcd(234, 182) = 26

Example 2.11. (108, 48) = (108 − 96, 48) = (12, 48) = 12. What we are actuallydoing is computing 108/48 = 2 + 12/48.

In order to implement the Euclidean algorithm we use the Division algorithm.

Theorem 2.3. Division Algorithm Let a, b be any integers with b > 0. Thenthere exist unique integers q, r such that a = qb + r and 0 ≤ r < b. q is called thequotient, and r the remainder. Equivalently, we can write a

b = q + rb .

Proof. Existence: Let S = {x ∈ Z : xb ≤ a}. Then S is bounded above and soit contains a maximum element, say q. Define r = a − qb. Then a = qb + r. Bymaximality of q we have qb ≤ a < (q + 1)b, and so 0 ≤ r < b.

Uniqueness: Suppose that a = qb + r = q′b + r′, with 0 ≤ r, r′ < b. Thenb|q − q′| = |r′ − r| < b. Since the LHS is a multiple of b this is only possible ifq = q′. It follows that r = r′. �

6 TODD COCHRANE

Example 2.12. Find q, r when -392 is divided by 15. We first observe that 392/15 =26 + 2/15, so that 392 = 15 · 26 + 2 and so −392 = (−27)15 + 13.

2.1. Euclidean Algorithm. A procedure for calculating gcd’s by using successiveapplications of the division algorithm.

I) Traditional Euclidean Algorithm: A positive remainder is always chosen. Leta ≥ b > 0 be positive integers. Then, by the division algorithm and gcd subtractionlemma, we have

a = bq1 + r1, 0 ≤ r1 < b, (a, b) = (r1, b)(2.1)

b = r1q2 + r2, 0 ≤ r2 < r1, (a, b) = (r1, r2)(2.2)

. . .(2.3)

rk−2 = rk−1qk, (a, b) = rk−1.(2.4)

Since r1 > r2 > · · · > rk−1 we are guaranteed that this process will stop in a finitenumber of steps.

II] Fast Euclidean Algorithm: In your homework you prove the following versionof division algorithm. Given integers a > b > 0 with a > 0 there exist integers qand r such that a = qb+ r with |r| ≤ b/2. Thus if we allow ourselves to work withnegative remainders we can assume that the remainder in absolute value is alwayscut by a factor of 2. Thus |r1| ≤ b/2, |r2| ≤ |r1|/2 ≤ b/4, .. , |ri| ≤ b/2i. Thusalgorithm terminates in log2 b steps.

Example 2.13. Find gcd(150, 51) both ways.

2.2. Linear Combinations and the GCDLC theorem.

Definition 2.3. A linear combination of two integers a, b is an integer of the formax+ by, with x, y ∈ Z. Thus, we say that an integer d is a linear combination of aand b if there exist integers x, y such that d = ax+ by.

Example 2.14. Find all linear combinations of 9 and 15. Try to get the smallestpossible.

x y 9x+ 15y1 0 90 1 151 1 242 −1 3

Note that every linear comb. is a multiple of 3, the greatest common divisor of9,15.

Recall: We saw earlier that if d is a common divisor of a, b then d|ax + by forany x, y ∈ Z. In particular this holds for the greatest common divisor of a, b.

Claim: If d = gcd(a, b) then d can be expressed as a linear comb. of a and b.

Example 2.15. gcd(20,8)=4. By trial and error, 4 = 1 · 20 + (−2)8.gcd(21,15)=3. By trial and error, 3 = 3 · 21− 4 · 15.

To prove the claim in general we again use the Euclidean Algorithm, togetherwith the method of back substitution.


Example 2.16. Find d = gcd(126, 49).

(1) 126 = 2 · 49 + 28, d = gcd(28, 49)

(2) 49 = 28 + 21, d = gcd(28, 21)

(3) 28 = 21 + 7, d = gcd(7, 21)

(4) 21 = 3 · 7, d = gcd(7, 0) = 7, STOP

Back Substitution: A method of solving the equation d = ax + by (withd = gcd(a, b)) by working backwards through the steps of the Euclidean algorithm.

Example 2.17. Use example above for gcd(126,49) to express 7 as a LC of 126 and49. Use the method of back substitution. Start with equation (3): 7 = 28− 21. By(2) we have 21 = 49−28. Substituting this into previous yields 7 = 28−(49−28) =2 · 28− 49. By (1) we have 28 = 126− 2 · 49. Substituting this into previous yields7 = 2 · (126− 2 · 49)− 49 = 2 · 126− 5 · 49, QED.

Theorem 2.4. GCDLC Theorem.(i) The gcd of two integers a, b can be expressed as a linear combination of a, b.(ii) Every LC of a, b is a multiple of (a, b) and conversely every multiple of (a, b)

is a LC of a, b.(iii) In particular, (a, b) is the smallest positive l.c. of a, b.

Example 2.18. Suppose I tell you that a, b are whole numbers such that 45a+37b =1. What is (a, b)?

Proof. The discussion above indicates how to prove (i) although we just did it withone example. The first part of (ii) is just a special case of the additive property ofdivisibility. For the second part of (ii) let d = (a, b). Then we can write d = ax+byfor some integers x, y. Suppose that kd is an arbitrary multiple of d. Then kd =(kx)a+ (ky)b and so kd is a l.c. of a,b. (iii) is obvious from (ii) since every positivemultiple of d is ≥ d. �

Array Method. A more efficient method of expressing the gcd as a linearcombination.

Example 2.19. Redo example using array method. Perform Euclidean Alg. on thenumbers in top row, but do column operations on the array. Let C1 be the columnwith top entry 126, C2 the column with top entry 49, etc. Then C3 = C1 − 2C2.C4 = C2 − C3, C5 = C3 − C4.

126x+ 49y 126 49 28 21 7x 1 0 1 −1 2y 0 1 −2 3 −5

Thus, 7 = 7 · 126− 5 · 49.

Example 2.20. Find gcd(83, 17) and express it as a LC of 83 and 17.83x+ 17y 83 17 15 2 1

x 1 0 1 −1 8y 0 1 −4 5 −39

Thus gcd = 1 and 1 = 8 · 83− 39 · 17.

8 TODD COCHRANE

Example 2.21. Solve the equation 15x + 21y + 35z = 1, that is express 1 as a LCof 15,21 and 35, using the array method.

15x+ 21y + 35z 15 21 35 6 14 1x 1 0 0 −1 0 1y 0 1 0 1 −1 1z 0 0 1 0 1 −1

thus 15 + 21− 35 = 1.

Solving Linear Equations in integers: Solve ax + by = c. Put d = (a, b).GCDLC theorem tells us that this equation can be solved iff c is a multiple of d,that is d|c.Theorem 2.5. Solvability of a Linear Equation. Let a, b, c ∈ Z with d = (a, b).The linear equation ax+ by = c has a solution in integers x, y iff d|c.Definition 2.4. We say two integers a, b are relatively prime if gcd(a, b) = 1, thatis a, b have no common factor other than ±1.

Theorem 2.6. Let a, b ∈ Z with d = (a, b) = d. Then (ad ,bd ) = 1. (Note the two

fractions are integers.)

Proof. If k is a common positive divisor of ad , b

d then kd is a common divisor ofa, b, so k = 1, by maximality of d. �

Theorem 2.7. Euclid’s Lemma. If a, b, c are integers with a|bc and gcd(a, b) = 1,then a|c.Proof. Use GCDLC. �

Note 2.1. In general, if a|bc can we conclude that a|b or a|c? No.

Note 2.2. Further applications of GCDLC theorem, in homework.i) Every common divisor of a and b is a divisor of gcd(a, b).ii) Every common multiple of a and b is a multiple of lcm[a, b].

2.3. Linear Equations in two variables. Solve

ax+ by = c (NH) ax+ by = 0 (H)

in integers. Geometrically, we are looking for integer points on a line in the plane.We start by using a principle that you are familiar with from D.E. Namely that thegeneral solution to (NH) is obtained by finding a particular solution of (NH) andadding to it any solution of (H). This works because the equation is linear.

Suppose that (x0, y0) is a particular solution of (NH). Let (x, y) be any solutionof (H). Then (x0 +x, y0 +y) is also a solution of (NH). Conversely, if (x1, y1) is anysolution of (NH) then we can write (x1, y1) = (x0, y0) + (x1−x0, y1−y0) where thelatter is a solution of (H).

Focus on solving (H). Let d = (a, b). ax = by ⇒ ad |y, say a

d t = y, with t ∈ Z.

Then −bd t = x. Conversely for any integer t, these values of x, y yield solution.

Thus we have

Theorem 2.8. Let d = (a, b). Then the equation (NH) above has a solution iff d|c.Suppose d|c and that (x0, y0) is a particular solution. Then the general solution isgiven by x = x0 − b

d t, y = y0 + ad t, with t any integer. (Draw picture).


In applications we may wish to restrict the variables to positive values.

Example 2.22. A person has a collection of 17 and 25 cent stamps, but fewer than30 25 cent stamps. How can he mail a parcel costing $8.00.

Example 2.23. In baseball a few years ago the American league had 2 divisions with7 teams each. Say that teams play x games against each team in their own divisionand y games against each team in the other division. Find possible solutions forx, y assuming there are 162 games in a season? Which solution do think was used?

3. Introduction to Congruences

Let m be a fixed positive integer, referred to as the “modulus”.

Definition 3.1. We say that two integers a, b are congruent (mod m) and write

a ≡ b (mod m)

if m|a− b. Equivalently a ≡ b (mod m) iff a = b+ km for some integer k.

Example 3.1. Clock Arithmetic. m = 12. The set of integers congruent to 3(mod 12) is

{3 + 12k : k ∈ Z}.

Example 3.2. 23 ≡ 18 ≡ 13 ≡ 8 ≡ 3 ≡ −2 (mod 5). The values 18,13, etc. arecalled residues of 23 (mod 5), and the number 3 is called the least residue of 23(mod 5).

Definition 3.2. The least residue lr of a (mod m) is the smallest nonnegativeinteger that a is congruent to (mod m). It is a value between 0 and m− 1 (inclu-sive).

Lemma 3.1. Let a ∈ Z. The least residue of a (mod m) is the remainder individing a by m.

Proof. Use division algorithm. �

Example 3.3. What is the least residue of 800 (mod 7)?

Theorem 3.1. Congruence is an Equivalence Relationship, that is, it sat-isfies the following three properties for any integers a, b, c.

(i) Reflexive: a ≡ a (mod m)(ii) Symmetric: If a ≡ b (mod m) then b ≡ a (mod m).(iii) Transitive: If a ≡ b (mod m) and b ≡ c (mod m), then a ≡ c (mod m).

Thus, congruence (mod m) partitions Z into equivalence classes of the form

[a]m = {x ∈ Z : x ≡ a (mod m)},called congruence classes or residue classes. That is,

Z = [0]m ∪ [1]m ∪ · · · ∪ [m− 1]m.

Theorem 3.2. Substitution Properties of Congruences. Let a, b, c, d be integerswith a ≡ b (mod m) and c ≡ d (mod m). Then

(i) a+ c ≡ b+ d (mod m).(ii) ac ≡ bd (mod m).(iii) For any positive integer n, an ≡ bn (mod m).

10 TODD COCHRANE

Example 3.4. Find 2004 · 123 · 77 (mod 20). What is the remainder on dividing3298 by 7? Find 7994 (mod 8).

Theorem 3.3. Standard Algebraic properties of congruences. For any integersa, b, c we have

(i) a+ b ≡ b+ a (mod m) (commutative law)(ii) ab ≡ ba (mod m) (commutative law)(iii) a+ (b+ c) ≡ (a+ b) + c (mod m) (associative law)(iv) (ab)c ≡ a(bc) (mod m) (associative law)(v) a(b+ c) ≡ ab+ ac (mod m) (distributive law))

Example 3.5. What day of the week will it be 10 years from today?

Note: a is divisible by d iff a ≡ 0 (mod d).

Example 3.6. Prove that a number is divisible by 9 iff the sum of its digits (base10) is divisible by 9. Similarly for 11.

Example 3.7. Can 2013 be expressed as a sum of two squares? Suppose that a ≡ 3(mod 4). Can a be expressed as a sum of two squares of integers. Try 3, 7, 11, 15,etc.

4. Induction

Example 4.1. Example Notice the pattern for the sum of the first k odd numbers.Now prove by induction a formula.

Example 4.2. Fibonacci sequence 1,1,2,3,5,8,13,21,... Find a formula for f1 + f2 +f3 + f4 + · · ·+ fk

To prove formulas that hold for positive integers, induction is a very powerfultechnique. Recall,

Axiom of Induction: Suppose that S is a subset of the natural numbers suchthat (i) 1 ∈ S and (ii) If n ∈ S then n+ 1 ∈ S. Then S = N.

Principle of Induction: Let P (n) be a statement involving the natural numbern. Suppose that (i) P (1) is true and (ii) If P (n) is true for a given n then P (n+ 1)is true. Then P (n) is true for all natural numbers n.

The connection of course is just to let S be the set of all natural numbers for whichthe statement P (n) is true.

Example 4.3. On first HW you conjecture:∑nk=1 k

3 = (1 + 2 + · · ·n)2 = [n(n +1)/2]2. Prove.

Note the two ways to conclude an induction proof. 1)“Therefore, by the principleof induction, the statement is true for all natural numbers.” 2) “QED” = Quod EratDemonstrandum. Thus we have established what we wished to demonstrate.

One might object to this method by saying that we are assuming what we wishto prove. Is this a valid objection?

Example 4.4. Prove that 16n ≡ 1− 10n (mod 25) for any n ∈ N.

Example 4.5. Show that 16n|(6n)! for all n ∈ N.


Example 4.6. Prove that everyone has the same name. Let P (n) be the statementthat in any set of n people, everyone has the same name. P (1) is trivially true.

Strong Form of Induction. Let P (n) be a statement involving n. Suppose (i)P (1) is true and (ii) If P (1), P (2), . . . P (n) are all true for a given n, then so isP (n+ 1). Then P (n) is true for all natural numbers n.

The induction assumption is stronger, and so this allows us to prove more.

5. Primes and Unique Factorization

There are three types of natural numbers:1) 1, multiplicative identity or unity element.2) primes. P = {2, 3, 5, 7, . . . }.3) Composites.

Definition 5.1. A natural number n > 1 is called a prime if its only positivedivisors are 1 and itself. Otherwise it is called a composite. Thus n is composite ifn = ab for some natural numbers a, b with 1 < a < n, 1 < b < n.

Note 5.1. 1 is not called a prime for a couple reasons.

Example 5.1. Everyone factor 120 using a factor tree. Compare.

Theorem 5.1. Fundamental Theorem of Arithmetic. Any natural numbern > 1 can be expressed uniquely as a product of primes.

Note 5.2. It is understood that if n is a prime then it trivially is a product ofprimes.

Proof. Existence. Strong form of induction. For uniqueness we need followinglemma. �

Lemma 5.1. (i) If p is a prime and p|ab, then p|a or p|b.(ii) More generally, if p|a1 · · · ak then p|ai for some i.

Proof. Use Euclid to prove (i) and induction to prove (ii). �

Proof. Uniqueness of FTA. �

Example 5.2. Let E = {2n : n ∈ Z}. Note E is closed under + and ·, and enjoys allthe usual axioms as Z (with one exception). Factor 60.

Note 5.3. Every positive integer n has a unique prime power factorization of theform n = pe11 · · · p

ekk , with the pi distinct primes and the ei positive integers.

Definition 5.2. Let p be a prime n ∈ Z. We write pe‖n if pe|n but pe+1 - n. e iscalled the multiplicity of p dividing n.

Example 5.3. 2000 = 2453, so 24‖2000 and 53‖2000.

Example 5.4. Find the multiplicity of 2 dividing 21553 − 2357. Answer = 3.

Theorem 5.2. Let n > 1 have prime power factorization n = pe11 · · · pekk , and let

d ∈ N. Then d|n iff d = pf11 · · · pfkk for some nonnegative integers fi ≤ ei, 1 ≤ i ≤ k.

12 TODD COCHRANE

5.1. The factoring method for finding GCDs and LCMs. Note, given anytwo integers, we can always express their factorization using the same set of primes,if we allow 0’s in the exponents. This is a useful trick.

Theorem 5.3. Formula for GCD and LCM. Let a, b be positive integers with

prime power factorizations, a = pe11 · · · pekk , b = pf11 · · · p

fkk where the ei, fi are

nonnegative integers. Then

i) (a, b) = pmin(e1,f1)1 . . . p

min(ek,fk)k .

ii) [a, b] = pmax(e1,f1)1 . . . p

max(ek,fk)k .

Proof. Just use FTA and preceding theorem. �

Example 5.5. Let a = 2537715, b = 223654. Find gcd and lcm.

Corollary 5.1. For any nonzero integers a, b we have (a, b)[a, b] = |ab|.

Proof. Just use preceding theorem and one simple idea. For any integers e, f

max(e, f) + min(e, f) = e+ f.

An elementary proof of the corollary can be given based on just the definitions ofgcd and lcm, but it is not as transparent. �

5.2. Gaussian Integers.

Definition 5.3. The Gaussian integers is the set Z[i] = {a + bi : a, b ∈ Z}. Notethat Z[i] satisfies the ring axioms.

Definition 5.4. The absolute value or modulus of a complex number z = a+ bi isgiven by |z| =

√a2 + b2.

Recall the properties |zw| = |z||w|, |z/w| = |z|/|w|.

Definition 5.5. Let z, w be Gaussian integers, z 6= 0. We say that z divides w,written z|w if zu = w for some u ∈ Z[i].

Example 5.6. (1 + 2i)|5 because (1 + 2i)(1− 2i) = 5.

Definition 5.6. A Gaussian integer z = a + bi is called a unit if it has a multi-plicative inverse in Z[i].

Note 5.4. i) The units in Z[i] are {±1,±i}. Why? Suppose z is a unit, say zw = 1.Then |z||w| = 1 so |z| = 1.

ii) Units are divisors of every Gaussian integer.

Definition 5.7. i) A nonzero Gaussian integer z is called composite if z = uv forsome non-unit Gaussian integers u, v.

ii) A nonzero Gaussian integer z is called a prime if z is not composite and nota unit.

Definition 5.8. The gcd of two Gaussian integers z, w is the Gaussian integeru of largest modulus dividing both z and w. It is unique up to unit multiples.Our convention is to choose the representative in the first quadrant (including thepositive real axis but not the imaginary axis.)

Theorem 5.4. Division Algorithm. Let z, w ∈ Z[i], w 6= 0. Then there existGaussian integers q, r such that

z = qw + r and 0 ≤ |r| < |w|.


Proof. We want z/w = q+r/w with |r/w| < 1. Define q to be the Gaussian integerclosest to z/w. Certainly | zw − q| < 1. �

Example 5.7. a) Find the quotient and remainder for 12 + 5i÷ 1 + 2i. q = 4− 4i,r = i.

b) Find gcd(12 + 5i, 1 + 2i) = (12 + 5i− q(1 + 2i), 1 + 2i) = (i, 1 + 2i) = 1, usingconvention up choosing rep in 1st quadrant.

What are the primes in Z[i]? There are three types:(i) Odd integer primes p with p ≡ 3 (mod 4): 3,7,11,19,...(ii) The factors of integer primes p with p ≡ 1 (mod 4): 5 = (1 + 2i)(1 − 2i),

13 = (2 + 3i)(2− 3i), ...(iii) 1 + i, 1− i, the factors of 2.Division algorithm ⇒ Euclidean algorithm ⇒ GCDLC ⇒ Euclid’s Lemma ⇒

Unique Factorization.

Theorem 5.5. Every Gaussian integer can be uniquely expressed as a product ofprimes.

5.3. Infinitude of primes.

Theorem 5.6. There are infinitely many primes in N.

Proof. Euclid. Proof by contradiction. Suppose that there are finitely many primes,say p1, . . . , pk. Let N = p1 · · · pk + 1. Then, by FTA, N has a prime factor, saypi. Then we have pi|N and pi|(p1 · · · pk). Thus pi|(N − p1 · · · pk), that is, pi|1, acontradiction. Therefore, there must be infinitely many primes. �

Theorem 5.7. There exist arbitrarily large gaps between consecutive primes.

Proof. Let n ∈ N. Consider the sequence of consecutive integers n! + 2, n! +3, · · · , n! + n. For 2 ≤ k ≤ n we have k|n! and k|k and so k|(n! + k), and moreoverit is a proper divisor. Thus n! + k is composite. Therefore we have a sequence ofn − 1 consecutive composite numbers, and so if we let p be the largest prime lessthan n! + 2, the gap between p and the next prime must be at least n. �

Open: 1) Are there infinitely many twin primes.2) Given any even number n, is there a pair of consecutive primes with gap n

between them? Are there infinitely many pairs with gap n between them?3) Goldbach: Given any even number, can we express n as a sum of two primes.

5.4. Sieve of Eratosthenes. An elementary algorithm for finding the set of primeson an interval by sieving out multiples of small primes.

Example 5.8. Find all primes between 200 and 220.

Theorem 5.8. Basic primality test. If n is a positive integer having no primedivisor p ≤

√n, then n is a prime.

Proof. Proof by contradiction. Suppose that n is composite, say n = ab with1 < a < n, 1 < b < n. We claim that either a ≤

√n or b ≤

√n, else ab >

√n√n =

n = ab, a contradiction. Say a ≤√n. Let p be any prime divisor of a. Then

p ≤ a ≤√n, and, since p|a and a|n we have p|n. But this contradicts assumption

that n has no prime divisor p ≤√n. Therefore n is a prime. �

14 TODD COCHRANE

5.5. Estimating π(x). Pick a positive integer at random from 1 to x. What is theprobability that it is a prime? Let Pq be the probability that n is not divisible byq: Pq = 1− 1

q . Let p1, . . . , pk be the primes up to x. Thus prob that n is a prime

roughtly equals the prob that n is not divisible by 2, 3, .., pk, which (assuming theevents are independent) is given by

P =

k∏i=1

(1− 1

pi

)=∏p<x

(1− 1

p

).

Now

P−1 =∏p<x

(1− 1

p)−1 =

′∑n

1

n≈∑n≤x

1

n≈ lnx,

where∑′n is a sum over all n such that all prime factors of n are ≤ x. Thus

P ≈ 1ln x , and π(x) ≈ x

ln x .

Theorem 5.9. Prime Number Theorem. limx→∞π(x)

x/ ln(x) = 1.

Conjectured by Gauss and proved by J. Hadamard and C. de la Vallee Pousin(1896).

6. Multiplicative functions

Definition 6.1. Let f : N → N be a function defined on N. Such functions arecalled arithmetic.

1) We say that f is multiplicative if for any two natural numbers a, b withgcd(a, b) = 1 we have f(ab) = f(a)f(b).

2) We say that f is totally multiplicative if for any two natural numbers a, b,f(ab) = f(a)f(b).

Example 6.1. f(n) = n, f(n) = nk, f(n) ≡ 1, are all multiplicative, in fact, theyare totally multiplicative.

Note 6.1. If f is a multiplicative function that is not identically 0, then f(1) = 1.

Example 6.2. Suppose f is a multiplicative function such that f(p) = 2p for oddprime p, f(pj) = 3 for odd p and j > 1, f(2) = 4, f(4) = 5, f(8) = 6, f(2k) = 0 fork > 3. Evaluate f(13), f(100), f(80).

Theorem 6.1. a) If f is a multiplicative function and n is a positive integer withprime factorization n = pe11 . . . pekk then

(6.1) f(n) = f(pe11 ) . . . f(pekk ).

b) Conversely, if f is an arithmetic function satisfying (6.1), then f is multi-plicative.

Proof. a) The proof is by induction on k. The case k = 1 is trivial. Supposestatement is true for a given k, and now consider k+ 1. Let n = pe11 · · · p

ek+1

k+1 . Then

f(n) = f((pe11 · · · p

ekk )p

ek+1

k+1

)= f(pe11 · · · p

ekk )f(p

ek+1

k+1 ),

since f is multiplicative. Then, by the induction assumption, we conclude that

f(n) = f(pe11 ) . . . f(pekk )f(pek+1

k+1 ),

QED.


b) Let a, b be positive relatively prime integers, with factorizations a = pe11 · · · pekk ,

b = qf11 · · · qfll where the pi, qj are all distinct primes. Then

f(ab) = f(pe11 · · · pekk q

f11 · · · q

fll ) =

k∏i=1

f(peii )

l∏j=1

f(qfjj ) = f(a)f(b).

�

Thus, multiplicative functions are determined by their values at prime powers.

Theorem 6.2. If f and g are multiplicative functions then so are fg, f/g and fn

for any n ∈ N.

Proof. Immediate from definition. For example, to show fg is multiplicative, let a, bbe positive integers with (a, b) = 1 then fg(ab) = f(ab)g(ab) = f(a)f(b)g(a)g(b) =f(a)g(a)f(b)g(b) = fg(a)fg(b). �

Definition 6.2. For any positive integer n we let τ(n) (or d(n)) denote the numberof positive divisors of n, and σ(n) denote the sum of the positive divisors of n.

Example 6.3. τ(1) = 1. τ(2) = 2. For prime p, τ(p) = 2, τ(pk) = k+1. For distinctprimes p, q, τ(pq) = 4. σ(1) = 1, σ(2) = 3, σ(p) = p+ 1, σ(pk) = 1 + p+ · · ·+ pk.

We claim that τ(n) and σ(n) are multiplicative. To prove this we need.

Theorem 6.3. Correspondence Theorem for divisors. Let a, b be relativelyprime positive integers. Then every divisor of ab can be uniquely expressed in theform de, where d|a and e|b. Moreover, any number of the form de where d|a ande|b is a divisor of ab.

Proof. Let a = pe11 · · · pekk , b = qf11 · · · q

fll with the pi,qi all distinct primes. Then

ab = pe11 · · · qfll . By an earlier theorem, any divisor s of ab is of the form s =

pg11 · · · pgkk q

h11 · · · q

hl

l for some integers gi, hi with 0 ≤ gi ≤ ei and 0 ≤ hi ≤ fi. Let

d = pg11 · · · pgkk , e = qh1

1 · · · qhl

l . Then s = de, d|a and e|b. Conversely, if we startwith d and e as defined above, plainly de is a divisor of ab. The expression is uniqueby FTA. �

Equivalent Statement: Let Da, Db, Dab denote the sets of positive divisors ofa, b, ab respectively and let Da × Db denote the set of all ordered pairs. Thenthere is a 1-to-1 correspondence between Dab and Da ×Db given by the mappingDa ×Db → Dab given by (d, e)→ de.

Proof. (i) Note the mapping goes into Dab. (ii) The mapping is one-to-one: d1e1 =d2e2 → d1|d2e2 → d1|d2 since (d1, e2) = 1. Similarly d2|d1 and so d1 = d2,e1 = e2. (iii) The mapping is onto. Let f |ab. Put d = (f, a), e = (f, b). Thenf = (f, ab) = (f, a)(f, b) = de.

One can also use prime power decomposition to prove the result: A typicaldivisor of a is of the form, d =

∏pgii , gi ≤ ei. A typical divisor of b is of form

e =∏qhii , hi ≤ fi. Then de =

∏pgii∏qhii , which is a typical divisor of ab. �

Example 6.4. Make an array to illustrate the correspondence for a = 28, b = 15.

Theorem 6.4. τ(n) and σ(n) are multiplicative functions.

16 TODD COCHRANE

Proof. Suppose (a, b) = 1. let Da = {d1, . . . , dk}, Db = {e1, . . . , el}. Then, bycorrespondence theorem Dab = {diej : 1 ≤ i ≤ k, 1 ≤ j ≤ l}. In particularτ(ab) = |Dab| = kl = τ(a)τ(b). Also

σ(ab) =

k∑i=1

∑j=1l

diej = (d1 + d2 + · · ·+ dk)(e1 + e2 + · · ·+ el) = σ(a)σ(b),

by the distributive law. �

Now for prime powers we easily see:

τ(pe) = e+ 1,

σ(pe) = 1 + p+ p2 + · · ·+ pe =pe+1 − 1

p− 1.

Thus, since τ(n) and σ(n) are multiplicative, we obtain

Theorem 6.5. Formulas for τ(n) and σ(n). Let n = pe11 · · · pekk . Then

i) τ(n) =∏ki=1(ei + 1).

ii) σ(n) =∏ki=1

pe1+1i −1pi−1 .

6.1. Perfect, Deficient and Abundant Numbers.

Definition 6.3. We say that a positive integer n isi) Deficient if σ(n) < 2n,ii) Abundant if σ(n) > 2n, andiii) Perfect, if σ(n) = 2n.

Another way to think about it. A number is perfect if it equals the sum of itsproper divisors.

Example 6.5. i) 6, 28 are perfect.ii) Any prime power is deficient. Any product of two odd prime powers is defi-

cient.iii) Any multiple of a perfect number (greater than the perfect number) is abun-

dant.

Example 6.6. The following numbers were known to be perfect to the ancients.Lets look at their factorizations to see if a pattern can be discerned.

6 = 2 · 3,28 = 4 · 7 = 22 · 7,

496 = 16 · 31 = 24 · 31,

8128 = 64 · 127 = 26 · 127

What about 8 · 15? This is abundant.

Conjecture: n is perfect if n = 2k(2k+1 − 1) and 2k+1 − 1 is a prime.

Proof. We have σ(n) = σ(2k(2k+1−1)) = σ(2k)σ(2k+1−1). Now σ(2k) = 2k+1−1and since 2k+1 − 1 is a prime, σ(2k+1 − 1) = 2k+1. Thus σ(n) = (2k+1 = 1)2k+1 =2n. �


Questions: Are these the only perfect numbers? Are there any odd perfectnumbers? When is 2k+1 − 1 a prime?

These are all open questions. It is known that if n is an odd perfect number,then n > 10300, n must have a prime divisor > 100000 and it must have at least 11distinct prime factors. However, for even perfect numbers we have

Theorem 6.6. Euler’s characterization of even perfect numbers. An evennumber is perfect if and only if it is of the form 2k(2k+1−1) with 2k+1−1 a prime.

Proof. Already shown one way. Suppose now that n is an even perfect number sayn = 2ka with a odd. Then σ(n) = 2n iff (2k+1− 1)σ(a) = 2k+1a. Let σ(a) = a+ b.The above holds iff a = b(2k+1 − 1). In particular b|a and b < a. If b 6= 1 thena, b, 1 are distinct divisors of a and so σ(a) ≥ a+ b+ 1 a contradiction. Thus b = 1and a = 2k+1 − 1 and σ(a) = a+ 1. It follows that a must be an odd prime of thedesired form. �

6.2. Mersenne Primes.

Definition 6.4. Any prime of the form Mk = 2k − 1 is called a Mersenne prime.(In general, numbers of the form Mk are called Mersenne numbers.)

Theorem 6.7. (i) If d|k then 2d − 1|2k − 1.(ii) Thus if 2k − 1 is a prime then k must be a prime.

Proof. (i) Immediate from the factoring formula Xn − 1 = (X − 1)(Xn1 + · · ·+ 1).Say k = dn for some n ∈ N, and put X = 2d. (ii) Immediate from part (i). �

Example 6.7. k = 2, 3, 5, 7 yield Mersenne primes 3,7,31,127. However, k = 11,gives 2047 = 23 · 89, a composite. Thus we do not always get a Mersenne primewhen k is prime.

Example 6.8. Factors of Mersenne numbers with composite k. If k = 9, then23 − 1 = 7 is a factor and we see M9 = 511 = 7 · 73. If k = 10, then 22 − 1 = 3 and25 − 1 = 31 are factors and we see M10 = 1023 = 3 · 11 · 31.

6.3. GIMPS, Great Internet Mersenne Prime Search. It was popular formany years to test the speed of new computers by seeing if they can find the largestknown prime number using standard algorithms. All of the largest known primesare Mersenne primes. In 1876 Lucas had the record k = 127. He discovered a cleveralgorithm in order to deal with numbers of this size by hand. In 1985 a Cray X-MPobtained k = 216, 091 a 65000 digit number, in 3 hours. In 2008 the 45th Mersenneprime was discovered at UCLA, 243,112,609 − 1 a number with 12,978,189 digits,earning the finders $100000. There are currently 47 known Mersenne primes, andthis is still the largest. Check GIMPS on the internet if you wish to participate inthis search.

Open: Are there infinitely many Mersenne primes?

6.4. Fermat primes.

Definition 6.5. Any prime of the form Fk = 2k + 1 is called a Fermat prime. (Ingeneral, numbers of the form Fk are called Fermat numbers.)

Make table with 2k + 1, k = 1 to 8. Discover that it is prime iff k is a power of

2. Fermat conjectured that every number of the form 22k

+ 1 is a prime. This is

18 TODD COCHRANE

true for k = 0, 1, 2, 3, 4, (3,5,17,257,65537), but false for 225

+ 1. Euler was able tofactor the latter. Here’s one way.

232 + 1 = (29 + 27 + 1)(223 − 221 + 219 − 217 + 214 − 29 − 27 + 1)

4, 294, 967, 297 = 641 · 6700417.

How might one discover that 641 is a divisor of 225

+ 1 without a lot of trialand error? Using properties of congruences (Fermat’s Little Theorem, orders ofelements (mod p)) one can prove that if p is an odd prime divisor of a Fermat

number 22k

+ 1, then p ≡ 1 (mod 2k+2). Thus in the case k = 5 we must havep ≡ 1 (mod 128), and so one very quickly sees that 641 = 1 + 5 · 128 is a goodcandidate to test.

Theorem 6.8. (i) If k = ab with a odd and b arbitrary, then 2b + 1|2k + 1.(ii) Thus if 2k + 1 is a Fermat prime, then k is a power of 2.

Proof. This follows from the factorization formula ya− 1 = (y− 1)(ya−1 + · · ·+ 1).Set y = 2b. �

Open problem: Are there any other Fermat primes besides the 5 listed above?In particular it is unknown whether there are infinitely many. Gauss made a con-nection between Fermat primes and construction of regular n-gons.

Theorem 6.9. (i) A regular n-gon with n a prime, can be constructed with straight-edge and compass if and only if n is a Fermat prime.

(ii) More generally, a regular n-gon can be constructed iff n is of the form n =2kp1p2 · · · pl for some k ≥ 0 and distinct Fermat primes p1, . . . , pl.

6.5. Properties of multiplicative functions. Recall definition of multiplicativefunction.

Let f(n) be a given multiplicative function. Define

F (n) :=∑d|n

f(d),

the sum being over all positive divisors of n. We claim that F is multiplicative.

Example 6.9. Let f ≡ 1. Then F (n) = τ(n). Let f(n) = τ(n) then F (n) = σ(n).Let f(n) = n2 then F (n) = σ2(n) the sum of the squares of the divisors of n. etc.

Theorem 6.10. Suppose that f is a multiplicative function. Then so is the functionF defined by F (n) =

∑d|n f(d).

Proof. Same as for σ. Suppose that (a, b) = 1. By the correspondence theorem,any divisor d of ab can be expressed uniquely in the manner d = ce, where c|a ande|b. Thus we have

F (ab) =∑d|ab

f(d) =∑c|a

∑e|b

f(ce)

=∑c|a

∑e|b

f(c)f(e) =∑c|a

f(c)∑e|b

f(e) = F (a)F (b).

�


Example 6.10. Let F (n) =∑d|n τ(d). Find a formula for F (n) and evaluate

F (8000). First we evaluate F (pe) for any prime power pe.

F (pe) = τ(1)+τ(p)+τ(p2)+· · ·+τ(pe) = 1+2+3+· · ·+e+(e+1) =(e+ 1)(e+ 2)

2.

Next we observe that since τ is multiplicative, so is F by preceding theorem. Thus,if n = pe11 · · · p

ekk , then

F (n) =

k∏i=1

F (peii ) =

k∏i=1

(ei + 1)(ei + 2)

2.

If n = 8000 = 23 · 1000 = 2653, then F (n) = (6+1)(6+2)2

(3+1)(3+2)2 = 28 · 10 = 280.

Example 6.11. Let σ−1(n) =∑d|n

1d . Show that σ−1(n) = σ(n)

n .

6.6. The Euler Phi Function.

Definition 6.6. For any positive integer n we define φ(n) to be the number ofpositive integers less than or equal to n that are relatively prime to n.

Example 6.12. Find φ(10).

Example 6.13. φ(p) = p− 1, for prime p. φ(pk) = pk − pk−1 for prime power pk.

Suppose n is a positive integer with factorization pe11 pe22 · · · p

ekk . How do we find

φ(n)? For any divisor d of n, let

Sd = {k : 1 ≤ k ≤ n, d|k}.

Note |Sd| = n/d. By the inclusion-exclusion principle, to find the number of valuesfor 1 to n relatively prime to n, we need to count how many points are not in Sp1 ,Sp2 , ..., or Spk .

φ(n) = n− |Sp1 | − |Sp2 | − · · · − |Spk |+ |Sp1p2 |+ |Sp1p3 |+ · · ·+ (−1)k|Sp1···pk |

= n

(1− 1

p1− · · · − 1

pk+ · · ·+ (−1)k

1

p1 · · · pk

)= n

k∏i=1

(1− 1

pi

).

Thus we have established the following theorem.

Theorem 6.11. Let n have prime power factorization n = pe11 · · · pekk . Then φ(n) =∏k

i=1 φ(peii ) = Πki=1p

e−1i (pi − 1).

Corollary 6.1. The Euler phi-function is multiplicative.

Proof. Follows from Theorem 6.1 (b). �

We will see two more ways of showing that the Euler phi-function is multiplica-tive, one making use of the identity

∑d|n φ(d) = n, and the other involving the

Chinese Remainder Theorem.Here is another interesting property of the Euler phi-function.

Theorem 6.12. For any natural number n we have∑d|n φ(d) = n.

20 TODD COCHRANE

Proof. Let F (n) =∑d|n φ(d). First note that since φ is multiplicative, so is F .

Next, for any prime power pe we have

F (pe) = φ(1)+φ(p)+φ(p2)+· · ·+φ(pe) = 1+(p−1)+(p2−p)+· · ·+(pe−pe−1) = pe,

since the last sum is telescoping. Thus if n is any integer, with prime factorizationn = pe11 · · · p

ekk , then we have

F (n) = F (pe11 ) · · ·F (pekk ) = pe11 · · · pekk = n.

�

A direct proof of this theorem, that does not appeal to the multiplicative propertyof φ can be given as follows. A complex number w is called a primitive n-th rootof unity if wn = 1 but wd 6= 1 for all d < n. There are φ(n) primitive n-th rootsof unity. Now every n-th root of unity is a primitive d-th root of unity for some(unique) d|n. Thus since there are n, n-th roots of unity, and φ(d) primitive d-throots of unity for each d|n, we see that n =

∑d|n φ(d).

6.7. The Mobius Function.

Definition 6.7. The Mobius function µ is defined by

µ(n) =

1, if n = 1;

(−1)k, if n = p1p2 · · · pk, a product of distinct primes;

0, if p2|n for some prime p.

Make table illustrating random behavior of µ(n). Is it statistically random insome sense? If so then

∑n≤x µ(n)�

√x, but this is an open problem.

Theorem 6.13. µ(n) is a multiplicative function.

Proof. Let a, b be positive integers with (a, b) = 1. If a or b equals 1, say wloga = 1 then µ(ab) = µ(b) while µ(a)µ(b) = µ(1)µ(b) = 1 · µ(b) = µ(b) and soµ(ab) = µ(a)µ(b). Next, suppose that either a or b is divisible by p2 for some primep, say wlog a. Then so is ab, and so µ(ab) = 0, while µ(a)µ(b) = 0µ(b) = 0, so againµ(ab) = µ(a)µ(b). Finally, suppose that a, b are products of distinct primes, saya = p1 · · · pk, b = q1 · · · ql. The pi, qj must all be distinct since (a, b) = 1. Thus abis a product of k+ l distinct primes, and we have µ(ab) = (−1)k+l = (−1)k(−1)l =µ(a)µ(b). �

Theorem 6.14. For any natural number n∑d|n

µ(d) =

{1 if n = 1

0 n > 1

Proof. Let F (n) =∑d|n µ(d). For any prime power pe we have

F (pe) = µ(1) + µ(p) + µ(p2) + · · ·+ µ(pe) = 1− 1 + 0 + · · ·+ 0 = 0,

Thus if n > 1 with prime factorization n = pe11 · · · pekk (k ≥ 1), then F (n) =

F (pe11 ) · · ·F (pekk ) = 0 · · · 0 = 0. Trivially, F (1) = 1. �

Example 6.14. Calculate sum for n = 10.


Definition 6.8. The indicator function (or characteristic function) for a singletonpoint set {n} is defined by

δn(x) =

{1 if x = n,

0 if x 6= n.

Thus the previous theorem can be restated:∑d|n µ(d) = δ1(n).

Corollary 6.2. Let n ∈ N. For any divisor d of n we have, δn(d) =∑e|nd

µ(e).

Proof. Since d|n we have δn(d) = δ1(n/d) =∑e|nd

µ(e). �

Suppose that f is an arithmetic function and we define F by F (n) =∑d|n f(d).

How can we invert this equation, and solve for f(n) in terms of F (n)? Letting δ bethe indicator function for the point set {n}, we have

f(n) =∑d|n

f(d)δn(d) =∑d|n

f(d)

∑e|nd

µ(e)

=∑e|n

µ(e)

∑d|ne

f(d)

=∑e|n

µ(e)F (n

e).

Theorem 6.15. Mobius inversion formula. Let f be any arithmetic functionand F (n) =

∑d|n f(d). Then for any n ∈ N we have

f(n) =∑d|n

F (d)µ(n/d) =∑d|n

F(nd

)µ(d).

Think of it as a sum over the divisor pairs d, nd of n.

Proof. A proof was given above that actually derives the formula. Lets give asecond proof (that assumes such a formula has already been conjectured). By thedefinition of F , we have

∑d|n

F (n

d)µ(d) =

∑d|n

µ(d)

∑e|nd

f(e)

=∑e|n

f(e)∑d|ne

µ(d)

=∑e|n

f(e)δ1(n/e) = f(n).

�

Example 6.15. σ(n) =∑d|n d and so n =

∑d|n σ(d)µ(n/d).

Theorem 6.16. Let f, g be multiplicative functions and define F (n) =∑d|n f(d)g(n/d).

Then F is multiplicative.

22 TODD COCHRANE

Proof. Follows from correspondence theorem. Let (a, b) = 1. Then

F (ab) =∑l|ab

f(l)g(ab

l)

=∑d|a

∑e|b

f(de)g(a

d

b

e)

=∑d|a

∑e|b

f(d)f(e)g(a

d)g(

b

e)

=∑d|a

f(d)g(a

d)∑e|b

f(e)g(b

e) = F (a)F (b).

�

Corollary 6.3. Let f be any arithmetic function and F be defined by F (n) =∑d|n f(d). Then F is multiplicative if and only if f is multiplicative.

Proof. One direction is just Theorem 6.10. For the converse, suppose that F is mul-tiplicative. Then by the Mobius inversion formula we have f(n) =

∑d|n µ(d)F (n/d),

which is multiplicative by the preceding theorem. �

Example 6.16. Suppose we start with the formula n =∑d|n φ(d), in Theorem 6.12.

By the preceding corollary we deduce that φ is multiplicative. In fact, by theMobius inversion formula we obtain

φ(n) =∑d|n

µ(d)n

d= n

∑d|n

µ(d)

d.

7. More on Congruences

Definition 7.1. A complete residue system (mod m) is a set of m distinct integers(mod m), {x1, . . . , xm}. Thus, every integer is congruent to exactly one of the xi(mod m).

Example 7.1. For m = 5, the following are all examples of complete residue systems(mod 5): {0, 1, 2, 3, 4}, {5, 6, 7, 8, 9}, {5, 1, 22,−27, 94}.7.1. Counting Solutions of Congruences. Let f(x) be a polynomial with inte-ger coefficients and m be a positive integer. We wish to solve the congruence

(7.1) f(x) ≡ 0 (mod m).

Example 7.2. Solve x2 ≡ 1 (mod 8). By testing values from 0 to 7, we see thatthe solution set is all x with x ≡ 1, 3, 5 or 7 (mod 8). Thus {1, 3, 5, 7} is called acomplete set of solutions of the congruence x2 ≡ 1 (mod 8). Thus this congruencehas 4 distinct solutions (mod 8).

Definition 7.2. (i) A set of integers {x1, . . . , xk} is called a complete set of solu-tions of the congruence (7.1) if the values x1, . . . , xk are distinct residues (mod m),and every solution of (7.1) is congruent to one of these values (mod m).

(ii) A complete set of solutions is called the “least” complete set of solutions ifthe xi are least residues (that is, 0 ≤ xi ≤ m− 1.)

(iii) If {x1, . . . , xk} is a complete set of solutions of (7.1), then we say (7.1) hask distinct solutions (mod m).


7.2. Linear Congruences. Consider the linear congruence

(7.2) ax ≡ b (mod m),

where a, b ∈ Z. Note, this is equivalent to solving the linear equation ax = b+my,that is ax−my = b, and we did this earlier. Putting d = (a,m), we saw that thiswas solvable iff d|b, in which case the general solution was given by x = x0 + m

d t,y = y0 − m

d t, with t any integer and (x0, y0) any particular solution.

Theorem 7.1. Let d = (a,m). The linear congruence (7.2) has a solution if andonly if d|b, in which case a complete set of solutions is given by

x = x0 + tm

d, 0 ≤ t ≤ d− 1,

where x0 is any particular solution of (7.2). Thus, if a solution exists, then thereare d distinct solutions (mod m).

Note that we stop at t = d− 1 in order to avoid repetition of solutions.

Example 7.3. Solve 7x ≡ 2 (mod 11). d = (7, 11) = 1 and 1|2 so a unique solutionexists. To solve, we solve linear equation 7x− 11y = 2 using array method. x ≡ 5(mod 11).

A useful trick for solving linear congruence: If you notice that a, b,m all have acommon factor d, then it can be divided out. That is,

ax ≡ b (mod m), iffa

dx ≡ b

d(mod

m

d).

Example 7.4. Solve 3x ≡ 6 (mod 18), implies x ≡ 2 (mod 6). Thus x ≡ 2, 8, 14(mod 18).

7.3. Multiplicative inverses.

Definition 7.3. Let a ∈ Z. An integer x is called a multiplicative inverse of a(mod m) if ax ≡ 1 (mod m). In this case we write x ≡ a−1 (mod m).

Example 7.5. Find a mult inverse of 3 (mod 10). Find mult inverse of 2 (mod 10).Can’t do the latter because (2, 10) > 1

Theorem 7.2. An integer a has a multiplicative inverse (mod m) if and only if(a,m) = 1. In this case, the mult inverse is unique.

Example 7.6. Use the mult inverse of 3 (mod 10) to solve the congruence 3x ≡ 7(mod 10).

Note 7.1. By definition, there are φ(n) integers between 1 and n that are relativelyprime to n. These are the values that have multiplicative inverses.

Definition 7.4. A reduced residue system (mod m) is a set of integers {a1, . . . , aφ(m)}that are distinct (mod m), and relatively prime to m.

Note: The values in a reduced residue system (mod m) are all invertible(mod m).

Example 7.7. m = 10. {1, 3, 7, 9} is a reduced residue system (mod 10). So is{11, 33, 17, 9}.

Theorem 7.3. Cancellation Law. If (a,m) = 1 and ax ≡ ay (mod m), thenx ≡ y (mod m).

24 TODD COCHRANE

Proof. Since (a,m) = 1, a has a mult inverse (mod m) and so we can multiplyboth sides of the congruence ax ≡ ay (mod m) by a−1 (mod m), to get x ≡ y(mod m). �

Theorem 7.4. Wilson’s Theorem For any prime p, (p− 1)! ≡ −1 (mod p).

Proof. The statement is trivial for p = 2, so assume that p is odd. Note that theonly solutions of the congruence x2 ≡ 1 (mod p) are x ≡ ±1 (mod p). Thus ifx 6≡ ±1 (mod p) then x−1 6≡ x (mod p), and so we can form pairs (x, x−1), andobtain

{1, 2, . . . , p− 1} ≡ {1,−1, x1, x−11 , x2, x

−12 , . . . , xk, x

−1k } (mod p).

Thus taking the product of all of the elements in each set we see,

(p− 1)! ≡ 1(−1)x1x−11 · · ·xkx

−1k ≡ −1 (mod p).

�

7.4. Chinese Remainder Theorem.

Example 7.8. Find a whole number n such that the remainder is 3 when n is dividedby 7, 5 when divided by 11. This is equivalent to the system x ≡ 3 (mod 7), x ≡ 5(mod 11). Set x = 3 + 7t, 3 + 7t ≡ 5 (mod 11), t ≡ 5 (mod 11). Thus x ≡ 38(mod 77).

Theorem 7.5. Chinese Remainder Theorem. Let a, b be positive integers with(a, b) = 1. Let h, k be any integers. Then the system

x ≡ h (mod a)

x ≡ k (mod b).

has a unique solution (mod ab).

Proof. Set x = h+ at, substitute to get at ≡ k − h (mod b). By previous theoremthis system has solution t = t0 +bs, s ∈ Z. Substituting gives x ≡ h+at0 (mod ab)is the unique solution. �

Example 7.9. Historical example used by the ancient Chinese. Suppose we wish todetermine the exact number of people in a large crowd of about 500 people. Havethe crowd break into groups of 7, 8 and 9 people, with 2,4,6 people left over in thethree cases. Thus we must solve x ≡ 2 (mod 7), x ≡ 4 (mod 8), x ≡ 6 (mod 9).To solve, start with the biggest modulus, that is set x = 6+9t, t ∈ Z. Substitute toget t ≡ 6 (mod 8) and consequently x ≡ 60 (mod 7)2, say x = 60+72s. Substituteagain to get s ≡ 6 (mod 7) and x ≡ 492 (mod 5)04. Thus there are 492 people.

Definition 7.5. We say a set of integers {a1, a2, . . . , ak} are pairwise relativelyprime if (ai, aj) = 1 for all i, j with 1 ≤ i < j ≤ k.

Example 7.10. The integers 6, 11, 15 are not pairwise relatively prime, even thoughgcd(6, 11, 15) = 1.

Theorem 7.6. CRT with more than 2 congruences Let m1, . . . ,mn be pair-wise relatively prime positive integers, and h1, . . . , hn be any integers. Then thesystem

x ≡ hi (mod mi), 1 ≤ i ≤ n,has a unique solution (mod m1m2 · · ·mn).


7.5. Fermat’s Little Theorem and Euler’s Theorem.

Theorem 7.7. Fermat’s Little Theorem FLT. Let p be a prime and a be aninteger with p - a. Then ap−1 ≡ 1 (mod p).

Proof. Special case of Euler’s Theorem, coming next. Just set m = p and noteφ(p) = p− 1. �

An equivalent version of Fermat’s Little Theorem is the following: For any primep and integer a we have ap ≡ a (mod p). Note, if p|a then this statement is triviallytrue (both sides are 0) while if p - a then we can divide both sides by a to obtainthe original statement.

Theorem 7.8. Euler’s Theorem. Let m ∈ N and a ∈ Z with (a,m) = 1. Thenaφ(m) ≡ 1 (mod m).

Note: Euler’s theorem fails if (a,m) > 1.

Example 7.11. Find the value of 171802 (mod 27). Note φ(27) = 18 and (17, 27) =1, so 1718 ≡ 1 (mod 27). Thus 171802 = (1718)100172 ≡ 289 ≡ 19 (mod 27).

Lemma 7.1. Permutation Lemma. Let m ∈ N and a be an integer with (a,m) =1 and k = φ(m). Let {x1, x2 . . . , xk} be a reduced residue system (mod m). Thenthe set {ax1, ax2, . . . , axk} is also a reduced residue system (mod m).

Example 7.12. m = 10, {1, 3, 7, 9} is a reduced residue system. Let a = 3, 7, 9 toobtain new reduced residue systems, and note that they are just permutations ofthe original.

Proof of Permutation Lemma. By the cancelation law, the values ax1, . . . , axk areall distinct (mod p). Since there are k distinct values, this must be a reducedresidue system. �

Proof of Euler’s Theorem. Let a ∈ Z with (a,m) = 1 and {x1, . . . , xk} be a re-duced residue system (mod m), where k = φ(m). By the permutation lemma,{[ax1]m, . . . , [axk]m} = {[x1]m, . . . , [xk]m}. Thus the product of all of the elementsin each of these sets must be equal (mod m), that is,

(ax1)(ax2) · · · (axk) ≡ x1x2 · · ·xk (mod m).

By the cancelation law we obtain ak ≡ 1 (mod m), which is the statement of thetheorem. �

Example 7.13. Find the last 3 digits of 17801. That is, find lr of 171801 (mod 1000).Note φ(1000) = 400, so by Euler, 17400 ≡ 1 (mod 1000). Thus 171801 ≡ 17(mod 1000). So last three digits are 017.

7.6. Applications of Euler’s Theorem and Fermat’s Little Theorem. Wewill see five applications. (i) Computing powers of integers (mod m). (Alreadydone.) (ii) Finding orders of elements (mod m). (iii) Finding the length of therepeating pattern in the decimal expansion of a rational number. (iv) Primalitytesting. (iv) RSA cryptography. In this section we look at the first two applications.

26 TODD COCHRANE

7.7. Orders of elements (mod m).

Definition 7.6. Let m be a positive integer and a be any integer with (a,m) = 1.The order of a (mod m), written ordm(a) is the smallest positive integer k suchthat ak ≡ 1 (mod m).

Example 7.14. ord7(2) = 3, ord5(2) = 4.

Note 7.2. i) If (a,m) = 1 then ordm(a) exists. Why? Consider the values a, a2, . . . ,(mod m) Eventually there must be repetition, that is, ai ≡ aj (mod m) for somei > j. But then ai−j ≡ 1 (mod m). Thus there exists some k such that ak ≡ 1(mod m), and therefore a minimal such k must exist by well ordering.

ii) If (a,m) > 1 then there is no k with ak ≡ 1 (mod m) and so ordm(a) doesn’texist.

iii) If k = ordm(a) then a−1 ≡ ak−1 (mod m).

Theorem 7.9. Powers of a (mod m). Let (a,m) = 1 and k = ordm(a). Theni) The values 1, a, a2, . . . ak−1 are distinct (mod m).ii) Every power of a is congruent to exactly one of these values. To be precise if

n ∈ Z and r is the remainder in dividing n by k then an ≡ ar (mod m).iii) an ≡ 1 (mod m) if and only if k|n.

Proof. i) Proof by contradiction. Suppose that ai ≡ aj (mod m) for some 0 ≤ i <j < k. Then by cancelation law aj−i ≡ 1 (mod m), but since 0 < j − i < k thiscontradicts the minimality of k.

ii) By division algorithm, n = qk + r, with 0 ≤ r < k. Thus an ≡ ar (mod m).iii) an ≡ 1 (mod m) iff ar ≡ 1 (mod m) iff r = 0 (by minimality of k) iff k|n. �

Theorem 7.10. Orders of elements. Let m ∈ N, a ∈ Z with (a,m) = 1. Thenordm(a)|φ(m).

Proof. Let k = ordm(a). By Theorem 7.9, an ≡ 1 (mod m) iff k|n. Since aφ(m) ≡ 1(mod m) (by Euler’s Theorem), we must have k|φ(m). �

Example 7.15. a) Find k = ord18(7). φ(18) = 6. Thus k|6, that is, k = 1, 2, 3 or 6.Plainly k 6= 1 (1 is the only element of order 1 for any modulus). 72 ≡ 13 (mod 18)and 73 ≡ 13 · 7 = 91 ≡ 1 (mod 18), so k = 3.

b) Next lets find k = ord18(5). Note 52 ≡ 7 (mod 18), 53 ≡ −1 (mod 18). Thusk = 6.

For composite moduli, the next theorem is convenient for calculating orders.

Theorem 7.11. Suppose m = m1m2 with (m1,m2) = 1 and (a,m) = 1. Thenordm1m2

(a) = [ordm1(a), ordm2

(a)].

Proof. Just note that ak ≡ 1 (mod m1m2) is equivalent to the system, ak ≡ 1(mod m1) and ak ≡ 1 (mod m2). Any k satisfying the first is a multiple of ordm1

(a)while any k satisfying the second is a multiple of ordm2

(a). Thus the minimal suchk is the least common multiple of these two orders. �

Example 7.16. Find ord21(10). We must find the minimal k such that 10k ≡ 1(mod 2)1, that is, 10k ≡ 1 (mod 3) and 10k ≡ 1 (mod 7). The first congruence is1k ≡ 1 (mod 3) which holds for any k, while the second 3k ≡ 1 (mod 7) requires6|k. Thus k = 6 is the minimal value.


7.8. Decimal Expansions. 17 = .142857. In your first homework you discovered

that the length of the repeating cycle in the decimal expansion of 1/p where p isa prime, is a divisor of p − 1. Lets see how we can predict the length of the cyclewithout ever finding the decimal expansion.

Theorem 7.12 (Decimal Expansions). Let ab be a fraction with 0 < a < b and

(a, b) = 1. Say b = 2e5fm with (m, 10) = 1, and that ab has a decimal expansion of

the forma

b= .a1a2 . . . aic1c2 . . . ck,

with i, k minimal, that is, k is the (minimal) length of the repeating cycle, and therepeating cycle does not start earlier. Then i = max(e, f), and k = ordm(10).

Note: The decimal expansion is called purely periodic if i = 0. By the theorem,this will occur iff (b, 10) = 1.

Corollary 7.1. If a/b is a fraction as given in Theorem 7.12 then k|φ(m).

Proof. Let k = ordm(10). By Theorem 7.10 we have k|φ(m). �

Example 7.17. Consider 17 . Let k = ord710 = ord7(3). k|6, so k = 2, 3 or 6, and

one easily finds k = 6. Thus 1/7 is purely periodic with cycle of length 6.

Example 7.18. Consider 125336 . Note 125 = 53, 336 = 24 · 3 · 7. Thus m = 21,

k = ord21(10) = [ord3(10), ord7(10)] = [ord3(1), ord7(3)] = [1, 6] = 6. Thus i =max(e, f) = 4, k = 6. One finds on a calculator 125

336 = .3720238295. Having donethe calculation of i and k ahead of time we are assured that the answer on calculatoris exact.

Proof of Theorem 7.12. Let ab have a decimal expansion as given in the theorem

with i, k minimal.

10ia

b= a1 . . . ai.c1 . . . ck.

and

10i+ka

b= a1 . . . aic1 · · · ck.c1 . . . ck.

Subtracting, we get 10i ab (10k−1) ∈ Z. Thus b|10ia(10k−1). Since (b, a) = 1 this is

equivalent to b|10i(10k − 1), that is, 2e5fm|10i(10k − 1). Thus, by Euclid’s lemma,2e5f |10i and, since (m, 10) = 1, m|(10k − 1). Moreover any i, k satisfying thelast two conditions gives rise to such a decimal expansion. Thus, i is the minimalinteger satisfying 2e5f |10i and k is the minimal integer satisfying m|10k−1. Plainly,i = max(e, f) and k = ordm(10). �

7.9. Primality Testing. How can we efficiently test whether a given 100 digitnumber is a prime? We need the following facts:

i) Fermat’s Little Theorem: If p is a prime and - a then ap−1 ≡ 1 (mod p).ii) If p is a prime and x2 ≡ 1 (mod p) then x ≡ ±1 (mod p).

iii) If p is an odd prime and p - a then ap−12 ≡ ±1 (mod p).

Proof. (i) Done. (ii) p|(x2−1) iff p|(x−1)(x+1) iff p|(x−1) or p|(x+1) iff x ≡ ±1

(mod p). (iii) Let x ≡ ap−12 (mod p). Then by FLT x2 ≡ 1 (mod p) and so by (ii),

x ≡ ±1 (mod p). �

28 TODD COCHRANE

Theorem 7.13 (Composite Number Test). Let m be a positive integer (we wishto test for primality) and b be any integer with (b,m) = 1. If bm−1 6≡ 1 (mod m),then m is composite.

Proof. Proof by contradiction. Suppose that m is a prime. Since (b,m) = 1 wewould then have by FLT that bm−1 ≡ 1 (mod m), a contradiction. �

Definition 7.7. (i) A composite number m is called a pseudoprime to the base bif bm−1 ≡ 1 (mod m) (that is, b satisfies the criterion in FLT).

(ii) A composite number m is called a Carmichael number if m is a pseudoprimeto every base b relatively prime to m.

Note 7.3. (i) Carmichael numbers exist. You show in homework that 561 = 3·11·17is a Carmichael number.

(ii) There exist infinitely many Carmichael numbers.

The Strong Pseudoprime Test for Primality. Let m be a given odd number

we wish to test for primality. Start with base 2, and calculate x ≡ 2m−1

2 (mod m).There are four options: If x 6≡ ±1 (mod m) then m is composite. If x ≡ −1(mod m), pause and change base. If x ≡ 1 (mod m) and 4 - m − 1 then change

base. If x ≡ 1 (mod m) and 4|(m − 1) then calculate y ≡ 2m−1

4 (mod m). Notey2 ≡ 1 (mod m) so if m is a prime we should have y ≡ ±1 (mod m), and repeatthe four options. The number of possible repetitions for a given base is at mostthe multiplicity of 2 dividing m− 1. The next base we test is 3, then 5,7,11,13, ...running through the primes.

By just using bases 2 and 3 we can test any number up to one million andarrive at a definitive conclusion as to whether it is a prime or not. Using bases2,3,5,7,11 we can test any number up to 2 · 1012. This algorithm runs extremelyfast (microsecond) on a computer.

7.10. Public-Key Cryptography. RSA-Method: Rivest, Shamir, Adleman (1978).1. First we need a way of changing words into numbers. This can be public and

as simple as A = 01, B = 02, . . . , Z = 26, space = 00, etc. Thus the word “Hello”would become 0805121215, which we think of as the nine-digit number 805,121,215.Sentences are broken into pieces such that each piece becomes a number less thanthe modulus we are working with.

2. Each person selects two distinct primes p and q each with say 200 digits,and multiplies them to create their public modulus m = pq (with 400 digits). Eachperson also selects an encoding exponent e relatively prime to the value L calculatedin step 3. A public phone book is made listing Name, modulus m and encodingexponent e. The individual primes p and q are kept secret.

3. Each person P also calculates two secret values: (i) L := [p− 1, q − 1]. Thisvalue can be calculated since P knows the individual values p, q. (ii) The decodingexponent d is chosen so that de ≡ 1 (mod L), that is, d ≡ e−1 (mod L). d existssince e was selected relatively prime to L.

4. Encoding the message: Suppose that person P wishes to send a message toperson Q. Person P looks in the phone book for person Q′s m and e, and thenchops his/her message into pieces smaller than m and relatively prime to m. LetM be one such piece. Person P then calculates the least least residue Me of Me

(mod m), that is,

Me ≡Me (mod m), , 0 < Me < m.


Me is called the encoded message.5. The encoded message is delivered to person Q in a public manner. Anyone

is free to look at Me, but it is undecipherable to anyone not having the decodingexponent d.

6. Person Q receives the message and calculates the least residue Md of Mde

(mod m), that is,

Md ≡Mde (mod m), , 0 < Md < m.

We claim that Md = M , that is, person Q has recovered the original message!

Proof. Since Md and M are less than m it suffices to show that Md ≡M (mod m).Claim: ML ≡ 1 (mod m). This is equivalent to ML ≡ 1 (mod p) and ML ≡ 1(mod q). Since L is a multiple of p − 1 we have, by FLT, ML ≡ 1 (mod p), andsince L is a multiple of q − 1 we have ML ≡ 1 (mod q), completing the proof ofthe claim.

Now, since ed ≡ 1 (mod L), we have ed = 1 + kL for some integer k. Thus

Md ≡Mde ≡ (Me)d ≡Med = M1+kL = M · (ML)k ≡M (mod m),

by the claim. QED. �

Example 7.19. Let p = 31, q = 37, m = pq = 1147. L = [p− 1, q − 1] = [30, 36] =180. Let e = 7. Note that (e, L) = 1. Select d ≡ e−1 (mod L), so d = 103. Letssend the message M = 805. Me ≡ 8057 ≡ 650 (mod m). Md ≡ 650103 ≡ 805(mod m).

7.11. Computing powers (mod m). An efficient way to compute powers (mod m)is to use the binary expansion of the power. Most computing software that handlesmodular arithmetic uses this method. Lets illustrate the method with an example.

Example 7.20. Find 2149 (mod m). The binary expansion of 149 is given by

128 + 0 · 64 + 0 · 32 + 16 + 0 · 8 + 4 + 0 · 2 + 1 = 10010101two.

By successive squaring we calculate 22 (mod m), 24 (mod m), 28 (mod m), 216

(mod m), . . . , 2128 (mod m). Start with x = 1. If a digit 1 appears in the binaryexpansion then we replace x with x times the corresponding power of 2, as we goalong. Thus, altogether we will have computed

1 · 21 · 24 · 216 · 2128 ≡ 21+4+16+128 = 2149 (mod m).

7.12. Polynomial Congruences. We wish to solve the congruence

(7.3) f(x) ≡ 0 (mod m),

where f(x) is a polynomial with integer coefficients.The three step process: Let m = pe11 · · · p

ekk .

(i) Solve the congruence f(x) ≡ 0 (mod p)i for each prime pi.(ii) Lift the solutions in (i) from (mod pi) to solutions (mod peii ).(iii) Use CRT to find all possible solutions (mod m) using the info from (ii).

Example 7.21. Solve 26x3 + x2 − 13x+ 5 ≡ 0 (mod 35). Note this is equivalent tosolving the congruence (mod 7) and (mod 5).

(i) Solve (mod 5).

26x3 + x2 − 13x+ 5 ≡ x3 + x2 + 2x = x(x2 + x+ 2) (mod 5).

30 TODD COCHRANE

The quadratic has no zero (mod 5) (as seen by testing 0,1,2,3,4). Thus the onlysolution (mod 5) is x ≡ 0 (mod 5).

(ii) Next solve (mod 7). First note that

26x3 + x2 − 13x+ 5 ≡ 5x3 + x2 + x+ 5 = 5(x3 + 1) + x(x+ 1)

= (x+ 1)(5(x2 − x+ 1) + x) = (x+ 1)(5x2 − 4x+ 5) (mod 7).

The quadratic is again seen to have no solution (mod 7), and so the unique solutionis x ≡ −1 (mod 7). By CRT we then find that the unique solution to the originalcongruence is x ≡ 20 (mod 35).

7.13. Lifting solutions from (mod p) to (mod pe). Let p be a prime and f(x)a polynomial with integer coefficients. Suppose that we wish to solve f(x) ≡ 0(mod p2). Any solution must already be a solution (mod p). Let x1 be an integersolution of the congruence

f(x) ≡ 0 (mod p).

We shall attempt to lift the solution x1 to a solution (mod p2), that is find a pointx2 such that,

(7.4) x2 ≡ x1 (mod p) and f(x2) ≡ 0 (mod p2).

Say x2 = x1 + tp for some t ∈ Z. Can we choose t so that this is a solution(mod p2).

Recall from Calc II the Taylor expansion,

f(a+ y) = f(a) + f ′(a)y +f ′′(a)

2y2 + · · ·+ f (n)(a)

n!yn,

for any a, y ∈ Z. Note that since the coefficients on the left are all integers, so arethe coefficients on the right. Inserting a = x1, y = pt we obtain

f(x1 + tp) = f(x1) + f ′(x1)tp+f ′′(x1)

2(tp)2 + · · · ≡ f(x1) + f ′(x1)tp (mod p2),

since all of the other coefficients are divisible by p2. Thus we need to solve thecongruence

(7.5) Lifting Congruence: f ′(x1)t ≡ −f(x1)

p(mod p).

The three possibilities:(i) If f ′(x1) 6≡ 0 (mod p), then there is a unique solution t of (7.5) and hence a

unique solution x2 of (7.4) (mod p2).(ii) If f ′(x1) ≡ 0 (mod p) and f(x1) 6≡ 0 (mod p2) then there is no solution of

(7.5) and hence no solution of (7.4).(iii) If f(x1) ≡ 0 (mod p) and f(x1) ≡ 0 (mod p2), then any value of t is a

solution of (7.5), and hence there are p distinct solutions of (7.4) (mod p2).

Suppose that we have constructed by induction a sequence of integers x1, x2, . . . xnsuch that

xi+1 ≡ xi (mod pi) and f(xi) ≡ 0 (mod pi),

for i = 1, 2 . . . , n. To continue we wish to find an xn+1 = xn + pnt such thatf(xn + pnt) ≡ 0 (mod pn+1). This amounts to solving

f(xn) + f ′(xn)pnt ≡ 0 (mod pn+1),


or equivalently (noting that f ′(x1) ≡ f ′(xn) (mod p))

f ′(x1)t ≡ −f(xn)

pn. (mod p)

and so again we have three possibilities.

Definition 7.8. A solution x1 of the congruence f(x) ≡ 0 (mod p) is called non-singular if f ′(x1) 6≡ 0 (mod p) and singular if f ′(x1) ≡ 0 (mod p).

Theorem 7.14. If x1 is a nonsingular solution of the congruence f(x) ≡ 0 (mod p)then for any positive integer n there is a unique solution xn (mod pn) of thecongruence f(x) ≡ 0 (mod pn) such that xn ≡ x1 (mod p).

Example 7.22. Solve the congruence x2 ≡ −1 (mod 125). Start with x2 ≡ −1(mod 5) which has solutions ±2. First lets lift 2. Set x = 2 + 5t. f(x) = x2 + 1,f(2) = 5, f ′(2) = 4, and so Lifting Congruence is 4t ≡ −1 (mod 5), which givest ≡ 1 (mod 5), x ≡ 7 (mod 25). Next lift 7. Set x = 7+25t. f(7) = 50. The LiftingCongruence is 4t ≡ −50/25 (mod 5), so t ≡ 2 (mod 5) and x ≡ 57 (mod 125).Clearly, the second solution (obtained by lifting −2) is x ≡ −57 (mod 125).

Example 7.23. Solve x3 + x2 + 23 ≡ 0 (mod 53). Start with the same congruence(mod 5). By trial and error we see that x ≡ 1 or 2 (mod 5).

(i) Take x1 = 1. Put x = 1 + 5t. Note that f ′(1) = 5 ≡ 0 (mod p), that is 1is a singular solution, while f(1)/5 = 5 ≡ 0 (mod 5). Thus we have have option(iii), that is, the lifting congruence is 0t ≡ 0 (mod 5), so t is arbitrary and weget x2 = 1 + 5t = 1, 6, 11, 16, 21. Now f(1 + 5t)/25 = 4t2 + t + 1, and we seef(1 + 5t)/25 ≡ 0 (mod 5) iff t = 3. Thus for x2 = 16 we have option (iii) and getfive liftings to solution (mod 125), namely x ≡ 16, 41, 66, 91, 116 (mod 125).

If one continues this to (mod 54) one discovers that all of the solutions (mod 53)lift. Thus there are 25 solutions (mod 625) all living above x1 = 1.

(ii) Since x1 = 2 is a nonsingular solution, there is a unique lifting each time.We obtain x2 ≡ 17 (mod 25) and x3 ≡ 42 (mod 125), and (if we continue one morelevel) x4 ≡ 417 (mod 625).

This information can be displayed in a tree graph with vertices 1 and 2 at thetop and branches below for the (mod 25), (mod 125), (mod 625) liftings.

Example 7.24. Solve the congruence f(x) = x3+7x2+x = x(x−1)2 ≡ 0 (mod 32).

7.14. Counting Solutions of congruences.

Example 7.25. Suppose that we wish to count the number of solutions of f(x) ≡ 0(mod 35), where f(x) is a polynomial over Z. We start by solving the congruencesf(x) ≡ 0 (mod 5) and f(x) ≡ 0 (mod 7). Say a1, a2, . . . , ar are the solutions ofthe former, and b1, . . . , bs the solutions of the latter. By CRT, for any choice of i, jthere is a unique x (mod 35) with

x ≡ ai (mod 5)

x ≡ bj (mod 7).

By the substitution principle, f(x) ≡ f(ai) ≡ 0 (mod 5) and f(x) ≡ f(bj) ≡ 0(mod 7), and so f(x) ≡ 0 (mod 35). Thus, altogether, we obtain rs solutions(mod 35).

32 TODD COCHRANE

The content of this example is given in the following theorem.

Theorem 7.15. Let f(x) be a polynomial with integer coefficients and m a positiveinteger with factorization m = pe11 · · · p

ekk . Then

(i) x is a solution of the congruence

(7.6) f(x) ≡ 0 (mod m)

if and only if x satisfies the system of congruences

(7.7) f(x) ≡ 0 (mod peii ), 1 ≤ i ≤ k.(ii) Letting N(m) denote the number of solutions of (mod m) and N(peii ) denote

the number of solutions of (2.66), we have N(m) = Πki=1N(peii ).

Proof. (i) m|f(x)⇔ peii |f(x), 1 ≤ i ≤ k.(ii) We claim that the CRT gives us a one-to-one correspondence between the

k−tuples (x1, . . . , xk) ∈ Z/(pe11 ) × · · · × Z/(pekk ) with xi a solution of (7.7) for1 ≤ i ≤ k and the solutions x of (7.6). Indeed, suppose that xi is a solution of (7.7)for 1 ≤ i ≤ k, and let x (mod m) be the unique value with x ≡ xi (mod peii ) ,1 ≤ i ≤ k. Such an x satisfies f(x) ≡ f(xi) ≡ 0 (mod peii ) for all i, and so f(x) ≡ 0(mod m). �

7.15. Solving congruences (mod p). As we saw above, in order to solve a poly-nomial congruence (mod m), one starts by solving congruences (mod p) where pis a prime. For small p this is generally done by trial and error. Another tool thatcan be useful is the factor theorem for congruences.

Theorem 7.16. Factor Theorem Suppose that p is a prime, f is a polynomialof degree d over Z, and that a is a solution of the polynomial congruence f(x) ≡ 0(mod p). Then f(x) ≡ (x − a)g(x) (mod p), for some polynomial g(x) over Z ofdegree d− 1.

Note: To say two polynomials are congruent (mod p) means that all of thecorresponding coefficients are congruent (mod p).

Proof. We are given that f(a) ≡ 0 (mod p). Thus f(a) = kp for some k ∈ Z. Leth(x) = f(x) − kp. Then a is a zero of h(x) and so by the factor theorem for Z,(x−a) is a factor of h(x), that is, h(x) = (x−a)g(x) for some polynomial g(x) overZ. Clearly deg(g) = d− 1 and f(x) = (x− a)g(x) + pk, that is, f(x) ≡ (x− a)g(x)(mod p). �

Definition 7.9. (i) We say that a is a zero of a polynomial f(x) (mod p), iff(a) ≡ 0 (mod p). In this case (x− a) is a factor of f(x) (mod p).

(ii) We say that a is a zero of f(x) (mod p) of multiplicity k, if (x − a)k is afactor of f(x) (mod p), that is, f(x) ≡ (x− a)kg(x) (mod p) for some polynomialg(x).

Example 7.26. (i) Let f(x) = xp − 1. Since p|(pk

)for 1 ≤ k ≤ p − 1 we have

xp − 1 ≡ (x− 1)p (mod p), and so 1 is a zero of f(x) (mod p) of multiplicity p.(ii) Let f(x) = xp−1 − 1. By FLT, 1, 2, . . . , p − 1 are all zeros of f(x) (mod p),

and soxp−1 − 1 ≡ (x− 1)(x− 2) . . . (x− (p− 1)) (mod p).

In particular, matching the constant terms on the RHS and LHS, we obtain Wilson’sTheorem, (p− 1)! ≡ −1 (mod p).


Theorem 7.17. Lagrange’s Theorem Let f(x) be a polynomial of degree d overZ, and p a prime.Then the congruence f(x) ≡ 0 (mod p) has at most d solutions(counted with multiplicity).

Proof. The proof is by induction on d. When d = 1 the statement is trivial.Indeed, a linear congruence has either no solution or 1 solution (mod p). Supposethe statement is true for d and now let f be a polynomial of degree d+ 1. If f hasno zero (mod p) we are done. Otherwise, let a be a zero of f (mod p). Then, bythe factor theorem f(x) ≡ (x− a)g(x) (mod p) for some polynomial g(x) of degreed. By the induction assumption g(x) has at most d zeros (mod p). Thus f(x) hasat most d + 1 zeros, since if f(x) ≡ 0 (mod p) then either x − a ≡ 0 (mod p), org(x) ≡ 0 (mod p) (since p is a prime.) Thus either x ≡ a (mod p), or x is one ofthe zeros of g(x) (mod p). �

Note 7.4. Lagrange’s Theorem fails for composite moduli. For example if f(x) =x2 − 1 and m = p1 · · · pk, a product of k distinct primes, then f(x) ≡ 0 (mod m)has 2k distinct solutions (mod m), even though f(x) is just of degree 2.

Example 7.27. Solve x3 + x + 1 ≡ (mod 11). Plainly x = 2 is a solution, and so(x − 2) is a factor. By long division we obtain x3 + x + 1 ≡ (x − 2)(x2 + 2x + 5)(mod 11). By trial and error one can check that the quadratic has no solution.Thus x = 2 is the only solution.

Example 7.28. Solve the congruence x3 + x+ 1 (mod 312). Hint: Note that 3 is asolution (mod 31). Use factor theorem and quadratic formula to obtain others.

Example 7.29. Solve the congruence x495−x24+3 ≡ 0 (mod 7). Hint: Use FermatsLittle Theorem to make life easier.

8. Quadratic Residues and the Legendre Symbol

Definition 8.1. Let p be a prime, a ∈ Z, with p - a. a is called a quadratic residue(mod p) if a ≡ x2 (mod p) for some integer x. Otherwise a is called a quadraticnon-residue (mod p).

Example 8.1. 5 is a quadratic residue (mod 11) since 72 = 49 ≡ 5 (mod 11).

Theorem 8.1. Exactly p−12 values (mod p) are quadratic residues and p−1

2 arenot.

Proof. The quadratic residues are 12, 22, . . . , (p − 1)2 (mod p). Note x2 ≡ y2

(mod p) iff x ≡ ±y (mod p). Thus the valued 12, 22, . . . ,(p−12

)2(mod p) are the

distinct quadratic residues. �

Example 8.2. Find all quadratic residues (mod 11). 12, 22 ≡ 4, 32 ≡ 9, 42 ≡ 5, 52 ≡3 (mod 11).

Definition 8.2. Let p be a prime, a ∈ Z, p - a. The Legendre symbol is defined by(a

p

)=

{1 if a is a quadratic residue (mod p);

−1 if a is a quadratic non-residue (mod p).

Example 8.3. ( 511 ) = 1, ( 2

3 ) = −1.

34 TODD COCHRANE

Theorem 8.2. Euler’s Criterion. Let p be an odd prime and a ∈ Z with p - a.Then

(8.1)

(a

p

)≡ a

p−12 (mod p).

Proof. We’ve already seen that the RHS is equiv ±1 (mod p) (by FLT). Supposethat a is a quadratic residue, so that a ≡ x2 (mod p) for some integer x. Then

ap−12 ≡ xp−1 ≡ 1 (mod p), and so both sides of (8.1) are 1. Thus all p−12 quadratic

residues are solutions of the congruence xp−12 ≡ 1 (mod p). Since this is a poly-

nomial of degree p−12 it cannot have any other solutions by Lagrange’s theorem.

Thus for any quadratic nonresidue (mod p) the RHS must be -1, agreeing withthe LHS. �

Example 8.4. ( 313 ) ≡ 36 ≡ (33)2 ≡ 1 (mod 13) so 3 is a quadratic residue. Indeed

42 ≡ 3 (mod 13).

Theorem 8.3. Multiplicative property of Legendre symbol. Suppose that p is aprime and that a, b ∈ Z with p - ab. Then(

ab

p

)=

(a

p

)(b

p

).

Proof. Trivial for p = 2. Suppose p is odd. By Euler criterion we have(ab

p

)≡ (ab)

p−12 ≡ a

p−12 b

p−12 ≡

(a

p

)(b

p

)(mod p).

Since the LHS and RHS are both ±1 we see that equality (as integers) follows. �

Theorem 8.4. Trivial Properties of Legendre symbol. Suppose that p is a prime.

(i) For any integer a with p - a, we have (a2

p ) = 1.

(ii) For any integers a, b with a ≡ b (mod p), we have (ap ) = ( bp ).

Theorem 8.5. Legendre symbol for −1 and 2. a) For any odd prime p we have(−1

p

)=

{1 if p ≡ 1 (mod 4);

−1 if p ≡ 3 (mod 4).

b) For any odd prime p we have(2

p

)=

{1 if p ≡ ±1 (mod 8);

−1 if p ≡ ±3 (mod 8).

Proof. a) Suppose p ≡ 1 (mod 4), say p = 1 + 4k, k ∈ N. Then (−1)(p−1)/2 =(−1)2k = 1 and so by Euler’s criterion, −1 is a quadratic residue (mod p). If p ≡ 3(mod 4), say p = 3 + 4k, then (−1)(p−1)/2 = (−1)2k+1 = −1, so -1 is a quadraticnon-residue.

b) Suppose that p ≡ 1 (mod 4), and so p ≡ 1 or 5 (mod 8). We calculate2(p−1)/2 (mod p) two different ways. Set

Q = 2 · 4 · 6 · · · (p− 1).


First note that Q = 2(p−1)/2((p− 1)/2)!. Also, noting that p−12 is even, we have

Q =

(2 · 4 · · · p− 1

2

)(p+ 3

2· · · (p− 1)

)≡(

2 · 4 · · · p− 1

2

)(−(p− 3)

2· · · (−5)(−3)(−1)

)(mod p)

≡ (−1)p−14 1 · 2 · 3 · 4 · · ·

(p− 3

2

)(p− 1

2

)(mod p).

Equating the two expressions for Q and canceling the ((p− 1)/2)! we obtain

2p−12 ≡ (−1)

p−14 (mod p).

If p ≡ 1 (mod 8) then the RHS = 1, while if p ≡ 5 (mod 8) then RHS = -1. Theformula for ( 2

p ) then follows from Euler’s criterion.

Next, suppose that p ≡ 3 (mod 4), so that p−32 is even. Then

Q =

(2 · 4 · · · p− 3

2

)(p+ 1

2· · · (p− 1)

)≡(

2 · 4 · · · p− 3

2

)(−(p− 1)

2· · · (−5)(−3)(−1)

)(mod p)

≡ (−1)p+14

(p− 1

2

)! (mod p)

and so2

p−12 ≡ (−1)

p+14 (mod p).

If p ≡ 3 (mod 8) then RHS = -1, while if p ≡ 7 (mod 8) then RHS = 1, completingthe proof. �

8.1. Quadratic Reciprocity. Consider solving the two congruences,

x2 ≡ 7 (mod 1009),

x2 ≡ 1009 (mod 7).

Which one is easier? Is there any connection between these two congruences? Note( 1009

7 ) = (17 ) = 1.

Theorem 8.6. Quadratic Reciprocity. Let p, q be distinct odd primes. Then(pq ) = ( qp ) unless p ≡ q ≡ 3 (mod 4), in which case (pq ) = −( qp ).

We will not do the proof here.

Example 8.5. Find ( 71009 ). Since 1009 ≡ 1 (mod 4) we have ( 7

1009 ) = (10097 ) = 1.

Example 8.6. Find ( 227137 ), noting that 137 is a prime and that 137 ≡ 1 (mod 8),(

227

137

)=

(90

137

)=

(9

137

)(10

137

)=

(10

137

)=

(2

137

)(5

137

)=

(5

137

)=

(137

5

)=

(2

5

)= −1.

Corollary 8.1. The Legendre symbol for 3. For any odd prime p we have(3

p

)=

{1, if p ≡ ±1 (mod 12);

−1, if p ≡ ±5 (mod 12).

36 TODD COCHRANE

Proof. Suppose that p ≡ 1 (mod 4). Then by quadratic reciprocity ( 3p ) = (p3 ).

Since 1 is the unique quadratic residue (mod 3), we see that if p ≡ 1 (mod 3),( 3p ) = 1 and if p ≡ 2 (mod 3) then ( 3

p ) = −1. Now by CRT, if p ≡ 1 (mod 4) and

p ≡ 1 (mod 3) then p ≡ 1 (mod 12), while if p ≡ 1 (mod 4) and p ≡ 2 (mod 3)then p ≡ 5 (mod 12).

Next, suppose that p ≡ 3 (mod 4). Then by quadratic reciprocity ( 3p ) = −(p3 )

which equals -1 if p ≡ 1 (mod 3), and 1 if p ≡ 2 (mod 3). Now, by CRT if p ≡ 3(mod 4) and p ≡ 1 (mod 3) then p ≡ 7 (mod 12), while if p ≡ 3 (mod 4) and p ≡ 2(mod 3) then p ≡ 11 (mod 12). �

8.2. Sums of two squares. When can a prime p be expressed as a sum of twosquares. It is easy to see that if p is odd and p = a2 + b2, then since a2 ≡ 0 or 1(mod 4), we must have p ≡ 1 (mod 4). Test such p: 5 = 12 + 22, 13 = 22 + 32,17 = 12 + 42, 29 = 52 + 22,.. It is reasonable to conjecture the following result

Theorem 8.7. Let p be an odd prime. Then p is a sum of two squares if and onlyif p ≡ 1 (mod 4).

Proof. Suppose that p ≡ 1 (mod 4). Then (−1p ) = 1, so there exists a u ∈ Zwith u2 ≡ −1 (mod p). Consider the set of integers of the form x + uy withx, y ∈ [0,

√p] ∩ Z}. Since there are ([

√p] + 1)2 > p choices for (x, y), there must

exist, by the pigeonhole principle, distinct (x1, y1) 6= (x2, y2) with

x1 + uy1 ≡ x2 + uy2 (mod p), that is, (x1 − x2) ≡ u(y2 − y1) (mod p).

Set a = x1 − x2, b = y1 − y2. Then |a| < √p, |b| < √p and a ≡ ub (mod p).

Therefore a2 + b2 ≡ (1 + u2)b2 ≡ 0 (mod p) and a2 + b2 < 2p. Furthermore, since(x1, y1) 6= (x2, y2), a2 + b2 > 0. Thus a2 + b2 = p. �

Department of Mathematics, Kansas State University, Manhattan, KS 66506

E-mail address: [email protected]

Introduction - Kansas State Universitycochrane/m506/m506s13lec.pdf · Squaring numbers that end in...

Documents

Transcript of Introduction - Kansas State Universitycochrane/m506/m506s13lec.pdf · Squaring numbers that end in...