Algebraic Number Theory summary of notesempslocal.ex.ac.uk/people/staff/rjchapma/notes/ant2.pdf ·...

Algebraic Number Theorysummary of notes

Robin Chapman

3 May 2000, revised 28 March 2004, corrected 4 January 2005

This is a summary of the 1999–2000 course on algebraic number the-ory. Proofs will generally be sketched rather than presented in detail. Also,examples will be very thin on the ground.

I first learnt algebraic number theory from Stewart and Tall’s textbookAlgebraic Number Theory (Chapman & Hall, 1979) (latest edition retitledAlgebraic Number Theory and Fermat’s Last Theorem (A. K. Peters, 2002))and these notes owe much to this book.

I am indebted to Artur Costa Steiner for pointing out an error in anearlier version.

1 Algebraic numbers and integers

We say that α ∈ C is an algebraic number if f(α) = 0 for some monicpolynomial f ∈ Q[X]. We say that β ∈ C is an algebraic integer if g(α) = 0for some monic polynomial g ∈ Z[X]. We let A and B denote the setsof algebraic numbers and algebraic integers respectively. Clearly B ⊆ A,Z ⊆ B and Q ⊆ A.

Lemma 1.1 Let α ∈ A. Then there is β ∈ B and a nonzero m ∈ Z withα = β/m.

Proof There is a monic polynomial f ∈ Q[X] with f(α) = 0. Let m be theproduct of the denominators of the coefficients of f . Then g = mf ∈ Z[X].Write g =

∑nj=0 ajX

j. Then an = m. Now

h(X) = mn−1g(X/m) =n∑

j=0

mn−1+jajXj

1

is monic with integer coefficients (the only slightly problematical coefficientis that of Xn which equals m−1Am = 1). Also h(mα) = m

n−1g(α) = 0.Hence β = mα ∈ B and α = β/m. �

Let α ∈ A. Then there is a monic polynomial f ∈ Q[X] of least degreesuch that f(α) = 0. This polynomial is uniquely determined.

Proposition 1.1 Let α ∈ A. Then there is precisely one monic polynomialf ∈ Q[X] of minimum degree with f(α) = 0. This polynomial f has theproperty that if g ∈ Q[X] and g(α) = 0 then f | g.

Proof Note first that if h ∈ Q[X] is a nonzero polynomial with deg(h) <deg(f), then h(α) 6= 0 since otherwise h1 = a−1h is a monic polynomial,where a is the leading coefficient of h, with the property that deg(h1) <deg(f) and h1(α) = 0. That would contradict the definition of f . Nowf is unique, since if f1 had the same degree as f and also satisfied thesame conditions then h = f − f1, if nonzero, has h ∈ Q[X], h(α) = 0 anddeg(h) < deg(g) which is impossible.

Now let g ∈ Q[X] and suppose that g(α) = 0. By the division algorithm(Proposition A.1), g = qf + h where q, h ∈ Q[X], and either h = 0 (whichmeans that f | g as we want) or h 6= 0 and deg(h) < deg(f). But ash(α) = g(α)− f(α)q(α) = 0 this is impossible. �

We call this f the minimum polynomial of α and call its degree the degreeof α. Minimum polynomials are always irreducible.

Lemma 1.2 Let f be the minimum polynomial of α ∈ A. Then f is irre-ducible over Q.

Proof If f is not irreducible then f = gh where g, h ∈ Q[X] are monicpolynomials of degree less than that of f . Then 0 = f(α) = g(α)h(α) andso either g(α) = 0 or h(α) = 0. We may assume g(α) = 0. Then f | g whichis impossible since deg(g) < deg(f). �

Suppose the minimum polynomial f of α lies in Z[X]. Then, since f ismonic and f(α) = 0, α is an algebraic integer. In fact the converse holds: ifα ∈ B then its minimum polynomial lies in Z[X]. We need to study integerpolynomials in more detail to prove this.

A nonzero polynomial f ∈ Z[X] is primitive if the greatest commondivisor of its coefficients is 1. Equivalently f is primitive if there is no primenumber dividing all its coefficients.

Lemma 1.3 (Gauss’s Lemma) Let f , g ∈ Z[X] be primitive polynomials.Then fg is also a primitive polynomial.

2

Proof Write

f(X) = a0 + a1X + a2X2 + · · ·+ amXm

andg(X) = b0 + b1X + b2X

2 + · · ·+ bnXn.

We show that there is no prime p dividing all the coefficients of fg. Take aprime p. As f is primitive there is a coefficient of f not divisible by p; letar be the first such. Similarly let bs be the first coefficient of g not divisibleby p. Then p | ai for i < r and p | bj for j < s. The coefficient of Xr+s in fgis

cr+s =∑

i+j=r+s

aibj.

This sum contains the term arbs, which is not divisible by p. Its other termsaibj are all divisible by p, since they either have i < r or j < s. Hence cr+sis not divisible by p.

As there is no prime dividing all the coefficients of fg, the polynomial fgis primitive. �

If f ∈ Z[X] is nonzero, let a be the greatest common divisor of thecoefficients of f . Then f = af1 where f1 is primitive. We call a the contentof f and denote it by c(f). More generally, let g be a nonzero element ofQ[X]. Then bg ∈ Z[X] where the positive integer b is the product of thedenominators of the coefficients of f . Then bg = cg1 where c is the contentof bg and g1 is primitive. Hence g = (c/b)g1 where q1 is primitive polynomialin Z[X] and c/b is a positive rational. We can write any nonzero g ∈ Z[X] asg = rg1 with r ∈ Q, r > 0 and g1 ∈ Z[X] being primitive. It’s an instructiveexercise to show that this r is uniquely determined; hence it makes senseto call r the content of g. Putting s = 1/r we see that there is a positiverational s with sg a primitive element of Z[X].

We now show that if a monic polynomial in Z[X] factors over the rationalsthen it factors over the integers.

Proposition 1.2 Let f and g be monic polynomials with f ∈ Z[X] andg ∈ Q[X]. If g | f then g ∈ Z[X].

Proof Suppose that g | f . Then f = gh where h ∈ Q[X]. Then h ismonic, as both f and g are. There are positive rationals r and s with rg andsh primitive elements of Z[X]. The leading coefficients of rg and sh are rand s respectively, so that r, s ∈ Z. By Gauss’s lemma, (rg)(sh) = (rs)fis primitive. But since f ∈ Z[X] all coefficients of (rs)f are divisible by rs.

3

Hence rs = 1 (as rs is a positive integer) and so r = s = 1 (as r and s arepositive integers). Thus g = rg ∈ Z[X] as required. �

An immediate corollary is this important characterization of algebraicintegers.

Theorem 1.1 Let α ∈ A have minimum polynomial f . Then α ∈ B if andonly if f ∈ Z[X].

Proof Suppose f ∈ Z[X]. Since f is monic and f(α) = 0 then α ∈ B.Conversely suppose that α ∈ B. There is a monic g ∈ Z[X] with g(α) = 0.

Then f | g, since f is the minimum polynomial of α. By Proposition 1.2,f ∈ Z[X]. �

Another corollary is this useful criterion for irreducibility.

Proposition 1.3 (Eisenstein’s criterion) Let p be a prime number andlet

f(X) = Xn +n−1∑j=0

ajXj ∈ Z[X].

If

• p | aj when 0 ≤ j < n, and

• p2 - a0

then f is irreducible over Q.

Proof Suppose that f is reducible over Q. Then f = gh where g, h ∈ Q[X],g and h are monic, and also 0 < r = deg(g) < n and deg(g) = s = n− r. ByProposition 1.2, g, h ∈ Z[X]. Write

g(X) =r∑

i=0

biXi and h(X) =

s∑j=0

cjXj.

Note that br = 1 = cs. Certainly p - br and p - cs. Let u and v be the leastnonnegative integers with p - bu and p - cv. Then u ≤ r and v ≤ s. I claimthat u = r and v = s. Otherwise u+ v < r + s = n and

au+v =∑

i+j=u+v

bicj.

This sum contains the term bucv which which is not divisible by p. Theremaining terms have the form bicj with either i < u or j < v. In each case

4

one of bi and cj is divisible by p. Hence au+v is the sum of a nonmultiple ofp with a collection of multiples of p and so p - au+v contrary to hypothesis.Hence u = r and v = s. As r, s > 0 both b0 and c0 are divisible by p so thata0 = b0c0 is divisible by p

2 again contrary to hypothesis. This contradictionshows that f is irreducible over Q. �

Example Let p be a prime number and let

f(X) = 1 +X +X2 + · · ·+Xp−1 =p−1∑j=0

Xj =Xp − 1X − 1

.

We cannot apply Eisenstein to f directly, but if we set f1(X) = f(X +1) weget

f1(X) =(X + 1)p − 1(X + 1)− 1

=(X + 1)p − 1

X=

p−1∑j=0

(p

j

)Xp−j−1.

This is a monic polynomial, but its remaining coefficients have the form(

pj

)for 0 < j < p and so are divisible by p. The final coefficient is

(p

p−1

)= p which

is not divisible by p2. By Eisenstein’s criterion, f1 is irreducible over Q. Itfollows that f is irreducible over Q, for if f(X) = g(X)h(X) were a nontrivialfactorization of f , then f1(X) = g(X + 1)h(X + 1) would be a nontrivialfactorization of f1.

We now show that A is a subfield of C and B is a subring of C.

Theorem 1.2 (i) Let α, β ∈ A. Then α + β, α− β, αβ ∈ A, and if α 6= 0then α−1 ∈ A.(ii) Let α, β ∈ B. Then α+ β, α− β, αβ ∈ B.

Proof We first prove (ii) in detail, since the bulk of the proof of (i) followsmutatis mutandis.

Let α and β have minimum polynomials f and g of degrees m and nrespectively. Write

f(X) = Xm +m−1∑i=0

aiXi and g(X) = Xn +

n−1∑j=0

bjXj.

Then the ai and bj are integers and

αm = −m−1∑i=0

aiαi and βn = −

n−1∑j=0

bjβj. (∗)

5

Let v be the column vector of height mn given by

v> = (1 α α2 · · · αm−1 β αβ α2β · · · αm−1β β2 · · · αm−1βn−1).

In other words the entries of v are the numbers αiβj for 0 ≤ i < m and0 ≤ j < n. I claim that there are (mn-by-mn) matrices A and B withentries in Z such that Av = αv and Bv = βv. The typical entry in αv hasthe form αiβj where 1 ≤ i ≤ m and 0 ≤ j < n. If i < m this already is anentry of v while if i = m, (∗) gives

αmβj = −m−1∑k=0

akαkβj.

In any case this entry αiβj of αv is a linear combination, with integer coeffi-cients, of the entries of v. Putting these coefficients into a matrix A we getαv = Av. Similarly there is a matrix B with integer entries with βv = Bv.

Now (A+B)v = (α+β)v, (A−B)v = (α−β)v and (AB)v = (αβ)v. Asv 6= 0 the numbers α+β, α−β and αβ are eigenvalues of the matrices A+B,A − B and AB each of which has integer entries. But if the matrix C hasinteger entries, its eigenvalues are algebraic integers, since the characteristicpolynomial of C is a monic polynomial with integer coefficients. It followsthat α+ β, α− β and αβ are all algebraic integers.

If we assume instead that α, β ∈ A, the above argument shows (whenwe replace ‘integer’ by ‘rational’ etc.) that α+ β, α− β, αβ are all algebraicnumbers.

Finally suppose that α is a nonzero algebraic number with minimumpolynomial

f(X) = Xn +n−1∑i=0

aiXi.

Then a0 6= 0 (why?) and dividing the equation f(α) = 0 by a0αn gives

α−n +n−1∑i=1

an−ia0

α−i +1

a0= 0

so that α−1 ∈ A. �

Example Let us see what the matrices A and B are for say α =√

2and β = 1

2(1 +

√5). The minimum polynomials of α and β are X2 − 2 and

X2−X−1 respectively so that α2 = 2 and β2 = β+1. Let v> = (1 α β αβ).

6

Then

αv =

αα2

αβα2β

=

α2αβ2β

=

0 1 0 02 0 0 00 0 0 10 0 2 0

1αβαβ

and

βv =

βαββ2

αββ2

=

βαβ

1 + βα+ αβ

=

0 0 1 00 0 0 11 0 1 00 1 0 1

1αβαβ

.We can take

A =

0 1 0 02 0 0 00 0 0 10 0 2 0

and B =

0 0 1 00 0 0 11 0 1 00 1 0 1

.Then, for instance, αβ is an eigenvalue of

AB =

0 0 0 10 0 2 00 1 0 12 0 2 0

so that h(αβ) = 0 where h is the characteristic polynomial of AB.

2 Number fields

The set A of algebraic numbers is too large to handle all at once. We restrictour consideration to looking at smaller subfields of A which contain all thealgebraic numbers “generated” from a given one. For instance consider

K1 = Q(i) = {a+ bi : a, b ∈ Q}

andK2 = Q(

3√

2) = {a+ b 3√

2 + c3√

4 : a, b, c ∈ Q}.It is apparent that both K1 and K2 are rings, being closed under addition,subtraction and multiplication. It’s not hard to see that K1 is a field since ifa+ bi is a nonzero element of K1 then

1

a+ bi=

a

a2 + b2− ba2 + b2

i ∈ Q(i).

7

But it is not so obvious that 1/(a + b 3√

2 + c 3√

4) is an element of K2. Butthis is in fact so, and is an example of a general phenomenon.

Let α be an algebraic number of degree n. Define

Q(α) = {a0 + a1α+ a2α2 + · · ·+ an−1αn−1 : a0, a1, . . . , an−1 ∈ Q}.

Proposition 2.1 For each α ∈ A, Q(α) is a subfield of A.

Proof Since A is closed under addition and multiplication, and α ∈ A andQ ⊆ A then it is apparent that Q(α) ⊆ A.

Let α have degree n and minimum polynomial f . Then by definition

Q(α) = {g(α) : g ∈ Q[X], and either g = 0 or deg(g) < n}.

I claim that in factQ(α) = {g(α) : g ∈ Q[X]}.

Certainly Q(α) ⊆ {g(α) : g ∈ Q[X]} so that to prove equality we needto show that g(α) ∈ Q(α) whenever g ∈ Q[X]. By the division algorithm(Proposition A.1) there is q ∈ Q[X] such that h = g − qf either vanishes orhas deg(h) < n. Then h(α) ∈ Q(α) but h(α) = g(α) − f(α)q(α) = g(α) asf(α) = 0. This proves that Q(α) = {g(α) : g ∈ Q[X]}.

It is now clear that, since Q[X] is closed under addition, subtraction andmultiplication then so is Q(α). Hence Q(α) is a subring of A. (Alternatively,one sees that the map g 7→ g(α) from Q[X] to A is a ring homomorphismwith image Q(α) which must therefore be a subring of A.)

To complete the proof that Q(α) is a field, we must show that 1/β ∈ Q(α)whenever β is a nonzero element of Q(α). Write β = g(α) with g ∈ Q[X]and note that f - g since otherwise g(α) = 0. Let h be the greatest commondivisor of f and g. By Proposition A.2, there exist u, v ∈ Q[X] with h =uf +vg. But f is irreducible, and so either h = 1 or h = f . But this latter isimpossible since h - f . Hence 1 = uf +vg and so 1 = u(α)f(α)+v(α)g(α) =v(α)β. It follows that 1/β = v(α) ∈ Q(α) and so K is a field. �

The numbers 1, α, α2, . . . , αn−1, where n is the degree of α, form a basisof Q(α) as a vector space over Q. Thus the degree n is also the dimensionof Q(α) as a vector space over Q, and so we call n the degree of Q(α). Ingeneral when we speak of a basis for a number field K = Q(α) me mean abasis for K as a vector space over Q.

Given α ∈ A of degree n, its minimum polynomial f factors over C as

f(X) =n∏

j=1

(X − αj)

8

where α = α1 say. The numbers α1, . . . , αn are the conjugates of α. Theyare all algebraic numbers with minimal polynomial f . It is important tonote that the conjugates of α are all distinct. This follows from the followinglemma.

Lemma 2.1 Let f ∈ Q[X] be a monic polynomial and suppose that f isirreducible over Q. Then f(X) = 0 has n distinct roots in C.

Proof Suppose that α is a repeated root of f(X) = 0. Then f(X) =(X − α)2g(X) where g(X) ∈ C[X]. Consequently f ′(X) = (X − α)2g′(X) +2(X − α)g(X) and so f(α) = f ′(α) = 0. Let h be the greatest commondivisor of f and f ′. Then h = uf + vf ′ for some u, v ∈ Q[X]. Thush(α) = u(α)f(α) + v(α)f ′(α) = 0. But as h | f and f is irreducible, thenh = 1 of h = f . Since h(α) = 0, h = f . But then f | f ′ and as f ′ has leadingterm nXn−1 this is impossible. �

The field Q(α) forms the set of numbers which can be expressed in termsof rational numbers and α using the standard arithmetic operations. Wemight instead consider what happens when we take two algebraic numbersα and β and consider which numbers can be expressed in terms of both.Suppose α and β have degrees m and n respectively, and define

Q(α, β) =

{m−1∑j=0

n−1∑k=0

cjkαjβk : cjk ∈ Q

}.

It is readily apparent that Q(α, β) is a ring; less apparent but nonethelesstrue that it is a field. However this field can be expressed in terms of onegenerator.

Theorem 2.1 (Primitive element) Let α, β ∈ A. Then there is γ ∈ Awith Q(α, β) = Q(γ).

Proof We show that for a suitable rational number c, γ = α + cβ suffices.Let α and β have degrees m and n respectively, and let their minimumpolynomials be

f(X) =m∏

j=1

(X − αj) and g(X) =n∏

k=1

(X − βk)

respectively, with α = α1 and β = β1. Suppose that 1 ≤ j ≤ m and2 ≤ k ≤ n. The equation

α+ xβ = αj + xβk

9

can be rewritten as(β1 − βk)x = α1 − αj

and so has exactly one solution x = xjk as β1 6= βk. Choose c to be a nonzerorational which is not equal to any of the xjk. This is possible as Q is aninfinite set. Then

α+ cβ 6= αj + cβkwhenever k 6= 1, by the choice of c. Let γ = α + cβ. For convenience putK = Q(γ). I claim that Q(α, β) = K.

Certainly γ ∈ Q(α, β) and as Q(α, β) is a ring, then K ⊆ Q(α, β). Toprove that K ⊇ Q(α, β) it suffices to show that α ∈ K and β ∈ K. Leth(X) = f(γ − cX). Then h has degree m, as c 6= 0, and h ∈ K[X]. Alsoh(β) = f(γ− cβ) = f(α) = 0. But of course g(β) = 0 so that g and h have βas a common zero. Suppose that it had another one, so that g(δ) = h(δ) = 0.Then δ = βk for some k ≥ 2 as g(δ) = 0. But then 0 = h(βk) = f(γ − cβk)so that γ − cβk = αj for some j. Thus γ = αj + cβk which is false by thechoice of c.

The greatest common divisor of g(X) and h(X) must be X − β. Asg ∈ Q[X] ⊆ K[X] and h ∈ K[X] there exists u, v ∈ K[X] with u(X)g(X) +h(X)v(X) = X − β. Thus β = −(u(0)g(0) + h(0)v(0)) ∈ K, and it followsthat α = γ − cβ ∈ K. This completes the proof. �

More generally we can consider fields Q(β1, . . . , βn) generated by anyfinite number of algebraic numbers. But by using the primitive elementtheorem and induction we see that each such field still has the form Q(γ).We call a field of the form Q(α) for α ∈ A an algebraic number field orsimply a number field.

Let α1, . . . , αn be the conjugates of α. The fields Q(αj) are very similarto Q(α) each being generated by an element with minimum polynomial f . Infact they are all isomorphic. We define an isomorphism σj : Q(α) → Q(αj)by setting σj(g(α)) = g(αj) when g ∈ Q[X]. It is perhaps not immediatelyevident that σj is well-defined. But this follows since if g1, g2 ∈ Q[X] andg1(α) = g2(α) then g1(αj) = g2(αj). This is a consequence of α and αj havingthe same minimum polynomial. Once σj is seen to be well-defined, then itis straightforward to prove it is an isomorphism. As α1 = α then σ1 is theidentity map on Q(α).

Let β ∈ Q(α). We define the norm N(β) and trace T (β) of β as follows.Let

N(β) =n∏

j=1

σj(β)

10

and

T (β) =n∑

j=1

σj(β).

Since the σj preserve addition and multiplication, the following propertiesare almost immediate:

• N(βγ) = N(β)N(γ) for all β, γ ∈ Q(α),

• N(cβ) = cnN(β) for all c ∈ Q and β ∈ Q(α),

• T (β + γ) = T (β)T (γ) for all β, γ ∈ Q(α), and

• T (cβ) = cT (β) for all c ∈ Q and β ∈ Q(α).

Clearly N(0) = 0 and N(1) = 1. If β 6= 0 then 1 = N(1) = N(β)N(1/β)so that N(β) 6= 0.

A word of warning: the norm N(β) and trace T (β) depend on the fieldQ(α) as well as the number β. If we wish to be strict we should use thenotation NQ(α)/Q(β) and TQ(α)/Q(β) instead.

The crucial property of the norm and trace is that they are both rational.

Theorem 2.2 Let β ∈ Q(α). Then N(β) ∈ Q and T (β) ∈ Q.

Proof Write β =∑n−1

k=0 bkαk where the bj ∈ Q. Then

N(β) =n∏

j=1

n−1∑k=0

bkαkj and T (β) =

n∑j=1

n−1∑k=0

bkαkj .

Both N(β) and T (β) are symmetric polynomials with rational coefficientsin the variables α1, . . . , αn. By Newton’s theorem on symmetric polynomi-als (Theorem A.2), N(β) = g1(e1, e2, . . . , en) and T (β) = g2(e1, e2, . . . , en)where g1 and g2 are polynomials in n variables with rational coefficientsand e1, . . . , en are the elementary symmetric polynomials in the variablesα1, . . . , αn. But

Xn +n∑

j=1

(−1)jejXn−j =n∏

j=1

(X − αj) = f(X)

which is the minimum polynomial of α. Hence ej ∈ Q and so N(β), T (β) ∈Q. �

11

More generally we can consider the field polynomial

n∏j=1

(X − σj(β)) = Xn − T (β)Xn−1 + · · ·+ (−1)nN(β)

of β. Using the same argument as Theorem 2.2 one shows that all its coeffi-cients are rational.

One gets similar results on replacing A by B and Q by Z. Before provingthem it’s convenient to prove, in essence, that σj(β) is always a conjugateof β.

Lemma 2.2 Let α ∈ A and β ∈ Q(α). For each j, β and σj(β) have thesame minimum polynomial.

Proof Let g be the minimum polynomial of β. Then

g(X) = Xn +n−1∑k=0

bkXk

where each bj ∈ Q. For any γ ∈ Q(α) we have, since σj is a ring homomor-phism and σj(b) = b whenever b ∈ Q,

σj(g(γ)) = σj(γn) +

n−1∑k=0

σj(bkγk) = σj(γ)

n +n−1∑k=0

bkσj(γ)k = g(σj(γ)).

In particular g(σj(β)) = σj(g(β)) = σj(0) = 0. As g is irreducible over Qthen g is the minimum polynomial of σ(β). �

If β ∈ Q(α) is an algebraic integer then its minimum polynomial hasinteger coefficients. As σj(β) shares this minimum polynomial, then σj(β) isalso an algebraic integer.

Proposition 2.2 Let α ∈ A and β ∈ Q(α) ∩B. Then T (β), N(β) ∈ Z.

Proof We already know that T (β), N(β) ∈ Q. But T (β) is the sum, andN(β) is the product of the σj(β). As β ∈ B then all σj(β) ∈ B and so T (β),N(β) ∈ B. Thus T (β), N(β) ∈ Q ∩ B. But Q ∩ B = Z since if a ∈ Q itsminimum polynomial is X − a and this has integer coefficients if and only ifa ∈ Z. Hence T (β), N(β) ∈ Z. �

More generally the same argument shows that the field polynomial ofβ ∈ Q(α) ∩B has integer coefficients.

12

Given a number field K = Q(α) we define its ring of integers as OK =K∩B, that is the set of algebraic integers inK. In the proof of Proposition 2.2we see that if K = Q then OK = Z. We aim to develop the concepts ofnumber theory (primes, congruences, factorizations) in the rings OK , just asstandard number theory does for Z.

Example A quadratic field is a number field of the form Q(√m) where

m ∈ Q but√m /∈ Q. Since it is easy to see that Q(

√r2m) = Q(

√m)

for any nonzero rational r, each quadratic field has the form K = Q(√m)

where m is a squarefree integer, that is, m is not divisible by the square ofany prime number. We shall always assume this is the case when we discussquadratic fields.

When m > 0, Q(√m) is a real quadratic field since Q(

√m) ⊆ R, and

when m < 0, Q(√m) is an imaginary quadratic field since Q(

√m) 6⊆ R.

We shall compute OK whenever K = Q(√m) is a quadratic field. Let

β = a + b√m ∈ K with a, b ∈ Q. For α ∈ OK it is necessary that T (β),

N(β) ∈ Z and this is sufficient too, since β2 − T (β)β +N(β) = 0. Supposethat T (β), N(β) ∈ Z. Then 2a = T (β) ∈ Z and a2 −mb2 = N(β) ∈ Z. Itfollows that m(2b)2 = T (β)2 − 4N(β) ∈ Z. Since m is squarefree, 2b ∈ Z forotherwise, 2b would have a power of a prime p dividing its denominator. Butthen, since p2 - m, so would m(2b)2. We can write β = 1

2(c + d

√m) with c,

d ∈ Z. Finally c2−md2 = 4N(β) ≡ 0 (mod 4). Since m is squarefree, m 6≡ 0(mod 4). As odd squares are congruent to 1 modulo 4, and even squares aredivisible by 4, then c2−md2 ≡ 0 (mod 4) is only possible if c and d are botheven, or if they are both odd and m ≡ 1 (mod 4).

To conclude, when m 6≡ 1 (mod 4) then

OK = {a+ b√m : a, b ∈ Z} = Z[

√m]

and when m ≡ 1 (mod 4) then

OK ={c+ d

√m

2: c, d ∈ Z, c ≡ d (mod 2)

}=

{a+ b

(1 +

√m

2

): a, b ∈ Z

}= Z

[1 +

√m

2

].

For quadratic fields K we have show that there exist β1 and β2 such thatOK = {a1β1+a2β2 : a1, a2 ∈ Z}. (We have β1 = 1 and β2 =

√m or 1

2(1+

√m)

as appropriate.) Our aim will be to show that the corresponding property

13

holds for every number field. Indeed if K is a number field of degree n, thenthere exist β1, . . . , βn ∈ OK with the property that each element of OK canbe uniquely expressed in the form

∑nj=1 ajβj where the aj ∈ Z. To this end

we need to introduce the concept of discriminant.Let K = Q(α) be a number field of degree n. Let β1, . . . , βn ∈ K.

We define M(β1, . . . , βn) as the matrix whose (j, k)-entry is T (βjβk). Wethe define the discriminant of the sequence β1, . . . , βn as ∆(β1, . . . , βn) =det(M(β1, . . . , βn)). Then as each T (βjβk) ∈ Q, ∆(β1, . . . , βn) ∈ Q.

Lemma 2.3 Let K be a number field of degree n and let β1, . . . , βn ∈ K.Then

∆(β1, . . . , βn) = det(N(β1, . . . , βn))2

where N(β1, . . . , βn) is the matrix whose (j, k)-entry is σk(βj).

Proof Let M = M(β1, . . . , βn) and N = N(β1, . . . , βn). The (j, k)-entry ofNN> is

n∑i=1

σi(βj)σi(βk) =n∑

i=1

σi(βjβk) = T (βjβk)

so that NN> = M . Hence det(M) = det(N) det(N>) = det(N)2. �

Example Suppose that K = Q(α) has degree n. We shall compute∆(1, α, α2, . . . , αn−1). By the Lemma,

∆(1, α, α2, . . . , αn−1) =

∣∣∣∣∣∣∣∣∣∣∣∣∣

1 1 1 · · · 1α1 α2 α3 · · · αnα21 α

22 α

23 · · · α2n

α31 α32 α

33 · · · α3n

......

.... . .

...αn−11 α

n−12 α

n−13 · · · αn−1n

∣∣∣∣∣∣∣∣∣∣∣∣∣

2

where α1, α2, . . . , αn are the conjugates of α. But this is a Vandermondedeterminant, and so by Proposition A.5

∆(1, α, α2, . . . , αn−1) =∏

1≤j

Let f(X) =∏n

k=1(X − αk) be the minimum polynomial of α. Then by theproduct rule for differentiation

f ′(X) =n∑

j=1

n∏k=1k 6=k

(X − αk).

When X = αj only the j-th summand is nonzero, so

f ′(αj) =n∏

k=1k 6=k

(αj − αk).

Hence

∆(1, α, α2, . . . , αn−1) = (−1)n(n−1)/2n∏

j=1

f ′(αj) = (−1)n(n−1)/2N(f ′(αj)).

Since ∆(β1, . . . , βn) 6= 0 for some choice of the βj, then ∆(β1, . . . , βn) 6= 0in many other cases. The following lemma enables us to relate the discrimi-nants of different sequences.

Lemma 2.4 Let K be a number field of degree n and let β1, . . . , βn ∈ K. IfB = (bjk) is an n-by-n matrix over Q and γj =

∑nk=1 bjkβk then

∆(γ1, . . . , γn) = det(B)2∆(β1, . . . , βn).

Proof We have ∆(β1, . . . , βn) = det(N(β1, . . . , βn))2 where the (j, k) entry

of N(β1, . . . , βn) is σk(βj). Now σk(γj) =∑n

i=1 bjiσk(βi) so that

N(γ1, . . . , γn) = BN(β1, . . . , βn).

Taking determinants and squaring completes the proof. �

We can now show how the discriminant discriminates between bases andnonbases.

Proposition 2.3 Let K be a number field of degree n and let β1, . . . , βn ∈ K.Then β1, . . . , βn form a basis of K as a vector space over Q if and only if∆(β1, . . . , βn) 6= 0.

Proof Certainly we can write βj =∑n

k=1 bjkαk−1 with bjk ∈ Q. Let B be

the matrix with the bjk as entries. Then by Lemma 2.4

∆(β1, . . . , βn) = det(B)2∆(1, α, α2, . . . , αn−1)

15

and so ∆(β1, . . . , βn) 6= 0 if and only if det(B) 6= 0. But det(B) 6= 0 if andonly if the βj form a basis of K as a vector space over Q. �

Let K be a number field of degree n. Certainly OK is a subgroup of Kunder the operation of addition. We aim to show that there are β1, . . . , βn ∈OK with each element of OK uniquely expressible in the form

∑nj=1 bjβj

where the βj ∈ Z. A sequence β1, . . . , βn satisfying this is called an integralbasis of OK . More generally if G is an abelian group under the operation ofaddition, then an integral basis of G is a sequence γ1, . . . , γm of elements ofG such that each element of G uniquely expressible as

∑mj=1 cjγj with the

γj ∈ Z. If G has an integral basis with m elements then we say that G isa free abelian group of rank m. The basic theory of free abelian groups isoutlined in Appendix A.3.

Theorem 2.3 Let K be a number field of degree n. Then OK is a freeabelian group of rank n.

Proof Let β1, . . . , βn be a basis of K. Then for positive integers r1, . . . , rnthe sequence r1β1, . . . , rnβn is also a basis of K. By Lemma 1.1 we maychoose the rj such that rjβj ∈ OK for each j. Replacing βj by rjβj we seethat there is a basis β1, . . . , βn of K with each βj ∈ OK .

Suppose that γ =∑n

k=1 ckβk ∈ OK where the ck ∈ Q. Then for each k,βjγ ∈ OK and so T (βjγ) ∈ Z. Thus dj =

∑nk=1 ckT (βjβk) ∈ Z for all j.

Let M be the matrix with (j, k)-entry T (βjβk). Then d = Mc where cand d are the column vectors with j-th entries cj and dj respectively. Nowdet(M) = ∆(β1, . . . , βn) 6= 0 as the βj form a basis. Hence c = M−1d.The matrix M has integer entries so that M−1 = (det(M))−1adj(M). Let∆ = det(M). Then adj(M) and d have integer entries and so ∆c has integerentries. Hence ∆cj ∈ Z for all j.

Let A = {a1β1 + · · ·+ anβn : aj ∈ Z} and B = {∆−1(a1β1 + · · ·+ anβn) :aj ∈ Z}. Since the βj form a basis of K, the βj form an integral basis of Aand the βj/∆ form an integral basis of B. We have shown that A ⊆ OK ⊆ B.Since B is free abelian of rank n, then OK is free abelian of rank m wherem ≤ n by Proposition A.3. Again by this proposition, since A ⊆ OK , n ≤ m.Hence m = n. �

The choice of integral basis for the ring of integers of a number field K isnot unique. However, the discriminant of each integral basis is the same.

Proposition 2.4 Let K be a number field of degree n, and let β1, . . . , βn andγ1, . . . , γn be integral bases of OK. Then ∆(β1, . . . , βn) = ∆(γ1, . . . , γn).

16

Proof We can write γj =∑n

k=1 bjkβk and βj =∑n

k=1 cjkγk where all the bjkand cjk are integers. Let B and C be the matrices with (j, k)-entries bjk andcjk respectively. Now

βj =n∑

k=1

cjkγk =n∑

k=1

n∑i=1

cjkbkiβi =n∑

i=1

djiβi

where dji =∑n

k=1 cjkbki ∈ Z. From the uniqueness of representations ofelements of O in terms of the βi we must have djj = 1 and dji = 0 wheneverj 6= i. But dji is the (j, i) entry of the matrix CB. Hence CB = I, theidentity matrix. Thus det(C) det(B) = det(I) = 1 and as det(B) and det(C)are integers det(B) = det(C) = ±1. But by Proposition 2.4,

∆(γ1, . . . , γn) = det(B)2∆(β1, . . . , βn) = ∆(β1, . . . , βn).

�

The nonzero integer

∆K = ∆(β1, . . . , βn),

where β1, . . . , βn form an integral basis of OK , only depends (as the notationsuggests) on the field K. We call ∆K the discriminant of K. After thedegree, it is the most important numerical invariant of the field.

Let β1, . . . , βn be an integral basis for OK and suppose that γ1, . . . , γn ∈OK and that γ1, . . . , γn are linearly independent over Q. Then γ1, . . . , γn forman integral basis of the subgroup H = {

∑nj=1 ajγj : aj ∈ Z} of OK . However,

it may happen that H 6= OK . We have ∆(γ1, . . . , γn) = det(B)2∆K byLemma 2.4, where the matrix B has integer entries bjk and γj =

∑nk=1 bjkβk.

But by Proposition A.4, det(B) = |OK : H|. Hence

∆(γ1, . . . , γn) = |OK : H|2∆K .

Hence the index |OK : H| is a number whose square divides ∆(γ1, . . . , γn).In particular if ∆(γ1, . . . , γn) is squarefree, then |OK : H| = 1 and γ1, . . . , γnform an integral basis of OK .

3 Factorization

In ordinary number theory we study the integers Z, in particular the positiveintegers. In algebraic number theory we study rings of integers OK . Eachpositive integer is a product of prime numbers p which have the two properties

17

• if p = ab with a, b ∈ Z then a = ±1 or b = ±1,

• If p | cd with c, d ∈ Z then p | c or p | d.

These two properties are equivalent, but while it is easy to prove that the sec-ond implies the first, the converse requires the Euclidean algorithm. Howeverthe corresponding properties in OK are not equivalent.

We need some definitions. A unit in OK is an element β ∈ OK such that1/β ∈ OK . The set of units of OK is denoted by U(OK) and it is apparentthat it forms a group under multiplication. There is a nice characterizationof units.

Lemma 3.1 Suppose that K is a number field of degree n and let β ∈ OK.Then β ∈ U(OK) if and only if N(β) = ±1.

Proof If β ∈ OK , then 1/β ∈ OK and so N(β), N(1/β) ∈ Z. ButN(β)N(1/β) = N(1) = 1 so that N(β) = N(1/β) = ±1.

Conversely suppose that N(β) = ±1. Then

±1 =n∏

j=1

σj(β) = βn∏

j=2

σj(β).

Thus 1/β = ±∏n

j=2 σj(β) which is an algebraic integer because each σj(β) ∈B. Hence β ∈ OK . �

For Z = OQ, the only units are ±1. But the unit group of OK can beinfinite. For example β = 1 +

√2 ∈ K = Q(

√2) is a unit as N(β) = −1.

But β > 1 so that βm → ∞ as m → ∞. But when m is a positive integer,βm ∈ U(OK) so that U(OK) is infinite.

As for the integers we define a divisibility relation on OK . For β, γ ∈ OKwith β 6= 0 we say that β | γ (β divides γ or γ is divisible by β) if γ/β ∈ OKand β - γ otherwise. Similarly we write γ ≡ δ (mod β) (γ is congruent toδ modulo β) if β | (γ − δ). Divisibility and congruences have all the formalproperties familiar from Z so we shall not repeat them. Note that β is a unitif and only if β | 1. One useful new property of divisibility is the following.

Lemma 3.2 Let β, γ ∈ OK. If β | γ then N(β) | N(γ) as integers.

Proof If β | γ then δ = γ/β ∈ OK and N(γ) = N(β)N(δ). As N(β),N(δ) ∈ Z then N(β) | N(γ). �

We can now define what turns out to be our first analogue of primenumbers. Let β ∈ OK . We say that β is irreducible if

18

• β 6= 0,

• β is not a unit, and

• if β = γδ with γ, δ ∈ OK then either γ or δ is a unit.

In Z, the irreducible elements have the form ±p where p is a prime num-ber. From Lemma 3.2 it follows that if N(β) is a prime number then β isirreducible. The converse is not true; take the example K = Q(i) so thatOK = Z[i]. Then 3 is irreducible in OK but N(3) = 9 is not prime.

In each OK we can achieve factorization into irreducibles.

Lemma 3.3 Let K be a number field. Suppose that β ∈ OK and that β 6= 0and β /∈ U(OK). Then there are irreducible elements γ1, . . . , γk ∈ OK withβ = γ1 · · · γk.

Proof By induction on |N(β)| which is a positive integer. Since β 6= 0 and βis not a unit then |N(β)| ≥ 2. If β is irreducible then take k = 1 and γ1 = β.If β is reducible then β = β1β2 where β1, β2 ∈ OK and β1, β2 /∈ U(OK).Then |N(β1)|, |N(β2)| > 1 and as |N(β)| = |N(β1)||N(β2)| it follows that|N(β1)|, |N(β2)| < |N(β)|. By the induction hypothesis both β1 and β2 areproducts of irreducible elements, and by combining these factorizations wesee that β is also a product of irreducible elements. �

We turn to the question of uniqueness. It is easy to see that if β isirreducible and ξ ∈ U(OK) then ξβ is also irreducible. Hence we can adjustfactorizations by multiplying factors by units. For instance if β = γ1γ2γ3 is afactorization into irreducibles and ξ ∈ OK then β = (ξγ1)γ2(ξ−1γ3) is also afactorization into irreducibles. If we can go from one factorization to anotherby introducing unit factors and/or permuting the order of factors then wesay that the factorizations are equivalent. From standard number theorywe know that in Z all factorizations of a given number into irreducibles areequivalent. However this does not hold in all OK .Example Let K = Q(

√−6). Then OK = Z[

√−6]. Now 6 = 2 × 3 =√

−6(−√−6). I claim that these are inequivalent factorizations of 6 into

irreducibles. Now N(2) = 4, N(3) = 9 and N(±√−6) = 6. If any of 2, 3

or ±√−6 were reducible, their nontrivial factors would have norms 2 or 3.

But suppose that β ∈ Z[√−6] has N(β) = 2 or 3. Then β = a+ b

√−6 with

a, b ∈ Z and a2 + 6b2 = 2 or 3 which is impossible. Hence 2, 3 and ±√−6

and as the norms of the factors on both sides of 2 × 3 =√−6(−

√−6) are

different then the two factorizations are inequivalent.

In this example we have√−6 is irreducible, but that

√−6 | (2×3),

√−6 -

2 and√−6 - 3. If we study the proof of uniqueness of prime factorization

19

in Z we see that it relies on the fact that if a prime p divides a product ofintegers ab then it divides (at least) one of the integers a and b. This propertyis not shared by the irreducible element

√−6 of Z[

√−6]. We thus make a

definition. Let β ∈ OK . We say that β is prime if

• β 6= 0,

• β is not a unit, and

• if β | γδ with γ, δ ∈ OK then either β | γ or β | δ.

From standard number theory an integer is irreducible in Z if and only if itis prime. However

√−6 is irreducible but not prime in Z[

√−6]. However

primes are always irreducible.

Lemma 3.4 Let K be a number field. If β is a prime element of OK thenβ is irreducible in OK.

Proof Let β be prime and suppose that β = γδ with γ, δ ∈ OK . Thenβ | γδ so that β | γ or β | δ by primality. Without loss of generality supposethat β | γ. Then as γ | β, δ = β/γ is a unit. Hence β is irreducible. �

If in a given OK every irreducible is prime then we achieve unique fac-torization, by an argument similar to that of unique factorization in Z.

Proposition 3.1 Let K be a number field and suppose that every irreducibleelement of OK is prime. Then OK has unique factorization: any two factor-ization of an element into irreducibles are equivalent.

Proof Let

β =r∏

j=1

γj =s∏

k=1

δk

be two factorizations of β into irreducibles. We argue that these are equiva-lent by induction on |N(β)|. Since γ1 is prime and γ1 | δ1 · · · δs then γ1 | δkfor some k. By shuffling the δs we may assume that γ1 | δ1 and as δ1 isirreducible then δ1 = ξγ1 where ξ is a unit. Hence

β/γ1 =r∏

j=2

γj = (ξδ2)s∏

k=3

δk

is a factorization into irreducibles and |N(β/γ1)| < |N(β)|. By the induc-tive hypothesis these factorizations of β/γ1 are equivalent, and so the givenfactorizations of β are equivalent. �

20

For some fields K, for instance K = Q every irreducible in OK is primeand for these fields OK has the unique factorization property. The reason Zhas unique factorization is because of the Euclidean algorithm. This worksas if a, b ∈ Z, a 6= 0 and a - b then there is c ∈ Z with |b − ac| < |a|.Some other number fields have the analogous property. We say that K isnorm-Euclidean if when β, γ ∈ OK with β 6= 0, then there exists δ ∈ OKwith |N(γ − δβ)| < |N(β)|. In other words we get the “remainder” γ − δβwhen dividing γ by β and this remainder is “smaller” than β. There is auseful alternative characterization.

Lemma 3.5 The number field K is norm-Euclidean if and only if for allξ ∈ K there is δ ∈ OK with |N(ξ − δ)| < 1.

Proof Suppose that K is norm-Euclidean. Let ξ ∈ K. Then ξ = γ/βfor some β, γ ∈ OK with β 6= 0. By the norm-Euclidean property, there isδ ∈ OK with |N(γ − δβ)| < |N(β)|. Thus

1 >

∣∣∣∣N(γ − δβ)N(β)∣∣∣∣ = |N(γ/β − δ)| = |N(ξ − δ)|.

Conversely, suppose that for all ξ ∈ K there is δ ∈ K with |N(ξ−δ)| < 1.Let β, γ ∈ OK with β 6= 0. Put ξ = γ/β. Then there is δ ∈ OK with|N(ξ − δ)| < 1. Hence

|N(γ − βδ)| = |N(β)||N(ξ − δ)| < |N(β)|

as so K is norm-Euclidean. �

Example We show that Q is norm-Euclidean. Let x ∈ Q. Then n ≤x ≤ n + 1 for some n ∈ Z. Now |N(x − n)| = |x − n| = x − n and|N(x− (n+1))| = |x− (n+1)| = n+1−x. As (x−n)+(n+1−x) = 1 thenone of these numbers is≤ 1/2. So for a = n or a = n+1, |N(x−a)| ≤ 1/2 < 1.

Example Consider K = Q(√−d) where d = 1 or d = 2. For ξ ∈ K,

|N(ξ)| = |ξ|2 where |ξ| denotes the absolute value of the complex number ξ.We have OK = Z[

√−d] = {a + b

√−d : a, b ∈ Z}. Let ξ = x + y

√−d

with x, y ∈ Q. There are integers a, b with |x − a|, |y − b| ≤ 1/2. Letδ = a+ b

√−d ∈ OK . Then

|N(ξ − δ)| = (x− a)2 + d(y − b)2 ≤ 1 + d4

< 1.

Hence Q(i) and Q(√−2) are norm-Euclidean.

21

Example Consider K = Q(√−d) where d = 3, 7 or 11. We have OK =

Z[α] = {a + bα : a, b ∈ Z} where α = 12(1 +

√−d). Let ξ = x + y

√−d

with x, y ∈ Q. There is an integer b with |2y − b| ≤ 1/2. Then ξ − bα =12(2x− b) + 1

2(2y − b)

√−d. There is an integer a with |1

2(2x− b)− a| < 1/2.

Let δ = a+ bα. Then

|N(ξ − δ)| = (2x− 2a− b)2 + d(2y − b)2

4≤ 1 + d/4

4< 1.

Hence Q(√−3), Q(

√−7) and Q(

√−11) are all norm-Euclidean.

Example Let K = Q(√−6). Then OK = Z[

√−6]. Let ξ = 1

2(1 +

√−6). If

δ = a+ b√−6 ∈ OK with a, b ∈ Z, then

|N(ξ − δ)| = (1/2− a)2 + 6(1/2− b)2 ≥ 1 + 64

> 1

since |c− 1/2| ≥ 1/2 for all c ∈ Z. Hence Q(√−6) is not norm-Euclidean.

Example Consider K = Q(√m) where m = 2 or m = 3. Then OK =

Z[√m] = {a+ b

√m : a, b ∈ Z}. Let ξ = x+ y

√m with x, y ∈ Q. There are

integers a, b with |x− a|, |y − b| ≤ 1/2. Let δ = a+ b√m ∈ OK . Then

N(ξ − δ) = (x− a)2 −m(y − b)2.

Thus−m/4 ≤ N(ξ − δ) ≤ 1/4

and consequently

|N(ξ − δ)| ≤ max(1/4,m/4) < 1.

Hence Q(√

2) and Q(√

3) are norm-Euclidean.

It is apparent that if K is norm-Euclidean then OK is a Euclidean domainwith respect to the Euclidean function φ(β) = |N(β)|. Every ideal in aEuclidean domain is principal (Proposition A.6). If all ideals of OK areprincipal then every irreducible in OK is prime.

Proposition 3.2 Suppose that the number field K has the property that eachideal of OK is principal. Then every irreducible element of OK is prime.

Proof Suppose that each ideal of OK is principal. Suppose that β ∈ OKis irreducible and that β | γδ. We must show that if β - γ then β | δ. LetI = {ξβ + ηγ : ξ, η ∈ OK} be the ideal generated by β and γ. Then I isprincipal: I = 〈λ〉 say. As β ∈ I then λ | β. But as β is irreducible either

22

λ = εβ or λ = ε where ε is a unit. If λ = εβ then β | λ, but also λ | γ asγ ∈ I. Hence β | γ which is false. Hence λ = ε and so I = R. Therefore1 ∈ I so that 1 = ξβ + ηγ for some ξ, η ∈ OK . Hence δ = ξβδ + ηγδ fromwhich it follows that β | δ as δ | γδ. Hence β is prime. �

When K is norm-Euclidean we get the following chain of implications.First OK is a Euclidean domain. Then every ideal of OK is principal. Thenevery irreducible in OK is prime. Finally OK has the unique factorizationproperty. However not all these implications are reversible. When K =Q(√−19), OK has unique factorization but is not a Euclidean domain. Clark

proved in 1993 that when K = Q(√

69) then OK is a Euclidean domaindespite the fact that K is not norm-Euclidean.

As an application we look at an equation where to find all integer solutionsit is useful to work in a number field.

Example We wish to find all solutions of

x3 = y2 + 2 (∗)

with x, y ∈ Z.The presence of the 2 in (∗) suggests that we see whether we can restrict

the parity of x and y. If y is even, then 4 | y2 and so y2 + 2 ≡ 2 (mod 4).But x3 6≡ 2 (mod 4) so y must be odd. This forces x3 to be odd and so x isodd.

Next we factor (∗) as

x3 = (y +√−2)(y −

√−2). (†)

This is a factorization in K = Q(√−2). Note that OK = Z[

√−2] and that

OK has unique factorization as we have seen that K is Euclidean. It is easyto see that the only units of OK are ±1. Let us write out the factorization ofy+

√−2 into primes, putting together repeated occurrences and also putting

together occurrences of π and −π as −π2. We get

y +√−2 = ±πa11 πa22 · · ·πamm (‡)

where if j 6= k then πj 6= ±πk. Then (†) implies that

x3 = πa11 πa22 · · ·π

akk π

a11 π

a22 · · ·π

akk .

I claim that no πj equals ±πk. If this happened then πj | (y +√−2) and

πj | (y−√−2). Thus πj would be a factor of (y+

√−2)−(y−

√−2) = 2

√−2.

But as πj | (y+√−2) then N(πj) | N(y+

√−2) = y2 + 2, which is odd, and

as πj | 2√−2 then N(πj) | N(2

√−2) = 8. Hence N(πj) = 1, which means

that πj is a unit, and not an irreducible.

23

Since x3 is a cube, when we write it as a power of irreducibles, the ex-ponent of each is a multiple of 3. From unique factorization then each ajis divisible by 3. Consequently from (‡), y +

√−2 = ±β3 = (±β)3 where

β ∈ OK . Write ±β = a+ b√−2 with a, b ∈ Z. Then

y +√−2 = (a3 − 6ab2) + (3a2b− 2b3)

√−2

and so y = a(a2 − 6b2) and 1 = b(3a2 − 2b2). Hence b = ±1 and ±1 =3a2 − 2b2 = 3a2 − 2. This can only happen when 3a2 = 3 and a = ±1. Theny = ±(−5) = ±5. Thus x3 = 27 and x = 3. We conclude that the onlyinteger solutions of (∗) are (x, y) = (5, 3) and (x, y) = (−5, 3).

4 Ideals

Recall that an ideal of a (commutative) ring R is a subset I of R such that

• I is a subgroup of R (under the operation of addition),

• if a ∈ I and x ∈ R then xa ∈ I.

A principal ideal is one of the form

〈a〉 = {xa : x ∈ R}

for some a ∈ R. The trivial cases are 〈0〉 = {0} and 〈1〉 = R. All other idealsof R are called nontrivial.

Let I and J be ideals of R. It is easy to see that their sum I+J = {a+b :a ∈ I, b ∈ J} is an ideal of R. The sum of I and J is the smallest idealcontaining both I and J . It is even easier to see that their intersection I ∩ Jis an ideal of R. Ideals can be multiplied, but this is more difficult. If I andJ are ideals of R then the set {ab : a ∈ I, b ∈ J} is not in general an idealof R (although if one if I and J is principal it is). The problem is that thesum a1b1 + a2b2 where a1, a2 ∈ I and b1, b2 ∈ J may not be expressible as abfor a ∈ I and n ∈ J . However the additive group generated by the elementsab for a ∈ I, b ∈ J is an ideal of R, and we call this ideal the product IJ ofI and J . Symbolically

IJ = {a1b1 + a2b2 + · · ·+ anbn : a1, . . . , an ∈ I, b1, . . . , bn ∈ J}.

The product IJ is the smallest ideal containing all ab with a ∈ I and b ∈ J .The sum and product satisfy a number of formal properties:

• I + I = I when I is an ideal of R,

24

• I + J = J + I when I and J are ideals of R,

• I1 + (I2 + I3) = (I1 + I2) + I3 when I1, I2 and I3 are ideals of R,

• IJ ⊆ I ∩ J when I and J are ideals of R,

• IJ = JI when I and J are ideals of R,

• I1(I2I3) = (I1I2)I3 when I1, I2 and I3 are ideals of R, and

• I(J1 + J2) = IJ1 + IJ2 when I, J1 and J2 are ideals of R.

We abbreviate the sum of a number of principal ideals as follows:

〈a1, a2, . . . , ar〉 = 〈a1〉+ 〈a2〉+ · · ·+ 〈ar〉 .

Then 〈a1, a2, . . . , ar〉 is the smallest ideal containing each of the aj. An idealof this form is called finitely generated. By using the above properties of theideal sum and product we find that

〈a1, a2, . . . , ar〉+ 〈b1, b2, . . . , bs〉 = 〈a1, a2, . . . , ar, b1, b2, . . . , bs〉

and

〈a1, a2, . . . , ar〉〈b1, b2, . . . , bs〉 = 〈a1b1, a1b2, . . . , a1bs, a2b1, a2b2, . . . , arbs〉 .

We now turn to the ideal theory of OK for number fieldsK. Two elementsof OK generate the same principal ideal when they differ by a unit factor.

Lemma 4.1 Let β and γ be nonzero elements of OK where K is a numberfield. Then 〈β〉 = 〈γ〉 if and only if γ/β is a unit in OK.

Proof If 〈β〉 = 〈γ〉 then β ∈ 〈γ〉 and γ ∈ 〈β〉. Hence γ/β ∈ OK andβ/γ ∈ OK and so γ/β ∈ U(OK).

Conversely if γ/β ∈ U(OK) then β/γ, γ/β ∈ OK and so β | γ and γ | β.Hence 〈β〉 ⊆ 〈γ〉 ⊆ 〈β〉 so that 〈β〉 = 〈γ〉. �

The concepts of divisibility and primality in OK can be expressed interms of ideals. For instance β | γ if and only if γ ∈ 〈β〉 which occurs if andonly if 〈γ〉 | 〈β〉. Similarly γ ≡ δ (mod β) if and only if γ − δ ∈ 〈β〉. Wecan generalize the notion of congruences modulo an element to congruencesmodulo an ideal; if I is an ideal then we write γ ≡ δ (mod I) wheneverγ − δ ∈ I. Hence γ ≡ δ (mod 〈β〉) means the same as γ ≡ δ (mod β). Therelation of congruence modulo an ideal has the same formal properties ascongruence modulo an element, which I shall not list.

The condition for β to be a prime element of OK becomes the following:

25

• 〈β〉 6= 〈0〉,

• 〈β〉 6= 〈1〉, and

• if γ, δ ∈ OK and γδ ∈ 〈β〉 then either γ ∈ 〈β〉 or δ ∈ 〈β〉.

Note that here β only enters through the ideal 〈β〉. We say that an ideal Pof OK is prime if

• P 6= 〈0〉,

• P 6= 〈1〉, and

• if γ, δ ∈ OK and γδ ∈ P then either γ ∈ P or δ ∈ P .

Thus the principal prime ideals are those of the form 〈β〉 with β prime. Whenevery ideal ofOK is principal then every irreducible element ofOK is prime byProposition 3.2. But then factorizations into irreducibles are always uniqueup to equivalence by Proposition 3.1. We can put these results together andrephrase in the language of ideals.

Proposition 4.1 Let K be a number field, and suppose that each ideal ofOK is principal. Each nontrivial ideal of OK is a product of prime ideals andall such expressions are unique up to the order of the factors.

Proof Each nontrivial ideal I has the form I = 〈β〉 where β 6= 0 and β /∈U(OK). Then by Lemma 3.3, β = γ1γ2 · · · , γr where the γj are irreducible.Then I = 〈γ1〉〈γ2〉 · · · 〈γr〉 and by Proposition 3.2 the γj are primes. Butthen the 〈γj〉 are prime ideals so that I is a product of prime ideals.

LetI = P1P2 · · ·Pr = Q1Q2 · · ·Qs (∗)

be two factorizations of I into prime ideals. Write Pi = 〈γi〉 and Qj = 〈δj〉.Then β, γ1γ2 · · · γr and δ1δ2 · · · δs differ only by unit factors. By absorbingthese into γ1 and δ1 we may assume that

β = γ1γ2 · · · γr = δ1δ2 · · · δs.

By Proposition 3.1, these factorizations are equivalent which means thatin (∗), r = s and the Pi and Qj are the same up to order. �

But not every K has the property that each ideal of OK is principal.Remarkably, Proposition 4.1 is still valid for these fields, although the proofis harder. The unique factorization property for prime ideals compensatesin part for the failure of unique factorization into irreducible elements. It istime to see some examples of nonprincipal ideals.

26

Example Let K = Q(√−6). Then OK = {a + b

√−6}. We define two

subsets of OK which will turn out to be nonprincipal ideals. Let

I = {a+ b√−6 : a, b ∈ Z, a is even} = {2c+ b

√−6 : b, c ∈ Z}

andJ = {a+ b

√−6 : a, b ∈ Z, 3 | a} = {3c+ b

√−6 : b, c ∈ Z}.

It is easy to see that I and J are subgroups of OK under addition. Supposeβ = 2c+ b

√−6 ∈ I and γ = r + s

√−6 ∈ OK . Then

γβ = (r + s√−6)(2c+ b

√−6) = 2(rs− 3sb) + (rb+ 2sc)

√−6 ∈ I

and so I is an ideal of OK . A similar argument shows that J is an idealof OK . In fact I claim that I =

〈2,√−6〉. Certainly 2 ∈ I and

√−6 ∈ I

so that〈2,√−6〉⊆ I. On the other hand each element of I has the form

2c+ b√−6 for b, c ∈ Z. A fortiori each element of I has the form 2γ+δ

√−6

with γ, δ ∈ OK and so I ⊆〈2,√−6〉. Indeed then, I =

〈2,√−6〉. Similarly

J =〈3,√−6〉.

We now show that I and J are nonprincipal. Suppose that I were princi-pal. Then I = 〈β〉 for some β ∈ OK . Then as 2 ∈ I and

√−6 ∈ I, β | 2 and

β |√−6. Hence N(β) | N(2) = 4 and N(β) | N(

√−6) = 6. It follows that

N(β) = ±1 or ±2. But N(β) = a2 + 6b2 where β = a+ b√−6 and a, b ∈ Z.

The only possibility is a = ±1 and b = 0. But then β = ±1 and ±1 /∈ I sothis is false. Hence I is nonprincipal. A similar argument shows that J isalso nonprincipal.

We shall compute the products of I and J . First of all consider I2. Wehave

I2 =〈2,√−6〉 〈

2,√−6〉

=〈4, 2

√−6, 2

√−6,−6

〉.

By inspection we see that 4, −6 and 2√−6 are all elements of 〈2〉 so that

I2 ⊆ 〈2〉. But 2 = (−1)4− (−1)(−6) ∈ I2. Hence 〈2〉 ⊆ I2 and we concludethat I2 = 〈2〉. Similarly J2 = 〈3〉. Now consider IJ . We have

IJ =〈2,√−6〉 〈

3,√−6〉

=〈6, 2

√−6, 3

√−6,−6

〉.

As√−6 | ±6 in OK we see that IJ ⊆

〈√−6〉. But

√−6 = 3

√−6 +

(−1)2√−6 ∈ IJ and so

〈√−6〉⊆ IJ . Hence IJ =

〈√−6〉.

We now show that I and J are prime ideals. Let β = a + b√−6, γ =

c + d√−6 ∈ OK and suppose that β /∈ I and γ /∈ I. Then a and c are odd.

Thus βγ = (ac−6bd)+ (ad+ bd)√−6. But ac−6cd is odd so βγ /∈ I. Hence

I is prime. Now suppose that β /∈ J and γ /∈ J . Then 3 - a and 3 - c. Butthen 3 - (ac− 6bd) so that βγ /∈ J . Hence J is prime.

27

We have already seen the example

6 = 2× 3 = (√−6)(−

√−6)

of nonunique factorization into irreducibles in OK . This gives the ideal fac-torization

〈6〉 = 〈2〉〈3〉 =〈√

−6〉2. (∗)

But none of 〈2〉, 〈3〉 and〈√

−6〉

is “irreducible” as an ideal. The factorization(∗) can be rewritten as

(I2)(J2) = (IJ)(IJ)

and is now seen to be exhibit two ways of regrouping the nonprincipal primeideals in the factorization 〈6〉 = I2J2 into pairs multiplying to principalideals.

We need a technical result about ideals in OK .

Lemma 4.2 Let K be a number field of degree n. Each nonzero ideal of OKis a free abelian group of rank n under the operation of addition.

Proof Let I be a nonzero ideal of OK . Let β1, β2, . . . , βn form an integralbasis of OK and let γ be a nonzero element of I. Then it is plain thatγβ1, γβ2, . . . , γβn form an integral basis of 〈γ〉. Hence 〈γ〉 is free abelian ofrank m. Since I is a subgroup of OK then by Proposition A.3, I is freeabelian of rank m where m ≤ n. But 〈γ〉 is a subgroup of I and so the rankof 〈γ〉, that is n, does not exceed m. Hence n ≤ m ≤ n so that m = n. �

Let K be a number field. Each nonzero ideal of OK has the same rankas an abelian group as OK . By Proposition A.4 each ideal I has finite indexas a subgroup of OK . We call this index the norm of I, and denote it asN(I). That is, N(I) = |OK : I|. What this means is that if N(I) = m, thenthere are γ1, . . . , γm ∈ OK which form a system of coset representatives forI in OK . That is, each β ∈ OK is congruent to exactly one γj modulo I.Example Let K = Q(

√−6) so that OK = Z[

√−6]. Consider the principal

ideal〈1 +

√−6〉. Let β = 1 +

√−6. Then γ ∈ 〈β〉 if and only if γ/β ∈

Z[√−6]. If γ = a+ b

√−6 then

γ

β=a+ b

√−6

1 +√−6

=(a+ b

√−6)(1−

√−6)

(1 +√−6)(1−

√−6)

=a+ 6b

7+b− a

7

√−6.

Thus γ ∈ 〈β〉 if and only if a + 6b ≡ 0 (mod 7) and b − a ≡ 0 (mod 7).Both conditions are equivalent to a ≡ b (mod 7). Consequently a+ b

√−6 ≡

c + d√−6 (mod 〈β〉) if and only if b − a ≡ d − c (mod 7). Hence 0, 1, 2,

28

3, 4, 5, 6 form a system of coset representatives for 〈β〉 in Z[√−6] and so

N(〈1 +

√−6〉) = 7.

Example Again let K = Q(√−6), and consider the nonprincipal ideal

I =〈2,√−6〉. We have seen that a + b

√−6 ∈ I if and only if a is even.

Hence a+ b√−6 ≡ c+ d

√−6 (mod I) if and only if a ≡ c (mod 2). Hence 0

and 1 form a system of coset representatives for I in Z[√−6] and soN(I) = 2.

A similar argument gives N(J) = 3 when J =〈3,√−6〉.

We list some formal properties of the norm. Let I and J be nonzero idealsof OK .

• N(I) is a positive integer, and N(I) = 1 only when I = 〈1〉 = OK ,

• if I ⊆ J then N(J) | N(I) with equality only when I = J ; a fortioriN(J) < N(I) with equality only when I = J .

The latter of these is because |OK : I| = |OK : J ||J : I|.So far we have two notions of norm. The norm of an element of K, and

the form of a nonzero ideal of OK . As one might expect these notions arelinked.

Theorem 4.1 Let K be a number field. If γ is a nonzero element of OKthen

N(〈γ〉) = |N(γ)|. (∗)

(Note that on the left of (∗) we have the norm of an ideal, and on the rightwe have the norm of an element.)

Proof Let β1, . . . , βn form an integral basis of OK . Then γβ1, . . . , γβn formsan integral basis of 〈γ〉. We can write γβj =

∑nj=1 ajkβk where the ajk ∈ Z.

By Proposition A.4, N(〈γ〉) = |OK : 〈γ〉 | = | det(A)| where A is the n-by-nmatrix with (j, k)-entry ajk. It suffices to show that det(A) = N(γ).

We have the matrix equation γv = Av where v is the column vec-tor (β1 β2 · · · βn)>. Applying the homomorphism σk to this equationgives σk(γ)vk = Avk where vk = (σk(β1) σk(β2) · · · σk(βn))>. Thus thevk are eigenvectors of A with eigenvalues σk(γ). The n-by-n matrix Bwith columns the vk has (j, k)-entry σk(βj). Then BB

> has (j, k)-entry∑ni=1 σi(βj)σi(βk) = T (βjβk) and so det(BB

>) = ∆(β1, . . . , βn) 6= 0. HenceB is nonsingular. But then BAB−1 is a diagonal matrix with entries σj(γ)and so det(A) = det(BAB−1) =

∏nj=1 σj(γ) = N(γ). �

A similar argument shows that N(〈γ〉 I) = |N(γ)|N(I). Later we shallshow that N(IJ) = N(I)N(J) is general, but our proof will be very indirect.

29

Our aim is to show that each nontrivial ideal of OK can be uniquelyrepresented as a product of prime ideals. We need many preliminary resultsalas. By definition if P is a prime ideal, β, γ ∈ OK and βγ ∈ P , then eitherβ ∈ P or γ ∈ P . Equivalently if 〈β〉〈γ〉 = 〈βγ〉 ⊆ P then either 〈β〉 ⊆ P or〈γ〉 ⊆ P . This can be extended to nonprincipal ideals.

Lemma 4.3 Let K be a number field and let P be a prime ideal of OK. IfI and J are ideals of OK and IJ ⊆ P then either I ⊆ P or J ⊆ P .

More generally if I1, . . . , Im are ideals and I1 · · · Im ⊆ P then Ik ⊆ J forsome k.

Proof Suppose, for a contradiction, that IJ ⊆ P but I 6⊆ P and J 6⊆ P .Then there exist β ∈ I, γ ∈ J with β /∈ P and γ /∈ P . But then βγ ∈ IJ ,but βγ /∈ P , since P is prime, contradicting the hypothesis IJ ⊆ P .

The case of an m-term product I1 · · · Im now follows by induction. �Primality is also equivalent to maximality. An ideal I of OK is maximal

if I is nontrivial but the only ideals J of OK with I ⊆ J are J = I andJ = OK .

Lemma 4.4 Let K be a number field. An ideal I of OK is prime if and onlyif it is maximal.

Proof First suppose that I is maximal. Let β, γ ∈ OK with βγ ∈ I andβ /∈ I. To show that I is prime it suffices to show that γ ∈ I. Let J = I+〈β〉.Then J is an ideal and I ⊆ J , but I 6= J since β ∈ J . By maximality of I,J = OK . Hence 1 ∈ J so 1 = η + δβ where η ∈ I and δ ∈ OK . Then 1 ≡ δβ(mod I). Consequently, γ = 1γ ≡ δβγ ≡ 0 (mod I), as βγ ∈ I. We concludethat γ ∈ I and that I is prime.

Conversely suppose that I is prime. Suppose that J is an ideal of OKwith I ⊆ J and I 6= J . We need to show that J = OK , or equivalently,that 1 ∈ OK . Let β ∈ J and β /∈ I. Then J ⊇ I + 〈β〉 so all we need todo is to show that 1 ∈ I + 〈β〉. The ideal I has finite index, m say, in OK .Let γ1, . . . , γm be coset representatives for I in OK . That is to say that eachelement of OK is congruent modulo I to exactly one γj. In particular γj ≡ γk(mod I) if and only if j = k. If βγj ≡ βγk (mod I) then β(γj − γk) ∈ Iand as I is prime and β /∈ I then γj − γk ∈ I and so j = k. The numbersβγ1, . . . , βγm lie in distinct cosets of I, and so they represent all cosets. Inparticular 1 ≡ βγj (mod I) for some j, and so 1 = η + γjβ for some η ∈ I.Thus 1 ∈ I + 〈β〉 and I is maximal. �

As an immediate consequence, if P and Q are prime ideals of OK andP ⊆ Q then P = Q due to the maximality of P .

30

Due to maximality being the same as primality, every nontrivial ideal iscontained in a prime ideal.

Lemma 4.5 Let K be a number field, and let I be a nontrivial ideal of OK.Then there is a prime ideal P of OK with I ⊆ P .

Proof Consider the nontrivial ideals J of OK with I ⊆ J . There is certainlyat least one namely I itself. Take one, P , with least possible norm. Then Pis maximal, for if P ⊆ J1 with J1 6= P an ideal of OK , then N(J1) < N(P )and so J1 = OK . �

We would like to show that each nontrivial ideal is a product of primeideals. We cannot do so yet, but we can prove a first approximation to thisresult.

Lemma 4.6 Let K be a number field, and let I be a nontrivial ideal of OK.Then I ⊇ P1P2 · · ·Pm where the Pj are prime ideals of OK.

Proof We use induction on N(I). If I is prime, then we can take m = 1and P1 = I. If I is not prime, then there exist β, γ ∈ OK with β /∈ I, γ /∈ Ibut βγ ∈ I. Let J1 = 〈β〉 + I and J2 = 〈γ〉 + I. Then I ⊆ J1 and I ⊆ J2,but I 6= J1 and I 6= J2. Hence N(J1) < N(I) and N(J2) < N(I). ButJ1J2 = 〈βγ〉 + βI + γI + I2 ⊆ I as βγ ∈ I. By the inductive hypothesis,J1 ⊇ P1 · · ·Pr and J2 ⊇ Q1 · · ·Qs where the Pj and Qk are prime. HenceI ⊇ J1J2 ⊇ P1 · · ·PrQ1 · · ·Qs as required. �

As a technical convenience we extend the notion of ideals in OK to thatof fractional ideal. It will turn out that the set of fractional ideals forms agroup under multiplication, which it is clear that the set of ideals do not.

A fractional ideal of K is a set of the form βI where β is a nonzeroelement of K and I is a nonzero ideal of OK . Note that we do not assumethat β ∈ OK . In particular βOK = 〈β〉 is a fractional ideal of K for allnonzero β ∈ K. We call such a fractional ideal principal. If all ideals of OKare principal, for instance if K = Q, then so are all fractional ideals, for thefractional ideal βI = 〈βγ〉 if I = 〈γ〉. We define the sum and product offractional ideals in the same way as for ideals. In particular if β ∈ K, β 6= 0then 〈β〉〈1/β〉 = 〈1〉 = OK , so that principal fractional ideals are invertible.We shall show that all fractional ideals are invertible.

We start with an alternative characterization of fractional ideals.

Lemma 4.7 Let K be a number field. Then I is a fractional ideal of K ifand only if

• I is a nonzero subgroup of K under addition,

31

• if β ∈ I and γ ∈ OK then γβ ∈ I, and

• there is a nonzero η ∈ K such that β/η ∈ OK for each β ∈ I.

Proof If I = ηJ is a fractional ideal of K, with η ∈ K and J an idealof OK , then the three properties follow with the same value of η.

Conversely suppose the three properties hold. Then J = η−1I = {β/η :β ∈ I} is a nonzero ideal of OK and so I = ηJ is a fractional ideal. �

We shall show that all fractional ideals are invertible, that is given afractional ideal I, there is a fractional ideal J with IJ = 〈1〉. It is easy towrite down a candidate for the inverse of a fractional ideal I; define I∗ ={β ∈ K : βI ⊆ OK}. It is clear that I∗ is an additive subgroup of K and isnonzero since it contains 1/η whenever I = ηJ with J an ideal of OK . Alsoit is clear that if β ∈ I∗ and γ ∈ OK then βγ ∈ I∗. If δ is a nonzero elementof I then δI∗ ⊆ OK and so (1/δ)I∗ ⊆ OK . Thus I∗ is a fractional ideal of K.Also II∗ ⊆ OK so that II∗ is an ideal of OK . The hard part is to show thatII∗ = OK .

We first prove the invertibility for prime ideals. Let P be a prime idealof OK . Then P ∗ ⊇ OK since βP ⊆ P for all β ∈ OK . Thus PP ∗ ⊇ POK =P . But PP ∗ ⊆ OK . By the maximality of P , either PP ∗ = OK (as we want)or PP ∗ = P . We dispose of the possibility that PP ∗ = P in two stages: firstwe show that PP ∗ = P implies that P ∗ = OK , then we show that P ∗ 6= OK .

Lemma 4.8 Let K be a number field and let I be a nonzero ideal of OK. IfγI ⊆ I for some I ∈ K, then γ ∈ OK.

Proof By Lemma 4.2, I is a free abelian group, so let β1, . . . , βn form anintegral basis of I. Then γβj ∈ I for all j, so γβj =

∑nk=1 ajkβk where the

ajk ∈ Z. Thus γv = Av where v is the column vector with entries theβj and A is the matrix with entries the ajk. Thus γ is an eigenvalue of Awhich is a matrix with integer entries. Thus γ is an algebraic integer and soγ ∈ K ∩B = OK . �

We now show that prime ideals are invertible.

Proposition 4.2 Let K be a number field and let P be a prime ideal of OK.Then there is a fractional ideal J of K with PJ = 〈1〉.

Proof We let P ∗ = {β ∈ K : βP ⊆ OK}. Then P ∗ is a fractional idealof K, OK ⊆ P ∗ and P ⊆ PP ∗ ⊆ OK . By the maximality of the prime idealP , either PP ∗ = P or PP ∗ = OK . We show that the latter is true, so toobtain a contradiction, suppose that PP ∗ = P .

32

Then γ ∈ P ∗ implies that γP ⊆ P and so γ ∈ OK by Lemma 4.8.Hence P ∗ ⊆ OK and we conclude that P ∗ = OK . To obtain the desiredcontradiction, it suffices to find an element in P ∗ but not in OK .

Let β be a nonzero element of P . Then by Lemma 4.6, 〈β〉 containsa product P1P2 · · ·Pr of prime ideals. Choose such a product with fewestpossible factors. Then P ⊇ 〈β〉 ⊇ P1P2 · · ·Pr and so P ⊇ Pj for some jby Lemma 4.3. We shall assume, without loss of generality, that P ⊇ P1.Then by maximality of the prime ideal P1, P = P1. Thus 〈β〉 ⊇ PI whereI = P2 · · ·Pr. As r was chosen to be minimal then 〈β〉 6⊇ I. Thus thereexists γ ∈ I but γ /∈ 〈β〉. Hence δ = γ/β /∈ OK . But γP ⊆ PI ⊆ 〈β〉 and soδP = β−1γP ⊆ OK . Hence δ ∈ P ∗. But as γ /∈ 〈β〉 then δ /∈ OK .

The assumption that P ∗ = OK has led to a contradiction. We cannotthen have PP ∗ = P and we conclude that PP ∗ = OK = 〈1〉 as required. �

The inverse of a prime ideal P is uniquely determined, for if PJ = OKthen J = JPP ∗ = P ∗. We can now show that every nontrivial ideal is aproduct of prime ideals.

Theorem 4.2 Let K be a number field, and suppose that I is a nontrivialideal of OK. Then I = P1P2 · · ·Pm where the Pj are prime ideals.

Proof We use induction on N(I). There is nothing to prove when I isprime so assume that it is not. By Lemma 4.5 I ⊆ P for some primeideal P . Let P−1 be the inverse of P as a fractional ideal. As P ⊆ OKthen OK = PP−1 ⊆ OKP−1 = P−1. Let J = IP−1. As I ⊆ P thenJ ⊆ PP−1 = OK and J is an ideal of OK . Also I = PJ . Thus J is a properideal of OK .

We know P−1 ⊇ OK but P−1 6⊆ OK for otherwise P−1 would be OKwhich is not an inverse of P . Thus J = IP−1 ⊇ IOK = I. If we hadJ = I then γI ⊆ I for all γ ∈ P−1 and so P−1 ⊆ OK by Lemma 4.8. Thiscontradicts P−1 6⊆ OK . Hence I ⊆ J but I 6= J and so N(J) < N(I). Bythe inductive hypothesis J is a product of primes, and so I = PJ is also. �

We can conclude that every fractional ideal is invertible.

Proposition 4.3 Let K be a number field, and let I be a fractional idealof K. Then I is invertible.

Proof Write I = βJ where β 6= 0 and J is an ideal of OK . By Theorem 4.2J = P1P2 · · ·Pr where the Pj are prime, and by Lemma 4.5, each Pj isinvertible with inverse P−1j say. Then the fractional ideal β

−1P−11 P−12 · · ·P−1j

is an inverse of I. �

33

It is now plain that the fractional ideals of K form a group under mul-tiplication and the principal fractional ideals form a subgroup. One usefulconsequence is that ideals satisfy the maxim, “to contain is to divide”.

Proposition 4.4 Let K be a number field, and let I1, I2 be nonzero idealsof OK. Then I1 ⊇ I2 if and only if there is an ideal J of OK with I2 = I1J .

Proof If I2 = I1J for an ideal J then I2 ⊆ I1. Conversely suppose thatI1 ⊇ I2. Then OK = I1I−11 ⊇ I2I−11 = J say. Then J is an ideal of OK andI1J = I1I2I

−1 = I2. �

After all this hard work it is now easy to conclude that factorizations intoprime ideals are unique.

Theorem 4.3 Let K be a number field, and suppose that I is a nontrivialideal of OK. If

I = P1P2 · · ·Pr = Q1Q2 · · ·Qs (∗)

where the Pj and Qk are prime ideals of OK, then r = s and the Qk can bereordered so that Pj = Qj for each j.

Proof We use induction on r. Certainly P1 ⊇ I = Q1Q2 · · ·Qs. ByLemma 4.3, P1 ⊇ Qk for some k. We reorder the Qj so that P1 ⊇ Q1.By maximality of Q1 then P1 = Q1. Multiplying (∗) by P−11 gives

P2 · · ·Pr = Q2 · · ·Qs

and the result now follows from the inductive hypothesis. �

We can also repair an earlier omission; we can show that the ideal normis multiplicative.

Proposition 4.5 Let K be a number field, and let I and J be nonzero idealsof OK. Then N(IJ) = N(I)N(J).

Proof If I = OK there is nothing to prove, so we can write I as a productof prime ideals. Using induction on the number of prime ideals it plainlysuffices to prove the result in the special case where I = P is a prime ideal.

By definition N(P ) = |OK : P |, N(J) = |OK : J | and N(PJ) = |OK :PJ |. From the transitivity of index we have

N(PJ) = |OK : PJ | = |OK : J ||J : PJ | = |J : PJ |N(J).

It suffices to show that |J : PJ | = N(P ). Certainly J ⊇ PJ and J 6= PJ(why?). Let β ∈ J with β /∈ PJ . Then J ⊇ 〈β〉 + PJ ⊇ PJ and so

34

J = 〈β〉 + PJ (why?). Let γ1, . . . , γm be a system of coset representativesfor P in OK , so that m = N(P ). We shall show that βγ1, . . . , βγm form asystem of coset representatives for PJ in J .

If δ ∈ J then δ = ξβ+ η with ξ ∈ OK and η ∈ PJ as J = 〈β〉+PJ . Nowξ − γk ∈ P for some k, and so β(ξ − γk) ∈ PJ . Hence δ ≡ βγk (mod PJ) sothat each coset of PJ in J is represented by some βγk.

We need to show that the βγk represent distinct cosets of PJ . Supposethat βγi ≡ βγk (mod PJ) with i 6= k. Then β(γi−γk) ∈ PJ . Let δ = γi−γk.Then δ /∈ P , and as P is maximal 〈δ〉+ P = OK . There exists λ ∈ OK withλδ ≡ 1 (mod P ) and so (λδ− 1)β ∈ PJ . But δβ ∈ PJ and so β ∈ PJ whichis a contradiction.

Since βγ1, . . . , βγm form a system of coset representatives for PJ in Jwith m = N(P ) then the index |J : PJ | = m = N(P ) and this completesthe proof. �

It is important to know how to find prime ideals in OK . The follow-ing lemma shows they are all obtained from factorization of ordinary primenumbers.

Lemma 4.9 Let K be a number field, and let P be a prime ideal of OK.Then P occurs in the ideal factorization of 〈p〉 for a unique prime number p.Also N(P ) is a power of p.

Proof By Proposition 4.4, P occurs in the prime factorization of 〈p〉 if andonly if P ⊇ 〈p〉 which occurs if and only if p ∈ P . Let P have norm m.Then the index of P as a subgroup of OK is m. Consequently mβ ∈ P forall β ∈ OK ; in particular m ∈ P . Write m = p1p2 · · · pr with the pj prime.Then pj ∈ P for some j since P is a prime ideal. Write pj = p.

Since P ⊇ 〈p〉, N(P ) is a factor of N(〈p〉) = |N(p)| = pn where n is thedegree of K. Hence N(P ) = pk for some k with 1 ≤ k ≤ n. This also showsthat the prime p is uniquely determined. �

One corollary of this is the fact that there are only a finite number ofideals of a given norm (one can also prove this directly from the definition ofnorm).

Lemma 4.10 Let K be a number field and m a positive integer. There areonly finitely many ideals I of OK with N(I) = m.

Proof Write m = pa11 pa22 · · · parr as a product of primes. Then I is a product

of at most a1 prime ideals dividing 〈p1〉, at most a2 prime ideals dividing〈p2〉, and so on. There are only finitely many ways of choosing these primeideals, and consequently only finitely many possibilities for I. �

35

Hence we can determine all the prime ideals of OK by resolving each 〈p〉into its prime ideal factors.

When OK = Z[α] for some α, that is when 1, α, α2, . . . , αn−1 forms anintegral basis of OK , then the prime ideal factorization of p can be computedfrom the minimum polynomial of α. In general we cannot always find anintegral basis of this form for a given K. But in special cases, for instancethe important case of quadratic fields, we can.

We will need to factorize polynomials modulo p. Recall that for a primenumber p, the set Fp = {0, 1, 2, . . . , p− 1} with addition and multiplicationmodulo p forms a field. Here we have written elements of Fp in the form ato distinguish them from integers, but in practice we are usually sloppy andwrite a both for an integer and the corresponding element of Fp. If f ∈ Z[X]is a polynomial with integer coefficients then f ∈ Fp[X] denotes its reductionmodulo p, that is a0 + a1X + a2X2 + · · ·+ AmXm = a0 +a1X+a2X2 + · · ·+amX

m.When OK = Z[α] we can classify all ideals containing 〈p〉.

Proposition 4.6 Let K be a number field of degree n, and suppose that thereis α ∈ OK with OK = Z[α]. Let f be the minimum polynomial of α, andlet p be a prime number. Then each ideal of OK which contains 〈p〉 has theform

Ig = 〈p, g(α)〉where g ∈ Z[X] is monic and g | f in Fp[X]. Also Ig1 = Ig2 if and only ifg1 = g2 in Fp[X].

The norm of Ig is pd where d = deg(g), and Ig1 ⊇ Ig2 if and only if g1 | g2

in Fp[X].

Proof Suppose that I ⊇ 〈p〉. Since f(α) = 0 ∈ I, there are certainly monicpolynomials g ∈ Z[X] with g(α) ∈ I. Fix such a polynomial g of least possibledegree. Certainly I ⊇ Ig = 〈p, g(α)〉. I claim that, for h ∈ Z[X], h(α) ∈ Iif and only if g | h in Fp[X]. Let d = deg(g) = deg(g). Suppose first thatdeg(h) < d. Then for some integer a we have ah monic, so ah = h1 + ph2where h1 ∈ Z[X] is monic of degree less than d and h2 ∈ Z[X]. Thush1(α) = ah1(α)−ph2(α) ∈ I. This contradicts the definition of g. In generalsuppose g - h and h(α) = 0. Then h−ug is nonzero and has degree less thand for some u ∈ Z[X]. Then vα ∈ I where v = h − ug and v is nonzero andhas degree less than d, which is false. Conversely, if g | h then h = ug + pvwith u, v ∈ Z[X], and so h ∈ Ig ⊆ I. Hence I = Ig. As h(α) ∈ I then g | h.

The pd numbers a0 + a1α + a2α2 + · · · + ad−1αd−1 where 0 ≤ aj < p

form a system of coset representatives for Ig in OK , since each h ∈ Fp[X]is congruent to a unique polynomial of degree less than d modulo g. Hence

36

N(Ig) = pd. Since g2(α) ∈ Ig1 if and only if g1 | g2, it follows that Ig1 ⊇ Ig2

if and only if g1 | g2. In particular, Ig1 = Ig2 if and only if g1 | g2 and g2 | g1,that is if and only if g1 = g2. �

Using the above notation we get that when g = f , If = 〈p, 0〉 = 〈p〉 andwhen g = 1 then I1 = 〈p, 1〉 = OK . Clearly the maximal (prime) idealscontaining p correspond to the irreducible factors of f .

Theorem 4.4 Let K be a number field of degree n, and suppose that thereis α ∈ OK with OK = Z[α]. Let f be the minimum polynomial of α, and letp be a prime number. Write

f = g1a1g2

a2 · · · grar

where the gj are the distinct monic irreducible factors of f in Fp[X]. Thenthe prime ideal factorization of 〈p〉 in OK is

〈p〉 = P a11 P a22 · · ·P arr

wherePj = 〈p, gj(α)〉 .

Proof By Proposition 4.6, if Q is an ideal of OK and Q ⊇ Pj, then Q =〈p, h(α)〉 where h | gj. Thus h = 1 or h = gj so that Q = OK or Q = Pj.Hence Pj is maximal, and so prime.

The norm of the ideal P a11 Pa22 · · ·P arr is

N(P1)a1N(P2)

a2 · · ·N(Pr)ar = pd1a1+d2a2+···+drar

where dj = deg(gj). Hence

N(P a11 Pa22 · · ·P arr ) = pdeg(f) = pn = N(〈p〉).

For any polynomials h1, h2 ∈ Z[X] we have

(〈p〉+ 〈h1(α)〉)(〈p〉+ 〈h2(α)〉) ⊆ 〈p〉+ 〈h1h2(α)〉 .

Iterating this by induction we get

P a11 Pa22 · · ·P arr ⊆ 〈p〉+ 〈g

a11 g

a22 · · · garr (α)〉 = 〈p〉+ 〈f1(α)〉

where f1 = ga11 g

a22 · · · garr . Then f1 − f = pf2 where f2 ∈ Z[α]. But f1(α) =

pf2(α) ∈ 〈p〉 so that P a11 P a22 · · ·P arr ⊆ 〈p〉. But we have seen these idealshave the same norm, so they are equal. �

37

In fact this result is true even when OK 6= Z[α] as long as p - |OK : Z[α]|.We shall not prove this generalization.

Example Let K = Q(√m) be a quadratic field where m is a squarefree

integer with m 6≡ 1 (mod 4). Then OK = Z[α] where α =√m has minimum

polynomial X2−m. To determine the prime ideal factorization of 〈p〉, wherep is a prime number, in OK , we must factorize X2 −m in Fp[X]. There arethree possibilities:

1. X2 −m is irreducible over Fp. Then 〈p〉 is a prime ideal of OK . Thisoccurs when the congruence x2 ≡ m (mod p) is insoluble. This canonly happen when p is odd. This condition is described by saying thatm is a quadratic nonresidue of p. In this case we say that p is inertin K.

2. X2 − m splits into two distinct factors over Fp. Then X2 − m ≡(X − a)(X + a) (mod p) with a 6≡ −a (mod p). This means that p isodd, a2 ≡ m (mod p) and p - m. This condition is described by sayingthat m is a quadratic residue of p. In this case 〈p〉 = P1P2 whereP1 = 〈p,

√m+ a〉 and P2 = 〈p,

√m− a〉. Here P1 6= P2 and both P1

and P2 have norm p. In this case we say that p splits in K.

3. X2 −m splits into two equal factors over Fp. This happens only whenp = 2 or when p | m. When p = 2 then X2 − m ≡ X2 or (X + 1)2(mod 2) according to the parity of m. When p | m then X2 −m ≡ X2(mod p). Then 〈p〉 = P 2 where P = 〈p,

√m〉, unless p = 2 and m is

odd when P = 〈2,√m+ 1〉. In any case P has norm p. In this case we

say that p ramifies in K.

Note that p ramifies if and only if p divides the discriminant of K. Thusis true for all number fields K, but is too difficult to prove in this course.

It is a good exercise to perform the same calculations when K = Q(√m)

has m ≡ 1 (mod 4).

The theory of ideal factorization allows us to prove various results inelementary number theory.

Example Let K = Q(i). We know that K is norm-Euclidean so thatOK = Z[i] is a Euclidean domain. Then each ideal of Z[i] is principal.

Let p be a prime number. Then p splits in K whenever p is odd and thecongruence a2 ≡ −1 (mod p) is soluble. By elementary number theory thisoccurs if and only if p ≡ 1 (mod 4). When p ≡ 1 (mod 4) then 〈p〉 = P1P 2where P1 = 〈p, a+ i〉 and P2 = 〈p, a− i〉. Here a2 ≡ −1 (mod p). Theseideals are principal: P1 = 〈β〉 and as N(P1) = p then N(β) = p. Hence

38

p = b2 + c2 where β = b + ci. We have recovered the two-square theorem ofelementary number theory: if p is a prime congruent to 1 modulo 4, then pis the sum of two squares of integers.

If for a given p we can find an a with a2 ≡ −1 (mod p) then by applyingthe Euclidean algorithm for Z[i] to p and a+i we can obtain β = gcd(p, a+i)and so integers b and c with p = b2 + c2.

We shall briefly consider ideals in K = Q(ζ) where ζ = exp(2πi/p) andp is an odd prime number. We have seen that the minimum polynomialof ζ is f(X) = Xp−1 + Xp−2 + · · · + X + 1 so that K has degree p − 1.Certainly ζ ∈ OK . Let λ = ζ − 1, so that Z[ζ] = Z[λ]. Then N(ζ) =(−1)p−1N(−ζ) − f(0) = 1 and N(λ) = (−1)p−1N(1 − ζ) = f(1) = p. Alsof(λ+ 1) = 0 so that the minimum polynomial of λ is

g(X) = f(X + 1) =(X + 1)p − 1(X + 1)− 1

= Xp−1 +

p−1∑j=1

(p

j

)Xp−1−j.

Thus

λp−1 = −p−1∑j=1

(p

j

)λp−1−j = −

p−2∑k=0

(p

k + 1

)λk.

In particular λp−1 = pβ where β ∈ OK . Comparing norms gives

pp−1 = N(λp−1) = pp−1N(β).

Consequently N(β) = 1 and β = λp−1/p is a unit.

Proposition 4.7 Let K = Q(ζ) where p is an odd prime number and ζ =exp(2πi/p). Then ∆(1, ζ, ζ2, . . . , ζp−2) = (−1)(p−1)/2pp−2.

Proof Call this discriminant ∆. Then ∆ = (−1)p(p−1)/2f ′(ζ) where f is theminimum polynomial of ζ. Then f(X) = (Xp − 1)/(X − 1), so

f ′(X) =p(X − 1)Xp−2 − (Xp − 1)

(X − 1)2

and so f ′(ζ) = pζp−2/(ζ − 1). Hence N(f ′(ζ)) = N(p)N(ζ)p−1/N(ζ − 1) =pp−2. As p is odd, (−1)p(p−1)/2 = (−1)(p−1)/2 and the result follows. �

We can now show that the ring of integers of Q(ζ) is Z[ζ]. We have seenN(λ) = p and αp−1/p is a unit. Thus 〈λ〉 must be a prime ideal of norm pand 〈λ〉p−1 = 〈p〉. We thus know the prime factorization of 〈q〉 when q = p.

Theorem 4.5 Let K = Q(ζ) where p is an odd prime number and ζ =exp(2πi/p). Then OK = Z[ζ].

39

Proof Since the discriminant of 1, ζ, ζ2, . . . , ζp−2 is, up to sign, a power of p,the index |OK : Z[ζ]| is a power of p. If OK 6= Z[ζ] then there is β ∈ OKwith β /∈ Z[ζ] but pβ ∈ Z[ζ]. Since Z[ζ] = Z[λ], where λ = ζ − 1 then wecan write

β =1

p

p−2∑j=0

bjλj

where the bj are integers, not all divisible by p. Choose j to be the smallestnumber such that p - bj. Then

1

p

j−1∑k=0

bkλk ∈ OK

and so

γ = β − 1p

j−1∑j=0

bkλk =

1

p

p−2∑k=j

bkλk ∈ OK .

We infer that

λp−2−jγ =1

p

p−2∑k=j

bkλp−2−j+k ∈ OK .

But we have seen that λp−1/p ∈ OK . Then for k ≥ j+1 we have λp−2+j+k/p =λp−1λk−j−1 ∈ OK . Hence

bjλp−1

p= λp−2−jγ − 1

p

p−2∑k=j+1

bkλp−2−j+k ∈ OK .

We now consider norms. The norm of bjλp−2/p is bp−1j p

p−2/pp−1 = bp−1j /p.This must be an integer, yet it cannot be as p - bj. This contradiction showsthat Ok = Z[ζ]. �Example Let K = Q(ζ) where ζ = exp(2πi/5). Then OK = Z[ζ] and ζhas minimum polynomial f(X) = X4 + X3 + X2 + X + 1. For each primenumber q we aim to factorize the ideal 〈q〉 by factorizing the polynomial fmodulo q.

Consider the case q = 5. Then

(X − 1)4 = X4 − 4X3 + 6X2 − 4X + 1 ≡ X4 +X3 +X2 +X + 1 (mod 5).

It follows that 〈5〉 = P 45 where P5 = 〈5, ζ − 1〉. For λ = ζ − 1 we have seenthat λ4 | 5 so that P5 = 〈λ〉. Hence 〈5〉 is the fourth power of the primeideal 〈ζ − 1〉.

40

Now suppose that q 6= 5. If f is reducible modulo q, then either it has alinear factor or a quadratic factor. Let us suppose that f has the linear factorX − a modulo q. Then f(a) ≡ 0 (mod q) and so as f(X)(X − 1) = X5 − 1then a5 ≡ 1 (mod q). But a 6≡ 1 (mod q) for f(1) = 5 6≡ 0 (mod q). Thusa has order 5 modulo p. The powers a2, a3, a4 must also be solutions off(x) ≡ 0 (mod p), so we have

f(X) ≡ (X − a)(X − a2)(X − a3)(X − a4) (mod q).

Such an a exists if and only if q ≡ 1 (mod 5), and we conclude for theseprimes that 〈q〉 is a product of four distinct prime ideals, each of norm q.For instance, take q = 11. Then a = 3 satisfies a5 ≡ 1 (mod 11) and a2 ≡ 9,a3 ≡ 5, and a4 ≡ 4 (mod 11). Thus

〈11〉 = 〈11, ζ − 3〉〈11, ζ − 4〉〈11, ζ − 5〉〈11, ζ − 9〉 .

If f has no linear factor modulo q then either f is irreducible or f is theproduct of two quadratics, each irreducible modulo q. In the latter case thenin fact

f(X) ≡ (X2 + aX + 1)(X2 + bX + 1) (mod q) (∗)

for some a and b. I shan’t prove this; it is easy if one knows some theory offinite fields, otherwise it’s a rather messy calculation. Then (∗) holds if andonly if a + b ≡ 1 and ab ≡ −1 (mod q), that is that a and b are the rootsof Y 2 − Y − 1 ≡ 0 (mod q). This equation is soluble modulo q if and onlyif 5 is a square modulo q; then a and b are congruent to 1

2(1± s) modulo q,

where s2 ≡ 5 (mod q). By quadratic reciprocity 5 is a square modulo q ifand only if q ≡ ±1 (mod 5). We have seen that q ≡ 1 (mod 5) if and onlyif 〈q〉 is the product of four prime ideals of norm q. Thus 〈q〉 is the productof two prime ideals of norm q2 if and only if q ≡ −1 (mod 5). For example,let q = 19. Then 92 ≡ 5 (mod 19) and so we can take a = 1

2(1 + 9) = 5 and

b = 12(1− 9) = −4. That is

f(X) ≡ (X2 + 5X + 1)(X2 − 4X + 1) (mod 19)

and both X2 + 5X + 1 and X2 − 4X + 1 are irreducible modulo 19. Thus

〈19〉 =〈19, ζ2 + 5ζ + 1

〉 〈19, ζ2 − 4ζ + 1

〉is the factorization of 〈19〉 into prime ideals.

In all other cases, that is when q ≡ ±2 (mod 5) then f is irreduciblemodulo q and 〈q〉 is prime.

41

5 Ideal classes

The set of fractional ideals of a number field K forms an abelian group undermultiplication which we shall call IK . The set of princip

Algebraic Number Theory summary of notesempslocal.ex.ac.uk/people/staff/rjchapma/notes/ant2.pdf ·...

Documents

Transcript of Algebraic Number Theory summary of notesempslocal.ex.ac.uk/people/staff/rjchapma/notes/ant2.pdf ·...