Department of Mathematics | Van Vleck Hall, 480 Lincoln Drive, …boston/869.pdf · 2003. 8. 5. ·...

140
i Nigel Boston University of Wisconsin - Madison THE PROOF OF FERMAT’S LAST THEOREM Spring 2003

Transcript of Department of Mathematics | Van Vleck Hall, 480 Lincoln Drive, …boston/869.pdf · 2003. 8. 5. ·...

  • i

    Nigel Boston

    University of Wisconsin - Madison

    THE PROOF OF

    FERMAT’S LAST

    THEOREM

    Spring 2003

  • ii

    INTRODUCTION.

    This book will describe the recent proof of Fermat’s Last The-orem by Andrew Wiles, aided by Richard Taylor, for graduatestudents and faculty with a reasonably broad background in al-gebra. It is hard to give precise prerequisites but a first coursein graduate algebra, covering basic groups, rings, and fields to-gether with a passing acquaintance with number rings and va-rieties should suffice. Algebraic number theory (or arithmeticalgeometry, as the subject is more commonly called these days)has the habit of taking last year’s major result and making itbackground taken for granted in this year’s work. Peeling backthe layers can lead to a maze of results stretching back over thedecades.

    I attended Wiles’ three groundbreaking lectures, in June 1993,at the Isaac Newton Institute in Cambridge, UK. After return-ing to the US, I attempted to give a seminar on the proofto interested students and faculty at the University of Illinois,Urbana-Champaign. Endeavoring to be complete required sev-eral lectures early on regarding the existence of a model overQ for the modular curve X0(N) with good reduction at primesnot dividing N . This work hinged on earlier work of Zariskifrom the 1950’s. The audience, keen to learn new material, didnot appreciate lingering over such details and dwindled rapidlyin numbers.

    Since then, I have taught the proof in two courses at UIUC,a two-week summer workshop at UIUC (with the help of ChrisSkinner of the University of Michigan), and most recently a

  • iii

    course in spring 2003 at the University of Wisconsin - Madi-son. To avoid getting bogged down as in the above seminar, itis necessary to assume some background. In these cases, refer-ences will be provided so that the interested students can fillin details for themselves. The aim of this work is to convey thestrong and simple line of logic on which the proof rests. It iscertainly well within the ability of most graduate students toappreciate the way the building blocks of the proof go togetherto give the result, even though those blocks may themselves behard to penetrate. If anything, this book should serve as aninspiration for students to see why the tools of modern arith-metical geometry are valuable and to seek to learn more aboutthem.

    An interested reader wanting a simple overview of the proofshould consult Gouvea [13], Ribet [25], Rubin and Silverberg[26], or my article [1]. A much more detailed overview of theproof is the one given by Darmon, Diamond, and Taylor [6], andthe Boston conference volume [5] contains much useful elabora-tion on ideas used in the proof. The Seminaire Bourbaki articleby Oesterlé and Serre [22] is also very enlightening. Of course,one should not overlook the original proof itself [38], [34] .

  • iv

    CONTENTS.

    Introduction.

    Contents.

    Chapter 1: History and overview.

    Chapter 2: Profinite groups, complete local rings.

    Chapter 3: Infinite Galois groups, internal structure.

    Chapter 4: Galois representations from elliptic curves, mod-ular forms, group schemes.

    Chapter 5: Invariants of Galois representations, semistablerepresentations.

    Chapter 6: Deformations of Galois representations.

    Chapter 7: Introduction to Galois cohomology.

    Chapter 8: Criteria for ring isomorphisms.

    Chapter 9: The universal modular lift.

    Chapter 10: The minimal case.

    Chapter 11: The general case.

    Chapter 12: Putting it together, the final trick.

  • v

    1History and Overview

    It is well-known that there are many solutions in integers to x2+y2 = z2, for instance (3, 4, 5), (5, 12, 13). The Babylonians wereaware of the solution (4961, 6480, 8161) as early as around 1500B.C. Around 1637, Pierre de Fermat wrote a note in the marginof his copy of Diophantus’ Arithmetica stating that xn+yn = zn

    has no solutions in positive integers if n > 2. We will denotethis statement for n (FLT )n. He claimed to have a remarkableproof. There is some doubt about this for various reasons. First,this remark was published without his consent, in fact by hisson after his death. Second, in his later correspondence, Fermatdiscusses the cases n = 3, 4 with no reference to this purportedproof. It seems likely then that this was an off-the-cuff commentthat Fermat simply omitted to erase. Of course (FLT )n implies(FLT )αn, for α any positive integer, and so it suffices to prove(FLT )4 and (FLT )` for each prime number ` > 2.

  • 1.1 Proof of (FLT )4 by Fermat vi

    1.1 Proof of (FLT )4 by Fermat

    First, we must deal with the equation x2 + y2 = z2. We mayassume x, y, and z are positive and relatively prime (sinceotherwise we may divide out any common factors because theequation is homogeneous), and we see that one of x or y is even(since otherwise z2 ≡ 2 (mod 4), which is a contradiction).Suppose that x is even. Then(

    z − y2

    ) (z + y

    2

    )=

    (x

    2

    )2with relatively prime factors on the left hand side and a squareon the right hand side. Hence

    z − y2

    = b2,z + y

    2= a2,

    with a, b ∈ Z+. Then y = a2 − b2, z = a2 + b2, and x = 2ab.[Alternatively, if x2 + y2 = z2, then (x + iy)/z has norm 1,

    and so by Hilbert’s Theorem 90,

    x+ iy

    z=a+ ib

    a− ib=

    (a2 − b2) + i 2aba2 + b2

    ,

    which yields the same result.]

    Theorem 1.1 x4 + y4 = z2 has no solutions with x, y, z allnonzero, relatively prime integers.

    This implies (FLT )4.

    Proof: Say

    x2 = 2ab

    y2 = a2 − b2

    z = a2 + b2

  • 1.2 Proof of (FLT )3 vii

    with a and b relatively prime. Clearly, b is even (y is odd, sincex is even), and from a2 = y2 + b2 we get

    b = 2cd

    y = c2 − d2

    a = c2 + d2

    with c and d relatively prime. Hence

    x2 = 2ab = 4cd(c2 + d2)

    with c, d, and c2 + d2 relatively prime. Then

    c = e2

    d = f 2

    c2 + d2 = g2

    whence e4 + f 4 = g2.Note, however, that z > a2 = (g2)2 > g, and so we are done

    by infinite descent (repeated application produces an infinitesequence of solutions with ever smaller positive integer z, acontradiction). QED

    1.2 Proof of (FLT )3

    The first complete proof of this case was given by Karl Gauss.Leonhard Euler’s proof from 1753 was quite different and atone stage depends on a fact that Euler did not justify (thoughit would have been within his knowledge to do so). We outlinethe proof - details may be found in [16], p. 285, or [23], p. 43.

    Gauss’s proof leads to a strategy that succeeds for certainother values of n too. We work in the ring A = Z[ζ] = {a+ bζ :a, b ∈ Z}, where ζ is a primitive cube root of unity. The key fact

  • 1.2 Proof of (FLT )3 viii

    here is that A is a PID and hence a UFD. We also repeatedlyuse the fact that the units of A are precisely ±ζ i (i = 0, 1, 2).

    Theorem 1.2 x3 + y3 = uz3 has no solutions with x, y, z ∈ A,u a unit in A, xyz 6= 0.

    This certainly implies (FLT )3.

    Proof: By homogeneity, we may assume that x, y, z are rela-tively prime. Factoring x3 + y3 = uz3 gives

    (x+ y)(x+ ζy)(x+ ζ2y) = uz3,

    where the gcd of any 2 factors on the left divides λ := 1 − ζ.If each gcd is 1, then each factor is a cube up to a unit. Inany case, λ is “small” in that |A/(λ)| = 3. In particular, eachelement of A is either 0,±1 mod λ.

    Lemma 1.3 There are no solutions when λ 6 |xyz.

    Proof: If x ≡ 1 (mod λ), say x = 1 + λα, then

    x3 − 1 = λ3α(1 + α)(α− ζ2)≡ λ3α(1 + α)(α− 1) (mod λ4)≡ 0 (mod λ4),

    Plugging back in the equation, (±1)+(±1) ≡ ±u (mod λ4),impossible since none of the 6 units u in A are 0 or ±2 modλ4). QED

    Lemma 1.4 Suppose λ 6 |xy, λ|z. Then λ2|z.

    Proof: Consider again ±1 ± 1 ≡ uz3 (mod λ4). If the leftside is 0, then λ4|z2, so λ2|z. If the left side is ±2, then λ|2,contradicting |A/(λ)| = 3. QED

  • 1.2 Proof of (FLT )3 ix

    Lemma 1.5 Suppose that λ 6 |xy, λk||z, k ≥ 2. Then thereexists a solution with λ 6 |xy, λk−1||z.

    Proof: In this case, the gcd of any 2 factors on the left is λ.Hence we can assume that

    (1)x+ y = u1α3λt

    (2)x+ ζy = u2β3λ

    (3)x+ ζ2y = u3γ3λ,

    where u1, u2, and u3 are units, t = 3k − 2, and λ 6 |α, β, γ.(1)+ζ(2)+ζ2(2) yields (setting x1 = β, y1 = γ, and z1 = αλ

    k−1)

    x31 + �1y31 = �2z

    31

    with �1, �2 units. Reducing mod λ2, we get

    ±1± �1 ≡ 0,

    which implies that �1 = ±1. Replacing y1 by −y1 if necessary,we get

    x31 + y31 = �2z

    3.

    QEDFinally, to prove the theorem, if λ 6 |xyz, we use lemma 1.2. If

    λ 6 |xy but λ|z, we use lemmas 1.3 and 1.4. If λ|x then λ 6 |yz,hence mod λ3

    0± 1 ≡ ±uwhich implies that u ≡ ±1 (mod λ3), and hence u = ±1.Rearranging yields

    (±z)3 + (−y)3 = x3,

    a case which has already been treated. QED

  • 1.3 Further Efforts at Proof x

    1.3 Further Efforts at Proof

    Peter Dirichlet and Adrien Legendre proved (FLT )5 around1825, and Gabriel Lamé proved (FLT )7 around 1839. If we setζ = e2πi/` (` prime), and

    Z[ζ] = {a0 + a1ζ + . . .+ al−2ζ l−2 : ai ∈ Z},

    then there are cases when Z[ζ] is not a UFD and the factor-ization method used above fails. (In fact, Z[ζ] is a UFD if andonly if ` ≤ 19.)

    It turns out that the method can be resuscitated under weakerconditions. In 1844 Ernst Kummer began studying the idealclass group of Q(ζ), which is a finite group that measures howfar Z[ζ] is from being a UFD [33]. Between 1847 and 1853,he published some masterful papers, which established almostthe best possible result along these lines and were only reallybettered by the recent approach detailed below, which beganover 100 years later. In these papers, Kummer defined regu-lar primes and proved the following theorem, where h(Q(ζ))denotes the order of the ideal class group.

    Definition 1.6 Call a prime ` regular if ` 6 |h(Q(ζ)) (whereζ = e2πi/`). Otherwise, ` is called irregular.

    Remark 1.7 The first irregular prime is 37 and there are in-finitely many irregular primes. It is not known if there are in-finitely many regular primes, but conjecturally this is so.

    Theorem 1.8 (Kummer) (i) (FLT )` holds if ` is regular.(2) ` is regular if and only if ` does not divide the numerator

    of Bi for any even 2 ≤ i ≤ `− 3.

  • 1.3 Further Efforts at Proof xi

    Here Bn are the Bernoulli numbers defined by

    x

    ex − 1=∑

    (Bn/n!)xn.

    For instance, the fact that B12 = − 6912730 shows that 691 isirregular. We shall see the number 691 appearing in many dif-ferent places.

    Here the study of FLT is divided into two cases. The firstcase involves showing that there is no solution with ` 6 |xyz.The idea is to factor x` + y` = z` as

    (x+ y)(x+ ζy) · · · (x+ ζ`−1y) = z`,

    where ζ = e2πi/`. The ideals generated by the factors on theleft side are pairwise relatively prime by the assumption that` 6 |xyz (since λ := 1 − ζ has norm ` - compare the proof of(FLT )3), whence each factor generates an `th power in the idealclass group of Q(ζ). The regularity assumption then shows thatthese factors are principal ideals. We also use that any for unitu in Z[ζ], ζsu is real for some s ∈ Z. See [33] or [16] for moredetails.

    The second case involves showing that there is no solution toFLT for `|xyz.

    In 1823, Sophie Germain found a simple proof that if ` isa prime with 2` + 1 a prime then the first case of (FLT )`holds. Arthur Wieferich proved in 1909 that if ` is a prime with2`−1 6≡ 1 (mod `2) then the first case of (FLT )` holds. Exam-ples of ` that fail this are rare - the only known examples are1093 and 3511. Moreover, similar criteria are known if p`−1 6≡ 1(mod `2) and p is any prime ≤ 89 [15]. This allows one to provethe first case of (FLT )` for many `.

    Before Andrew Wiles, (FLT )` was known for all primes 2

  • 1.4 Modern Methods of Proof xii

    ` < 4× 106 [3]; the method was to check that the conjecture ofVandiver (actually originating with Kummer and a refinementof his method) that ` 6 |h(Q(ζ + ζ−1)) holds for these primes.See [36]. The first case of (FLT )` was known for all primes2 < ` < 8.7× 1020.

    1.4 Modern Methods of Proof

    In 1916, Srinivasa Ramanujan proved the following. Let

    ∆ = q∞∏n=1

    (1− qn)24 =∞∑n=1

    τ(n)qn.

    Then

    τ(n) ≡ σ11(n) (mod 691),

    where σk(n) =∑d|n d

    k.∆ is a modular form; this means that, if we set q = e2πiz, ∆

    satisfies (among other conditions) ∆(az+bcz+d

    )= (cz+d)k∆(z) for

    all z in the upper half-plane Im(z) > 0 and all(a bc d

    )∈ Γ with,

    in this case, (“weight”) k = 12 and Γ = SL2(Z) (in general, wedefine a “level” N by having Γ defined as the group of matricesin SL2(Z) such that N |c; here N = 1). For instance, settinga, b, d = 1, c = 0, ∆(z + 1) = ∆(z), and this is why ∆ can bewritten as a Fourier series in q = e2πiz.

    Due to work of André Weil in the 1940’s and John Tate inthe 1950’s, the study of elliptic curves, that is curves of theform y2 = g(x), where g is a cubic with distinct roots, led tothe study of Galois representations, i.e. continuous homomor-phisms Gal(Q̄/Q)→ GL2(R), where R is a complete local ringsuch as the finite field F` or the ring of `-adic integers Z`. Inparticular, given elliptic curve E defined over Q (meaning the

  • 1.4 Modern Methods of Proof xiii

    coefficients of g are in Q), and any rational prime `, there existassociated Galois representations ρ`,E : Gal(Q̄/Q)→ GL2(Z`)and (by reduction mod `) ρ`,E : Gal(Q̄/Q)→ GL2(F`). Theseencode much information about the curve.

    A conjecture of Jean-Pierre Serre associates to a certain kindof modular form f (cuspidal eigenforms) and to a rational prime` a Galois representation, ρ`,f . All known congruences for τ fol-low from a systematic study of the representations associatedto ∆. This conjecture was proved by Pierre Deligne [7] (butnote that he really only wrote the details for ∆ - extensivenotes of Brian Conrad http://www.math.lsa.umich.edu/∼ bd-conrad/bc.ps can be used to fill in details here) in 1969 forweights k > 2. For k = 2 it follows from earlier work of MartinEichler and Goro Shimura [31]. For k = 1 it was later estab-lished by Deligne and Serre [8].

    These representations ρ`,f share many similarities with therepresentations ρ`,E. Formalizing this, a conjecture of YutakaTaniyama of 1955, later put on a solid footing by Shimura,would attach a modular form of this kind to each elliptic curveover Q. Thus, we have the following picture

    {Repns from elliptic curves}|∩

    {Repns from certain modular forms} ⊆ {Admissible Galois representations}

    In 1985, Gerhard Frey presented a link with FLT. If we as-sume that a, b, c are positive integers with a` + b` = c`, andconsider the elliptic curve y2 = x(x− a`)(x+ b`) (called a Freycurve), this curve is unlikely to be modular, in the sense that

  • 1.4 Modern Methods of Proof xiv

    ρ`,E turns out to have properties that a representation associ-ated to a modular form should not.

    The Shimura-Taniyama conjecture, however, states that anygiven elliptic curve is modular. That is, given E, defined over Q,we consider its L-function L(E, s) =

    ∑an/n

    s. This conjecturestates that

    ∑anq

    n is a modular form. Equivalently, every ρ`,Eis a ρ`,f for some modular form f .

    In 1986, Kenneth Ribet (building on ideas of Barry Mazur)showed that these Frey curves are definitely not modular. Hisstrategy was to show that if the Frey curve is associated to amodular form, then it is associated to one of weight 2 and level2. No cuspidal eigenforms of this kind exist, giving the desiredcontradiction. Ribet’s approach (completed by Fred Diamondand others) establishes in fact that the weak conjecture belowimplies the strong conjecture (the implication being the so-called �-conjecture). The strong conjecture would imply manyresults - unfortunately, no way of tackling this is known.Serre’s weak conjecture [30] says that all Galois representa-

    tions ρ : Gal(Q̄/Q) → GL2(k) with k a finite field, and suchthat det(ρ(τ)) = −1, where τ denotes a complex conjugation,(this condition is the definition of ρ being odd) come from mod-ular forms.Serre’s strong conjecture [30] states that ρ comes from a mod-

    ular form of a particular type (k,N, �) with k,N positive inte-gers (the weight and level, met earlier) and � : (Z/NZ)× → C×(the Nebentypus). In the situations above, � is trivial.

    In 1986, Mazur found a way to parameterize certain collec-tions of Galois representations by rings. Frey curves are semistable,meaning that they have certain mild singularities modulo primes.Wiles with Richard Taylor proved in 1994 that every semistable

  • 1.4 Modern Methods of Proof xv

    elliptic curve is modular.In a picture we have (restricting to certain subsets to be de-

    fined later):

    {Certain semistable elliptic curves}|∩

    {Certain modular forms} ⊆ {Certain semistable Galois representations}

    Wiles’ idea is, first, following Mazur to parametrize the setson the bottom line by local rings T and R. The inclusion trans-lates into a surjection from R → T . Using some clever com-mutative algebra, Wiles obtains conditions for such a map tobe an isomorphism. Using Galois cohomology and the theoryof modular curves, it is checked that these conditions generallyhold. The isomorphism of R and T translates back into the twosets on the bottom line being equal. It then follows that everysemistable elliptic curve is modular.

    In particular our particular Frey curves are modular, contra-dicting the conclusion of Ribet’s work and establishing thatcounterexamples to Fermat’s Last Theorem do not exist.

    The Big Picture. An outline to the strategy of the proof hasbeen given. A counterexample to Fermat’s Last Theorem wouldyield an elliptic curve (Frey’s curve) with remarkable proper-ties. This curve is shown as follows not to exist. Associated toelliptic curves and to certain modular forms are Galois repre-sentations. These representations share some features, whichmight be used to define admissible representations. The aimis to show that all such admissible representations come frommodular forms (and so in particular the elliptic curve ones do,

  • 1.4 Modern Methods of Proof xvi

    implying that Frey’s curves are modular, enough for a contra-diction). We shall parametrize special subsets of Galois repre-sentations by complete Noetherian local rings and our aim willamount to showing that a given map between such rings is anisomorphism. This is achieved by some commutative algebra,which reduces the problem to computing some invariants, ac-complished via Galois cohomology. The first step is to define(abstractly) Galois representations.

  • xvii

    2Profinite Groups and Complete Local Rings

    2.1 Profinite Groups

    Definition 2.1 A directed set is a partially ordered set I suchthat for all i, j ∈ I there is a k ∈ I with i ≤ k and j ≤ k.

    Example: Let G be a group. Index the normal subgroups offinite index by I. Say i ≥ j if Ni ⊆ Nj. If k corresponds toNk = Ni ∩Nj then i, j ≤ k, so we have a directed set.

    Definition 2.2 An inverse system of groups is a collection ofgroups indexed by a directed set I, together with group homo-morphisms πij : Gi → Gj whenever i ≥ j. We insist thatπii = Id, and that πjkπij = πik.

    Example: Index the normal subgroups of finite index by I asabove. Setting Gi = G/Ni, and πij : Gi → Gj to be the naturalquotient map whenever i ≥ j, we get an inverse system ofgroups.

  • 2.1 Profinite Groups xviii

    We now form a new category, whose objects are pairs (H, {φi :i ∈ I}), where H is a group and each φi : H → Gi is a grouphomomorphism, with the property that

    Hφi

    ~~~~~~

    ~~~~ φj

    AAA

    AAAA

    A

    Gi πij//Gj

    commutes whenever i ≥ j. Given two elements (H, {φi}) and(J, {ψi}), we define a morphism between them to be a grouphomomorphism θ : H → J such that

    Hθ //

    φi AAA

    AAAA

    A J

    ψi~~~~

    ~~~~

    Gi

    commutes for all i ∈ I.Example: Continuing our earlier example, (G, {φi}) is an ob-ject of the new category, where φi : G → G/Ni is the naturalquotient map.

    Definition 2.3 lim←− i∈IGi is the terminal object in the new cat-

    egory, called the inverse limit of the Gi. That is, lim←− Gi is theunique object (X, {χi}) such that given any object (H, {φi})there is a unique morphism

    (H, {φi})→ (X, {χi}).

    The existence of a terminal object in this category will beproved below, after the next example.

    Example: Continuing our earlier example, the group above, X,is the profinite completion Ĝ ofG. Since Ĝ is terminal, there is a

  • 2.1 Profinite Groups xix

    unique group homomorphism G→ Ĝ. If this is an isomorphismthen we say that G is profinite (or complete). For instance, itwill be shown below that Gal(Q̄/Q) is a profinite group.

    We have the following commutative diagram for every object(H, {φi}) of the new category and every Ni ⊆ Nj,

    H

    φi

    ��666

    6666

    6666

    666 φj

    ''PPPPP

    PPPPPP

    PPPPPP

    PPPPPP

    PPPPPP

    ∃!

    ''G

    ��

    ��666

    6666

    6666

    666

    ∃! // Ĝ

    wwnnnnnn

    nnnnnn

    nnnnnn

    nnnnnn

    nnnnn

    ������

    ����

    ����

    ��

    G/Ni //G/Nj

    Ĝ contains all relevant information on finite quotients of G.G → Ĝ is called the profinite completion of G. If we only usethose finite quotients which are C-groups, then we obtain thepro-C completion of G instead.

    Exercise: Prove that Ĝ → ˆ̂G is an isomorphism, so that Ĝ isprofinite/complete.

    We return to the general case and we now need to prove theexistence of lim←− i∈I

    Gi. To do this, let

    C =∏i∈IGi,

    and πi : C → Gi be the ith projection. LetX = {c ∈ C|πij(πi(c)) =πj(c) ∀i ≥ j}. We claim that

    lim←− i∈IGi = (X, {πi|X}).

    Proof: (i) (X, {πi|X}) is an object in the new category, since Xis a group (check!) and the following diagram commutes for all

  • 2.1 Profinite Groups xx

    i ≥ j (by construction)

    Xπi|X~~~~

    ~~~~

    ~~ πj |X A

    AAAA

    AAA

    Gi πij//Gj

    (ii) Given any (H, {φi}) in the new category, define φ(h) =(φi(h))i∈I , and check that this is a group homomorphism φ :H → X such that

    Hφ //

    φi AAA

    AAAA

    A X

    πi~~}}}}

    }}}}

    Giand that φ is forced to be the unique such map. QED

    Example: Let G = Z and let us describe Ĝ = Ẑ. The finitequotients of G are Gi = Z/i, and i ≥ j means that j|i. HenceẐ = {(a1, a2, a3, . . .)|ai ∈ Z/i and ai ≡ aj (mod j) whenever j|i}.Then for a ∈ Z, the map a 7→ (a, a, a . . .) ∈ Ẑ is a homomor-phism of Z into Ẑ.

    Now consider F̄p = ∪nFpn. Then, if m|n,

    Gal(F̄p/Fp)restriction,φn//

    restriction,φm **VVVVVVV

    VVVVVVVV

    VVVGal(Fpn/Fp) ∼= Z/n

    πnm��

    Gal(Fpm/Fp) ∼= Z/m

    Note that Gal(Fpn/Fp) is generated by the Frobenius auto-morphism Fr : x 7→ xp. We see that (Gal(F̄p/Fp), {φn}) is anobject in the new category corresponding to the inverse system.Thus there is a map Gal(F̄p/Fp)→ Ẑ.

    We claim that this map is an isomorphism, so thatGal(F̄p/Fp)is profinite. This follows from our next result.

  • 2.1 Profinite Groups xxi

    Theorem 2.4 Let L/K be a (possibly infinite) separable, al-gebraic Galois extension. Then Gal(L/K) ∼= lim←− Gal(Li/K),where the limit runs over all finite Galois subextensions Li/K.

    Proof: We have restriction maps:

    Gal(L/K)φi //

    φj ((PPPPP

    PPPPPP

    PGal(Li/K)

    ����

    Gal(Lj/K)

    whenever Lj ⊆ Li, i.e. i ≥ j. We use the projection maps toform an inverse system, so, as before, (Gal(L/K), {φ}) is anobject of the new category and we get a group homomorphism

    Gal(L/K)φ−→ lim←− Gal(Li/K).

    We claim that φ is an isomorphism.(i) Suppose 1 6= g ∈ Gal(L/K). Then there is some x ∈ L

    such that g(x) 6= x. Let Li be the Galois (normal) closure ofK(x). This is a finite Galois extension of K, and 1 6= g|Li =φi(g), which yields that 1 6= φ(g). Hence φ is injective.

    (ii) Take (gi) ∈ lim←− Gal(Li/K) - this means that Lj ⊆ Li ⇒gi|Lj = gj. Then define g ∈ Gal(L/K) by g(x) = gi(x) when-ever x ∈ Li. This is a well-defined field automorphism andφ(g) = (gi). Thus φ is surjective. QED

    For the rest of this section, we assume that the groups Gi areall finite (as, for example, in our running example). Endow thefinite Gi in our inverse system with the discrete topology. Giis certainly a totally disconnected Hausdorff space. Since theseproperties are preserved under taking products and subspaces,lim←− Gi ⊆

    ∏Gi is Hausdorff and totally disconnected as well.

    Furthermore∏Gi is compact by Tychonoff’s theorem.

  • 2.2 Complete Local Rings xxii

    Exercise: If f, g : A → B are continuous (A, B topologicalspaces) with A, B Hausdorff, then {x|f(x) = g(x)} is closed.Deduce that

    lim←− Gi =⋂i≥j

    {c ∈

    ∏Gi : πij(πi(c)) = πj(c)

    }

    is closed in∏Gi, therefore is compact. In summary, lim←− Gi is a

    compact, Hausdorff, totally disconnected topological space.

    Exercise: The natural inclusion Z → Ẑ maps Z onto a densesubgroup. In fact, for any group G, its image in Ĝ is dense, butthe kernel of G → Ĝ need not be trivial. This happens if andonly if G is residually finite (meaning that the intersection ofall its subgroups of finite index is trivial).

    If we denote by Fr the element ofGal(F̄q/Fq) given by Fr(x) =xq, i.e. the Frobenius automorphism, then Fr does not gener-ate the Galois group, but the group which it does generate isdense (by the last exercise), and so we say that Gal(F̄q/Fq)is topologically finitely generated by one element Fr (and so isprocyclic).

    2.2 Complete Local Rings

    We now carry out the same procedure with rings rather thangroups and so define certain completions of them. Let R be acommutative ring with identity 1, I any ideal of R. For i ≥ jwe have a natural quotient map

    R/I iπij−→R/Ij.

    These rings and maps form an inverse system (now of rings).Proceeding as in the previous section, we can form a new cate-gory. Then the same proof gives that there is a unique terminal

  • 2.2 Complete Local Rings xxiii

    object, RI = lim←− iR/I i, which is now a ring, together with a

    unique ring homomorphism R → RI , such that the followingdiagram commutes:

    R //

    πi

    ��555

    5555

    5555

    555

    πj

    $$JJJ

    JJJJ

    JJJJ

    JJJJ

    JJJJ

    JJ RI

    ������

    ����

    ����

    ��

    yyttttt

    tttt

    tttt

    tttt

    tttt

    R/I i πij//R/Ij

    Note that RI depends on the ideal chosen. It is called theI-adic completion of R (do not confuse it with the localizationof R at I). We call R (I-adically) complete if the map R→ RIis an isomorphism. Then RI is complete. If I is a maximal idealm, then we check that RI is local, i.e. has a unique maximalideal, namely mR̂. This will be proven for the most importantexample, Zp (see below), at the start of the next chapter.

    Example: If R = Z, m = pZ, p prime, then RI ∼= Zp, the p-adicintegers. The additive group of this ring is exactly the pro-pcompletion of Z as a group. In fact,

    Zp = {(a1, a2, . . .) : ai ∈ Z/pi, ai ≡ aj (mod pj) if i ≥ j}.

    Note that Ẑ will always be used to mean the profinite (ratherthan any I-adic) completion of Z.

    Exercise: Let R = Z and I = 6Z. Show that the I-adic com-pletion of Z is isomorphic to Z2 × Z3 and so is not local (it’scalled semilocal).

    Exercise: Show that Ẑ =∏`Zp, the product being over all

    rational primes.

    Exercise: Show that the ideals of Zp are precisely {0} and piZ`

  • 2.2 Complete Local Rings xxiv

    (i ≥ 0). (This also follows from the theory developed in chapter3.)

    We will be interested in the category C whose objects are thecomplete local Noetherian rings with a given finite residue field(that is, the ring modulo its maximal ideal) k. In this category amorphism is required to make the following diagram commute:

    Rφ //

    ��

    S

    ��k Id

    // k

    where the vertical maps are the natural projections. This isequivalent to requiring that φ(mR) ⊆ mS, where mR (respec-tively mS) is the maximal ideal of R (respectively S).

    As an example, if k = F`, then Z` is an object of C. By atheorem of Cohen [2], the objects of C are of the form

    W (k)[[T1, . . . , Tm]]/(ideal),

    where W (k) is a ring called the ring of infinite Witt vectorsover k (see chapter 3 for an explicit description of it). ThusW (k) is the initial object of C. In the case of Fp, W (Fp) = Zp.

    Exercise: IfR is a ring that is I-adically complete, thenGLn(R) ∼=lim←− i

    GLn(R/Ii) (the maps GLn(R/I

    i)→ GLn(R/Ij) being thenatural ones).

    Note that the topology on R induces the product topology onMn(R) and thence the subspace topology on GLn(R).

    The Big Picture. We shall seek to use continuous grouphomomorphisms (Galois representations)

    Gal(Q̄/Q)→ GLn(R),

  • 2.2 Complete Local Rings xxv

    where R is in some C, to parametrize the homomorphisms thatelliptic curves and modular forms naturally produce. In thischapter we have constructed these groups and rings and ex-plained their topologies. Next, we study the internal structureof both sides, notably certain important subgroups of the leftside. This will give us the means to characterize Galois repre-sentations in terms of their effect on these subgroups.

  • xxvi

    3Infinite Galois Groups: Internal Structure

    We begin with a short investigation of Zp. A good reference forthis chapter is [28].

    We first check that pnZp is the kernel of the map Zp → Z/pn.Hence, we have

    Zp ⊃ pZp ⊃ p2Zp . . .If x ∈ pnZp − pn+1Zp, then we say that the valuation of x,v(x) = n. Set v(0) =∞.Exercise: x is a unit in Zp if and only if v(x) = 0

    Corollary 3.1 Every x ∈ Zp − {0} can be uniquely written aspv(x)u where u is a unit.

    In fact

    (∗) : (1)v(xy) = v(x) + v(y), (2)v(x+ y) ≥ min(v(x), v(y)).We define a metric on Zp as follows: set d(x, y) = c

    v(x−y) fora fixed 0 < c < 1, x 6= y ∈ Zp (d(x, x) = 0). We have some-thing stronger than the triangle inequality, namely d(x, z) ≤

  • 3. Infinite Galois Groups: Internal Structure xxvii

    max(d(x, y), d(y, z)) for all x, y, z ∈ Zp. This has unusual con-sequences such as that every triangle is isosceles and every pointin an open unit disc is its center.

    The metric and profinite topologies then agree, since pnZp is abase of open neighborhoods of 0 characterized by the propertyv(x) ≥ n ⇐⇒ d(0, x) ≤ cn.

    By (∗)(1), Zp is an integral domian. Its quotient field is calledQp, the field of p-adic numbers. We have the following diagramof inclusions.

    Q̄ � // Q̄p

    Q � //

    ?�

    OO

    Qp?�

    OO

    Z� //

    ?�

    OO

    Zp?�

    OO

    This then produces restriction maps

    (∗) Gal(Q̄p/Qp)→ Gal(Q̄/Q).

    We can check that this is a continuous group homomorphism(defined up to conjugation only). Denote Gal(K̄/K) by GK .

    Definition 3.2 Given a continuous group homomorphism ρ :GQ → GLn(R), (∗) yields by composition a continuous grouphomomorphism ρp : GQp → GLn(R). The collection of homo-morphisms ρp, one for each rational prime p, is called the localdata attached to ρ.

    The point is that GQp is much better understood than GQ;in fact even presentations of GQp are known, at least for p 6= 2[18]. We next need some of the structure of GQp and obtain

  • 3. Infinite Galois Groups: Internal Structure xxviii

    this by investigating finite Galois extensions K of Qp and howGal(K/Qp) acts.

    Now,Zp − {0} = {pnu|n ≥ 0, u is a unit},

    whenceQp − {0} = {pnu|u is a unit inZp}.

    We can thus extend v : Zp − {0} → N to a map v : Q×p → Zwhich is a homomorphism of groups.

    Definition 3.3 A discrete valuation w on a field K is a sur-jective homomorphism w : K× → Z such that

    w(x+ y) ≥ min(w(x), w(y)

    for all x, y ∈ K (we take w(0) =∞).

    Example: The map v : Qp → Z is a discrete valuation, since(∗) extends to Qp.Exercise: If K has a discrete valuation then

    A = {x ∈ K|w(x) ≥ 0}

    is a ring, andm = {x ∈ K|w(x) > 0}

    is its unique maximal ideal. Choose π so that w(π) = 1. Everyelement x ∈ K× can be uniquely written as x = πw(x)u, whereu is a unit in A.

    Remark: A,A/m, and π are called respectively the valua-tion ring, the residue field, and a uniformizer of w. As withthe valuation on Qp, w yields a metric on K. An alterna-tive way of describing the elements of A is by the power se-ries {c0 + c1π + c2π2 + . . . |ci ∈ S}, where S ⊂ A is chosen

  • 3. Infinite Galois Groups: Internal Structure xxix

    to contain exactly one element of each coset of A/m. Theconnection with our approach is by mapping the typical el-ement to (c0 (mod π), c0 + c1π (mod π)

    2, c0 + c1π + c2π2

    (mod π)3, . . .) ∈ ∏A/mi = A. The description extends to Kby having K = {∑∞i=N ciπi}, i.e. Laurent series in π, coefficientsin S.

    Corollary 3.4 The ideal m is equal to the principal ideal (π),and all ideals of A are of the form (πn), so A is a PID.

    Let K/Qp be a finite Galois extension, and define a normN : K× → Q×p by

    x 7→∏σ∈G

    σ(x)

    where G = Gal(K/Qp). The composition of homomorphisms

    K×N−→Q×p

    v−→Z

    is nonzero with some image fZ. We then define w : K× → Zby w = 1f v ◦ N . Then w is a discrete valuation on K. f iscalled the residue degree of K, and we say that w extends vwith ramification index e if w|Qp = ev. For any x ∈ Qp , wehave

    ev(x) = w(x) =1

    fv(N(x)) =

    1

    fv(xn) =

    n

    fv(x),

    so that ef = n = [K : Qp].

    Proposition 3.5 The discrete valuation w is the unique dis-crete valuation on K which extends v.

    Proof: A generalization of the proof that any two norms on afinite dimensional vector space over C are equivalent ([4],[29]Chap. II). QED

  • 3. Infinite Galois Groups: Internal Structure xxx

    Exercise: Let A be the valuation ring of K, and m the maximalideal of A. Prove that the order of A/m is pf .

    We have a collection of embeddings as follows:

    A� //K

    Zp?�

    OO

    � //Qp?�

    OO

    We note that the action of G on K satisfies w(σ(x)) = w(x),for all σ ∈ G, x ∈ K, since w and w ◦ σ both extend v (or byusing the explicit definition of w).

    This property is crucial in establishing certain useful sub-groups of G.

    Definition 3.6 The ith ramification subgroup of G = Gal(K/Qp)is

    Gi = {σ ∈ G|w(σ(x)− x) ≥ i+ 1 for all x ∈ A},for i = −1, 0, . . .. (See [29], Ch. IV.)

    Gi is a group, since 1 ∈ Gi, and if σ, τ ∈ G, then

    w(στ(x)− x) = w(σ(τ(x)− x) + (σ(x)− x))≥ min(w(σ(τ(x)− x), w(σ(x)− x))≥ i+ 1,

    so that στ ∈ Gi. This is sufficient since G is finite. Moreover,

    στσ−1(x)− x = σ(τσ−1(x)− σ−1(x))= σ(τ(σ−1(x))− σ−1(x))

    shows that Gi �G.We have that

    G = G−1 ⊇ G0 ⊇ G1 ⊇ · · · ,

  • 3. Infinite Galois Groups: Internal Structure xxxi

    where we call G0 the inertia subgroup of G, and G1 the wildinertia subgroup of G.

    Exercise: Let π be a uniformizer of K. Show that in fact

    Gi = {σ ∈ G|w(σ(π)− π) ≥ i+ 1}.

    These normal subgroups determine a filtration of G, and wenow study the factor groups in this filtration.

    Theorem 3.7 (a) The quotient G/G0 is canonically isomor-phic to Gal(k/Fp), with k = A/m, hence it is cyclic of orderf .(b) Let U0 be the group of units of A. Then Ui = 1 + (π

    i)(i ≥ 1) is a subgroup of U0. For all σ ∈ G, the map σ 7→ σ(π)/πinduces an injective group homomorphism Gi/Gi+1 ↪→ Ui/Ui+1.

    Proof: (a) Let σ ∈ G. Then σ acts on A, and sends m tom. Hence it acts on A/m = k. This defines a map φ : G →Gal(k/Fp) by sending σ to the map x + m 7→ σ(x) + m. Wenow examine the kernel of this map.

    kerφ = {σ ∈ G|σ(x)− x ∈m for all x ∈ A}= {σ ∈ G|w(σ(x)− x) ≥ 1 for all x ∈ A} = G0.

    This shows that φ induces an injective homomorphism fromG/G0 to Gal(k/Fp).

    As for surjectivity, choose a ∈ A such that the image ā of ain k has k = Fp(ā). Let

    p(x) =∏σ∈G

    (x− σ(a)).

    Then p(x) is a monic polynomial with coefficients in A, and

  • 3. Infinite Galois Groups: Internal Structure xxxii

    one root is a. Then

    p(x) =∏σ∈G

    (x− σ(a)) ∈ k[x]

    yields that all conjugates of ā are of the form σ(a). For τ ∈Gal(k/Fp), τ(ā) is such a conjugate, whence it is equal to someσ(a). Then the image of σ is τ .

    (b) σ ∈ Gi ⇐⇒ w(σ(π)− π) ≥ i+ 1 ⇐⇒ σ(π)/π ∈ Ui.This map is independent of choice of uniformizer. Suppose

    π′ = πu is another uniformizer, where u is a unit. Then

    σ(π′)

    π′=σ(π)

    π

    σ(u)

    u.

    but for σ ∈ Gi, we have that i+1 ≤ w(σ(u)−u) = w(σ(u)/u−1) + w(u), so that σ(u)/u ∈ Ui+1, i.e. σ(π′)/π′ differs fromσ(π)/π by an element of Ui+1.

    The map θi is a homomorphism, since

    στ(π)

    π=σ(π)

    π

    τ(π)

    π

    σ(u)

    u,

    where u = τ(π)/π, and, as above, σ(u)/u ∈ Ui+1.Finally the map θi is injective. To see this assume that σ(π)/π ∈

    Ui+1. Then σ(π) = π(1 + y) with y ∈ (πi+1). Hence σ(π)− π =πy has valuation at least i+ 2, that is, σ ∈ Gi+1. QED

    Proposition 3.8 We have that U0/U1 is canonically isomor-phic to k×, and is thus cyclic of order pf − 1, and for i ≥ 1,Ui/Ui+1 embeds canonically into π

    i/πi+1 ∼= k+, and so is ele-mentary p-abelian.

    Proof: The map x 7→ x (mod m) takes U0 to k×, and issurjective. Its kernel is {x|x ≡ 1 (mod m)} = U1, whenceU0/U1 ∼= k×.

  • 3. Infinite Galois Groups: Internal Structure xxxiii

    The map 1 + x 7→ x (mod π)i+1 takes Ui to πi/πi+1 andis surjective with kernel Ui+1. Moreover, since π acts triviallyon πi/πi+1, this is an (A/m)-module (i.e. k-vector space). Itsdimension is 1, since otherwise there would be an ideal of Astrictly between πi and πi+1. QED

    We note in particular thatG is solvable, since the factors in itsfiltration are all abelian by the above. Thus, GQp is prosolvable.Specifically, we have the following inclusions:

    G

    G0?�

    cyclicOO

    G1?�

    cyclic order prime to pOO

    {1}?�

    p−groupOO

    Corollary 3.9 The group G = Gal(K/Qp) is solvable. More-over, its inertia subgroup G0 has a normal Sylow p-subgroup(namely G1) with cyclic quotient.

    To study continuous homomorphisms ρp : GQp → GL2(R),we are looking at finite quotients Gal(K/Qp) of GQp. If we cannext define ramification subgroups for the infinite Galois groupGQp, then we can describe properties of ρp in terms of its effecton these subgroups.

  • 3.1 Infinite extensions xxxiv

    3.1 Infinite extensions

    Let Qp ⊆ M ⊆ L be finite Galois extensions, with respectivevaluation rings AM , AL. We have a restriction map

    Gal(L/Qp) −→ Gal(M/Qp)⋃|

    ⋃|

    G(L)i G

    (M)i ,

    where the horizontal map is restriction. In general the image ofG

    (L)i under this restriction map does not equal G

    (M)i ; in order to

    have this happen, we need the upper numbering of ramificationgroups (see Chapter IV of [29]).

    It is, however, true that the image of G(L)i equals G

    (M)i for

    i = 0, 1. This follows for i = 0 since σ ∈ G(L)0 ⇐⇒ thevaluation of σ(x) − x > 0 for all x ∈ AL. It follows that thevaluation of σ|M(x) − x > 0 for all x ∈ AM . The case i = 1now follows since the Sylow p-subgroups of the image are theimage of the Sylow p-subgroups.

    Hence, we obtain, for i = 0 or 1, an inverse system consistingof the groups G

    (L)i as L runs through all finite Galois extensions

    of Qp, together with homomorphisms G(L)i →→ G

    (M)i whenever

    M ⊆ L. Inside ∏Gal(L/Qp), by definition of inverse limit, wehave

    GQp = lim←− Gal(L/Qp)⋃|

    G0 = lim←− G(L)0⋃

    |

  • 3.1 Infinite extensions xxxv

    G1 = lim←− G(L)1

    A second inverse system consists of the groupsGal(L/Qp)/G(L)0 .

    Homomorphisms for M ⊆ L are obtained from the restrictionmap and the fact that the image under restriction of G

    (L)0 is

    equal to G(M)0 .

    Gal(L/Qp)/G(L)0

    //Gal(M/Qp)/G(M)0

    || ||

    Gal(kL/Fp) //Gal(kM/Fp)

    We use this to show that

    GQp/G0∼= Gal(F̄p/Fp) ∼= Ẑ ∼=

    ∏q

    Zq.

    We shall see later that

    G0/G1 ∼=∏q 6=p

    Zq.

    The following will come in useful:

    Theorem 3.10 If H is a closed subgroup of GQp, then H =Gal(Q̄p/L), where L is the fixed field of H.

    Proof: Let H be a closed subgroup of GQp, and let L be its fixedfield. Then for any finite Galois extension M/L, we obtain a

  • 3.2 Structure of GQp/G0 and GQp/G1 xxxvi

    commutative diagram

    H� //

    %% %%KKKK

    KKKK

    KKK Gal(Q̄p/L)

    ��

    Gal(M/L)

    Since Gal(Q̄p/L) = lim←− Gal(M/L), this implies thatH is dense

    in Gal(Q̄p/L). Since H is closed, H = Gal(Q̄p/L). QEDIn fact infinite Galois theory provides an order-reversing bi-

    jection between intermediate fields and the closed subgroups ofthe Galois group.

    3.2 Structure of GQp/G0 and GQp/G1

    We now have normal subgroups G1 ⊆ G0 ⊆ GQp. Suppose thatρ : GQ → GL2(R) is one of the naturally occurring continu-ous homomorphisms in which we are interested, e.g. associatedto an elliptic curve or modular form. We get continuous ho-momorphisms ρp : GQp → GL2(R) , such that ρp(G0) = {1}for all but finitely many p (in which case we say that ρ is un-ramified at p). For a p at which ρ is unramified, ρ induces ahomomorphism GQp/G0 → GL2(R). Often it will be the casethat ρp(G1) = {1} for a ramified p (in which case we say that ρis tamely ramified at p). In this case ρ induces a homomorphismGQp/G1 → GL2(R). Thus, we are interested in the structure ofthe two groups GQp/G0 and GQp/G1.

    Here is a rough outline of how we establish this. In the fi-nite extension case considered already, G/G0 is isomorphic toGal(k/Fp) (and so is cyclic), andG0/G1 embeds in k

    × (and so iscyclic of order prime to p). In the limit, GQp/G0 = lim←− G/G0 =

    lim←− Gal(k/Fp) = Gal(F̄p/Fp) (and so is procyclic), and G0/G1

  • 3.2 Structure of GQp/G0 and GQp/G1 xxxvii

    embeds in lim←− k×. A few things need to be checked along the

    way. The first statement amounts to there being one and onlyone homomorphism onto each Gal(k/Fp). For the second state-ment, we check that the restriction maps between the G0/G1translate into the norm map between the k×. This gives an in-jective map G0/G1 → lim←− k

    ×, shown surjective by exhibitingexplicit extensions of Qp.

    As for GQp/G1, consider the extension of groups at the finitelevel:

    1→ G0/G1 → G/G1 → G/G0 → 1.

    Thus, G/G1 is metacyclic, i.e. has a cyclic normal subgroupwith cyclic quotient. The group G/G1 acts by conjugation onthe normal subgroup G0/G1. Since G0/G1 is abelian, it actstrivially on itself, and thus the conjugation action factors throughG/G0. So G/G0 acts on G0/G1, but since it is canonically iso-morphic to Gal(k/Fp), it also acts on k

    ×. These actions com-mute with the map G0/G1 → k×. In the limit, GQp/G1 isprometacyclic, and the extension of groups

    1→ G0/G1 → GQp/G1 → GQp/G0 → 1

    is a semidirect product since GQp/G0 is free, with the actiongiven by that of Gal(F̄p/Fp) on lim←− k

    ×. In particular, the Frobe-

    nius map is a (topological) generator of Gal(F̄p/Fp) and actsby mapping elements to their pth power.

    Theorem 3.11 Let G0, G1 denote the 0th and 1st ramificationsubgroups of GQp. Then GQp/G0

    ∼= lim←− Gal(k/Fp)∼= Ẑ and

    G0/G1 ∼= lim←− k× ∼= ∏q 6=p Zq, where the maps in the inverse

    system are norm maps. Moreover, GQp/G1 is (topologically)generated by two elements x, y where y generates G0/G1, x

  • 3.2 Structure of GQp/G0 and GQp/G1 xxxviii

    maps onto the Frobenius element in GQp/G0 = Gal(F̄p/Fp),and x−1yx = yp.

    For any finite extension L/Qp, define G(L) = Gal(L/Qp), AL

    is the valuation ring of L, and kL the residue field of L.Let M ⊆ L be finite Galois extensions of Qp. We obtain the

    following diagram:

    GQp

    }}{{{{

    {{{{

    ""DDD

    DDDD

    D

    φL

    ��

    φM

    G(L)/G(L)0

    ����

    //G(M)/G(M)0

    ����

    Gal(kL/Fp) //Gal(kM/Fp)

    Note that G0 lis in the kernel of each φL. From this diagram weget maps φL : GQp/G0 → Gal(kL/Fp). We note the followingfacts.

    1) Given k, there is only one such map, φk. The reason hereis that if kL = kM = k, then there is a unique map

    Gal(LM/Qp)→ Gal(kLM/Fp)→ Gal(k/Fp)

    since Gal(kLM/Fp) is cyclic.

    Definition 3.12 The valuation ring of the fixed field of φk willbe denoted W (k), the ring of infinite Witt vectors of k. An al-ternative explicit description is given in [17]. Note that W (Fp)is simply Zp.

    2) Given k, there is some L with kL = k. (We may takeL = Qp(ζ), where ζ is a primitive (|k| − 1)th root of 1.)

    We consider the inverse system consisting of groupsGal(k/Fp),where k/Fp runs through finite Galois extensions, together with

  • 3.2 Structure of GQp/G0 and GQp/G1 xxxix

    the usual restriction maps. Then {GQp/G0, {φk}) is an object ofthe new category associated to this inverse system. We obtaina homomorphism

    GQp/G0 → lim←− Gal(k/Fp)∼= Ẑ,

    and this is an isomorphism because of (1) and (2) above.Next we study the structure of G0/G1. We have the following

    diagram:

    G0

    ����

    ���

    AAA

    AAAA

    ψL

    ��

    ψM

    ��

    G(L)0 /G

    (L)1

    ����

    //G(M)0 /G

    (M)1

    ����

    k×L// k×M

    Note that G1 lies in the kernel of each ψL. It is a simple exerciseto show that the norm map k×L → k×M makes the bottom squarecommutative.

    Theorem 3.13 There is a canonical isomorphism G0/G1 ∼=lim←− k

    ×.

    Proof: Let L/Qp be a finite Galois extension, and recall that

    G(L)0 /G

    (L)1 ↪→ k×L

    naturally. Then

    G0 → G(L)0 → k×Lfactors through G0/G1. In this way we get maps G0/G1 →k×L for each L, and finally a map G0/G1 → lim←− k

    ×, where theinverse limit is taken over all finite Galois extensions k/Fp, andfor k1 ⊆ k2, the map k×2 → k×1 is the norm map. The fact that

  • 3.2 Structure of GQp/G0 and GQp/G1 xl

    G0/G1 → lim←− k× is an isomorphism, follows by exhibiting fields

    L/Qp with G(L)0 /G

    (L)1∼= k× for any finite field k/Fp, namely

    L = Qp(ζ, p1/d) where d = |k| − 1 and ζ is a primitive dth root

    of 1 . QEDNote: the fields exhibited above yield (surjective) fundamental

    charactersG0/G1 → k×. Let µn be the group of nth roots of 1 inF̄p , where (p, n) = 1. For m|n, we have a group homomorphismµn → µm given by x 7→ xn/m, which forms an inverse system.Let µ = lim←− µn.

    Lemma 3.14 We have a canonical isomorphism lim←− k× ∼= µ.

    Proof: Since the groups k× form a subset of the groups µn, weobtain a surjection from lim←− µn → lim←− k

    ×. To obtain a map in

    the other direction, note that the numbers pf − 1 are cofinalin the set of integers prime to p. In fact, if d is such an integerinteger, there is some f ≥ 1 with pf ≡ 1 (mod d), e.g. f =ϕ(d). QEDµ is noncanonically isomorphic to

    ∏q 6=pZq, since µn is non-

    canonically isomorphic to Z/nZ, and

    lim←− Z/nZ∼=

    ∏q 6=p

    Zq,

    since Z/nZ ∼= ∏Z/prii Z for n = ∏ prii .We now examine the map

    G(L)0 /G

    (L)1 ↪→ k×L .

    One can check that it is G(L)-equivariant (with the conjugationaction on the left, and the natural action on the right). Now

    G(L)0 acts trivially on the left (G

    (L)0 /G

    (L)1 is abelian ), and also

    trivially on the right by definition. Hence we end up with a

  • 3.2 Structure of GQp/G0 and GQp/G1 xli

    G(L)/G(L)0 (∼= Gal(kL/Fp))-equivariant map. The canonical iso-

    morphism G0/G1 → lim←− k× is GQp/G0-equivariant.

    The upshot is that GQp/G1 is topologically generated by 2

    elements x, and y, where x generates a copy of Ẑ, y generatesa copy of

    ∏q 6=pZq, with one relation x

    −1yx = yp. This is seenfrom the short exact sequence of groups:

    1→ G0/G1 → GQp/G1 → GQp/G0 → 1,

    in which G0/G1 ∼=∏q 6=pZq, and GQp/G0

    ∼= Ẑ are both procyclic(i.e topologically generated by one element). Since Ẑ is free,the sequence is split, i.e. defines a semidirect product with theaction given as above.

    Next, let ρ : GQ → GLn(k) be given, where k is a finite field ofcharacteristic `. The image of ρ is finite, say Gal(K/Q), whereK is a number field. Letting the finite set of rational primesramified in K/Q be S, we see that ρ is unramified at everyp 6∈ S, i.e. ρp(G0) = {1} and so ρp factors through GQp/G0 forall p 6∈ S. Consider next ρ` : GQ` → GLn(k).

    Call ρ` semisimple if V := kn, viewed as a k[GQ`]-module, is

    semisimple, i.e. a direct sum of irreducible modules.

    Theorem 3.15 If ρ` is semisimple, then ρ`(G1) = {1}, and soρ` factors through GQ`/G1.

    Proof: Assume V is irreducible, i.e. has no proper k[GQ`]-submodules.Let V ′ = {v ∈ V |g(v) = vfor allg ∈ ρ`(G1)}. Since G1 is a pro-` group, ρ`(G1) is a finite `-group and so its orbits on V areof length 1 or a power of `. The orbits of length 1 compriseV ′, so that |V ′| ≡ |V | (mod `) ≡ 0 (mod `). In particular,V ′ 6= {0}. Since G1 is normal in GQ`, V ′ is stable under GQ`,implying by irreducibility of V that V ′ = V . Thus ρ`(G1) acts

  • 3.2 Structure of GQp/G0 and GQp/G1 xlii

    trivially on the whole of V . For a semisimple V , we apply theabove to each summand of V . QED

    The Big Picture. We have defined subgroups GQp (one foreach prime p) ofGQ with much simpler structure thanGQ itself.This will allow us to describe representations ρ : GQ → GLn(R)in terms of their restrictions ρp to these subgroups. Each ρp canbe described in turn by its effect on ramification subgroups ofGQp, ultimately enabling us to define useful numerical invari-ants associated to ρ (see chapter 5). First, however, we needsome natural sources of Galois representations ρ and these willbe provided by elliptic curves, modular forms, and more gen-erally group schemes. Φ

  • xliii

    4Galois Representations from Elliptic Curves,Modular Forms, and Group Schemes

    Having introduced Galois representations, we next describe nat-ural sources for them, that will lead to the link with Fermat’sLast Theorem.

    4.1 Elliptic curves

    An elliptic curve over Q is given by equation y2 = f(x), wheref ∈ Z[x] is a cubic polynomial with no repeated roots in Q̄. Agood reference for this theory is [32].

    Example: y2 = x(x− 1)(x+ 1)IfK is a field, set E(K) := {(x, y) ∈ K×K|y2 = f(x)}∪{∞}.

    We begin by studying E(C).Let Λ be a lattice inside C. Define the Weierstrass ℘-function

    by

    ℘(z; Λ) =1

    z2+

    ∑06=ω∈Λ

    (1

    (z − ω)2− 1ω2

    )

  • 4.1 Elliptic curves xliv

    and the 2kth Eisenstein series by

    G2k(Λ) =∑

    06=ω∈Λω−2k.

    These arise as the coefficients in the Laurent series of ℘ aboutz = 0, namely:

    ℘(z) =1

    z2+ 3G4z

    2 + 5G6z4 + 7G8z

    6 + . . . ,

    which is established by rearranging the infinite sum, allowed bythe following.

    Proposition 4.1 The function ℘ is absolutely and locally uni-formly convergent on C − Λ. It has poles exactly at the pointsof Λ and all the residues are 0. G2k(Λ) is absolutely convergentif k > 1.

    Definition 4.2 A function f is called elliptic with respect to Λ(or doubly periodic) if f(z) = f(z+ω) for all ω ∈ Λ. Note thatto check ellipticity it suffices to show this for ω = ω1, ω2, thefundamental periods of Λ. An elliptic function can be regardedas a function on C/Λ.

    Proposition 4.3 ℘ is an even function and elliptic with re-spect to Λ.

    Proof: Clearly, ℘ is even, i.e. ℘(z) = ℘(−z). By local uniformconvergence, we can differentiate term-by-term to get ℘′(z) =−2∑ω∈Λ 1(z−ω)3 , and so ℘′ is elliptic with respect to Λ. Integrat-ing, for each ω ∈ Λ, ℘(z + ω) = ℘(z) +C(ω), where C(ω) doesnot depend on z. Setting z = −ωi/2 and ω = ωi (i = 1 or 2)gives

    ℘(ωi/2) = ℘(−ωi/2) + C(ωi) = ℘(ωi/2) + C(ωi),

  • 4.1 Elliptic curves xlv

    using the evenness of ℘. Thus, C(ωi) = 0(i = 1, 2), so ℘ iselliptic. QED

    Let g2 = 60G4(Λ) and g3 = 140G6(Λ).

    Proposition 4.4 (℘′(z))2 = 4℘(z)3 − g2℘(z)− g3(z 6∈ Λ).(∗)

    Proof: Let f(z) = (℘′(z))2− (a℘(z)3−b℘(z)2−c℘(z)−d). Con-sidering its explicit Laurent expansion around z = 0, since f iseven, there are terms in z−6, z−4, z−2, z0 whose coefficients arelinear in a, b, c, d. Choosing a = 4, b = 0, c = g2, d = g3 makesthese coefficients zero, and so f is holomorphic at z = 0 andeven vanishes there. We already knew that f is holomorphicat points not in Λ. Since f is elliptic, it is bounded. By Liou-ville’s theorem, bounded holomorphic functions are constant.f(0) = 0 implies this constant is 0. QED

    Theorem 4.5 The discriminant ∆(Λ) := g32 − 27g23 6= 0, andso 4x3− g2x− g3 has distinct roots, whence EΛ defined by y2 =4x3 − g2x− g3 is an elliptic curve (over Q if g2, g3 ∈ Q).

    Proof: Setting ω3 = ω1+ω2, ℘′(ωi/2) = −℘′(−ωi/2) = −℘′(ωi/2)

    since ℘′ is odd and elliptic. Hence ℘′(ωi/2) = 0. Thus 4℘(ωi/2)3−

    g2℘(ωi/2)−g3 = 0. We just need to show that these three rootsof 4x3 − g2x− g3 = 0 are distinct.

    Consider ℘(z)−℘(ωi/2). This has exactly one pole, of order 2,and so by the next lemma has either two zeros of order 1 (whichcannot be the case since this function is even and elliptic) or 1zero of order 2, namely ωi/2. Hence ℘(ωi/2) − ℘(ωj/2) 6= 0 ifi 6= j. QED

    Lemma 4.6 If f is elliptic and vw(f) is the order of vanishingof f at w, then

    ∑w∈C/Λ vw(f) = 0. Moreover, the sum of all the

    zeros minus the poles is 0 (mod Λ).

  • 4.1 Elliptic curves xlvi

    Proof: By Cauchy’s residue theorem,∑w∈C/Λ vw(f) =

    12πi

    ∫ f ′f dz,

    where the integral is over the boundary of the fundamental par-allelogram with vertices 0, ω1, ω2, ω3 (if a pole or zero happensto land on the boundary, then translate the whole parallelo-gram to avoid it). By ellipticity, the contributions from parallelsides cancel, so the integral is 0. The last statement is provedsimilarly, using 12πi

    ∫z f

    f dz. QED

    Theorem 4.7 There is a bijection (in fact a homeomorphismof Riemann surfaces) φ : C/Λ→ EΛ(C) given by

    z 7→ (℘(z), ℘′(z))(z 6∈ Λ), z 7→ ∞(z ∈ Λ).

    Proof: Ellipticity of ℘ and ℘′ implies that φ is well-defined and(∗) shows that the image is in EΛ(C). To show surjectivity,given (x, y) ∈ EΛ(C) − {∞}, we consider ℘(z) − x, a noncon-stant elliptic function with a pole (at 0) and so a zero, say atz = a. By (∗), ℘′(a)2 = y2. By oddness of ℘′ and evenness of ℘,we see that φ(a) or φ(−a) is (x, y).

    To show injectivity, if φ(z1) = φ(z2) with 2z1 6∈ Λ, then con-sider ℘(z) − ℘(z1), which has a pole of order 2 and zeros atz1,−z1, z2, so z2 ≡ ±z1 (mod Λ). If also φ′(z1) = φ′(z2), thenthis fixes the sign. QED

    Note that this bijection from C/Λ, which is a group (in fact atorus), puts a group structure on EΛ(C), so that it is isomorphicto R/Z × R/Z. Our desired Galois representations will comefrom Galois actions on certain finite subgroups of EΛ(C). Later,we shall see that every elliptic curve over C is of the form EΛ.

    Theorem 4.8 The group law on EΛ(C) is given by saying thatthree points P1, P2, P3 add up to the identity, ∞, if and only ifP1, P2, P3 are collinear. If two of the points coincide, this means

  • 4.1 Elliptic curves xlvii

    the tangent at that point.

    Proof: Fixing z1, z2, let y = mx + b be the line through P1 =(℘(z1), ℘

    ′(z1)) and P2 = (℘(z2), ℘′(z2)).

    Consider f(z) = ℘′(z) − m℘(z) − b, which has one pole oforder 3 at 0, whence three zeros. Two of these are z1, z2 - letthe third be z3. Since the zeros minus poles equals 0, we getz1+z2+z3 = 0. Thus, if P3 = (℘(z3), ℘

    ′(z3), then P1+P2+P3 =∞. QED

    The importance of this is that it means that the coordinatesof (x1, y1)+ (x2, y2) are rational functions in x1, x2, y1, y2, g2, g3.This yields a group structure on E(K) whenever g2, g3 ∈ K,since e.g. the associative law is a formal identity in these ratio-nal functions.

    Definition 4.9 Given an elliptic curve E over Q and positiveinteger n, the n-division points of E are given by E[n] := {P ∈E(C)|nP =∞}.

    Example: If E is y2 = f(x) ∈ Z[x], let αi (i = 1, 2, 3) bethe roots of f(x) = 0 and Pi = (αi, 0). The tangent at Pi isvertical and so Pi, Pi,∞ are collinear, whence 2Pi = ∞. ThusE[2] = {∞, P1, P2, P3}. (No other point has order 2 by thefollowing.)

    Lemma 4.10 Let E be an elliptic curve over Q. Since C/Λ ∼=R/Z ×R/Z, E[n] ∼= Z/n × Z/n. Moreover, there are polyno-mials fn ∈ Q[x] such that E[n] = {(x, y)|fn(x) = 0} ∪ {∞}.

    Proof: The elements of R/Z of order dividing n form a cyclicgroup of order n. The fn come from iterating the rational func-tion that describes addition of two points on the curve. QED

  • 4.1 Elliptic curves xlviii

    Example: Continuing the case n = 2, we see that f2 = f .By the last lemma, E[n] ⊆ E(Q̄) and if P = (x, y) ∈ E[n] and

    σ ∈ GQ, then σ(P ) = (σ(x), σ(y) ∈ E[n]. This action of GQon E[n] ∼= Z/n× Z/n produces a homomorphism ρE,n : GQ →GL2(Z/n). These are the Galois representations associated toE.

    Let ` be a prime. Consider the inverse system consisting ofgroups GL2(Z/`

    n) together with the natural maps betweenthem. An object of the corresponding new category is givenby (GQ, {ρE,`n}), yielding a homomorphism ρE,`∞ : GQ →GL2(Z`), called the `-adic representation associated to E. An-other way of viewing this is as the Galois action on the `-adicTate module T`(E) = lim←− E[`

    n], where the maps in the inverse

    system are E[`n]→ E[`m] for n > m defined by P 7→ `n−mP .Example: Continuing the case n = 2, note that GL2(Z/2) ∼= S3,the symmetric group on 3 letters. The action of GQ on E[2] ={∞, (α1, 0), (α2, 0), (α3, 0)} amounts to permuting the roots αiof f , and so the image of ρE,2 is Gal(K/Q) ≤ S3 where K isthe splitting field of f .

    Given a typical E (meaning one with no complex multiplica-tions), ρE,`∞ is surjective for all but finitely many primes `. Infact it is surjective for all ` for a set of elliptic curves of density1, for example if E is the curve y2 + y = x3 − x (elliptic - youcan complete the square).

    These `-adic Galois representations encode much informationabout elliptic curves. For example, for a fixed `, ρE,`∞ and ρE′,`∞

    are equivalent (conjugate) if and only if E and E ′ are isogenous.Later on, we shall study in great detail the `-adic representa-tions associated to semistable elliptic curves.

  • 4.2 Group schemes xlix

    4.2 Group schemes

    An elliptic curve E defined over Q yields groups E(A) for anyQ-algebra A. This can be usefully generalized as follows. Ulti-mately we shall define finite, flat group schemes and see thatthey provide a quite general source of Galois representations.A good resource for this section is [37].

    Definition 4.11 Let R be a commutative ring with 1. An affinegroup scheme over R is a representable functor from the cate-gory of R-algebras (i.e. rings A together with a homomorphismR→ A with morphisms ring homomorphisms that make a com-mutative triangle over R) to the category of groups.

    Recall that a functor F is representable by the R-algebra < ifand only if F (A) = homR−alg(

  • 4.2 Group schemes l

    (iv) The functor µn defined by µn(A) := {x ∈ A|xn = 1} isrepresentable since µn(A) = homR−alg(R[T ]/(T

    n− 1), A). µn isa subgroup scheme of Gm.

    The following example shows how certain group schemes cangive rise to Galois representations. It will be generalized below.

    Definition 4.12 Let K be a field, ` 6= charK. Then GK actson µ`n(K̄) ∼= Z/`n. This yields a representation χ`n : GK →GL1(Z/`

    n) = (Z/`n)×. Putting these together yields χ`∞ : GK →GL1(Z`) = Z

    ×` , called the `-adic cyclotomic character. As with

    elliptic curves, we can provide an alternative defintion by do-ing the inverse limit before the Galois action; namely lettingT`(µ) := lim←− µ`n, then χ`∞ gives the action of GK on the Tatemodule T`(µ).

    Exercise: Let K = Q. Show that the `-adic cyclotomic char-acter χ is unramified at all primes p 6= `, i.e. χp : GQp → Z×`factors through the inertia subgroup G0. This then induces amap Gal(F̄p/Fp) ∼= GQp/G0 → Z×` . Show that the image of thepth Frobenius element is p. (Hint: note that σ(ζ) = ζχ(σ) forany `-power root of 1 in Q̄.)

    Exercise: Some universal ring constructions.(i) Given ring homomorphism R→ S and R-algebra A, con-

    sider the collection of rings B and homomorphisms that makethe following diagram commute:

    A //B

    R //

    OO

    S

    OO

  • 4.2 Group schemes li

    Show that these form the objects in a category of S-algebras,with an initial object. This initial object is the tensor productA⊗R S. In particular, R[T1, . . . , Tn]⊗R S = S[T1, . . . , Tn].

    (ii) Given x ∈ R, consider the collection of R-algebras A suchthat under R → A, x maps to a unit in A. Show that theseform the objects in a category, with an initial object. This initialobject is the localization of R at x.

    The case of (i) we shall be most interested in is where R = Z`.If A ∼= Zn` ×T (T torsion) as a Z`-module, then A⊗RQ` ∼= Qn`as a Q`-module and A⊗R F` ∼= Fn` × T/`T as an F`-module.

    Let φ : R → S be a ring homomorphism. Given a functorF on R-algebras, we get a functor F ′ on S-algebras since viacomposition with φ every S-algebra S → A is an R-algebra. If< is an R-algebra, then homS−alg(

  • 4.2 Group schemes lii

    Exercise: For each R-algebra A, define F (A) = ker(G(A) →H(A)). Show that F is a group scheme over R.

    For example, the determinant map gives a homomorphismfrom GLn to Gm with kernel SLn, and the map x 7→ xn ahomomorphism from Gm to Gm with kernel µn. It requires alot more work to give the cokernel a functorial description.

    We now have the objects and morphisms of the category ofaffine group schemes over a fixed ring R. An elliptic curveE : y2 = f(x) over Q gives for each Q-algebra A a groupE(A) and for each Q-algebra map A → B a group homomor-phism E(A) → E(B). Is it an affine group scheme? Actuallynot - the obvious try is < = Q[x, y]/(y2 − f(x)), in which casehomQ−alg(

  • 4.2 Group schemes liii

    rank n2) affine group scheme over Qp. (It can be shown directlyto be affine by finding a polynomial f whose zero set V ((f)) -see below - does not meet E[n], whence E[n] lies in an affinechart Uf . See Conrad’s article in [5] for details. A nice explicitdescription of E[n] can be found in [9].) This will be discussedin the section on reduction of elliptic curves in the next chapter.

    Next, we make explicit the kind of group schemes that yieldGalois representations useful to us.

    Definition 4.14 Let G be a finite group scheme over a fieldK, e.g. represented by K-algebra

  • 4.3 General schemes liv

    of < all lie in some finite Galois extension of K and so theaction is continuous.

    Conversely, given a finite groupH with continuousGK-action,consider first the case of transitive action, say H = GKh. Bycontinuity, choose finite extension L of K such that the actionof GK factors through Gal(L/K). Let S be the subgroup fixingh and < ⊆ L its fixed field. By Galois theory, all maps < → K̄map to L and are conjugate, yielding a GK-isomorphism H →homK−alg(

  • 4.3 General schemes lv

    Let R be a commutative ring with 1 and SpecR denote the setof prime ideals of R. For example, SpecZp = {(0), pZp}. ThenSpecR comes with a topology, the Zariski topology, defined byhaving the closed sets be the sets V (I) as I runs through allideals of R, where

    V (I) := {℘ ∈ SpecR|I ⊆ ℘}.

    Exercise: Show that this does indeed define a topology on SpecR,i.e. that ∅, R are closed sets and that arbitrary intersections andfinite unions of closed sets are closed. Show that SpecZp is notHausdorff. Show that SpecR is always compact.

    If f : R→ S is a ring homomorphism and ℘ is a prime idealof S, then R/f−1(℘)→ S/℘ is an injective homomorphism intoan integral domain, and so f−1(℘) is a prime ideal of R. Thusf induces a map SpecS → SpecR, which can be checked to becontinuous with respect to the Zariski topologies. For example,if I is an ideal, then since the prime ideals of R containing I arein bijection with those of R/I, Spec(R/I)→ SpecR is an injec-tion with image V (I). If x ∈ R, then likewise SpecRx → SpecR(from localization) is injective with image SpecR−V ((x)). Wethus think of SpecR as a ringed space by having Rx be the ringof functions on this basic open set.

    We shall identify Spec< with the affine scheme representedby

  • 4.3 General schemes lvi

    yields a commutative diagram:

    SpecBφ //

    φi %%KKKKK

    KKKKK

    SpecA

    πiyyssssss

    ssss

    SpecR

    i.e. a morphism of affine schemes over SpecR. The categoryof affine schemes over SpecR is hereby anti-equivalent to thecategory of R-algebras.

    Exercise: Let ℘ ∈ SpecR and let κ(℘) = Frac(R/℘). Show thatκ(℘) is an R-algebra, so that Specκ(℘) embeds in SpecR. LetA be an R-algebra, so that SpecA is a cover of SpecR. Showthat its fibre over ℘ can be identified with Spec(A⊗R κ(℘).

    A scheme then is a ringed space admitting a covering by opensets that are affine schemes. Morphisms of schemes are definedlocally, i.e. f : S ′ → S is a morphism if there is a coveringof S by open, affine subsets SpecRi such that f

    −1(SpecRi) isan affine scheme SpecR′i and the restriction map SpecR

    ′i →

    SpecRi is a morphism of affine schemes. We say that S′ is a

    scheme over S. In Grothendieck’s approach, this relative notionis important rather than absolute questions about a scheme.Questions about f turn into questions about the ring mapsfi : Ri → R′i. In particular, we say that f has property (∗) (forexample, is finite or flat), if there is a covering of S such thateach of the ring maps fi has this property.

    If S is a scheme, then a group scheme over S is a representablefunctor F from the category of schemes over S to the categoryof groups, i.e. there exists some scheme S over S such thatF (X) = homS−schemes(X,S). For example, an elliptic curveover Q is a (non-affine) group scheme over SpecQ.

  • 4.4 Modular forms lvii

    4.4 Modular forms

    Another source of Galois representations (in fact, which turnsout to produce all that we are interested in) is modular forms.For this section, [28], [19], and [11] are recommended. Note thatelliptic curves will ultimately correspond to modular forms ofweight 2 and trivial Nebentypus, and so those forms will behighlighted.

    Fix positive integers k,N and homomorphism � : (Z/N)× →

    C×. Let H = {z ∈ C|Im(z) > 0}. For σ = a bc d

    ∈ SL2(Z),set σz = az+bcz+d and f |[σ]k(z) = (cz + d)

    −kf(σz).Set

    Γ0(N) := { a bc d

    ∈ SL2(Z)|c ≡ 0 (mod N)}and

    Γ1(N) = { a bc d

    ∈ Γ0(N)|d ≡ 1 (mod N)}.Definition 4.17 A modular function of weight k, level N , andNebentypus �, is(i) a meromorphic function on H,(ii) satisfies f |[σ]k = �(d)f for all σ ∈ Γ0(N),(iii) is meromorphic at all cusps.If the form is holomorphic on H and at the cusps, then it is

    called a modular form. If it vanishes at the cusps, then it iscalled a cusp form. We now explain (iii).Note that by (ii) f(z + 1) = f(z). Thus, f has a Fourier

    expansion in terms of q = e2πiz, say f =∑anq

    n.We say that f is meromorphic (respectively, holomorphic,

    vanishes) at ∞ if an = 0 for all n < some n0 (respectively

  • 4.4 Modular forms lviii

    n < 0, n ≤ 0). We say that f is meromorphic (respectively,holomorphic, vanishes) at the cusps if f |[σ]k is meromorphic(respectively, holomorphic, vanishes) at ∞ for all σ ∈ SL2(Z).

    For example, if N = 1, then Γ0(N) = Γ1(N) = SL2(Z). Iff is a modular form of this level, then its Nebentypus mustbe trivial. Moreover, f |[σ]k = f for all σ ∈ SL2(Z) and so thecusp condition only need be checked at ∞. In general, one hasfinitely many conditions to check, taking σ running through thefinitely many cosets of Γ0(N) in SL2(Z).

    The set of modular forms (respectively cusp forms) of weightk, level N , and Nebentypus � will be denoted Mk(N, �) (re-spectively Sk(N, �)). As will be shown later, these are finite-dimensional C-vector spaces.

    Exercise: Show that if f, f ′ are modular functions of weightsk, k′ respectively and level 1, then ff ′, f/f ′ are modular func-tions of weights k+k′, k−k′ respectively and level 1. Show thatfor λ ∈ C, λf and if k = k′, then f + f ′ are modular functionsof weight k, level 1.

    Example: Let Λ be the lattice in C generated by fundamentalperiods 1, τ , where τ ∈ H.G2k(τ) = G2k(Λ) =

    ∑(m,n)6=(0,0)

    1(mτ+n)2k

    (the Eisenstein series) is a modular form of weight 2k andlevel 1. (ii) follows since G2k(λΛ) = λ

    −2kG2k(Λ) and (iii) fol-lows since uniform convergence allows passage to limit termby term, the m 6= 0 terms giving 0, the m = 0 terms giving∑n 6=0 n

    −2k = 2ζ(2k).For this same Λ, ∆ = g32 − 27g23 is therefore a modular form

    of weight 12 and level 1. Using the known values of ζ(4), ζ(6),we get its constant coefficient 0, and so it is a cusp form. Infact, looking deeper shows that ∆ = (2π)12q

    ∏∞n=1(1−qn)24 [28].

  • 4.4 Modular forms lix

    Henceforth, we shall normalize ∆ so that its first coefficient is1.

    Another important example is j(τ) := 1728g32

    ∆ , which is a mod-ular function of weight 0 and level 1, and thus defines a mapj : SL2(Z)\H → C. Its Fourier expansion 1q + 744 + 196884q+. . . has fascinating connections with the Monster finite simplegroup.

    The best way to think of modular forms is in terms of asso-ciated Riemann surfaces, called modular curves.

    4.4.1 Riemann surfaces

    A surface is a topological space S which is Hausdorff and con-nected such that there is an open cover {Uα|α ∈ A} and home-omorphisms φα of Uα to open sets Vα ⊆ C. Then (Uα, φα) iscalled a chart and the set of charts an atlas. If Uα∩Uβ 6= ∅, thentransition function tαβ = φβφ

    −1α : φα(Uα ∩ Uβ)→ φβ(Uα ∩ Uβ).

    Call S a Riemann surface if all the tαβ, where defined, are an-alytic.

    Example: (i) Let C∞ = C∪{∞}. Take the topology with opensets those in C together with {∞} ∪ (C − K) (K compactin C). Let Uα = C, φα(z) = z and Uβ = C∞ − {0}, φβ(z) =1/z(z ∈ C), φβ(∞) = 0. These two charts make C∞ a compactRiemann surface, identifiable with the sphere.

    (ii) If Λ is a lattice in C, then C/Λ is a compact Riemannsurface.

    We now introduce another useful class of compact Riemannsurfaces:

    Definition 4.18 Let H∗ = H ∪Q ∪ {∞}, and put a topologyon H∗ by taking the following as basic open sets:

  • 4.4 Modular forms lx

    (i) about a point in H any open disk entirely inside H;(ii) about ∞, {Imτ > r} for any r > 0;(iii) about x ∈ Q, D ∪ {x}, where D is any open disk in H

    of radius y > 0 and center x+ iy.

    Extend the action of σ =

    a bc d

    on H to act on x = [x, 1] ∈Q and ∞ = [1, 0] by σ[x, y] = [ax + by, cx + dy] ∈ P 1(Q)(i.e. homogenized). Let Yi(N) = Γi(N)\H ⊂ Γi(N)\H∗ =Xi(N). These are compact Riemann surfaces and called modu-lar curves. See [19] p.311 or [11] p.76 for more details.

    Riemann surfaces form a category where the morphisms areanalytic maps defined thus.

    Definition 4.19 Call a continuous map f : R → S of Rie-mann surfaces analytic if the maps ψβfφ

    −1α are analytic maps

    on domains in C, wherever defined. An analytic map f : R →C∞ is a meromorphic function on R (matches the usual defi-nition, setting f(p) = ∞, if f has a pole at p). The collectionof all such functions is a field, the function field K(R).

    Example: (i) The meromorphic functions on torus C/Λ are justthe elliptic functions with respect to Λ (in fact = C(℘, ℘′)).

    (ii) The meromorphic functions on modular curve X0(N) arejust the modular functions of weight 0, level N , and trivialNebentypus.

    The most important result here is the Open Mapping The-orem, stating that any analytic nonconstant map of Riemannsurfaces maps open sets to open sets.

    Exercise: Using this, show that if f : R→ S is one such and Rcompact, then f(R) = S, and so S is compact. Deduce Liou-

  • 4.4 Modular forms lxi

    ville’s theorem.Moreover, show that f is a k-to-1 map for some k (hint: let

    Sm ⊆ S be those points with precisely m preimages, countingmultiplicity. Show that Sm is open and then use compactnessand connectedness of S).

    Lemma 4.20 The function j is surjective.

    Proof: Consider j as a map X0(1)→ C∞. Since ∆ has a simplezero at∞ and no others (as we showed in establishing EΛ is anelliptic curve), j has a simple pole at ∞ and no others. Thus,invoking the exercise above, j is 1-to-1. QED

    Corollary 4.21 Every elliptic curve over C is of the form EΛ,for some lattice Λ.

    Proof: Given y2 = f(x) with f ∈ C[x] a cubic with distinctroots, we can, by change of variables, get y2 = 4x3 − Ax − Bfor some A,B ∈ C. The claim is that there exists Λ such thatg2(Λ) = A, g3(Λ) = B. From the definition of j above, we

    can find τ such that g23

    g32takes any value other than 127 . Pick

    τ such that this equals B2

    A3 (6=127 since f has distinct roots).

    Choose λ such that g2(τ) = λ4A. Then g3(τ)

    2 = λ12B2, sog3(τ) = ±λ6B. If we have the negative sign, then replace λ byiλ. Noting that Eisenstein series satisfy by definition g2(λL) =λ−4g2(L), g3(λL) = λ

    −6g3(L), we are done if we take Λ = λLwhere L has basis 1, τ . QED

    Note that by the last theorem, j identifies X0(1) with C ∪{∞}. In fact:

    Theorem 4.22 The modular functions of weight 0 and level 1are precisely the rational functions of j, i.e. the function field

  • 4.4 Modular forms lxii

    K(X0(1)) = C(j).

    Proof: An earlier exercise showed that rational functions in jare modular functions of that weight and level. Conversely, if fis such a function, say with poles τi, counted with multiplicity,then g = f

    ∏i(j(τ) − j(τi)) is a modular function of weight 0

    and level 1 with no poles in H. If g has a pole of order n at∞,then there exists c such that g − cjn has a pole of order n− 1at ∞ (and no others). By induction, g minus some polynomialin j has no pole in H∗, and so is constant. Thus, g, and sof ∈ C(j). QED

    Letting αi run over all integer matrices

    ∆N := { a b

    0 d

    with ad = N, d > 0, 0 ≤ b < d, gcd(a, b, d) = 1}(of which there are µ(N) := N

    ∏p|N(1 +

    1p) such matrices), the

    modular polynomial of order N is ΦN(x) =∏µ(N)i=1 (x−j◦αi). One

    root is jN := j ◦ α, where α = N 0

    0 1

    (i.e. jN(z) = j(Nz)).In fact, µ(N) = [SL2(Z) : Γ0(N)] and so is the degree of the

    cover X0(N) → X0(1). The corresponding extension of func-tion fields K(X0(1)) ⊆ K(X0(N)) is then also of degree µ(N).K(X0(N)) turns out to be K(X0(1))(jN):

    Theorem 4.23 See [19], p. 336 on. ΦN(x) has coefficients inZ[j] and is irreducible over C(j) (and so is the minimal polyno-mial of jN over C(j)). The function field K(X0(N)) = C(j, jN).

    This enables us to define X0(N), a priori a curve over C, overQ. This means that it can be given by equations over Q. IfN > 3, then one can further define a scheme over Z[1/N ], sothat base change via Z[1/N ] → Q yields this curve (in other

  • 4.4 Modular forms lxiii

    words, we have a model for X0(N) over Q with good reductionat primes not dividing N).

    Proof: First, we note that j, jN are indeed in K(X0(N)), i.e.satisfy f(z) = f(σz) for all σ ∈ Γ0(N). This clearly holds forj since j is a modular function of weight 0 on all of SL2(Z).Since jN(σz) = j ◦ α ◦ σ(z) = j ◦ (ασα−1)α(z) and ασα−1 = a NbN−1c d

    ∈ SL2(Z), jN(σz) = j ◦ α(z) = jN(z). Thecondition at the cusps is easily checked.

    Next, we note that j ◦αi(1 ≤ i ≤ µ(N)) are distinct functions

    on H. If αi = a b

    0 d

    , then j ◦ αi(z) = j(az+bd ) = 1qadζbd + . . .,where qd = e

    2πiz/d, ζd = e2πi/d. If j◦αi = j◦αi′, take the quotient

    and let Imz → 0 to get qadζbd = qa′

    d′ζb′d′ . So a/d = a

    ′/d′, but sincead = N = a′d′ and all are positive, a = a′, d = d′, whence b = b′

    too.Next, we show the properties of ΦN . If γ ∈ SL2(Z), then

    we check that αiγ = βαk for some k and β ∈ SL2(Z). Thusj ◦ αi ◦ γ = j ◦ αk, whence γ permutes the roots of ΦN , andso its coefficients are invariant under SL2(Z), hence ∈ C(j)(meromorphic since polynomials in the j◦αi). In fact, one easilycomputes that SL2(Z) acts transitively on the roots of ΦN ,whence ΦN is irreducible over C(j). To show its coefficients liein Z[j], we see that they lie in Z[ζN ] since d | N . AutomorphismζN 7→ ζrN (any r coprime to N) permutes the j ◦ αi, and so thecoefficients lie in Q ∩ Z[ζN ] = Z.

    QEDThese polynomials ΦN tend to have huge coefficients, but at

    least they define X0(N) as a curve over Q.If X is a compact Riemann surface of genus g and W =

  • 4.4 Modular forms lxiv

    Ωhol(X) its holomorphic differentials, then W and so V =hom(W,C) are g-dimensional C-vector spaces.

    Exercise: Show that there is an isomorphism of C-vector spacesgiven by

    Ωhol(X0(N))→ S2(N), f(z)dz 7→ f(z).

    (Hint: if f(z) is such a cusp form and σ =

    a bc d

    ∈ Γ0(N),then f(σz)d(σz) = (cz + d)2f(z)(cz + d)−2dz = f(z)dz. Theholomorphicity of f(z)dz corresponds to being a cusp formsince 2πidz = dq/q.)

    This shows that the dimension of S2(N) is the genus of X0(N)(in particular finite), computable by Riemann-Roch (explicitlygiven in [19]). Likewise, the finite dimensionality of S2(N, �) isbounded by the genus of X1(N). There are similar interpreta-tions of Sk(N) for higher k.

    Let C1, ..., C2g denote the usual 2g cycles on the g-handled X,generating free abelian groupH1(X,Z). Let Λ = Im(H1(X,Z)→hom(W,C)) where the map is C 7→ (ω 7→ ∫C ω), a discrete sub-group of rank 2g called the period lattice of X. The Jacobian ofX, Jac(X), is the g-dimensional complex torus Cg/Λ, isomor-phic as a group to (R/Z)2g. This is an example of an abelianvariety over C.

    The Abel map is X → Jac(X), x 7→ {∫ xx0 ωj} (for some fixedx0 whose choice doesn’t matter since the image is defined upto a period in Λ). In the case g = 1, then this is bijective.Let Div(X) be the free abelian group on the points of X andDiv0(X) its subgroup consisting of the elements whose coef-ficients sum to 0. The Abel map extends to a group homo-morphism Div(X)→ Jac(X), whose restriction to Div0(X) is

  • 4.4 Modular forms lxv

    surjective with kernel the so-called principal divisors P(X ).Since Div0(X0(N))/P(X0(N)) makes sense over Q, this de-

    fines J0(N) := Jac(X0(N)) over Q (in fact it is a coarse modulischeme over Z[1/N ] if N > 3). Galois action on its divisionpoints then produces Galois representations GQ → GL2g(Z/n).From this we obtain 2-dimensional Galois representations as-sociated to certain modular forms, defined below.

    Definition 4.24 If f is a cusp form of weight k, level N , andNebentypus �, define the mth Hecke operator by

    Tmf = m(k/2)−1 ∑

    j

    f |[αj]k,

    where if N = 1, the αj run through ∆n, and for general Nthrough the matrices in ∆n with gcd(a,N) = 1.

    Exercise: If f has q-expansion at ∞, ∑∞n=0 anqn, then Tmf hasq-expansion

    ∞∑n=0

    bnqn, where bn =

    ∑d|(m,n)

    �(d)dk−1amn/d2

    (taking �(d) = 0 if (d,N) 6= 1).Show that Tmf is again a cusp form of weight k, level N ,

    and Nebentypus � such that Tm is a linear operator on Sk,�(N).Furthermore, check the Hecke operators commute.

    Definition 4.25 If Tmf = λmf (some λm ∈ C) for all m, thenf is called a cuspidal eigenform. Actually, am = λma1 and weshall normalize eigenforms so that a1 = 1 and so λm = am. Thecommutative ring the Hecke operators generate, T , is called theHecke algebra.

    For example, since S12,1(1) has dimension 1, basis ∆, ∆ is

  • 4.4 Modular forms lxvi

    a cuspidal eigenform. A very useful corollary to the definitionfollows:

    Corollary 4.26 If f =∑anq

    n is a cuspidal eigenform, themap Tm 7→ am extends to a ring homomorphism (eigencharac-ter) θ : T → C.

    Consider the (injective) map Sk(N) → C[[q]] that sends acusp form to its Fourier expansion at∞. Then the inverse imageof Z[[q]] is Sk(N ;Z) and Sk(N ;A) = Sk(N ;Z)⊗ZA. Note thatusing the explicit coefficients of Tm, T acts on Sk(N ;Z). Theq-expansion principle says that Sk(N ;C) = Sk(N), i.e. Sk(N)has a basis in Sk(N ;Z), and so T embeds in EndSk(N ;Z). Thishas various consequences:

    Theorem 4.27 T is a finite free Z-algebra. If f is a cuspidaleigenform, then there exists an algebraic number ring Of con-taining all coefficients of f and the image of �. (Note that theimage of θ above lies in Of .)

    If A is a ring, we let Sk(N ;A) denote the cusp forms of levelN and weight k defined over A. Note that

    Sk(N ;A) = homA−alg(T , A). (†)

    This follows for A = Z by mapping φ : Sk(N ;Z)→ hom(T ,Z)by f 7→ (t 7→ a1(tf)) and then noting that φ is injective andthat T is free of rank ≤ that of Sk(N ;Z) . The general casefollows by tensoring with A.

    Note that the Hecke operators Tm also act on Div0(X0(N))

    (and so J0(N)) by extending linearly Tm[z] =∑

    [αiz], where [z]is the orbit of z ∈ H and αi runs (again) through all matrices a b

    0 d

    with ad = n, d > 0, gcd(a,N) = 1, 0 ≤ b < d. Let Ip

  • 4.4 Modular forms lxvii

    denote the image of inertia under GQp → GQ, Dp the image ofGQp, and Frp the element of Dp/Ip mapping to the Frobeniusmap in GFp. For λ a prime ideal of Of , denote the fraction fieldof (Of)λ by Kλ.

    Theorem 4.28 Let f =∑anq

    n be a cuspidal eigenform ofweight k, level N , and Nebentypus �. For each prime λ, there ex-ists a unique semisimple continuous homomorphism ρ : GQ →GL2(Kλ) such that if prime p 6 |`N , then ρ(Ip) = {1}, trρ(Frp) =ap, detρ(Frp) = �(p)p

    k−1.

    Proof: We explain the case k = 2 and trivial Nebentypus, sincethe elliptic curve representations we care about will correspondto this case.

    Let ` be the rational prime below λ. Let J0(N)[`n] ∼= (Z/`n)2g

    denote the kernel of multiplication by `n on J0(N) and T`(J0(N)) =lim←− J0(N)[`

    n] ∼= Z2g` the corresponding Tate module, a free Z`-module of rank 2g on which GQ acts (the argument for ellipticcurves carries over to g-dimensional tori), where g is the genusofX0(N). The action of Tm on J0(N) given above carries over toT`(J0(N)). ThenW := T`(J0(N))⊗Z`Q` is a T ⊗ZQ`-module, infact free of rank 2. (The proof of this comes from the Hodge de-composition S2(N)⊕S̄2(N) = H1(X,C), where S̄2(N) gives theanti-holomorphic differentials, and the fact that if A is a field ofcharacteristic 0, then Sk(N ;A) is free of rank 1 over T ⊗ZA by(†) above.) This yields a representation GQ → GL2(T ⊗Z Q`).Note that the particular level N eigenform has not been usedyet.

    The eigencharacter T → Of now induces a homomorphismT ⊗Z Q` → Of ⊗Z Q`. Since Of is an order in a number field,Of⊗ZQ` ∼=

    ∏λKλ. Mapping onto the λth component yields ρ :

  • 4.4 Modular forms lxviii

    GQ → GL2(Kλ). We shall see that ρ has the desired propertiesin the next chapter. QED

    Exercise: Show that if O is the ring of integers in a number field,then the splitting of ` in O determines the form of O ⊗Z Q`.(Hint: Z[x]/(f(x))⊗Z Q` ∼= Q`[x]/(f(x)).)

    For cases with k > 2, Deligne [7] used higher symmetric pow-ers of the Tate module. For k = 1, Deligne and Serre [8] builtthe representation from congruent eigenforms of higher weight.

    Exercise: Let ρ : GQ → GLn(K) be a continuous homomor-phism where K is a local field with valuation ring O. Consider-ing Kn thus as a GQ-module, show that there is a stable latticeOn under this action.

    Letting O be the valuation ring of Kλ and F its residue field,the residual representation associated to a cuspidal eigenformf will denote the composition GQ

    ρ−→GL2(O)→ GL2(F ) pro-duced above.

    The Big Picture. We have three sources of naturally oc-curring Galois representations, namely elliptic curves, modu-lar forms, and étale group schemes. The most general is thelast, and these will be used next to define the general class ofsemistable representations we shall focus on. We shall see thatthis class contains the representations coming from an ellipticcurve associated by Frey to a putative counterexample to Fer-mat’s Last Theorem. The first thing is to create links betweenthe various kinds of representations defined.

  • lxix

    5Invariants of Galois representations,semistable representations

    Let F be a finite field of characteristic ` > 2 and V a 2-dimensional F -vector space on which GQ acts continuously.Thus we have a representation ρ̄ : GQ → GL2(F ). By compos-ing ρ with the homomorphism GQ` → GQ, V has a GQ`-actionand so defines an étale group scheme over Q`.

    Definition 5.1 Call ρ̄ good at ` if this group scheme comesvia base change from a finite flat group scheme over Z` a