equi- to a use was...7. The monument at Borrowston Rig requires anchor points that are placed...

10
References and Notes 1. S. P. 6 Ri6rdfiin and G. Daniel, New Grange (Praeger, New York, 1964). 2. G. Hawkins, Stonehenge Decoded (Double- day, Garden City, N.Y., 1965). 3. A. Thom, Megalithic Sites in Britain (Ox- ford, London, 1967). 4. G. Daniel, The Megalithic Builders of Western Europe (Hutchenson, London, 1959), pp. 111-113. 5. Actually this tumulus represents what Thom calls a type D ring, one in which the two pivot stakes are placed at one-third the radius from a, rather than at the midpoint. As Fig. 3 shows, the pivots can be kept at the midpoint of the radius if a, is allowed to move inside the design. 6. L. B. Borst, Science 163, 567 (1969). 7. The monument at Borrowston Rig requires anchor points that are placed farther apart on the circumference than the intersections made by the trisecting radii demand. In this design the circle of the smaller end arc passes through the center of the larger end arc, and the wider placement of the anchor stakes may have been made on this account. The site at Maen Mawr had anchor stakes that were closer together on the circum- ference. If the radial line from P2 to either anchor is taken as a hypotenuse of a right triangle one side of which is half the anchor line, then the lengths of the sides of this triangle (in megalithic half-yards) are 14, 17, and 22, which is nearly Pythag- orean (142+ 172= 485 e 484). 8. M. Gardner, Sci. Amer. 221, 239 (1969). 9. To mention one such consequence, with the megalithic method concentric rings, such as those found at Woodhenge, can be drawn without using any special mensuration tech- nique. That is, a rope can be lengthened by an unspecified amount and a concentric design can be drawn at once. With a flexible compass, on the other hand, the large arc in a type II egg, for example, can be drawn with- out special measurement but the other arcs must be changed by x amount to remain equi- distant from the perimeter of the original figure. 10. Thom presents a good argument for use of a standard length of 2.72 feet (one mega- lithic yard) in the construction of these structures, as well as of others in both the Old World and the New. 11. The writing of this article was supported in part by the Oklahoma State University Research Foundation. Mechanism of Antibody Diversity: Germ Line Basis for Variability Analysis of amino termini of 64 light chains indicates that much antibody variability is present in the germ line. Leroy Hood and David W. Talmage The nature of the genetic control of antibody variability is one of the most fascinating and approachable problems in mammalian genetics. Vertebrate or- ganisms appear to be capable of syn- thesizing thousands of different antibody sequences, each presumably encoded by a different antibody gene. How then do these genes arise? The somatic theory of antibody diversity postulates that antibody genes arise by hypermuta- tion from a few germ line genes during somatic differentiation. In contrast, the germ line theory postulates that verte- brates have a separate germ line gene for each antibody polypetide chain the creature is capable of elaborating. We discuss here certain patterns that have emerged from amino acid sequence analysis of antibody polypeptide chains. These patterns indicate that much of the sequence diversity is present in the germ line. We also discuss why the germ line theory seems to be the simplest explanation for antibody diversity. Dr. Hood is a senior investigator in the Im- munology Branch of the National Cancer Insti- tute, Bethesda, Maryland. Dr. Talmage is profes- sor of microbiology and dean of the University of Colorado Medical School, Denver. Requests for reprints should be sent to Dr. Talmage. 17 APRIL 1970 General Immunoglobulin Structure All five recognized classes of anti- bodies (immunoglobulins) in mammals contain two distinct polypeptide chains, called light and heavy chains (1). For example, there are two identical light and two identical heavy chains per molecule in the major serum immuno- globulins, the immunoglobulin G class; but the light and heavy chains differ chemically in different antibodies. Since normal antibodies produced against the simplest of antigenic determinants are generally heterogeneous, advantage has been taken of the homogeneous im- munoglobulins produced in large quan- tities by plasmacytomas in humans and in the highly inbred BALB/c mouse (2). Light chains are frequently excreted in the urine of individuals with certain plasmacytomas; these light chains have been called Bence Jones proteins. This article deals only with light chains be- cause relatively little information on comparative sequences is available for heavy chains. Light chains from most mammalian species including man are of two types, lambda and kappa, which are readily distinguished by serological and chemi- cal criteria (1, 3). Light polypeptide chains have two parts: a common re- gion (approximately residues 108 to 215), which is essentially invariant for a given light chain type and species and a variable region (approximately residues 1 to 107) which is different for each well-characterized protein (4-6). Presumably each variable region se- quence directs the folding of a unique antibody light chain configuration. Two patterns emerge when the amino acid sequences from many light chain variable regions are compared. (i) All kappa and lambda variable region se- quences can be divided into subgroups on the basis of their similarity to one of eight prototype sequences (Fig. 1). (ii) Relatively minor deviation from the prototype sequences occurs among indi- vidual proteins of a given subgroup (Fig. 1) and is designated intrasubgroup variation. We first discuss the variable region subgroups and the reasons for concluding that each light chain sub- group must be encoded by at least one distinct germ line gene. We then discuss why we believe that the intrasubgroup variation may also be encoded by sep- arate germ line genes. Variable Region Subgroups of Kappa Chains The nearly complete sequences of six variable regions of human kappa chains are known (Roy, Ag, Cum, Mil, Eu, and Ti), and partial sequences of more than 35 others have been deter- mined. All variable regions of kappa chains can be assigned to one of three subgroups on the basis of their simi- larity to one of three sets of linked amino acid sequences (Figs. 1 and 2). These three sets of linked amino acids or three "prototype sequences" are derived by examining the proteins of 325

Transcript of equi- to a use was...7. The monument at Borrowston Rig requires anchor points that are placed...

Page 1: equi- to a use was...7. The monument at Borrowston Rig requires anchor points that are placed farther apart on the circumference than the intersections made by the trisecting radii

References and Notes

1. S. P. 6 Ri6rdfiin and G. Daniel, NewGrange (Praeger, New York, 1964).

2. G. Hawkins, Stonehenge Decoded (Double-day, Garden City, N.Y., 1965).

3. A. Thom, Megalithic Sites in Britain (Ox-ford, London, 1967).

4. G. Daniel, The Megalithic Builders of WesternEurope (Hutchenson, London, 1959), pp.111-113.

5. Actually this tumulus represents what Thomcalls a type D ring, one in which the twopivot stakes are placed at one-third theradius from a, rather than at the midpoint.As Fig. 3 shows, the pivots can be keptat the midpoint of the radius if a, isallowed to move inside the design.

6. L. B. Borst, Science 163, 567 (1969).

7. The monument at Borrowston Rig requiresanchor points that are placed farther aparton the circumference than the intersectionsmade by the trisecting radii demand. In thisdesign the circle of the smaller end arcpasses through the center of the larger endarc, and the wider placement of the anchorstakes may have been made on this account.The site at Maen Mawr had anchor stakesthat were closer together on the circum-ference. If the radial line from P2 to eitheranchor is taken as a hypotenuse of a righttriangle one side of which is half theanchor line, then the lengths of the sidesof this triangle (in megalithic half-yards)are 14, 17, and 22, which is nearly Pythag-orean (142+ 172= 485e 484).

8. M. Gardner, Sci. Amer. 221, 239 (1969).9. To mention one such consequence, with the

megalithic method concentric rings, such asthose found at Woodhenge, can be drawnwithout using any special mensuration tech-nique. That is, a rope can be lengthened byan unspecified amount and a concentricdesign can be drawn at once. With a flexiblecompass, on the other hand, the large arc in atype II egg, for example, can be drawn with-out special measurement but the other arcsmust be changed by x amount to remain equi-distant from the perimeter of the original figure.

10. Thom presents a good argument for use ofa standard length of 2.72 feet (one mega-lithic yard) in the construction of thesestructures, as well as of others in both theOld World and the New.

11. The writing of this article was supportedin part by the Oklahoma State UniversityResearch Foundation.

Mechanism of Antibody Diversity:Germ Line Basis for Variability

Analysis of amino termini of 64 light chains indicatesthat much antibody variability is present in the germ line.

Leroy Hood and David W. Talmage

The nature of the genetic control ofantibody variability is one of the mostfascinating and approachable problemsin mammalian genetics. Vertebrate or-ganisms appear to be capable of syn-thesizing thousands of different antibodysequences, each presumably encodedby a different antibody gene. Howthen do these genes arise? The somatictheory of antibody diversity postulatesthat antibody genes arise by hypermuta-tion from a few germ line genes duringsomatic differentiation. In contrast, thegerm line theory postulates that verte-brates have a separate germ line genefor each antibody polypetide chain thecreature is capable of elaborating.We discuss here certain patterns that

have emerged from amino acid sequenceanalysis of antibody polypeptide chains.These patterns indicate that much ofthe sequence diversity is present in thegerm line. We also discuss why thegerm line theory seems to be the simplestexplanation for antibody diversity.

Dr. Hood is a senior investigator in the Im-munology Branch of the National Cancer Insti-tute, Bethesda, Maryland. Dr. Talmage is profes-sor of microbiology and dean of the Universityof Colorado Medical School, Denver. Requests forreprints should be sent to Dr. Talmage.

17 APRIL 1970

General Immunoglobulin Structure

All five recognized classes of anti-bodies (immunoglobulins) in mammalscontain two distinct polypeptide chains,called light and heavy chains (1). Forexample, there are two identical lightand two identical heavy chains permolecule in the major serum immuno-globulins, the immunoglobulin G class;but the light and heavy chains differchemically in different antibodies. Sincenormal antibodies produced against thesimplest of antigenic determinants aregenerally heterogeneous, advantage hasbeen taken of the homogeneous im-munoglobulins produced in large quan-tities by plasmacytomas in humans andin the highly inbred BALB/c mouse (2).Light chains are frequently excreted inthe urine of individuals with certainplasmacytomas; these light chains havebeen called Bence Jones proteins. Thisarticle deals only with light chains be-cause relatively little information oncomparative sequences is available forheavy chains.

Light chains from most mammalianspecies including man are of two types,lambda and kappa, which are readily

distinguished by serological and chemi-cal criteria (1, 3). Light polypeptidechains have two parts: a common re-gion (approximately residues 108 to215), which is essentially invariant fora given light chain type and speciesand a variable region (approximatelyresidues 1 to 107) which is different foreach well-characterized protein (4-6).Presumably each variable region se-quence directs the folding of a uniqueantibody light chain configuration.Two patterns emerge when the amino

acid sequences from many light chainvariable regions are compared. (i) Allkappa and lambda variable region se-quences can be divided into subgroupson the basis of their similarity to oneof eight prototype sequences (Fig. 1).(ii) Relatively minor deviation from theprototype sequences occurs among indi-vidual proteins of a given subgroup(Fig. 1) and is designated intrasubgroupvariation. We first discuss the variableregion subgroups and the reasons forconcluding that each light chain sub-group must be encoded by at least onedistinct germ line gene. We then discusswhy we believe that the intrasubgroupvariation may also be encoded by sep-arate germ line genes.

Variable Region Subgroupsof Kappa Chains

The nearly complete sequences ofsix variable regions of human kappachains are known (Roy, Ag, Cum, Mil,Eu, and Ti), and partial sequences ofmore than 35 others have been deter-mined. All variable regions of kappachains can be assigned to one of threesubgroups on the basis of their simi-larity to one of three sets of linkedamino acid sequences (Figs. 1 and 2).These three sets of linked amino acidsor three "prototype sequences" arederived by examining the proteins of

325

on

Dec

embe

r 10

, 201

4w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

o

n D

ecem

ber

10, 2

014

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om

on

Dec

embe

r 10

, 201

4w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

o

n D

ecem

ber

10, 2

014

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om

on

Dec

embe

r 10

, 201

4w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

o

n D

ecem

ber

10, 2

014

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om

on

Dec

embe

r 10

, 201

4w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

o

n D

ecem

ber

10, 2

014

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om

on

Dec

embe

r 10

, 201

4w

ww

.sci

ence

mag

.org

Dow

nloa

ded

from

o

n D

ecem

ber

10, 2

014

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om

Page 2: equi- to a use was...7. The monument at Borrowston Rig requires anchor points that are placed farther apart on the circumference than the intersections made by the trisecting radii

one subgroup and noting the majoramino acid residue at each position (3,7). For example, most VKI proteins haveAsp (8) at position 1, Ile at position 2,Gln at position 3, and Met at position 4(Fig. 1). Thus the prototype VKI se-quence in this region would be Asp-Ile-Gln-Met. The three prototype sequences

Type Source

Kappa ---BiBiBJBJBJIgGBiIgMIgMBiBJBiBiIgMIgMIgGIgMBiBiIgGIgMBJ

Kappa ---

BJBJIgABJBJBiIgGBJBJBJBJIgMIgMIgGBi

Kappa ---

BJBJBiBi

Kappa IgG

Type Source

Lambda ---

BJBJBJBJBiBiBJIgMBJ

Lambda --

BJBJBJBJBJBJ

Lambda --

BJBJBJBJ

Lambda ---

BJBiIgG

Lambda ---BiBJ

326

Protein

VKIRoyAgDeeConAleTraHBJ10toWagKerBJCraHBJIPapLayLuxMarBelCarEuMonHBJ4

TiFr4NigRadCasHBJ5HS4HowB6HacWinSmiGraDobSte

VKillTewManMilCum

HumanPool

Protein

VxlHS92HS78NewBJ98HS94HBJ7HBJ 11KohHa

VxlIHS68HS77HBJ15HS70HS86HBJ8

KernxIIIPal

VxvBoHB.12Hul

VXvShGoo

1 2 3 4 5 6 7

Asp lie Gin Met Thr Gln Ser

are given for the first 20 residues inFig. 3, and it is evident that eachprototype sequence differs from itscounterparts by 5 to 6 residues in 20.At the amino terminus, most proteinswithin a subgroup deviate from theirrespective prototype sequence by ap-proximately one residue in 20 (Fig. 1).

Position

8 9 10 11 12 13 14 15 16 17 18 19 20 191

Pro Ser Ser Leu Ser Ala Ser Val Gly Asp Arg Val Thr -

I o..

lie --Va

Lea ArgLea Thr

VatVal

Leu PheLeuLeu

Thr AlaThrThrThr

G- lie Val Leu Thr Gln Ser Pro Gly Thr Leu Ser Leu Ser Pro Gly Glu Arg Ala Thr

LysAspAsp

Asx

Asp AlaAla

-Met Met Ala MetlIe Met -Ala

Ala

Asp lie Val Met Thr Gin Ser Pro Leu Ser Leu Pro Val Thr Pro Gly Glu Pro Ala Ser

Leu(t,l, Thr

Asp lIe Gln Met Thr Gln Ser Pro Ser Ser Leu Seru Val eu Gly Thr Val

AlaValLeuPro

Ala Ser Val Gly Asp ArgVa Thr Pro GiuLeu Val Leu

Position

1 2 3 4 5. 6 7 8 9 10 11 12 13 14 15 16 17

GIp Ser Val Leu Thr Gln Pro Pro [ I Ser Val Ser

AlaGlu- Ala

Gly Ala Pro Gly Gln

Ala (Asx) -tAla

Ala ThrAla ThrAla Thr

Thr

Glp Ser Ala Leu Thr Gln Pro Ala [ Ser Val Ser Gly Ser Pro Gly Gin

SerPro Ala Glu-

Ala

t] iyr Val Leu Thr Gln Pro Pro ] Ser Val Ser Val Ser Pro Gly Gln

18 19 20

Arg Val Thr

ThrLys,Ala)

Gly

Ser

Ser le Thr

Thr

Thr Ala Ser-- Ala val

Asp- Glu

Gip Ser Ala Leu Thr Gln Pro Pro [] Ser Ala Ser Gly Ser Pro Gly Gin Ser Val Thr

( Ser

l I Ser Glu Leu Thr Gin Pro Pro [ ] Ala Val Ser Val Ala Leu Gly Gln Thr Val Arg

Although the sequence data are morelimited after position 20, these sub-group distinctions extend through thevariable region at least to position 94(beyond which subgroups cannot bedistinguished in the present data). Itis particularly important to note thatin the carboxy-terminal portion of theVK region, the 12 sequences availablefall into the same kappa subgroups;that is, individual proteins do not switchsubgroups (Fig. 2). The separation be-tween subgroups is indicated by the factthat kappa chains within a subgroupare identical for 73 to 87 percent oftheir sequences, whereas comparisonsbetween subgroups show a 51 to 68percent identity (Table 1).Two regions in light chains seem to

be more variable than the remainderof the VK region. Both "hypervariable"regions (residues 25 to 35 and 89 to 96)are just COOH-terminal to the cysteines(that is, Cys 23 and Cys 88) which forma disulfide bridge in the V region. Per-haps one or both of these sites is di-rectly involved in the antigen-combiningsite. In spite of this variability, sub-group specific residues are present evenin these regions (Fig. 2).

In addition to the linked amino acidresidues (Fig. 3), yet another criterionmay distinguish kappa subgroups. Ho-mologous sequence gaps (3) may sep-arate the proteins of one subgroup fromthose of a second. For example, the

VatVat Fig. 1. Amino terminal sequences of hu-Val man lambda and kappa chains dividedVal into subgroups on the basis of associatedLeu amino acid residues at certain positions.

The prototype sequence is given first withthose residues unique to the subgroupunderlined. In subsequent sequences, onlythose residues that differ from the proto-type are included. The source of the light

190 chain is given, with BJ indicating Bence- Jones protein. Position 191 in kappa chainsLys has a Leu-Val interchange (Inv marker).Arg Position 190 in lambda chains has an Arg-Arg Lys interchange (Oz group). Residues forArg positions 191 in kappa and 190 in lambdaArg are in references listed below except forArg HBJ4, Hac, and Dob (63) and for theArg human pool (13). Most of this data is

summarized in two papers (17, 64). RoyArg and Cum are taken from (65); Ag, Ha,ArgLys Bo, and Sh (6); Ker and BJ (66); Cra,Arg Pap, Lux, Mon, Con, Tra, Nig, Win, Gra,Lys Cas, Smi, and human pool (9); Lay, Mar,

Wag, lo, How, and Koh (67); Rad andArg Fr4 (7); Man, Bel, and B6 (64); EuArg (68); Mil (28); Tew (69); Ti (70);Lys HBJ1, HBJ4, HBJ1O, HBJ5, and HS4

(3); HBJ2, HBJ7, HBJ8, HBJ11, andArg HBJ15 (25); HS92, HS78, HS94, HS68,Lys HS77, HS70, and HS86 (17); Hac, Dob,

and Pal (71); BJ98 (72); Hul (73); XArg (74); Kern and 111 (75); and Car, Dee,Lys and Ale (7).

SCIENCE, VOL. 168

--

Page 3: equi- to a use was...7. The monument at Borrowston Rig requires anchor points that are placed farther apart on the circumference than the intersections made by the trisecting radii

ALTERNATE POSITION NUBERS

A 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100Protein B 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

VKI Ser Leu Gln Pro Glu Asp Ile Ala Thr Tyr Tyr Cys Gln Gln Tyr Asn LeuAg Gly ThrRoy Phe [Eu Asp P_he Ser Asp CBJ -_Glu (Ker Asp [Beel Ala Phe His[

V< Arg Val G1x Ala Glx AsX Val Glr Val Tyr Tyr Cys Met Gln Ala Leu Gin Thr t

Cum Gln Met Arg Glu Ile[

V Arg Leu Glu Pro Glu Asp Phe Ala Val Tyr Tyr Cys Gln Gln Tyr Gly Ser Ser t

B6 [Rad Thr (Fr4 Phe Gly [

I

II

I

I

II

II

III

II

II

] Pro Thr Phe Gly Gin__Ar_

I Leuen-GlyI Ser Lysi Met

TyrI Pro ProI Phe) Gly

II'I

IIIII

Pro. Tbr Phe Gly GinLeu GlyTyr _____---

Pro Thnr Phe Gly GlnSerPheThrGln Pro

Fig. 2. Carboxy-terminal portion of variable region. The alternative numberings of positions are (A) that used in most previousreferences and (B) that used in the computer comparison of various chains mentioned in this paper (see Tables 1 to 3). Subgroupscannot be discerned beyond positions 94 to 100, the same point at which a deletion occurs in VK regions compared with VXregions. References are given in Fig. 1.

two sequenced VKIII proteins have aninsertion of four to six amino acids ator near position 30, as compared withthe VKI proteins (the exact number ofextra residues in the VKIII protein Milis uncertain) (see Table 1). A singleVKII protein has an insertion of oneamino acid in this region. Thus, pre-cisely located sequence gaps and in-sertions may also distinguish the threevariable region subgroups of humankappa chains.What do subgroups mean in relation

to gene structure? It is difficult to imag-ine a genetic mechanism that couldgenerate, from a single gene, the se-quences of VKI, VKII, and VKIII proteinswith linked amino acid residues andgaps. Indeed, it seems that each VK sub-group is encoded by a separate germline gene (or genes).Are these distinct kappa subgroup

genes alleles? Niall and Edman (9)found that all three kappa subgroupsare present in a pool of light chains;for example, Glu at position 1 is char-acteristic of VKII proteins, Gln at 3, ofVKI proteins, and Leu at 9, of VKIIIproteins (see Fig. 1, Human Pool).Two experiments suggest that everyindividual can produce VKI andVKII proteins. Light chains from13 normal individuals have thesame major alternatives as the normalpool at each of the first four positions(10). With three alleles the probabilityof selecting seven consecutive identicalheterozygotes from the normal popula-17 APRIL 1970

tion is considerably less than one in athousand [(2 X 1/9)7] if we assume anequal frequency. Furthermore, trypsin-digested peptides characteristic of VKIand VcII proteins have been isolatedfrom the light chains of 17 normalindividuals (7). The question of whether

or not the minority group VKiiI (11percent of the pool of kappa myelomachains according to Fig. 1) is found inall normal individuals has not yet beenanswered conclusively, but there is noreason to suspect that it is not. Thus,all individuals produce VKi and VKII

Table 1. Comparison of individual human variable regions within and between groups.Proteins were translated to nucleotide sequence and aligned to show maximum homology; thepercentages of nucleotide and amino acid homology were then determined. The numberingwas made according to longest chain combination (116 amino acid residues); deletions wereplaced in all VK chains at position 113, in all VK chains and VAIII chains at positions 101and 102, in all VA chains at position 9, and in VAIII and Vxv chains at position 1. Deletionsaround residue 30 were iocated as follows: VKI 30 to 35; VKII 32 to 36; VA1 (New) 31 to 34;V51 (Ha) 31, 33 to 34; VA111 31 to 36; VAIV 31, 33 to 34, and V,iv 31 to 36. All chains wereprogrammed for a computer to record comparisons as same nucleotide, different nucleotide,unknown, gap-gap, or gap-nucleotide. The percentage of nucleotide homology is the numberof identical nucleotides times 100 divided by the sum of all known nucleotides. The single extraresidue on Cum at the amino terminus was not counted. References are given in Fig. 1. Un-certainty regarding chain length of Mil and New and the location of a deletion in Ti givesonly a small uncertainty in the percentage homology for these proteins.

Homology (%) Number of Amino acidRegions compared . nucleotide- chain lengthNucleo- Amio gap positions difference

tide acid

Within suibgroupsVKI (Roy) vs. V.,1 (Ag) 92 87 0 0VKI (RoY) VS. VK1 (Eu) 80 73 0 0VK1 (Ag) VS. VKa (Eu) 81 74 0 0V,,111 (Mil) VS. VKIIT (Cum) 85 82 0-6 0-2VAI (New) vs. V), (Ha) 84 75 0-3 0-1V,\,, (Kern) vs. V51,, (X) 79 73 0 0

Between slubgrouipsV,,I (Roy) vs. VK,, (Ti) 74 61 3-9 1V,KI (Eu) vs. V,,, (Ti) 76 68 3-9 1VKI (Roy) VS. VKIII (Cum) 64 51 18 6V,KIII (Cum) vs. VKI1 (Ti) 72 64 15 5V51 (Ha) vs. VAIII (Kern) 68 55 18 6VAI (Ha) vs. V51v (Bo) 75 66 0 0VAI (Ha) vs. V-v (Sh) 68 56 12 4VAIII (Kern) vs. VAIN (Bo) 65 58 18 6VAIII (Kern) vs. V,v (Sh) 72 64 6 2V,IV (Bo) vs. VAv (Sh) 68 60 12 4

327

Page 4: equi- to a use was...7. The monument at Borrowston Rig requires anchor points that are placed farther apart on the circumference than the intersections made by the trisecting radii

Subgroup 1 3 4

Position

9 10 12 13 14 15 17 18 19 20

Lambda Chains

VKI Asp - Gln Met ---- Ser Ser - Ser Ala Ser Val - Asp Arg Val Thr

VKII Glu - Val Leu ---- Gly Thr - Ser Leu Ser Pro -. Glu Arg Ala Thr

VKIII Asp - Val Met ---- Leu Ser - Pro Val Thr Pro - Glu Pro Ala Ser

Fig. 3. Prototype sequences for three subgroups of human kappa variable regions.Hyphens represent residue identities in all prototype sequences.

proteins (by inference VKIII proteins);hence these alternatives cannot be al-leles. Therefore it appears that at leastthree germ line genes must encode VKregions in each individual. A paradoxarises, however, in that the commonregion of the kappa chains appears tobe encoded by a single gene and not bythree genes).

Common Region of Kappa Chains

The common regions of the kappachains (residues 108 to 214) from lightchains of different human myelomaproteins are identical except for aleucine-valine substitution at position191 (4, 11). This subtle interchangecan be detected in normal light chainsby serological and chemical techniques.Using these techniques, one can dopedigree analyses which have shownthat these two variants are inherited inMendelian fashion, thus suggesting thatthe common region of human kappachains is encoded by a single structuralgene with two alleles (12, 13). Thecommon region of the mouse kappachains with a single amino acid se-quence also appears to be encoded bya single gene (5, 14).That the common region of the kappa

chain is encoded by a single structuralgene is also supported by evolutionaryconsiderations. Genes for common re-gions of human and mouse light chains

probably descended from a commonancestor (these common regions havethe same number of residues and theyare identical at 64 out of 107 positions,as indicated in Table 2). If multiplegenes for the common region of thekappa chains existed in this ancestorand now exist in mouse and man, thenit is difficult to explain how in the faceof the nonnal mutation rate (40 per-cent of the residues differ in the chainsof mouse and man) all copies withineach species maintain the same aminoacid sequence. Furthermore, if there aremultiple genes for the common regions,it is difficult to see how the leucine andvaline genetic markers evolved andwhy recombination has not randomizedthem. Thus the common region forkappa chains in man (and mice) ap-parently is encoded by a single gene(15).

Two Genes-One Polypeptide Chain

We have previously concluded thatVK regions are encoded by at leastthree germ line genes. Since VKi, VKii,and VKIII sequences can be associatedwith a single variant for the commonregion (see kappa position 191 in Fig.1), it seems that each kappa light chainis encoded by two genes (VK and CK),which are expressed as a single poly-peptide chain. A similar picture seemsto be true of human lambda chains.

Table 2. Comparison of regions in individual human and mouse kappa chains. M or m des-ignates a mouse protein and h a human protein. M41 gaps were placed at the same posi-tions as VKr proteins, M-70 gaps were placed at positions 33 to 34, 101 to 102, and 113.Otherwise see legend to Table 1.

Homology (%) Number of Amino acidRegions compared Nucleo- Amino nucleotide- chain length

tide acid gap positions differenceM-41 vs. M-70 64 57 12 4M41 vs. hVKI (Ag) 77 64 0 0MA41 vs. hVKI (Ti) 71 58 3-9 1M41 vs. hV.111 (Cum) 59 49 18 6M-70 vs. hV,,1 (Ag) 71 60 12 4M-70 vs. hVKI1 (Ti) 72 62 9 3M-70 vs. hVKIIc (Cum) 71 59 6 2mC,, vs. hC,, 70 60 0 0mC,, vs. hC5 57 44 6 2

328

The human lambda chain also hasa common and a variable region (6).The common region is somewhat morecomplex in character than its kappacounterpart.

There are four or five subgroups inthe variable region of the lambda chain,which again are defined by prototypesequences and gaps (Fig. 1 and Table1) (16, 17). Some lambda subgroupsappear less distinct than their kappacounterparts (compare VAi and VAiv);nevertheless, statistical arguments sup-port these subgroup distinctions (17).The existence of at least four lambdasubgroups does require four germ linevariable genes. Again it appears thatevery individual can synthesize most,if not all, of the lambda subgroups (10).The common region of the lambda

chain is more complex than its kappacounterpart in that it appears to beencoded by two separate germ linegenes. The common region in thelambda chain is essentially invariantin light chains from myeloma proteinsexcept for an amino acid interchange,arginine-lysine, at position 190 (18).Since every individual can synthesizeboth the arginine and lysine forms,two genes for the common region ofthe lambda chain genes are probablypresent in the germ line (19).Each of the four or five subgroups of

the lambda variable region can befound attached to either form of thelambda common region (see lambdaposition 190 in Fig. 1). Thus, the lamb-da chain must also be encoded by twodistinct germ line genes (a Va and aCA) which are expressed as a singlepolypeptide chain. Otherwise eachlambda chromosome would need eightcopies of the lambda common region,an arginine and a lysine form for eachsubgroup. The evolution of such asystem would be difficult if not im-possible to explain.The existence of light chain sub-

groups permits us to determine theminimum number of germ line genesencoding light chains- four for humankappa chains (three VK and one CK),six for human lambda chains (four Vsand two CO).

Insertional Mechanism

For both types of human antibodylight chains two separate light chaingenes (variable and common regiongenes) or their products are joined

SCIENCE, VOL. 168

Page 5: equi- to a use was...7. The monument at Borrowston Rig requires anchor points that are placed farther apart on the circumference than the intersections made by the trisecting radii

during differentiation of the immun-ocyte by a special insertional mech-anism since the light chain is a singlepolypeptide chain. Although joiningcould occur at any stage of proteinsynthesis (DNA, RNA, or protein),four lines of evidence suggest that join-ing does not occur at the protein level.(i) If the light chain is synthesized astwo separate units and joined duringprotein synthesis, it must have twogrowing points. Experiments designedto resolve the problem of the numberof growing points suggest that thereis a single light chain growing point(20). (ii) The size of the light andheavy chain polysomes is consistentwith the synthesis of polypeptides hav-ing a molecular weight of 25,000and 50,000 respectively and not withthe synthesis of half molecules (21).(iii) The immunoglobulin products ofmyeloma tumors are stable throughhundreds of transplantation generations,suggesting irreversible union of com-mon and variable regions at the DNAlevel (22). (iv) Finally, one heavychain disease protein appears to have alarge segment of the V and C regionsdeleted (residues 19 to 217) (23). Inthis case the V region must have beenjoined to the C region at the DNA levelbefore the deletion occurred, otherwiseone must postulate that two inde-pendent deletions occurred in separateV and C genes. Furthermore, unlessjoining occurred at the DNA level onewould have to postulate that new Vregion and C region recognition siteswere generated so joining could occureither at the RNA or polypeptide levels(see next paragraph). Thus, it seemsvery likely that the joining of V and Cregions occurs at the level of structuralgenes.

Since kappa variable and commonregion genes only combine with oneanother and never with lambda variableor common genes (24, 25), the proposedinsertional mechanism must have ameans of recognizing and joining onlyappropriate gene pairs. The simplestmodel for this mechanism is the in-sertion of the lambda phage into thebacterial chromosome. The genome ofthis bacteriophage and the bacterialchromosome each must have a specificand complementary recognition site(26). As few as 12 nucleotides are suffi-cient for such a recognition site (27).These recognition sites in the immuno-globulin system must be in or adjacentto both common and variable genes.These sites could be expressed in thefinal polypeptide product at the NH.-

17 APRIL 1970

POSITION

Protein 10 20 22 24 28 29 30 31 32 34 39 41 42 46 48 49 50

Ag - E ----- Gln ---------Asn-His [yr - Asn Gly-Lys-----Ile-------- Asp

Roy - Li --- Gln ------------Ile LE - Asn ---------- Asp

Eu - Tihr ------- A Ser-----Asn-Thr ETr - Ala ----------------Met---- Lys

HBJ4 - Thr Ser-Ala Arg Asx-Val-----Asn Trp --Ala ----Glx---------Ser Lvs

Number ofBase 1

Changes1 2 2 2

Fig. 4. A comparison of the first 50 residues of four proteins formerly designatedVKI indicating further subdivision of the VKI subgroup into two subgroups, VKI andVKiv. Hyphens represent residues identical in two or more of the sequences. Subgroup-distinguishing residues are in boxes and are numbered. Number of base changes in-dicates the number of changes that would be required to convert the VKI codon at agiven position into a VKIV codon. Protein HBJ 4 is taken from (34) and the othersfrom the references given in Fig. 1.

terminal portion of the common region,the COOH-terminal portion of the var-iable region, or under special circum-stances in neither of these regions [seethe model proposed in (28) and the dis-cussion in (29)]. Perhaps our inabilityto recognize the COOH-terminal por-tion of the VK region (residues 95 to107) is a reflection in part of the inser-tional mechanism.One other mechanism which may

play a role in preventing the attach-ment of VK regions to the Cx regioncould be the physical separation in themammalian genome of the Cx and CKgenes (and presumably the correspond-ing Vx and VK genes). There is a pre-liminary indication that the genes forrabbit X and K light chains are indeedunlinked (30). In contrast the commongenes of various classes and subclassesof heavy chains which may share VHregions seem to be closely linked toone another (31).

Intrasubgroup Variability

Each light chain subgroup is prob-ably encoded by one or more distinctgerm line genes. We shall now examinethe genetic basis for the variation whichoccurs within a subgroup. We haveanalyzed the intrasubgroup variabilityfor 41 kappa and 23 lambda proteinsover the 20 residues of the NH9-termi-nal; thus more than 1200 residues wereexamined by comparing the amino acidsequence of each specific protein withits prototype subgroup sequence. Thereare 63 intrasubgroup amino acid inter-changes which are indicated in Fig. 1 asbreaks in the lines for individual pro-

teins. Three generalizations emergefrom these data; two of these generali-zations suggest that much intrasubgroupvariation is carried in the germ line.

1) The variation which occurs withinthe subgroups of antibody light chainsis indistinguishable from that whichoccurs among evolutionarily related setsof proteins such as cytochromes, fibrino-peptides, and hemoglobulins. This isdemonstrated (i) by the predominanceof single base substitutions (60 of 63intrasubgroup interchanges); (ii) by therandom nature of transversional andtransitional base changes (26 transver-sions and 1 5 transitions) (32); and (iii)by the striking tendency of G to mutatemore frequently than is expected (13G's mutated, whereas only 7 A's, 5 U'sand 5 C's were substituted). Each ofthese properties is demonstrated byother evolutionarily related sets of pro-teins, such as the cytochromes (33),which are obtained from differentspecies and are obviously encoded bydistinct germ line genes.

2) The substitutions which occur ata single position within a subgroup arehighly restricted, suggesting the pres-ence of additional germ line genes.Seven positions in the kappa chainsand eight in the lambda chains showtwo or more substitutions within asingle subgroup (see VKI positions 4,10, and 13 in Fig. 1). Indeed, 47 ofthe 66 base substitutions are repre-sented in these 15 positions. Further-more, 29 of the 47 multiple substitu-tions are represented by identical re-placements occurring at the same posi-tion within a single subgroup (see VKipositions 4, 10, and 13 in Fig. 1).Do those subsets of proteins with

329

Page 6: equi- to a use was...7. The monument at Borrowston Rig requires anchor points that are placed farther apart on the circumference than the intersections made by the trisecting radii

identical replacements at a given posi-tion (for example, VKI proteins Eu,HBJ4, Mon, and Car) have additionallinked substitutions farther down thepolypeptide chain? Both Eu and HBJ4are compared for 50 residues withmyeloma proteins Ag and Roy inFig. 4. Proteins Eu and HBJ4 sharefive residues which distinguish themfrom the Roy and Ag proteins (34).Clearly Eu and HBJ4 belong to a fourthkappa subgroup in that they have linkedamino acid residues which seem to dis-tinguish them from other VKi proteins.Preliminary data indicate that the VKIIproteins with Ala at position 9 will be-long to a fifth kappa subgroup (34).Thus as additional sequence data aregathered, it seems likely that many ofthe positions at which multiple identi-cal substitutions are present will indi-cate additional subgroups in the lightchains. As additional subgroups aredelineated from the major subgroupsalready defined, the intersubgroup dif-ferences will be less extensive. Ulti-mately the definition of a subgroup mustcome to rest on arbitrary statisticalcriteria related to the number of shareddifferences between two sets of pro-teins. In any case, it is clear that muchof the light chain diversity indicated inFig. 1 is encoded by multiple germline genes. If each of the positions atwhich identical replacements occur dorepresent additional subgroups (for ex-ample, VKi4, 10, and 13; VKx4, 9, and17; VAll, 13, and 14; and VAx15), hu-man light chains would be encoded bya minimum of 18 germ line genes. Thisnumber will certainly increase as moredata are gathered.

3) There is no evidence to suggestthat the intrasubgroup variation hasbeen generated by a somatic recombi-national mechanism. No recombinationhas been detected between V genes ofdifferent subgroups (Figs. 1 and 2). Fur-thermore, in more than half of the intra-subgroup substitutions (26 out of the 47substitutions in which the base changecan be identified), it can be determinedthat the substituted base is not presentat the corresponding position in anyof the prototype genes of the samelight chain type. Thus, simple recom-bination among three prototype VKgenes cannot explain more than halfof the intrasubgroup variability (35).

Genetic polymorphism probably doesnot account for most of the intrasub-group variation. If much of the intra-subgroup variation is encoded in thegerm line (rather than being the resultof somatic differentiation), do these

330

Position

Protein

NOPC 1494BJ 41

MBJ 70MOPC 46Adj PC9

MOPC 157

MBJ 6

1 2 3 4 5 6

Asp Ile Gln Met Thr

Asp Ile Gln Met Thr Gln

Asp Ile Val Leu Thr Gln

Asp Ile Val Leu Thr Gin

Glp Ile Val Leu

Asp Ile Val

Asp Ile Val VaI Thr Gln

Fig. 5.. Amino terminal sequences ofBALB/c (mouse) kappa chains. MOPC149, MOPC 46, Adj PC 9, and MOPC 157were taken from (37); MBJ 41, 70, and6 were taken from (36).

variant genes all occur in the same in-dividual or do they represent geneticpolymorphism in the human popula-tion? Genetic polymorphism certainlymust be responsible for some of thesequence variation, but the variation atthe amino termini of kappa chains fromthe inbred BALB/c mouse is similar tothat seen (Fig. 5) in the human popu-lation (36-38). Thus if one accepts thegenetic identity of individual BALB/cmice, polymorphism does not appearto be responsible for a majority of theV region substitutions.

Since the variation which occurswithin the subgroups of antibody lightchains is indistinguishable from thatwhich occurs among evolutionarily re-lated sets of proteins, random singlebase mutation followed by selectionseems to be the obvious mechanism forgenerating antibody diversity. The ques-tion is whether antibody genes are pro-duced by mutation and selection duringsomatic differentiation or during theevolution of 'higher creatures. The se-quence data cannot yet give us a directanswer to this question, but the germ-line hypothesis appears to be compatiblewith the sequence data and somewhatsimpler than its somatic counterparts.

Theories of Antibody Diversity

Somatic theories of antibody forma-tion suggest that a limited number ofgermline genes are responsible for anti-body diversity (29, 39-43). This diver-sity is generated during differentiation ofthe immunocyte by an appropriate mu-tational mechanism (hypermutation, re-combination, translational ambiguity).Although space does not permit a thor-ough discussion of the somatic theoriesindividually, several general commentsshould be made. (i) It is generally

agreed that somatic theories postulatethat subgroup variability is encoded inthe germ line, whereas an ad hoc muta-tional mechanism must generate theintrasubgroup variation. (ii) With asmall number of genes this mechanismcannot be random in character becauseof the enormous amount of cell wastagethat would ensue (44). (iii) The so-matic mutational mechanism must berestricted to the variable gene (the com-mon gene does not show similar varia-tion). (iv) Although there is no directevidence for a recombinational model,one cannot rule out the possibility thatrecombination occurs among a largenumber of genes-at least ten for eachsubgroup, which are themselves isolatedfrom the genes of other subgroups.Hence the most attractive somatictheories, simple hypermutation and re-combination, would need to be farmore complex than was initially pro-posed (39, 40, 42, 43) [see (45) for amore thorough discussion of thesepoints]. Thus we discuss below whatappears to be a simpler solution to theproblem of antibody diversity, thegerm line theory.

Germ Line Theory

The germ line (multigene) hypothesis(28, 46) states that each organism hasone variable gene in its germ line foreach unique antibody polypeptide chainit can synthesize and that a separategene encodes each type of common

region. During somatic differentiationof the immunocyte, a single variablegene or its product is joined to the com-

mon gene or its product at some stageof protein synthesis, and the resultingdifferentiated immunocyte synthesizesone type of light chain and, presum-ably by a similar mechanism, one typeof heavy chain. According to thistheory, the variable genes arise by thenormal process of chemical evolution,that is, gene duplication followed bymutation and selection. We first showthat the sequence data are consistentwith such a hypothesis and then discussthree questions concerned with the evo-lution of such a system.

Both the intra- and the intersubgroupvariation is similar to that seen in evo-

lutionarily related sets of proteins (seepoint 1 under intrasubgroup variationand remember that hemoglobin chainsfrom different species can be dividedby identical criteria into "subgroups"'-such as the a, ,B, and 8 "subgroups").Furthermore, a typical phylogenetic tree

SCIENCE, VOL. 168

Page 7: equi- to a use was...7. The monument at Borrowston Rig requires anchor points that are placed farther apart on the circumference than the intersections made by the trisecting radii

can be constructed from the 20 residuesof the NH2-terminal of 41 kappa and23 lambda chains by the method ofFitch and Margoliash (47) (Fig. 6).While this tree is only an approximationof the real variable gene tree (only one-fifth of the variable region sequence isused), it does correlate perfectly witha similar light chain tree constructedfrom 13 complete variable sequences(48) and with data in Table 1. Thekappa subgroups are seen as majorbranches on the evolutionary tree,whereas the intrasubgroup variationsrepresent the terminal twigs of eachsubgroup branch.

Additional branching (subdivisions)of the major subgroups would be ex-pected with a germ line model of anti-body diversity as more sequence dataaccumulates. Such subdivisions are evi-dent even now. For example, the VKIbranch designated 10 (enclosed by adotted circle) in Fig. 6 is supported bya comparison of the first 50 residues ofEu, HBJ4, Ag, and Roy given in Fig. 4.

As indicated earlier, five positions dis-tinguish the Roy and Ag proteins fromthe HBJ4 and Eu proteins. This sugges-tion that Eu and HBJ4 belong to afourth kappa subgroup (Vxiv) isstrengthened by the fact that three ofthe five substitutions are changes oftwo bases. Hence, as the intersubgroupsubdivisions (branching) become moresubtle, it will be impossible to distin-guish them from the intrasubgroupvariation. Indeed it seems reasonable topostulate that there is no real differencebetween the intra and the intersubgroupvariation; rather these merely representevents that have occurred at differentpoints of time in the evolution of mul-tiple V genes. As will be discussed sub-sequently, gene duplication could gener-ate many daughter genes which all showan early successful mutation (intersub-group variation) whereas later mutations(intrasubgroup variation) may be pres-ent in just a single gene.

Complete sequence data on twomouse kappa chains provide an indica-

tion of the time scale of the evolutionof light chains. Mouse protein MBJ-41has striking similarities to human VKiproteins over the first 60 residues, in-cluding the gap at residues 31 to 36.The remainder of protein MBJ-41 andmost of protein MBJ-70 cannot be cor-related with any of the kappa sub-groups, an indication that some of thepresent subgroup distinctions hadevolved before the human and mouselines diverged. The remainder of thesubgroup and all of the intrasubgroupvariation are more recent events.

In human and mouse proteins, vari-able and common regions seem to be di-verging at equal rates. For example, themouse protein MBJ-41 differs from hu-man VKI, VKII, and VKIII proteins by 36,42, and 51 percent of its sequence re-spectively (Table 2). The mouse proteinMBJ-70 differs from these same pro-teins by 40, 38, and 41 percent. Thecommon regions of human and mouseproteins have diverged by 40 percentof their sequences. Thus mouse and

13a10

9a

I,,i£S Za-2 b

4

Fig. 6. A "phylogenetic" tree constructed from the amino terminal 20 residues of 41 kappa and 223 lambda proteins by the methodof Fitch and Margoliash (76), which reduces the number of mu tations to a minimum. The 64 proteins are indicated by closedrectangles. Deletions are indicated by triangles and mutations by numbers. The letters a, b, or c indicate which nucleotide ischanged according to the genetic code. Closed circles indicate major classes and subclasses. Dotted circles indicate subdivisions ofthe subclasses, or highly improbable identical somatic mutations occuring in two different individuajls.17 APRIL 1970 331

Page 8: equi- to a use was...7. The monument at Borrowston Rig requires anchor points that are placed farther apart on the circumference than the intersections made by the trisecting radii

Primordial Specificity Gene

1 2 3 4 5 6 7 8 9 10

I I| I I I| | | Human-Rabbit Ancestor

75

1111I 11111111 111 1 111111111 1 lIl1l11 I 1 111 1 II I 11111

Human Rabbit

Fig. 7. Diagram of the evolution of a multigene system. This diagram does not implythat the common ancestor of human and rabbit had fewer germ line V genes thanpresent-day man or rabbit. Rather, the intention is to show how a single gene in thisancestor could father a large number of contemporary genes and that differentprimordial genes could produce current rabbit and human V genes. This model canthereby explain species-specific residues and perhaps also provides an explanation forthe apparent V region genetic markers (al, a2, a3) in rabbit heavy chains (61).

human common and variable regionsare equally distinct, and the corres-sponding genes seem to be divergingfrom one another at equal rates aswould be expected in the evolution ofindependent but related genes. Thesame phenomenon is observed when acomparison of variable and commonregions of human lambda and kappachains is made (Table 3). Again bothlight chain regions are about equallydifferent, and again common and vari-able genes seem to be diverging at

equal rates. Similar comparisons ofother lambda and kappa proteins showthe same pattern.

Each of these observations is consist-ent with the hypothesis that separategerm line genes encode each distinctvariable region and that the variablegenes and the common genes are evolv-ing independently but at similar rates.Furthermore, a germ line hypothesisis attractive because an ad hoc muta-tional mechanism is not required toexplain intrasubgroup substitutions.

Table 3. Comparison of regions in individual human lambda and kappa chains. Commonchains were numbered from 1 to 116 and aligned to maximize homology. An insertion offour amino acids was placed between positions 17 and 18, and deletions were placed in bothC,, and CA chains at positions 31 to 34, 71 to 75, 81, 90, and 96. In addition, deletions wereplaeed in CK chains at position 116, and CA chains at 65 and 102 to 103. Otherwise seelegend to Table 1.

Homology (%) Number of Amino acidRegions compared Nucleo- Amino nucleotide- chain length

tide acid gap positions difference

V,AI (Ha) vs. V., (Ag) 57 47 21 5VA, (Ha) vs. VK,I (Ti) 57 49 24 4VAI (Ha) vs. VKIII (Cum) 57 50 21 1V),,,, (Kern) vs. Vc, (Ag) 52 45 15 1VAIII (Kern) vs. VK,I (Ti) 57 43 12 2VAIII (Kern) vs. V,,,, (Cum) 58 45 27 7VAIV (Bo) vs. V,I (Ag) 55 49 21 5VAIV (Bo) vs. VK,, (Ti) 58 47 24 4VAIV (Bo) VS. VKIII (Cum) 57 46 21 1VAV (Sh) VS. VKI (Ag) 55 44 21 1V,v (Sh) vs. VKII (Ti) 58 45 18 0VAV (Sh) vs. V.,,, (Cum) 58 44 33 5C), VS. CK 55 39 6 2CA VS. VKI (Ag) 31 15 54 2CA, vs. VAI (Ha) 26 12 51 7C, VS. VKIII (Cum) 37 18 60 6

332

|Y

evolve? Once a gene duplication has oc-curred, the genome can be expandedby nonhomologous pairing and recombi-nation. An excellent example of this isthe hemoglobin system where in some

SCIENCE, VOL. 168

However, we must consider three ofthe specific questions regarding the evo-lution of a multigene system.

1) How many antibody genes mightbe required by a germ line hypothesis?There is reason to believe that the num-ber of antibody genes required is con-siderably less than the number of anti-body specificities that an individual cangenerate. First, each antibody moleculemay have a broad range of specificities,some of which may seem unrelated.For example, MOPC 315, a homogene-ous mouse myeloma protein with anti-body activity, shows a high bindingaffinity (107) for molecules as diverse asthose of dinitrophenol (hapten) andVitamin K (49, also see 50). Thus asingle antibody molecule may havespecificity for a large number of dif-ferent antigenic determinants (sharingin part presumably a common tertiarystructure) but exhibit different affinitiesfor each. Second, 10,000 light chainsand the same number of heavy chainswith unrestricted combination of eachlight and heavy chain could generate108 different antibody molecules (104 X104). There is some evidence which sug-gests that not all light and heavy chainscombine equally well (51); neverthelessif only one combination in ten is effect-tive, 20,000 antibody genes could gen-erate 107 different antibody molecules.Twenty thousand genes, each the size

of a variable gene, do not comprise anunreasonable percentage of the humangenome as is shown by the followingcalculation (52):

DNA content of human = 2.3 x 10-1 gsperm cell (haploid)

Molecular weight of = 6.2 X 1021 base pair

DNA content (per cell) = 3.7 X 109 basepairs

One variable region = 107 amino acidsOne variable gene = 321 pairs20,000 variable genes = 6.4 X 106 pairs20.000 variable genes = 0.2 percent of

genetic materialof humanhaploid cells

Thus each antibody chain type couldhave 1600 variable genes and use about0.2 percent of the germ cell DNA(about 12 different common genes havebeen characterized). This is not an ex-cessive demand, and it would be greatlyreduced if all heavy chain classes sharethe same variable genes-as they may.

2) How might a multigene system

Page 9: equi- to a use was...7. The monument at Borrowston Rig requires anchor points that are placed farther apart on the circumference than the intersections made by the trisecting radii

animals there are two or three copies ofthe a gene (53), a myoglobin gene, andat least four different B-lfike chains (,8,8, y, and e). There are even additionalhemoglobin genes in certain animals;for example, it appears that man hasat least two and possibly four copies ofthe y gene (54). Thus, man has at leastseven hemoglobin genes, all of whichare presumably derived from a singleancestoral gene. In a similar fashionthe immune system may have evolvedfrom a single gene, about 300 nucleo-tides in length (55), which duplicatedto generate the primordial V and Cgenes. The V gene could then in timehave been expanded by nonhomologouscrossing over, and with the appropriateselective pressures a large pool of Vgenes could have been produced. Thislarge pool of similar genes undergoesnormal mutation and has an increasedtendency for recombination. Thus theV gene pool is constantly in flux, gen-erating new variants and losing old ones.Hence the V gene pool gives the orga-nism a flexible and rapid means of gen-erating new antibody genes in an ever-changing environment.

3) What selective forces can accountfor both the diversity and similarityof V genes in an evolving germ lineset (56)? The individual light chain Vgenes are subject to two contrastingforces: toward diversity to increase therange of antibody activity, and towardsimilarity because of the structuraldemand for interaction with heavychains. Each different V gene will beselected if it forms a useful antibody incombination with any one of the manydifferent heavy chains. A large numberof antibody molecules encoded by thelight and heavy chain V genes would beexpected to have useful functions ofsome sort in every individual.

Apart from the more specific selec-tive forces imposed on individual Vgenes, there is selection for generalfunctions that are common to all Vregions (for example, combination withheavy chains). Since selection for gen-eral functions places identical con-straints on all variable regions, it is easyto understand how invariant residuescan be present in the V region (seekappa residues 5 to 8 in Fig. 1). Thefact that structural requirements forthese general functions can change indifferent species suggests how speciesspecific residues might be selected in amultigene system; for example, rabbitshave predominantly Ala at the NH2-terminus of kappa chains whereas manhas Asp. Again, suppose a mutation17 APRIL 1970

occurred in the common gene of theheavy chain of rabbit protein whichcaused a more effective light-heavychain fit with light chains having Alaat the NH2-terminus. In time, geneduplication would continue to generatenew Ala genes, and natural selectionwould preserve these new copies. Thisprocess, which is illustrated in Fig. 7,suggests that significant shifts in genefrequency could have occurred duringthe 75 million years since the divergenceof man and rabbit. Although we knowvery little about the nature of theselective forces operating in the im-mune system, the evolution of a multi-gene system does not seem an impos-sible task.The germ line theory permits one to

make predictions that can be tested inat least three ways independently.

1) Theoretically one should be ableto hybridize messenger RNA from onemyeloma tumor with the DNA from agerm cell of the same species and thusdifferentiate between a large and smallnumber of similar genes. Admittedly,serious technical problems exist bothin the isolation Qf mammalian messen-ger RNA and mammalian DNA-RNAhybridization. Nevertheless, this experi-ment should be possible in the nearfuture.

2) One can search for identical VKIchains in the highly inbred BALB/cmouse (presumably having no geneticpolymorphism). At the 95 percent confi-dence level about 35 randomly selectedproteins must be analyzed if thereis to be a repeat in a pool of 200 dif-ferent proteins, about 55 in a pool of500, and about 80 in a pool of 1000(57). If two identical proteins are foundin the first 80 examined, this wouldrender very unlikely any somatic theory(which should generate huge numbersof light chain genes) (58). On theother hand, if no identity is found thequestion of a germ line mechanism asopposed to a somatic mechanism wouldstill be unresolved (remember that 105variable genes require less than 1 per-cent of the germ line DNA).

3) If an authentic genetic marker,similar to the allelic Leu-Val inter-change in kappa chains at position 191is found in one of the variable regionsubgroups, this would suggest that theentire subgroup is encoded by a singlegene (59-61). For example, let us sup-pose an individual should be found inwhom half of the VKI proteins beginwith His instead of Asp. This wouldindicate a single mutation of the Aspcodon of one of the two parental VKi

genes and should distribute itself in aclassic Mendelian fashion among theprogeny of this individual.

Obviously additional sequence datamay reveal new patterns which couldplace additional constraints on one ormore of the various theories. It will beinteresting to see whether or not moreextensive data on the inbred BALB/ cmouse system w'ill yield a picture simi-lar to that of human light chains.Thus a germ line theory is attractive

in that: (i) it requires no ad hoc muta-tional mechanism to explain intrasub-group variation (chemical evolution gen-erates this diversity); and that (ii) thelight chain sequence variations are sim-ilar to those seen in other sets of evo-lutionarily related proteins; and that(iii) evolution in a multigene systemprovides, through frequent gene dupli-cation, an extremely rapid and flexibleresponse to the environment.

Summary

Immunoglobulin light chain se-quences have given us our first glimpseinto the complexity of the genetic mech-anism responsible for antibody diversity(62). Although various somatic modelscannot be excluded (hypermutation andrecombination), the germ line theory isattractive in that it is consistent withthe sequence data, which seems to in-dicate an ever increasing number ofgerm line genes for all theories. Fur-thermore, it seems unnecessary to postu-late a new ad hoc mutational mecha-nism for the generation of antibody di-versity when a well-documented mech-anism, chemical evolution, seems appli-cable. In any case, experiments are sug-gested which may allow us to distin-guish unequivocally among the varioustheories of antibody diversity.

References and Notes

1. F. W. Putnam, Science 163, 633 (1969);G. M. Edelman and W. E. Gall, Ann. Rev.Biochem. 38, 415 (1969); Symposium on De-velopmental Aspects of Antibody Formationand Structure, J. Sterzl and H. Rhia, Eds.(Academic Press, New York, in press); S.Cohen and C. Milstein, Advan. Immunol. 7,1 (1967); E. Lennox and M. Cohn, Annu. Rev.Biochem. 36, 365 (1967).

2. M. Potter, Meth. Cancer Res. 2, 105 (1967).3. L. Hood, W. R. Gray, B. G. Sanders, W.

J. Dreyer, Cold Spring Harbor Symp. Quant.Biol. 32, 133 (1967).

4. N. Hilschmann and L. C. Craig, Proc. Nat.Acad. Sci. U.S. 53, 1403 (1965).

5. W. R. Gray, W. J. Dreyer, L. Hood,Science 155, 465 (1967).

6. F. W. Putnam, K. Titani, M. Wikier, T.Shinoda, Cold Spring Harbor Symp. Quant.Biol. 32, 9 (1967).

7. C. Milstein, Nature 216, 330 (1967); C.Milstein, C. P. Milstein, A. Feinstein, ibid.221, 151 (1969). Only a subset of the VKIproteins with a particular sequence around

333

Page 10: equi- to a use was...7. The monument at Borrowston Rig requires anchor points that are placed farther apart on the circumference than the intersections made by the trisecting radii

the region of Cys at 23 could be detectedin this study.

8. The abbreviations for amino acid residuesare: Ala, alanine; Arg, arginine; Asn, as-paragine; Asp, aspartic acid; Cys, cysteine;Gln, glutamine; Glp, pyrrolidonecarboxylicacid; Glu, glutamic acid; Gly, glycine; His,histidine; Ile, isoleucine; Leu, leucine; Lys, ly-sine; Met, methionine; Phe, phenylalanine;Pro, proline; Ser, serine; Thr, threonine; Trp,tryptophan; Tyr, tyrosine; and Val, valine.The subgroup designations for immuno-globulin V (variable) regions are those pro-posed by the Conference on Nomenclaturefor Animal Immunoglobulins, Prague, June1969. Abbreviation for bases are A, adenine;G, guanine; C, cytosine; and U, uracil.

9. H. Niall and P. Edman, Nature 216, 262(1967).

10. L. Hood, J. A. Grant, H. C. Sox, in Sym-posium on Developmental Aspects of Anti-body Formation and Structure, J. Sterzl andH. Rhia, Eds. (Academic Press, New York,in press); J. A. Grant and L. Hood, unpub-lished results.

11. C. Baglioni, L. A. Zonta, D. Cioli, A. Car-bonara, Science 152, 1517 (1967); C. Mil-stein, Biochem. J. 101, 338 (1966).

12. C. Ropartz and J. Lenoir, Nature 189, 586(1961); A. G. Steinberg, J. A. Wilson, S. Lan-set, Vox Sang. 7, 151 (1962).

13. W. D. Terry, L. Hood, A. G. Steinberg,Proc. Nat. Acad. Sci. U.S. 63, 71 (1969).

14. M. Potter, W. J. Dreyer, E. L. Kuff, K.R. McIntire, J. Mol. Biol. 8, 814 (1964).

15. If one evokes a master-slave model of chro-mosome structure where many germ linecopies (slaves) of a single structural geneare corrected at each DNA division againsta single copy (master) [see H. C. Callan, J.Cell Sci. 2, 1 (1967)], then multiple germline genes could evolve as a single gene.This might explain the identity of multiplecommon genes, but obviously it does notapply to the V genes.

16. Von B. Langer, M. Steinmetz-Kayne, N.Hilschmann, Hoppe-Seyler's Z. Physiol. Chem.349, 945 (1968).

17. L. Hood and D. Ein, Nature 220, 764 (1968).18. D. Ein and J. Fahey, Science 1S6, 947 (1967);

E. Appella and D. Ein, Proc. Nat. Acad.Sci. U.S. 57, 1449 (1967); L. Hood and D.Ein, Science 162, 679 (1968).

19. D. Ein, Proc. Nat. Acad. Sci. U.S. 60, 982(1968).

20. E. S. Lennox, P. M. Knopf, A. J. Munro,R. M. E. Parkhouse, Cold Spring HarborSymp. Quant. Biol. 32, 249 (1967); P. M.Knopf, personal communication.

21. M. D. Scharff, A. L. Shapiro, B. Ginsberg,Cold Spring Harbor Symp. Quant. Biol. 32,235 (1967).

22. M. Potter and R. Lieberman, ibid., p. 187.23. B. Frangione and C. Milstein, Nature 224, 597

(1969). These generalizations also seem tohold for a second heavy chain disease pro-tein (W. D. Terry, personal communication).

24. Although amino acid sequence homologysuggests that lambda and kappa variablegenes share a common ancestor (Fig. 1) (25,6), it is equally obvious that lambda andkappa chains are now encoded by distinctsets of genes. Certain features are sharedby all lambda (but not kappa) chains; forexample, all lambda variable regions havethree sets of gaps or insertions when com-pared with their kappa counterparts (residue9 in Fig. 1). In addition, unique residue orresidue alternatives exist at a number ofpositions (for example, lambda chains haveGlp or a deletion at position 1, whereaskappa chains have Asp or Glu).

25. L. Hood, W. R. Gray, W. J. Dreyer, J.Mol. Biol. 22, 179 (1966).

26. A. M. Campbell, Adv. Genet. 11, 101 (1962).27. C. A. Thomas, Jr., Progr. Nucl. Acid Res.

Mol. Biol. 5, 315 (1966).28. W. J. Dreyer, W. R. Gray, L. Hood, Cold

Spring Harbor Symp. Quant. Biol. 32, 353(1967).

29. M. Cohn, in Rutgers Symposium on NucleicAcids in Immunology, 0. J. Plescia and W.Braun, Eds. (Springer-Verlag, New York,1968), p. 671.

30. R. Mage, G. 0. Young, J. Reinek, R. A.Reisfeld, E. Appella, Protides Bfol. FluidsProc. Colloq. Bruges, in press.

31. L. Herzenberg, H. McDevitt, L. Herzenberg,Annu. Rev. Genet. 2, 209 (1968); S. B.Natvig, H. G. Kunkel, S. P. Litwin, ColdSpring Harbor Symp. Quant. Blol. 32, 173(1967).

32. One would expect a 2:1 ratio of trans-versions to transitions if the tendency of amutating base to change to a purine or apyrimidine is random. For example, G canmutate to T or C (transversions) or A(transitions), hence twice as many trans-versions as transitions would be expected.

33. T. J. Jukes, Molecules and Evolution (Co-lumbia University Press, New York, 1966);C. Nolan and E. Margoliash, Annu. Rev. Blo-chem. 37, 727 (1968); W. M. Fitch, J. Mol.Biol. 26, 499 (1967).

34. H. C. Sox and L. Hood, in preparation.35. A 50 percent success here is not meaning-

ful because of the small number of alterna-tives.

36. L. Hood, W. R. Gray, W. J. Dreyer, Proc.Nat. Acad. Sci. U.S. 55, 826 (1966).

37. R. Perham, E. Appella, M. Potter, Science154, 391 (1966).

38. F. Melchers, Biochemistry 8, 938 (1969).39. J. Lederberg, Science 129, 1649 (1959).40. 0. Smithies, ibid. 157, 267 (1967).41. M. Potter, E. Appella, S. Geisser, J. Mol.

Biol. 14, 361 (1965); J. Campbell, J. Theor.Biol. 16, 321 (1967); H. L. K. Whitehouse,Nature 215, 371 (1967).

42. G. M. Edelman and J. Gally, Proc. Nat.Acad. Sci. U.S. 57, 353 (1967).

43. S. Brenner and C. Milstein, Nature 211, 242(1966).

44. With a random mutational mechanism (singlebase changes) operating throughout the en-tire variable gene, about six different aminoacids on the average could be generated ateach codon. With six variants at each of107 positions, one could generate 6107 (- 108a)variable region sequences (or cells if onesequence is allowed per cell). If three residuealternatives at each position are compatiblewith function (the sequence variation withina subgroup appears much more highly re-stricted) (1) then 3107 (- 1051) sequences (cells)are useful to the organism. Hence the ratioof useless to functional cells (sequences) is6107/3107 or more than 100 useless cells wouldbe produced for each functional one. Sincethe average man only generates 5 X 10'scells during his entire life, a random muta-tional mechanism poses an impossible cellwastage problem.

45. L. Hood and D. Talmage, in DevelopmentalAspects of Antibody Formation and Struc-ture, T. Sterzl and H. Rhia, Eds. (AcademicPress, New York, in press).

46. W. J. Dreyer and J. C. Bennett, Proc. Nat.Acad. Sci. U.S. 54, 864 (1965).

47. W. M. Fitch and E. Margoliash, Science 155,279 (1967).

48. M. Dayhoff, Atlas of Protein Seqtuence andStructure (National Biomedical ResearchFoundation, Silver Spring, Md., 1969), p. 29.

49. H. Eisen, Fed Proc. 29, 78 (1970).50. D. Schubert, A. Jobe, M. Cohn, Nature 220,

882 (1968).51. H. Grey and M. Mannik, J. Exp. Med. 122,

619 (1965).52. Calculation by W. R. Gray. The DNA value

is taken from The Chemical Basis ofHeredity, W. R. McElroy and B. Glass, Eds.(Johns Hopkins Univ. Press, Baltimore, 1957).

53. K. Hilse and R. A. Popp, Proc. Nat. Acad.Sci. U.S. 61, 930 (1968).

54. W. A. Schroeder, T. H. Hulsman, J. R.Shelton, J. B. Shelton, E. F. KiCleihaner, A.M. Dozy, B. Robberson, ibid. 60, 537 (1968).

55. R. L. Hill, R. Delaney, R. E. Fellows, Jr.,H. E. Lebovitz, ibid. 56, 1762 (1966); S. J.Singer and R. F. Doolittle, in Structural andMolecular Biology, A. Rich and N. Davidson,Eds. (Freeman, San Francisco, 1968).

56. There are, of course, examples of multi-gene systems with many copies of similargenes. About 10 percent of the DNA inmouse cells is "satellite DNA" which ap-pears to consist of about a million similar

copies of a short nucleotide sequence [seeR. J. Britten and D. E. Kohne, Science 161,529 (1968)]. A second example- is that ofthe ribosomal genes in Xenopus laevis. Thereappear to be about 450 copies of both the28S and the 18S ribosomal cistrons which arelinked in tandem along the chromosomewith alternating 28S and 18S genes [D. Brownand C. S. Weber, J. Mol. Biol. 34, 681 (1968)1.Furthermore DNA-RNA hybridization experi-ments suggest that individual members ofeach ribosomal set are very similar insequence to one another [M. Birnstiel, M.Grunstein, J. Speirs, W. Hennig, Nature 223,1265 (1969)].

57. S. Cohen and C. Milstein, Advan. Imnmunol.7, 1 (1967).

58. 0. Smithies (personal communication) haspointed out that if two identical sequencesare also identical to a prototype subgroupsequence, one could argue that the cor-responding subgroup gene may have justfailed to undergo the hypermutation process.In this special case, two identical lightchains would not, therefore, argue againstthe somatic theories.

59. It is beyond the scope of this paper to dis-cuss the implications of the limited sequenceinformation available on human and rabbitheavy chains. The data on the human heavychain are consistent with the light chainanalysis presented here (60). Interpretationof the data on the rabbit heavy chain isdifficult in that heterogeneous mixtures ofproteins have been used for sequence analysis.Wilkinson (61) has reported multiple sequencedifferences in the NH2-terminal region ofrabbit heavy chains which correlate withdiffering allotypes. These multiple aminoacid substitutions do not constitute a simplegenetic marker similar to the single residueinterchange in the common region of humankappa chains (Inv marker). Consequently themeaning of Wilkinson's data is subject toa number of different interpretations at thegenetic level. If, however, these differences doconstitute a genetic marker in the variableregion, it would render a germ line modelof heavy chain diversity unlikely.

60. See G. M. Edelman and E. Press, in Develop-mental Aspects of Antibody Formation andStructure, J. Sterzl and H. Rhia, Eds. (Aca-demic Press, New York, in press).

61. J. M. Wilkinson, Blochem. J. 112, 173 (1969).62. Sequence analysis of immunoglobulin light

chains has given us a glimpse of differ-entiation in a complex mammalian systemand has certainly raised a question as towhether other differentiating systems havefollowed a similar course.

63. L. Hood, unpublished results.64. C. Milstein, Fed. Eur. Blochem. Soc. Symp.

1S, in press.65. N. Hilschmann, Hoppe-Seyler's Z. Physiol.

Chem. 348, 1077 (1967).66. C. Milstein, Blochem. J. 101, 352 (1966).67. A. Kaplan and H. Metzgar, Biochemistry 8,

3944 (1969).68. B. A. Cunningham, P. D. Gottlieb, W. H.

Konigsberg, G. M. Edelman, ibid. 7, 1983(1968); G. M. Edelman, B. A. Cunningham,W. E. Gall, P. D. Gottlieb, U. Rutishauser,M. J. Waxdal, Proc. NVat. Acad. Sci. U.S. 63,78 (1968).

69. F. W. Putnam, Science 163, 633 (1969).70. N. Hilschmann, Fed. Eur. Blochem. Soc.

Symp. 15, in press.71. A. Grant and L. -Hood, unpublished results.72. C. Baglioni, Biochem. Blophys. Res. Commun.

26, 82 (1967).73. A. B. Edmundson, F. A. Sheber, K. R. Ely,

N. B. Simonds, N. K. Hutson, J. L. Rossiter,Arch. Biochem. Blophys. 127, 725 (1968).

74. C. Milstein, Blochem. J. 110, 631 (1968).75. H. Ponstingl, M. Hess, N. Hilschmann,

Hoppe-Seyler's Z. Physwol. Chem. 349, 867(1968).

76. W. M. Fitch and E. Margoliash, Science155, 279 (1967).

77. Supported in part by NIH grants 5 ROIAI 03047-11 and FR 00404-02. We thank S.Levine for assistance in -preparing theamino acid sequence data for computerprogramming.

SCIENCE, VOL. 168334