THE JOURNAL OF CHEMISTRY Vol. 257, No. 3. Issue of February … · 2001. 8. 24. · THE JOURNAL OF...

8
THE JOURNAL OF BIOLOGICAL CHEMISTRY Prrnted in U.S.A. Vol. 257, No. 3. Issue of February 10, pp. 1516-1522, 1982 Evolution of Human Immunoglobulin K J Region Genes* (Received for publication, September 21, 1981) Philip A. Hieter, Jacob V. Maizel, Jr., and Philip Leder From the Laboratory of Molecular Genetics, National Institute of Child HeaEth and Human Development, National Institutes of Health,. Bethesda, Maryland 20205 Immunoglobulin K chain variable region genes are assembled from two discontinuous DNA segments, a V and a J gene. The J region genes, in addition to encod- ing amino acid positions 96-108 of the IC polypeptide chain, also provide sequences required for both DNA and RNA splicing reactions. For purposes of evolution- ary comparison and to establish the complexity of the K J region locus in man, we have determined an ap- proximately 3000 basepairnucleotidesequence in a cloned human DNA fragment that encodes the gemline IC J region genes. The region sequenced includes five distinct J region segments. Significant blocks of ho- mology have been tightly maintained between this re- gion and an analogous segment of the mouse genome. In particular, the short sequences, GG”TGT and CACTGTG, thought to be involved in V-J recombina- tion, are the most highly conserved regions (97% ho- mology). In addition, from heteroduplex data and com- puter analysis of the nucleotide sequences, it is clear that the mouse 53 sequence, a pseudogene, is not pres- ent in the human cluster. This can be explained by a duplication event in the mouse J region gene cluster that may have been the result of unequal crossing over between homologous chromosomes. The variable regions of K immunoglobulin light chainpoly- peptides are encoded in discontinuous segments that are dis- tantly separated in germline DNA (1-4). In the K light chain system of the mouse, the germline consists of several hundred distinct variable region genes that encode the fist 95 amino acids of the 108 amino acids conventionally associated with the variable region of the light chain (5, 6). The remaining 13 amino acids of the variable segment are separately encoded in each of five distinct segments, called joining or J segments, that are regularly spaced at approximately 300 basepair inter- vals and situated a few thousand basepairs 5’ to the single K constant region gene (7, 8). Four of these J region genes appear to beactive in that they specify amino acids that are found in mouse IC light chains that have been sequenced. The middle J region, 53, is never found expressed in a K poly- peptide, and this is probably due to a point mutation in the RNA splice donor sequence immediately to its 3’ side (7, 8). During the development of an antibody-producing cell, activation of a K light chain gene is accomplished by a site specific DNA recombination event between one of the several hundred germline variable genes and one of the four functional J segments. In this way, a completed variable region gene (encoding amino acids through position 108), separated from the constant region gene by an interveningsequence, forms a * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. functional transcription unit and is the product of a develop- mental system in which gene activation is accomplished by a DNA rearrangement. Transcription of the assembled gene yields a large RNA precursor that is processed into the final immunoglobulin mRNA by RNA splicing (9-11). Apparently, the splice signal immediately to the 3‘ side of the recombined J region segment is activated in some way by V/J recombi- nation, since the inactive J regions are not used as splice signals (1 1). A variety of functions are associated with the J region genes. In particular, these relatively short genetic elements contribute significantly to the generation of variable region diversity. The existence of multiple J regions offers alternative amino acid combinations with the coding capacities of the germline variable gene segments. Furthermore, the crossover point of DNA recombination can vary, and in so doing can generate additional diversity around the site of recombination (ie. at amino acid position 96) (7, 8, 12). The fact that amino acid 96 is in a position critical to the antigen-antibody com- bining site (13) imparts particular physiological significance to such a process. In addition, the J regions are of extreme interest because each must encode to their 5’ sides a DNA sequence involved in DNA recombination, while on their 3‘ sides must encode a sequence involved in RNA splicing. Therefore, conserved homologies that surround these genetic elements should provide information for developing functional assays for both DNA and RNA recombination. In order to determine precisely the multiplicity of J region genes in the human K system and to establish to what extent sequences that encode and flank the J regions have been conserved during evolution, we have determined approxi- mately 3000 basepairs of DNA sequence that surround and include the human J region genes. Computer analysis has revealed that in addition to four human J regions that are directly related to the four functional mouse J region genes, a fifth functionalhuman J region is present immediately 3’ to this cluster. The most tightly maintained sequences within the entireregion are the nanomer GGTTTTTGT and hepta- mer CACTGTG, thought to be involved in V-J recombination (7, B), immediately 5’ to each of the J regions. In addition, a comparison of the mouse and human data allows us to for- mulate a model for the evolutionary divergenceof the J region cluster. The model involves gene duplication as a result of unequal crossing over between homologous mouse chromo- somes. EXPERIMENTAL PROCEDURES were purchased from New England Biolabs (Beverly, MA). Bacterial Materials-Restriction endonucleases and polynucleotide kinase alkaline phosphatase was from Worthington and was dialyzed against 0.1 M Tris-HC1, pH 8.0, prior to use. The cloned 12 kbembryonic human K J/C Barn HI fragment and 8 kb RPMI 6410 human K VJ/C Barn HI fragment have been previously described (14). DNA Restriction Fragment Heteroduplex Analysis-Heterodu- 1516 by guest on January 8, 2021 http://www.jbc.org/ Downloaded from

Transcript of THE JOURNAL OF CHEMISTRY Vol. 257, No. 3. Issue of February … · 2001. 8. 24. · THE JOURNAL OF...

Page 1: THE JOURNAL OF CHEMISTRY Vol. 257, No. 3. Issue of February … · 2001. 8. 24. · THE JOURNAL OF BIOLOGICAL CHEMISTRY Prrnted in U.S.A. Vol. 257, No. 3.Issue of February 10, pp.

THE JOURNAL OF BIOLOGICAL CHEMISTRY

Prrnted in U.S.A. Vol. 257, No. 3. Issue of February 10, pp. 1516-1522, 1982

Evolution of Human Immunoglobulin K J Region Genes* (Received for publication, September 21, 1981)

Philip A. Hieter, Jacob V. Maizel, Jr., and Philip Leder From the Laboratory of Molecular Genetics, National Institute of Child HeaEth and Human Development, National Institutes of Health,. Bethesda, Maryland 20205

Immunoglobulin K chain variable region genes are assembled from two discontinuous DNA segments, a V and a J gene. The J region genes, in addition to encod- ing amino acid positions 96-108 of the IC polypeptide chain, also provide sequences required for both DNA and RNA splicing reactions. For purposes of evolution- ary comparison and to establish the complexity of the K J region locus in man, we have determined an ap- proximately 3000 basepair nucleotide sequence in a cloned human DNA fragment that encodes the gemline IC J region genes. The region sequenced includes five distinct J region segments. Significant blocks of ho- mology have been tightly maintained between this re- gion and an analogous segment of the mouse genome. In particular, the short sequences, GG”TGT and CACTGTG, thought to be involved in V-J recombina- tion, are the most highly conserved regions (97% ho- mology). In addition, from heteroduplex data and com- puter analysis of the nucleotide sequences, it is clear that the mouse 5 3 sequence, a pseudogene, is not pres- ent in the human cluster. This can be explained by a duplication event in the mouse J region gene cluster that may have been the result of unequal crossing over between homologous chromosomes.

The variable regions of K immunoglobulin light chain poly- peptides are encoded in discontinuous segments that are dis- tantly separated in germline DNA (1-4). In the K light chain system of the mouse, the germline consists of several hundred distinct variable region genes that encode the fist 95 amino acids of the 108 amino acids conventionally associated with the variable region of the light chain (5, 6). The remaining 13 amino acids of the variable segment are separately encoded in each of five distinct segments, called joining or J segments, that are regularly spaced at approximately 300 basepair inter- vals and situated a few thousand basepairs 5’ to the single K

constant region gene (7, 8). Four of these J region genes appear to be active in that they specify amino acids that are found in mouse IC light chains that have been sequenced. The middle J region, 53, is never found expressed in a K poly- peptide, and this is probably due to a point mutation in the RNA splice donor sequence immediately to its 3’ side (7, 8).

During the development of an antibody-producing cell, activation of a K light chain gene is accomplished by a site specific DNA recombination event between one of the several hundred germline variable genes and one of the four functional J segments. In this way, a completed variable region gene (encoding amino acids through position 108), separated from the constant region gene by an intervening sequence, forms a

* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

functional transcription unit and is the product of a develop- mental system in which gene activation is accomplished by a DNA rearrangement. Transcription of the assembled gene yields a large RNA precursor that is processed into the final immunoglobulin mRNA by RNA splicing (9-11). Apparently, the splice signal immediately to the 3‘ side of the recombined J region segment is activated in some way by V/J recombi- nation, since the inactive J regions are not used as splice signals (1 1).

A variety of functions are associated with the J region genes. In particular, these relatively short genetic elements contribute significantly to the generation of variable region diversity. The existence of multiple J regions offers alternative amino acid combinations with the coding capacities of the germline variable gene segments. Furthermore, the crossover point of DNA recombination can vary, and in so doing can generate additional diversity around the site of recombination (ie. at amino acid position 96) (7, 8, 12). The fact that amino acid 96 is in a position critical to the antigen-antibody com- bining site (13) imparts particular physiological significance to such a process. In addition, the J regions are of extreme interest because each must encode to their 5’ sides a DNA sequence involved in DNA recombination, while on their 3‘ sides must encode a sequence involved in RNA splicing. Therefore, conserved homologies that surround these genetic elements should provide information for developing functional assays for both DNA and RNA recombination.

In order to determine precisely the multiplicity of J region genes in the human K system and to establish to what extent sequences that encode and flank the J regions have been conserved during evolution, we have determined approxi- mately 3000 basepairs of DNA sequence that surround and include the human J region genes. Computer analysis has revealed that in addition to four human J regions that are directly related to the four functional mouse J region genes, a fifth functional human J region is present immediately 3’ to this cluster. The most tightly maintained sequences within the entire region are the nanomer GGTTTTTGT and hepta- mer CACTGTG, thought to be involved in V-J recombination (7 , B), immediately 5’ to each of the J regions. In addition, a comparison of the mouse and human data allows us to for- mulate a model for the evolutionary divergence of the J region cluster. The model involves gene duplication as a result of unequal crossing over between homologous mouse chromo- somes.

EXPERIMENTAL PROCEDURES

were purchased from New England Biolabs (Beverly, MA). Bacterial Materials-Restriction endonucleases and polynucleotide kinase

alkaline phosphatase was from Worthington and was dialyzed against 0.1 M Tris-HC1, pH 8.0, prior to use. The cloned 12 kb embryonic human K J/C Barn HI fragment and 8 kb RPMI 6410 human K VJ/C Barn HI fragment have been previously described (14).

DNA Restriction Fragment Heteroduplex Analysis-Heterodu-

1516

by guest on January 8, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 2: THE JOURNAL OF CHEMISTRY Vol. 257, No. 3. Issue of February … · 2001. 8. 24. · THE JOURNAL OF BIOLOGICAL CHEMISTRY Prrnted in U.S.A. Vol. 257, No. 3.Issue of February 10, pp.

Evolution of Human Immunoglobulin K J Region Genes 151 i

Mouse/ Human J region restriction fragment heteroduplexes

Mouse J’s

Human J‘s

Hind3 Xbal I I

Mouse germline < I I I < $ J1 J2 J3 J4 J5

.-

t I

Hpa2 H pa2

Human recombinant I I I I 2 J2 J3 J4 V

H H

J Human J’s

Hind3 Xba 1 I

Mouse germline 3 I I I I I -

J2 J3 J4 J5 V

Human germline

H FIG. 1. Electron micrographs of heteroduplexes formed be-

tween DNA restriction fragments cotaining mouse and human J region genes. A, the 1.6 kb Hpa I1 restriction fragment from a functional human recombinant gene was identified and isolated as described (14). A physical map of this fragment, as deduced from measurements of eight heteroduplex molecules, is shown on the bottom line. Portions of i t s nucleotide sequence were determined (14) as indicated by the bars under this map. The 1.7 kb Xba-Hind111 mouse J region fragment was prepared from a Barn HI-Hind111 subclone of the mouse K constant/J region locus (3). The entire nucleotide sequence of this fragment is known (19) and, therefore,

plexes were prepared by alkali treatment of restriction fragments and renaturation in the presence of 50% formamide, 0.1 M Tris HCI, pH 8.0, 0.01 M EDTA at 25 “C. DNA fragments were prepared on I% agarose gels, electroeluted, and recovered essentially as described (15). 0.25 pg of each fragment was mixed together and denatured in 0.2 ml of 0.1 N NaOH, 0.02 M EDTA, for 30 min a t 3 i “C. The solution was adjusted to pH 8.0 with HCI, and a l/lOth volume of 2 M Tris, pH 8.0, was added. Deionized formamide was then added to a final concentration of 508 and renaturation a t 25 “C allowed to proceed for 120 min. The sample was spread onto a hypophase of H20. the

J1 J2 J3 J4 J(

the precise location of each of the five mouse J regions is known, with respect to the ends of this DNA fragment. The map of this fragment is presented. The experimental details for heteroduplex formation are described under “Experimental Procedures.” Of ten molecules ex- amined, eight were identical with the structure shown above. The remaining two were consistent with this structure but lacked one of the four hybrid structures. B, the 1.8 kb C/o I restriction fragment that contains the four previously identified human J region homolo- gies (51-54) (14) was isolated and heteroduplexed to the 1.7 kb Xha- Hind111 mouse J region fragment. The results suggest that a sequence directly related to mouse 53 is missing in the human J cluster.

DNA picked onto parlodian-coated grids. stained with uranyl acetate, and shadowed with platinum/palladium and carbon. Preparation of grids was performed by Barbara Norman. The grids were viewed on a Phillips 300 electron microscope.

DNA Sequence Determination and Computer Anafvsis-DNA fragments for sequencing were prepared on polyacrylamide gels. electroeluted, and recovered essentially as described (15). An addi- tional phenol extraction step was performed after DEAE-column elution of the fragments. This step was followed by precipitation in ethanol and a wash in ethanol before resuspending in H?O. The

by guest on January 8, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 3: THE JOURNAL OF CHEMISTRY Vol. 257, No. 3. Issue of February … · 2001. 8. 24. · THE JOURNAL OF BIOLOGICAL CHEMISTRY Prrnted in U.S.A. Vol. 257, No. 3.Issue of February 10, pp.

1518 Evolution of Human Immunoglobulin K J Region Genes

Constmnt

(A) 0.m J nQlons *.plan l"1 K b - i 8 . ~

1 2 3 4 5 I f ; ! 1 1 1 1 1 m I 1 I A 1 1 T t

Hlnd 1 8.C 1 8.c 1 8.c 1 Hlnd3 t S D C l b c l t 111 111 111

SRlwK. Q(mirud: t

,' , 4- - "

, / - -" -" ,

- ,

" " "

/ , , -" - --

I' , -" -" - -1

FIG. 2. Physical map of the human K constant/J region locus and strategy for sequencing the human K J region genes and flanking regions. A, the human K constant locus consists of five active J region genes located approximately 2.5 kb to the 5' side of a single I( constant region gene. Restriction endonculease sites are indicated for Bum HI, Eco RI, Sac I, and HindIII. Coding regions are indicated as solid boxes. The sequence determined (approximately 3000 basepairs) is indicated. B , an extended map of the sequence determined is shown. The coding and flanking regions of the human I( J region genes were sequenced by the method of Maxam and Gilbert (16, 17) using the restriction sites indicated. Only those restriction

detailed procedures for DNA sequencing are those described by Maxam and Gilbert (16, 17). The program of Korn et al. (18) was used to analyze the sequences throughout the course of the work. A program, PUBTRANS, written by J. V. Maizel was used to generate the publication figures. The dot matrix comparison program is that devised by Maizel' and is described in the legend to Fig. 4.

RESULTS AND DISCUSSION

Heteroduplex Analysis Reveals Mouse and Human J Re- gion Homologies-From partial nucleotide sequence data and the appearance of mouse/human heteroduplex structures, we previously suggested that a sequence directly related to the mouse 53, which is apparently inactive, was missing in the human J region cluster (14). This can be demonstrated in a more convincing way by heteroduplexing DNA restriction fragments containing the mouse and human J region genes (Fig. 1). For example, when a DNA restriction fragment known to contain the recombination site of the K expressing line RPMI 6410 is heteroduplexed with a DNA restriction fragment known to contain the five mouse J region genes (Fig. IA), it is clear that a sequence related to the mouse 53 sequence is not present in the human DNA. The same conclu- sion is reached upon examination of the heteroduplex between a human embryonic DNA restriction fragment containing the four previously identified J region homologies (14) and the DNA restriction fragment containing the five mouse J regions (Fig 1B). The appearance of these heteroduplex structures clearly indicates that the human J region cluster does not contain a sequence that corresponds to the mouse 53 se- quence. Such a situation could have arisen subsequent to the divergence of the mouse and human species as a result of a

' J. V. Maizel, Jr. and R. P. Lenk (1982) Proc. Nutl. Acad. Sci. U. S. A, , in press.

sites used for sequencing are shown. 32P-labeled ends are represented by short vertical lines and the extent and direction of sequencing are indicated by arrows. A computer search of this sequence for regions that would encode amino acids 96-108 of the K polypeptide revealed the existence of five human J coding segments. These coding regions are indicated by solid boxes and numbered from 5' to 3', J1 through 55. This numbering order represents a reversal of that used previously (7, 14). The portion of the nucleotide sequence that includes the coding regions is presented in Fig. 3 as indicated. The scale is shown in nucleotides.

deletion in the human J cluster, or by a duplication in the mouse J cluster (see below).

Nucleotide Sequence Analysis Reveals a Fifth Human K J Region Gene-In order to compare more precisely the struc- tural organization of the human J region genes to that of the mouse, and to determine the extent to which elements flanking the J region genes that are thought to be involved in DNA recombination have been conserved in humans, approximately 3000 basepairs of DNA sequence surrounding the human J segments have been determined. This spans the region of DNA that extends from a Cfo I restriction endonuclease site 0.7 kb 5' to J1 to a HindIII restriction endonculease site 1.3 kb 3' to 54 (Fig. 2). The strategy used in sequencing this 3 kb DNA segment and a physical map of the human K J/C locus are shown in Fig. 2, b and a, respectively. The sequence itself is shown in Fig. 3.

Computer analysis of this sequence revealed the existence of five distinct human J region coding segments that would encode amino acid positions 96-108 of a K polypeptide chain. The four J regions identified previously by heteroduplex anal- ysis (Ref. 14 and Fig. 1) correspond to 51-54; the fifth J region is located immediately 3' to this cluster. These J segment genes are clustered and regularly spaced, repeating approxi- mately every 300 basepairs along the DNA. We have num- bered these regions J1 through J5,5' to 3', in accordance with the nomenclature used for mouse K J region genes (7, 19) and mouse heavy chain J region genes (20). Although it is possible that additional human J region genes exist outside this 3 kb region, it seems unlikely in view of the regular spacing of these genetic elements in the mouse K and heavy chain systems. Furthermore, we have not been able to detect any additional J-like sequences on the cloned gennline 12 kb Bam HI DNA fragment using in situ hybridization to a J. probe (data not

by guest on January 8, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 4: THE JOURNAL OF CHEMISTRY Vol. 257, No. 3. Issue of February … · 2001. 8. 24. · THE JOURNAL OF BIOLOGICAL CHEMISTRY Prrnted in U.S.A. Vol. 257, No. 3.Issue of February 10, pp.

0

100

z o o

300

*on 5 0 0

600

700

800

9 0 0

1000

1100

1200

0 0 0

1400

1500

1600

1700

1800

I I I I I I I I I AAAGATAAAGITAAGTClGTAGTCAAAClCGAGAATTGATTGCACATTTTCTTTGAAGAGCAAGCAAGATTCAGTCATTGGGTGAGAATAACTTGTCTAA

GTAATAGCTTCAGAAATGTCCTAGGGAACATAACATGTTCTGGACAGAGCCTTGGTCAATTGTCAGAAAGGGAGTTTTTGTATAGGAGGGAAGTTAAGAG I I I I I I I I I

I I

GAACCATTGTGTGTACACTiTTGGCCAGG~GACCAAGCTGGAGATCAAACGTAAGTACTTTTTTCCACTGATTCTTCACTGTTGCTAAT~AGTTTACTTT 1 I I I

T y r T h r P h ~ G 1 y G 1 n G l y T h r L y ~ L ~ u G l u I l ~ L ~ a A r ~ I I I I

O T G T T C C ~ T ~ G T G T G G A T T ~ T C A T T A G T C ~ G A T G C C A G G ~ A C T C T A A C I ~ A C T T C A T T C ~ C A G G T T A G G ~ A C A G A G G A G ~ G G A A A T T G T ~ C C A C A G G A C G

CTAGCTT5TGGCTAATTTTTAAGATTTCTAAATCAAAATAACTTCATTGGGGGAAAGAGGCTTGCTGAGCTTTCAGGGAGGlTTTTGTAAAGGGAAAAGl

TAAGACGAATCACTGTGATlCACTTTCGGCCCTGGGACCAAAGTGGATATCAAACGTAAGTACATCTGTCTCAATlATTCGTGAGATTTTAGlGCCATTG

I I I I I I I I I I I I I I

T A T C A T T T G ! G C A A G T T T T C T G C A T A T T T ~ G G T T G A A T A ~ A C C T G G T C A ~ C C A G A A G T A ? A T A G C A G G A ~ A C C A G A A A A ~ T C G A A A C T T ~ A A A A A G C T G A I PholhrPh~G1yProG1yThrl~~ValA~pIl~LyrArg

1 I I I I I

G C A A G T A G A ~ C G A C T T C T T ~ G G G T T T G A G ~ G G A G A A T A G ~ T T C C T T G G G ~ G A A A T G G G G ~ A G A A A T A G C ~ A G A T T T T T C i C T G A A C A A G ~ A G C C T A T C T C

ATATGATTGGCTTCAAGAGAGGTTTTTGTTGAGGGGAAAGGGTGAGATCCCTCACTGTGGCTCACTTTCGGCGGAGGGACCAAGGTGGAGATCAAACGTA I I I I I I

L~uThrPheGlyGlyGlyThrLy~V~lGluIlaly~Arg I I I I I I I I I

AGTGCACTTTCCTIATGCTTTTTCTTATA~GTTTAAATTTGAGCGTTTTTGTGTTTGAGATATTAGCTCAGGTCAATTCCAAAGAGTACCAGATTCTTTC

AAAAAGTCAGATGAGTAAGGGACAGAAAATTAGTTCATCTTAAGGAACAGCCAAGCGCTAGCCAGTTAAGTGAGGCATCTCAATTGCAAGATTTTCTCTG I I I I I I I I I

1519

J I

J 2

J3

J4

J5

FIG. 3. Nucleotide sequence of the human K J region genes. The sequence presented corresponds to the region of DNA indicated in Fig. 2B. The amino acid translation is shown below the coding regions (amino acids 96-108 of the K polypeptide chain). The human J regions have been numbered from the J region that is most 5’ in the cluster.

w c . . aHmul~R.gmb

FIG. 4. Homology between the individual human J region genes as shown by a self-comparison on a dot matrix computer program. A visualization of the homology exhibited among the five human J region gene sequences shown in Fig. 3 is presented. The human sequence is represented along both the horizontal and the vertical axes. In this way a self-comparison of the nucleotide sequence is obtained. The program compares each base of the human sequence along the horizontal axis to each base of the human sequence on the vertical axis by placing a dot when bases are identical and by leaving a blank space when bases are different. Perfect homology appears as a solid line of dots at a -45‘ angle; partial homology appears as a broken line. Insertions and deletions appear as a shift in the homology line by the number of bases present in one sequence but not in the other. Duplications appear as a second line parallel to the fust. The

shown). This result, coupled with the fact that there are no additional Bum HI fragments in the genome that hybridize to a J, probe (Zl), suggests that there are no more than five J region genes in the K system in man.

Can the five J region gene sequences account for all of the known amino acid sequences in the J region of human K

polypeptides? Even allowing variability at position 96 due to the flexibility of the V-J recombination site, there are more than 20 different human protein sequences in the K J region (22). These discrepancies could be accounted for by invoking somatic mutation within the J region genes during immuno- cyte development. From what is known about the K J regions in the BALB/C mouse, Le. that the gennline coding capacities of the four active J region genes can account for all of the amino acid sequences determined to date (7, 8), somatic mutation is unlikely to account for all of the divergent amino acid sequences. Rather, some of the differences are likely to represent extensive polymorphism in the J region gene se- quences in a highly outbred human population.

As discussed elsewhere (7 , 8), the close homology between the individual J region genes suggests that they arose as a result of duplication of a common ancestral gene. This can be dramatically visualized by performing a dot matrix compari- son of the human J region gene sequence to itself (Fig. 4). Each of the 39 basepair coding regions show striking homology to one another, while the DNA sequences between the various J region genes exhibit very little or no homology. The strong homology of the nucleotide sequence within the five J region genes is most probably due to strong selection for conservation at the amino acid level, although there may also be functional constraints on the nucleotide sequence. In addition, sequences that lie immediately 5’ to each of the J region genes appear to

grid orients the matrix with respect to the structural organization of the five human J region genes. The matrix here is a filtered version in which a match of four scores a dot.

by guest on January 8, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 5: THE JOURNAL OF CHEMISTRY Vol. 257, No. 3. Issue of February … · 2001. 8. 24. · THE JOURNAL OF BIOLOGICAL CHEMISTRY Prrnted in U.S.A. Vol. 257, No. 3.Issue of February 10, pp.

1520 Evolution of Human Immunoglobulin K J Region Genes

MOUJ”

J1 J2 J3 J4 J6 r I

.. 11oObp/dv.; mW-41

FIG. 5. Homology between human and mouse J region gene sequences as shown by dot matrix comparison. The mouse sequence is represented dong the horizontal axis; the human se- quence along the vertical axis. The computer program used is that described in Fig. 4. The mouse sequence includes the five mouse J region genes and approximately 500 basepairs 3’ to mouse 55 (19). The human sequence is that presented in Fig. 3. The grid orients the matrix with respect to the structural organization of the genes. The extent of the shift in the -45” homology line in the region centered below human 53 represents the absence of a 300 basepair segment in human DNA that is present in the mouse. A region centered below position 1820 of the mouse sequence shows faint homology to human 55.

FIG. 7. Self-comparison of the mouse K J region genes by dot matrix sequence comparison. The mouse sequence presented in Fig. 5 is represented dong both the horizontal and vertical axes. The grid orients the matrix with respect to the structural organization of the J region genes. The homology line displaced but parallel to the main homology line (centered below the 54/55 IVS) indicates an imperfect direct repeat. This can be explained by postulating an unequal crossover between 54 and 53 on homologous chromosomes resulting in a duplication of the 53/54 intervening sequence.

COMPARISON OF HUMAN AND nousE KAPPA J REGION GENES

AGAAGGGTTTCTGTTCAGCAAGACAATGGAGAGCTCTCACTGTGGTGGACGTTCGGCCAAGGGACCAAGGTGGAAATCAAACGTGAGTAGAATTTAAACTTTGCTTCCTCAGTT

AGGAGGGTTTTTGTACAGCCAGACAGTGGAGTACTACCACTGTGGTGGACGTTCGGTGGAGGCACCAAGCTGGAAATCAAACGTAAGTAGAATCCAAAGTCTCTTTCTTCCGTT

- TrpThrPhaGlyGlnGlyThrLysValGluIleLysArg II IIIIIII I l l IIII I I I I I IIIII I1 I I I I I I I I I I I I I I I I I I I I l l I I I I I I I I I I I I I I I I I I I I IIIIIIII I l l I I I l l I I I l l

TrpThrPhaGlyGlyGlyThrLy.LouCluI1eLysArg

AGGGAGTTTTTGTATAGGAGGGAAGTTAAGAGGAACCATTGTGTGTACACTTTTGGCCAGGGGACCAAGCTGGAGATCAAACGTAAGTACTTTTTTCCACTGATTCTTCACTGT

GCTCAGTTTTTGTATGGGGGTTGAGTGAAGGGACACCAGTGTGTGTACACGTTCGGAGGGGGGACCAAGCTGGAAATAAAACGTAAGTAGTCTTCTCAACTC TTGTTCACTAA

- TyrThrPhaGlyGlnGlyThrLysLauGluIlaLysArg IIIIIIIIIII II I Ill Ill I 1 1 1 1 IIIIIIIIIII I I II IIIIIIIIIIIIIII II IIIIIIIIIII I I I I I I I l l I I I I I I I I

TyrThrPhaGlyGlyGlyThrLysLauGluIleLysArg

CTAGGGAGGGTTTTGTGGAGGTAAAGTTAAAATAAATCACTGTAAATCACATTCAGTGATGGGACCAGACTGGAAATAAAACCTAAGTACATTTTTGCTCAACTGCTTGTGAAG I1eThrPh~S~rAspGlyThrArglauGluIlaLysPro

GGGAGGTTTTTGTAAAGGGAAAAGTTAAGAC GAATCACTGTGATTCACTTTCGGCCCTGGGACCAAAGTGGATATCAAACGTAAGTACATCTGT CTCAATTATTCGTGAGA

GGCAGGTTTTTGTAAAGGGGGGCGCAGTGATATGAATCACTGTGATTCACGTTCGGCTCGGGGACAAAGTTGGAAATAAAACGTAAGTAGACTTTTGCTCATTTACTTGTGACG

- PhoThrPhaGlyProGlyThrLysValAspIlaLy¶Arg I I IIIIIIIIIIIIIIII I I1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I IIIIIIIIIII I I I I I I I I l l I Ill1

Ph~ThrPhaGlySerGlyThrLysLeuGluIlaLysArg

AGAGAGGTTTTTGTTGAGGGGAAAGGG~GAGATCCCTCACTGTGGCTCACTTTCGGCGGAGGGACCAAGGTGGAGATCAAACGTAAGTGCACTTTCCTAATGCTTTTTCTTATA

AGGCAGGTTTTTGTAGAGAGGGGCATGTCATAGTCCTCACTGTGGCTCACGTTCGGTGCTGGGACCAAGCTGG~GCTGAAACGTAAGTACACTTTTCTCATCTTTTTTTATGTG

- L~uThrPh~GlyGlyGlyThrLysValGluIlalysArg I I I I I I I I I I I I I l l I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I IIIII I II11111111 IIIIII II I I IIIII I I

LauThrPhaGlyAlaGlyThrLy.LsuGluLouLysArg

AAAGAGATTTTTGTTAAGGGGAAAGTAATTAAGTTAACACTGTGGATCACCTTCGGCCAAGGGACACGACTGGAGATTAAACGTAAGCATTTTTCACCATTGTCCGAAATTTGC

AAAGAGGCTTTAGTTGAGAGGAAAGTAATTAA TACTATGG TCACCAT CCAAGAGATTGGATCGGAGAATAAGCATGAGTAGTTAT TGAGATCTGT

- I1eThrPhaGlyGlnGlyThrArgLeuGluIleLysArq IIIIII I l l I l l I1 I I I I I I I I I I I I I I l l I l l I I I I I I I I I I I I I I I I I I I I I l l I I I I I I1 I I1 I I I I

human J l

mouse J1

human 52

mousa J2

mouse J3

human J3

mouse J 4

human J4

mouse J5

human 5 5

mousa remnant

coding segments and their immediate flanking sequences. Por- CACTGTG, thought to be involved in DNA recombination (7,8) are FIG. 6. Direct comparison of the human and mouse J region of the coding segments are shown. The sequences GGTTTTTGT and

tions of the sequences presented in Fig. 3 are compared directly to highlighted. Also shown is a comparison of human 55 with the region portions of the mouse J region sequences (7,8). The particular cross- of DNA situated 3’ to mouse 55 that showed homology in the mouse/ species pairs of J region genes that were chosen for comparison are human dot matrix comparison (Fig. 5). This sequence appears to be those which exhibited the strongest homology to one another. Regions an evolutionary remnant of a J region that was once present in the flanking the coding blocks are aligned to maximize homology by mouse. introducing s m d gaps where necessary. The amino acid translations

by guest on January 8, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 6: THE JOURNAL OF CHEMISTRY Vol. 257, No. 3. Issue of February … · 2001. 8. 24. · THE JOURNAL OF BIOLOGICAL CHEMISTRY Prrnted in U.S.A. Vol. 257, No. 3.Issue of February 10, pp.

Evolution of Human Immunoglobulin K J Region Genes 1521

be strongly conserved (e.g. the 5‘ flank of 52 compared to the 5‘ flank of 53). These sequences are candidates for signal sequences involved in V-J recombination.

The J Region Gene Sequences Have Been Tightly Main- tained in Euolution-Heteroduplex analysis indicated that four of five human J region genes were closely related to four of five mouse J region genes. The measurements of restriction fragment heteroduplexes (Fig. 1) indicated the closest evolu- tionary relationship between J regions 1, 2, 3, 4 of man and J regions 1, 2, 4, 5, of mouse, respectively. That is, a sequence related to mouse 53 was not present in the human J region cluster. In addition, the human 55 does not appear (by het- eroduplex analysis (14)) to have a counterpart in the mouse J cluster. Insight into the basis of these differences was obtained by comparing the human and mouse nucleotide sequences using dot matrix analysis. The evolutionary relationship of the coding regions is confirmed by this analysis (Fig. 5). For example, the homology line seen by comparing human 52 to mouse 52 indicates that the human 52 is more closely related to the mouse 52 than to the other mouse J regions. Supporting

this notion is the weak homology seen flanking these coding segments. This represents the degree of genetic “drift” in noncoding regions since the divergence of the human and mouse species. Also worth noting is the presence of a faint homology line corresponding to human 55 and a sequence centered at position 1820 of the mouse sequence (see below). This may represent an evolutionary remnant of a J region once present in the mouse J region cluster. In this respect, the mouse 53 pseudogene may also represent an analogous evo- lutionary remnant (7), released from selection, that has been accumulating point mutations for a much shorter time period.

A direct cross-species comparison of “related” J region genes is shown in Fig. 6. It is clear that the nucleotide sequences of these coding segments have been much more highly conserved than the constant region coding segments. Of the 156 nucleotides that comprise the four related sets of J region coding segments, nucleotide substitutions have taken place eight times in the fist position, three times in the second position, and seventeen times in the third position of amino acid codons. Thus, the overall homology in the coding regions

EVOLUTION OF HUMAN AND MOUSE J REGION GENES

””_

J1 J2 J3 Jl4 rem 5 I

e l b W c l d m e @ f I

J1 J2 ~ 1 3 J4 rem 5

I a m b l c m d l e m f

L”.

duplicates J3/J4 IVS unequal crossover

J1 J2 J3 J3 J4 rem 5 a ~ b m c l d ~ d ‘ l e ~ f

point mutation inactivates J3

FIG. 8. A model to account for the evolution of human and mouse J re- gion genes after divergence of the species. A, an evolutionary tree of mouse and human J region genes is pre- sented. Three hypothetical evolutionary events are indicated (1-3) which are thought to have occurred in the mouse after divergence of the species. B , a dia- grammatic representation of the evolu- tionary events indicated above. J region coding segments are represented as solid boxes. Hatched boxes represent J re- gions which have been inactivated by a mutational event and subjected to sub- sequent mutation in the absence of se- lection. The degree of hatching repre- sents the extent of nucleotide sequence homology to active J region genes. The resultant J region locus in the mouse would contain four active J regions and two J regions which are evolutionary remnants or pseudogenes. The middle J structure, J3(rern3), would be absent in the human J region cluster.

J1 J2 rem 3 J3 J4 rem 5 a m b m c ~ d ~ d ‘ ~ e V / 1 f

by guest on January 8, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 7: THE JOURNAL OF CHEMISTRY Vol. 257, No. 3. Issue of February … · 2001. 8. 24. · THE JOURNAL OF BIOLOGICAL CHEMISTRY Prrnted in U.S.A. Vol. 257, No. 3.Issue of February 10, pp.

1522 Evolution of Human Immunoglobulin K J Region Genes

is a striking 82%. The strongest homology is seen conserved between the two short sequences immediately 5' to the J regions that are thought to be involved in a recombinational stem intermediate for V-J joining (7, 8). These sequences, GGTTTTTGT and CACTGTG, have been conserved to the extent of 97%. Only two nucleotide differences have occurred in the 64 nucleotides that comprise the four sets of these putative recombination signals. This extreme conservation of sequence lends additional support to the idea that these sequences are functionally significant and act in some way as signal sequences in V-J recombination. Shown also is a com- parison of human 55 to the region of DNA in the mouse sequence recognized in the matrix as homologous. This ho- mologous region (position 1775-1870 of the mouse sequence) is found approximately 200 basepairs 3' to the mouse 55 rather than the 300 basepairs interval that would be expected of a J region regularly spaced along the DNA. This is due to an apparent deletion of approximately 100 basepairs in the mouse sequence (between 55 and the remnant J) as evidenced by a shift in the -45" line in the matrix at approximately position 1690 of the mouse sequence (Fig. 5). I t is likely that the sequence in the mouse that is homologous to human J5 represents an evolutionary remnant of a J region gene that has been released from selective pressure and thereafter been subjected to extensive mutation.

Gene Duplication by Unequal Crossing Over Seems To Have Occurred in the K J Region Locus-The evolutionary events that led to the formation of the mouse pseudo-J sequence, 53, can be deduced to have resulted from a gene duplication event that followed human/mouse divergence by the following argument. In the mouse/human dot matrix comparison (Fig. 5), the intervening sequence between the human 53 and 54 exhibits homology not only to the "corre- sponding" intervening sequence between 54 and 55 of the mouse, but also to the intervening sequence between the mouse 53 and 54. This relationship can be explained by supposing that the mouse 53/54 intervening sequence is re- lated to the adjacent mouse 54/55 intervening sequence. If this were the case, we can postdate a recent duplication in the mouse of the 53/54 intervening sequence by an unequal crossing over event that occurred between 54 and 53 on homologous chromosomes. This event is schematically de- picted in Fig. 8B, step 2. Whether a duplication event such as this has occurred can be tested directly by performing dot matrix sequence analysis of the mouse 5 region gene cluster to itself. That is, an unequal crossing over event in this region requires that there should be evidence for duplication (as evidenced by a second homology line displaced but parallel to the main homology line) in the mouse cluster. The self-com- parison of the mouse J region genes (Fig. 7) provides strong evidence for a recent duplication in the mouse. The 53/54 intervening sequence is clearly homologous to the 54/55 in- tervening sequence. The displaced homology line is indicative of a direct repeat precisely one 5 spacing unit length the degree of homology represents sequence drift subsequent to the duplication event. This type of unequal crossover has been postulated by others as a mechanism by which the number of copies of genes can vary in different organisms (23- 26).

The Evolutionary History of Mouse and Human J Region Genes-Considering the observations noted above, we can construct a series of events to account for the relationships observed between the organization and sequences of the mouse and human J region genes (Fig. 8). In this proposal, it is sugg_ested that the common ancestral species to mouse and man ccntained five functional J segment genes within the K constant locus. Sometime relatively soon after the divergence

of the species, the mouse 55 sequence was inactivated by a mutational event (step l), subjecting it to random genetic drift and subsequent cumulative loss of homology to J region coding segments. In addition, an unequal crossing over event in the mouse between 54 and 53 on homologous chromosomes (step 2) gave rise to a chromosome that became fixed in the population with five functional J region genes and a remnant gene. The chronological order of these fist two events is not known. Finally, a very recent mutational event, perhaps an RNA splice donor mutation as suggested elsewhere (7), inac- tivated mouse 53. The resultant unexpressed 53 gene then drifted to the limited extent observed at this point in evolu- tionary time. This model, although in no way proven, is the simplest in terms of postulating the fewest number of evolu- tionary events to account for our observations. Of course, more elaborate models are possible. The study of the organi- zation and sequence of the J region genes in other species should provide further insight into these evolutionary proc- esses.

Acknowledgments-We thank Terri Broderick for expert work in preparing this manuscript and Barbara Norman and Marion Nau for substantial contributions.

REFERENCES

1. Hozumi, N., and Tonegawa, S. (1976) Proc. Natl. Acad. Sci. U. S.

2. Seidman, J. G., and Leder, P. (1978) Nature 276, 790-795 3. Lenhard-Schuller, R., Hohn, B., Brack, C., Hirama, M., and

Tonegawa, S. (1978) Proc. Natl. Acad. Sci. U. S. A . 75, 4709- 4713

4. Seidman, J. G., Max, E. E., and Leder, P. (1979) Nature 280,370- 375

5. Seidman, J. G., Leder, A., Nau, M., Norman, B., and Leder, P. (1978) Science 202, 11-17

6. Valbuena, O., Marcu, K. B., Weigert, M., and Perry, R. P. (1978) Nature 276, 780-784

7. Max, E. E., Seidman, J . G., and Leder, P. (1979) Proc. Natl. Acad. Sci. U. S. A . 76,3450-3454

8. Sakano, H., Huppi, K., Heinrich, G., and Tonegawa, S. (1979) Nature 280, 288-294

9. Gilmore-Hebert, M., Hercules, K., Komaromy, M., and Wall, R. (1978) Proc. Natl. Acad. Sci. U. S. A. 75,6044-6048

A. 73,3628-3632

10. Rabbitts, T. H. (1978) Nature 275,291-296 11. Perry, R. P., Kelley, D. E., and Schibler, U. (1979) Proc. Natl.

12. Weigert, M., Perry, R., Kelley, D., Hunkapiller, T., Schilling, J.,

13. Padlan, E. (1977) Q. Rev. Biophys. 10,35-65 14. Hieter, P. A., Max, E. E., Seidman, J. G., Maizel, J. V., Jr., and

15. Konkel, D. A,, Maizel, J. V., Jr., and Leder, P. (1979) Cell 18,

16. Maxam, A. M., and Gilbert, W. (1977) Proc. Natl. Acad. Sci. U.

17. Maxam, A. M., and Gilbert, W. (1980) Methods Enzymol. 65,

18. Korn, L. J., Queen, C. L., and Wegman, M. N. (1977) Proc. Natl.

19. Max, E. E., Maizel, J. V., Jr., and Leder, P. (1981) J. Biol. Chem.

20. Sakano, H., Maki, R., Kurosawa, Y., Roeder, W., Tonegawa, S.

21. Hieter, P. A,, Korsmeyer, S. J., Waldmann, T., and Leder, P.

22. Kabat, E. A., Wu, T. T., and Bilofsky, H. (1979) NIH Publication

23. Hood, L., Campbell, J. H., and Elgin, S. C. R. (1975) Annu. Rev.

24. Tartof, K. D. (1975) Annu. Rev. Genet. 9, 355-385 25. Smith, G. P. (1976) Science 191, 528-535 26. Zimmer, E. A., Martin, S. L., Beverley, S. M., Kan, Y. W., and

Wilson, A. C. (1980) Proc. Natl. Acad. Sci. U. S. A . 77, 2158- 2162

Acad. Sci. U. S. A. 76,3678-3682

and Hood, L. (1 980) Nature 283,497-499

Leder, P. (1980) Cell 22, 197-207

865-873

S. A. 74,560-564

499-560

Acad. Sci. U. S. A . 74,4401-4405

256, 5116-5120

(1980) Nature 286,676-683

(1981) Nature 290,368-372

NO. 80-2008

Genet. 9,305-353

by guest on January 8, 2021http://w

ww

.jbc.org/D

ownloaded from

Page 8: THE JOURNAL OF CHEMISTRY Vol. 257, No. 3. Issue of February … · 2001. 8. 24. · THE JOURNAL OF BIOLOGICAL CHEMISTRY Prrnted in U.S.A. Vol. 257, No. 3.Issue of February 10, pp.

P A Hieter, J V Maizel, Jr and P LederEvolution of human immunoglobulin kappa J region genes.

1982, 257:1516-1522.J. Biol. Chem. 

  http://www.jbc.org/content/257/3/1516Access the most updated version of this article at

 Alerts:

  When a correction for this article is posted• 

When this article is cited• 

to choose from all of JBC's e-mail alertsClick here

  http://www.jbc.org/content/257/3/1516.full.html#ref-list-1

This article cites 0 references, 0 of which can be accessed free at

by guest on January 8, 2021http://w

ww

.jbc.org/D

ownloaded from