Gene hormone-releasing precursor: Structure, and · cisely the structure ofthe GRFgeneandto define...

5
Proc. Natl. Acad. Sci. USA Vol. 82, pp. 63-67, January 1985 Biochemistry Gene encoding human growth hormone-releasing factor precursor: Structure, sequence, and chromosomal assignment (peptide hormone/hypothalamus/alternative RNA processing/chromosome sorting) KELLY E. MAYO*, GAIL M. CERELLI*, ROGER V. LEBOt, BARRY D. BRUCEt, MICHAEL G. ROSENFELDt, AND RONALD M. EVANS* *Molecular Biology and Virology Laboratory, The Salk Institute for Biological Studies, P.O. Box 85800, San Diego, CA 92138; tHoward Hughes Medical Institute, Department of Medicine, University of California, San Francisco, CA 94143; and tEukaryotic Regulatory Biology Program, School of Medicine, University of California, San Diego, CA 92093 Communicated by Helen M. Ranney, August 31, 1984 ABSTRACT We have isolated and characterized overlap- ping clones from phage X and cosmid human genomic libraries that predict the entire structure of the gene encoding the pre- cursor to human growth hormone-releasing factor. The gene includes five exons spanning 10 kilobase pairs of human geno- mic DNA. There appears to be a segregation of distinct func- tional regions of the GRF precursor and its mRNA into the five exons of the gene. The DNA sequences of all exons, in- tron/exon boundaries, and 5' and 3' flanking regions are pre- sented. Dot-blot analysis of DNA from high resolution dual- laser-sorted human chromosomes indicates that the single- copy growth hormone-releasing factor gene is located on human chromosome 20. The regulation of growth hormone synthesis and secretion in the anterior pituitary is under complex neural and hormonal control (1). This regulation is mediated in part by two hypo- thalamic releasing factors. The inhibitory peptide, somato- statin, was characterized more than a decade ago (2) and the stimulatory peptide, growth hormone-releasing factor (GRF), was isolated and characterized only recently from two human pancreatic tumors possessing growth hormone- releasing activity (3, 4). Human GRF is a 44-amino acid pep- tide amidated at the COOH terminus. Nonamidated forms, consisting of 40 and 37 amino acids, that have full biological activity were also characterized (3, 4). Antibodies raised against the GRF-40 peptide specifically stain cell bodies and nerve fibers in the medial basal hypothalamus and median eminence, suggesting that the peptide isolated from pancre- atic tumors is the same as that found in the hypothalamus (5, 6). GRF and other hypothalamic releasing factors play a cen- tral role in directing endocrine responses to neural stimuli; analysis of their biosynthesis, therefore, provides a unique opportunity to study coordinate neural and hormonal control of gene expression. Toward this goal, we, and others, have isolated cDNA recombinant clones encoding human GRF (7, 8). Analysis of these cDNA clones indicated that mature GRF is proteolytically processed from a 108-amino acid pre- cursor protein. To further studies of the regulation of this physiologically and clinically important protein, we have now isolated and structurally characterized genomic clones containing the entire human GRF gene. In addition, to fur- ther the analysis of the potential role of GRF in various heri- table human growth disorders, we have mapped the GRF gene to human chromosome 20. MATERIALS AND METHODS Isolation and Mapping of Genomic Clones. The 350-base- pair (bp) EcoRI/BamHI fragment from a human GRF cDNA clone (7) was nick-translated using [a-32P]dCTP (410 Ci/ mmol; 1 Ci = 37 GBq) to a specific activity of >108 cpm/hg and used as a hybridization probe to screen at high density human genomic libraries constructed in either X Charon 28 (9) or cosmid pHC79 (10). Positive plaques or colonies were rescreened at lower density until pure. A combination of re- striction enzyme mapping and Southern DNA blotting was used to locate regions of the clones that hybridized to the GRF cDNA probe. Exon 1 was localized using a kinase-la- beled synthetic oligonucleotide predicted from the sequence of a larger human GRF cDNA (8) to be specific for the 5' nontranslated region of the GRF mRNA (the oligonucleotide corresponds to nucleotides 36-67 of Fig. 3). Details of all techniques used to isolate and analyze these genomic clones have been described (11). [a-32P]dCTP was from New En- gland Nuclear; all enzymes were from either New England Biolabs or Bethesda Research Laboratories. DNA Sequencing. Regions of genomic clones that hybrid- ized to the GRF cDNA probe were subcloned in plasmid vectors by standard techniques. Restriction-enzyme-digest- ed DNA was 5' labeled using bacterial alkaline phosphatase, T4 polynucleotide kinase, and [y-32P]ATP (11). Pst I sites were also labeled by use of [a-32P]dCTP and T4 DNA poly- merase (11). DNA sequence determination was by the chem- ical method of Maxam and Gilbert (12). RNA Analysis. Poly(A)+ RNA was prepared by standard techniques from a human thymic carcinoma ectopically pro- ducing GRF (kindly provided by M. Thorner, University of Virginia). RNA blot analysis was carried out as described (13), using denaturing formaldehyde/agarose gels. For prim- er extension of RNA, a synthetic oligonucleotide comple- mentary to a 32-nucleotide sequence from the 5' nontranslat- ed region of the GRF mRNA (see above) was synthesized. Hybridization and primer extension were done as described (14), using avian myeloblastosis virus reverse transcriptase (Bethesda Research Laboratories). For nuclease S1 map- ping, the 450-bp Xba I/EcoRV fragment including most of the first exon (see Fig. 1B) was 5'-end-labeled at the EcoRV site by use of polynucleotide kinase, strand-separated on an acrylamide gel (11), and used according to established proto- cols (15). Chromosome Mapping. Chromosome suspensions were prepared from a lymphocyte cell line and then stained with Abbreviations: GRF, growth hormone-releasing factor; bp, base pair(s); kb, kilobase pairs; DIPI, 4',6-bis(2"-imidazolinyl-4H,5H)-2- phenylindole. 63 The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. Downloaded by guest on November 27, 2020

Transcript of Gene hormone-releasing precursor: Structure, and · cisely the structure ofthe GRFgeneandto define...

Page 1: Gene hormone-releasing precursor: Structure, and · cisely the structure ofthe GRFgeneandto define sequences important for the expression of this gene, we have se-quenced all exons,

Proc. Natl. Acad. Sci. USAVol. 82, pp. 63-67, January 1985Biochemistry

Gene encoding human growth hormone-releasing factor precursor:Structure, sequence, and chromosomal assignment

(peptide hormone/hypothalamus/alternative RNA processing/chromosome sorting)

KELLY E. MAYO*, GAIL M. CERELLI*, ROGER V. LEBOt, BARRY D. BRUCEt, MICHAEL G. ROSENFELDt,AND RONALD M. EVANS**Molecular Biology and Virology Laboratory, The Salk Institute for Biological Studies, P.O. Box 85800, San Diego, CA 92138; tHoward Hughes MedicalInstitute, Department of Medicine, University of California, San Francisco, CA 94143; and tEukaryotic Regulatory Biology Program, School of Medicine,University of California, San Diego, CA 92093

Communicated by Helen M. Ranney, August 31, 1984

ABSTRACT We have isolated and characterized overlap-ping clones from phage X and cosmid human genomic librariesthat predict the entire structure of the gene encoding the pre-cursor to human growth hormone-releasing factor. The geneincludes five exons spanning 10 kilobase pairs of human geno-mic DNA. There appears to be a segregation of distinct func-tional regions of the GRF precursor and its mRNA into thefive exons of the gene. The DNA sequences of all exons, in-tron/exon boundaries, and 5' and 3' flanking regions are pre-sented. Dot-blot analysis of DNA from high resolution dual-laser-sorted human chromosomes indicates that the single-copy growth hormone-releasing factor gene is located onhuman chromosome 20.

The regulation of growth hormone synthesis and secretion inthe anterior pituitary is under complex neural and hormonalcontrol (1). This regulation is mediated in part by two hypo-thalamic releasing factors. The inhibitory peptide, somato-statin, was characterized more than a decade ago (2) and thestimulatory peptide, growth hormone-releasing factor(GRF), was isolated and characterized only recently fromtwo human pancreatic tumors possessing growth hormone-releasing activity (3, 4). Human GRF is a 44-amino acid pep-tide amidated at the COOH terminus. Nonamidated forms,consisting of 40 and 37 amino acids, that have full biologicalactivity were also characterized (3, 4). Antibodies raisedagainst the GRF-40 peptide specifically stain cell bodies andnerve fibers in the medial basal hypothalamus and medianeminence, suggesting that the peptide isolated from pancre-atic tumors is the same as that found in the hypothalamus (5,6).GRF and other hypothalamic releasing factors play a cen-

tral role in directing endocrine responses to neural stimuli;analysis of their biosynthesis, therefore, provides a uniqueopportunity to study coordinate neural and hormonal controlof gene expression. Toward this goal, we, and others, haveisolated cDNA recombinant clones encoding human GRF (7,8). Analysis of these cDNA clones indicated that matureGRF is proteolytically processed from a 108-amino acid pre-cursor protein. To further studies of the regulation of thisphysiologically and clinically important protein, we havenow isolated and structurally characterized genomic clonescontaining the entire human GRF gene. In addition, to fur-ther the analysis of the potential role of GRF in various heri-table human growth disorders, we have mapped the GRFgene to human chromosome 20.

MATERIALS AND METHODS

Isolation and Mapping of Genomic Clones. The 350-base-pair (bp) EcoRI/BamHI fragment from a human GRF cDNAclone (7) was nick-translated using [a-32P]dCTP (410 Ci/mmol; 1 Ci = 37 GBq) to a specific activity of >108 cpm/hgand used as a hybridization probe to screen at high densityhuman genomic libraries constructed in either X Charon 28(9) or cosmid pHC79 (10). Positive plaques or colonies wererescreened at lower density until pure. A combination of re-striction enzyme mapping and Southern DNA blotting wasused to locate regions of the clones that hybridized to theGRF cDNA probe. Exon 1 was localized using a kinase-la-beled synthetic oligonucleotide predicted from the sequenceof a larger human GRF cDNA (8) to be specific for the 5'nontranslated region of the GRF mRNA (the oligonucleotidecorresponds to nucleotides 36-67 of Fig. 3). Details of alltechniques used to isolate and analyze these genomic cloneshave been described (11). [a-32P]dCTP was from New En-gland Nuclear; all enzymes were from either New EnglandBiolabs or Bethesda Research Laboratories.DNA Sequencing. Regions of genomic clones that hybrid-

ized to the GRF cDNA probe were subcloned in plasmidvectors by standard techniques. Restriction-enzyme-digest-ed DNA was 5' labeled using bacterial alkaline phosphatase,T4 polynucleotide kinase, and [y-32P]ATP (11). Pst I siteswere also labeled by use of [a-32P]dCTP and T4 DNA poly-merase (11). DNA sequence determination was by the chem-ical method of Maxam and Gilbert (12).RNA Analysis. Poly(A)+ RNA was prepared by standard

techniques from a human thymic carcinoma ectopically pro-ducing GRF (kindly provided by M. Thorner, University ofVirginia). RNA blot analysis was carried out as described(13), using denaturing formaldehyde/agarose gels. For prim-er extension of RNA, a synthetic oligonucleotide comple-mentary to a 32-nucleotide sequence from the 5' nontranslat-ed region of the GRF mRNA (see above) was synthesized.Hybridization and primer extension were done as described(14), using avian myeloblastosis virus reverse transcriptase(Bethesda Research Laboratories). For nuclease S1 map-ping, the 450-bp Xba I/EcoRV fragment including most ofthe first exon (see Fig. 1B) was 5'-end-labeled at the EcoRVsite by use of polynucleotide kinase, strand-separated on anacrylamide gel (11), and used according to established proto-cols (15).Chromosome Mapping. Chromosome suspensions were

prepared from a lymphocyte cell line and then stained with

Abbreviations: GRF, growth hormone-releasing factor; bp, basepair(s); kb, kilobase pairs; DIPI, 4',6-bis(2"-imidazolinyl-4H,5H)-2-phenylindole.

63

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement"in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Dow

nloa

ded

by g

uest

on

Nov

embe

r 27

, 202

0

Page 2: Gene hormone-releasing precursor: Structure, and · cisely the structure ofthe GRFgeneandto define sequences important for the expression of this gene, we have se-quenced all exons,

64 Biochemistry: Mayo et al

A

hGRFcos49

&-IN 6~~.4"Xat E...wooXt'4W i 5'50oOIIIJI 1050 0c1III00 0 oOa.II ao E EE a ii .O0W wZxiI gowx m x M z

hGRFXIIICo$

10 A_, o-

B EXON I

hGRXO .R%.#'\/x _ _

A-U

EXONS 2+3

I t I

E 00 0 E

.-0 04-

mo x w mL-I / _ Y/

.*--

to

U)

' oIi, Ii,1000

LkUIAlu

4-0*4-

44E,2

-0- 4-0_

a 4. aE CIL E

0-4

.X E44B

440 E oC44

i xxasI |2 *I _[ if

FIG. 1. Human GRF genomic clones and subclones. (A) Structure of three overlapping clones isolated from human genomic libraries. Darkboxes indicate regions of hybridization to GRF cDNA or oligonucleotide probes, as described in the text. Wavy lines indicate the phage orcosmid vector DNA. The scale is broken by the slashes at the 3' ends of the phage clones and at the 5' end of the cosmid clone. A restrictionmap of sites for seven different enzymes is shown; all sites indicated are present within the insert of one or more of the clones and do notrepresent sites within the vector DNA. (B) Subclones used in the sequence analysis of the GRF gene. The location of each subclone within thegenomic clones can be determined from A. Subclones and their derivation are: exon 1, 3.5-kb BamHI/BamHI fragment from hGRFcos49;exons 2 and 3, 1.7-kb EcoRI/Bgl II fragment from hGRFcos49; exon 4, 4.5-kb Xma I/Xma I fragment from hGFRX101; exon 5, 1.3-kbHindIII/HindIII fragment from hGRFX101. The dark boxes indicate the exons and parallel slashes indicate a break in the scale. Regionssequenced are indicated below each subclone; filled circles at the origins of arrows indicate restriction sites that were radiolabeled, and thearrows represent the direction and extent of sequence analysis. The two sets of arrows from the unique Pst I site in exon 4 indicate sequencereactions carried out after labeling either 5' ends with T4 polynucleotide kinase or 3' ends with T4 DNA polymerase as described in Materialsand Methods.

4',6-bis(2"-imidazolinyl-4H,5H)-2-phenylindole (DIPI)/chro-momycin A3 stain-pair (16). Thirty thousand chromosomesof each type were sorted onto a single spot of a nitrocellulosefilter using a dual-laser custom FACS IV chromosome-sort-er (17). The filter-bound chromosomal DNA was denatured,neutralized, prehybridized, and then hybridized in 10% dex-tran sulfate to nick-translated GRF cDNA probe (seeabove). Filters were washed and autoradiographed usingstandard conditions.

RESULTSScreening of Human Genomic Libraries. The insert from a

previously isolated human GRF cDNA clone was used as ahybridization probe to screen human genomic libraries. Thisprobe includes nearly all of the GRF precursor coding se-quences and the 3' nontranslated region, but does not in-clude the 5' nontranslated region. We initially screened250,000 plaques from a human phage A library constructed inCharon 4A (18) and 400,000 plaques from another human

5' NT Slanal GF

cDNA

phage A library constructed in Charon 28 (9); we isolated twophage clones that hybridized strongly to the human GRFcDNA probe. The structures of these clones (hGRFX101 andhGRFX111) are shown in Fig. 1A. These two overlappingphage clones were found to contain four distinct regions ofhybridization to the GRF cDNA probe; however, subse-quent analysis revealed that neither contained the entireGRF gene. Because our experience with the phage clonesindicated that the GRF gene was substantially larger thanexpected, we next screened a human cosmid gene bank (10)and isolated a cosmid clone that overlapped substantiallywith the two phage clones; the structure of this cosmid clone(hGRFcos49) is shown in Fig. 1A. Because our human GRFcDNA clone did not include 5' nontranslated sequences, itwas difficult to conclusively show that the cosmid clone in-cluded the 5' end of the GRF gene. To establish this, we useda synthetic oligonucleotide specific for the 5' nontranslatedregion, based on the published sequence of another, largerhuman GRF cDNA clone (ref. 8; see Materials and Meth-

3' NT Poly A

50bp

5'Genomic DNA

EXONS:

Add,~~~~~111 2 3

1 2 3

FIG. 2. Structure of the human GRF cDNA and gene. Top line shows a schematic of the GRF cDNA. The 5' nontranslated region (5' NT),the-signal-peptide-encoding region, the GRF-encoding region, the 3' nontranslated region (3' NT), and the poly(A) tract are indicated. Shadedregions encode the NH2-terminal and COOH-terminal flanking peptides (7, 8). Bottom line shows a schematic of the GRF gene. Open boxesindicate noncoding exon regions; dark boxes indicate coding exon regions. Connecting lines indicate the relationship between structural regionsof the cDNA and exons of the gene. Notice that the two scales are different.

Proc. NatL. Acad. Sci. USA 82 (1985)

I kb

'4

VEPE

.. .CO$

EXON 4 EXON 5

I Ku A\i

4

500bp

5

11-1

I 1) 4-a I

Dow

nloa

ded

by g

uest

on

Nov

embe

r 27

, 202

0

Page 3: Gene hormone-releasing precursor: Structure, and · cisely the structure ofthe GRFgeneandto define sequences important for the expression of this gene, we have se-quenced all exons,

Proc. Natl. Acad. Sci. USA 82 (1985) 65

ods). Use of the combined cDNA and oligonucleotide probesshowed that the three overlapping genomic clones depictedin Fig. 1A contained the entire human GRF gene.

Structure of the GRF Gene. The three genomic clones iso-lated (Fig. 1A) were analyzed in detail by restriction enzymemapping, Southern DNA blotting, and partial DNA se-quence analysis. The complete structure of the human GRFgene as predicted from these overlapping genomic clones isshown schematically in Fig. 2. The gene includes five exonsseparated by 4 introns and spans 10 kilobase pairs (kb) ofgenomic DNA. Fig. 2 also compares the structure of thegene with that of the cDNA, and shows that the five exons ofthe gene appear to encode functionally discrete domains ofthe GRF precursor and its mRNA. Exon 1 includes the 5'nontranslated sequences, exon 2 encodes the signal peptideand small NH2-terminal connecting peptide, exon 3 encodesmost of the mature GRF peptide (including all of the biologi-cally active portion), exon 4 encodes the COOH-terminalpeptide (of unknown function), and exon 5 contains the 3'nontranslated sequences.DNA Sequence of the GRF Gene. To determine more pre-

cisely the structure of the GRF gene and to define sequencesimportant for the expression of this gene, we have se-quenced all exons, intron/exon boundaries, and 5' and 3'flanking regions of the human GRF gene. For sequencing,regions of GRF genomic clones were first subcloned in plas-mid vectors. The strategy used for DNA sequence analysis isshown in Fig. 1B. The sequence determined is presented inFig. 3. The sequence is numbered from the putative mRNA5' cap site, determined as described in the following section.The sequences of all exons agree precisely with those previ-ously determined from human GRF cDNA clones (7, 8). Allof the 4 introns begin and end with the consensus sequencesG-T and A-G, respectively (19). Sequences related to con-sensus "TATA" and "CAT" boxes, found in the 5' flankingregion of many eukaryotic genes (19, 20), are located 30 and75 nucleotides, respectively, 5' of the putative mRNA capsite. The sequence A-A-T-A-A-A, thought to be important inpolyadenylylation (21), is located 31 nucleotides 5' of thepoly(A) addition site.

It was previously reported (8) that two distinct types ofGRF cDNAs exist. These forms differ by the inclusion orexclusion of 3 nucleotides from the coding region and encodeeither a 107- or a 108-amino acid GRF precursor protein. Oursequence analysis suggests that these two forms result fromalternative RNA processing of a single GRF gene primarytranscript. As indicated in Fig. 3, differential utilization oftwo splice-acceptor sites at the beginning of exon 5 wouldresult in inclusion or exclusion of serine-103 and generateeither the 107- or the 108-amino acid forms of the precursor.

Analysis of the GRF genomic clones revealed a sequenceupstream from exon 2 that was highly repeated in the humangenome. This repetitive element has been completely se-quenced (see Fig. 1B for location and sequencing strategy)and was found to be highly homologous to the consensus

sequence for the human alu family of repeats (22).RNA Analysis. Because it was not obvious from sequence

analysis of cDNA and genomic clones where the mRNA 5'cap site was located, we attempted to directly determine thisby an analysis of GRF mRNA. To do this, we utilized RNAisolated from a human thymic carcinoma ectopically produc-ing GRF (GRF-tumor). Blot hybridization analysis of thisRNA, shown in Fig. 4A, revealed that the GRF mRNA isapproximately 750 nucleotides long. Assuming that thepoly(A) tract contributes -200 residues (23), this indicatesthat the cap site should be about 550 nucleotides upstreamfrom the poly(A) addition site. To more precisely define the5' end of the GRF mRNA, we performed primer extension ofGRF-tumor poly(A)+ RNA that had been hybridized to asynthetic oligonucleotide corresponding to a cDNA se-

quence thought to be near the 5' end of the mRNA. Thisanalysis, shown in Fig. 4B, indicated that the extended prod-uct ended near the A residue at position 3 in Fig. 3; a minorextension product of slightly larger size was also identified(Fig. 4B). Finally, we performed a nuclease S1 mapping ex-periment using GRF-tumor RNA (Fig. 4C). A single band ispresent in the GRF-tumor RNA sample that is not present inthe control RNA sample (Fig. 4C). This protected fragmentmaps to the C residue at position -1 in Fig. 3. The resolutionof these techniques localizes the cap site to within a few nu-cleotides; however, since most eukaryotic mRNAs begin atan adenosine within the consensus sequence Py-C-A-Py,where Py is a pyrimidine (24), we presume that the A at posi-tion 1 in Fig. 3 is most likely to correspond to the 5' end of

-350TCTAGACAGGGTCTCATTATGTTGCCAGGTTGGTCTCAAACTTCTGAGCTCAGGCAATCC

-300ACCCGCCTCAGCCTCCCAAAGTGCTAGGATGGCAGGTGTGAGCCACCGCGCCCAGCCGAG

-250TTCTCCAATCACTATTATAGCAGTATATATTCTCTATATCCTCTTGGAATAATGTTACAC

-200CTTTGTACTATGTCCACTGTGCCAAAGATAAAAGGAGACTTTACTAGGAGTCTAAGTCTG-150 -100

CAAGGGGCCAAACCTCTTTCACCAACAGGGTTTGTCAGTGTGATATGATGCTAAAAACAG-50

TCCTTTGGTTGACTTGT~ ffjTATTCTCTGACGCTGACAACGCTTAGGAAAATGAA1 'Exon 1

GA ~ATGGGA ACGCCAGGCGGCTGCCAGAGCAAACACCCAGCCCAGGGCCCCTG

GATTTGAGCAGTGCCTCGGAGCAGAGGGATATCTGCCGCATCAGGTGAGAGGGG......j Exon 2

Intron A: 3.7kb......CACTCTGCAGGTGCCACCCCGGGTGAAGGATGCCACTCMetProLeu

00 150TGGGTGTTCTTCTTTGTGATCCTCACCCTCAGCAACAGCTCCCACTGCTCCCCACCTCCCTrpValPhePhePheValIleLeuThrLeuSerAsnSerSerHisCysSerProProPro

CCTTTGACCCTCAGGTAAGcAGAC ... Intron B: 230 bp.....CCTCTCACAGGProLeuThrLeuAr g-i-Exon 3 200

ATGCGGCGGTATGCAGATGC CATC TTCACCAACAGC TACCGGAAGGTGCTGGGCCAGC TGMetArgArgTyrAlaAspAlal/ePheThrAsnSerTyrArgLysValLeuGlyGlnLeu

250

TCCGCCCGCAAGCTGCTCCAGGACATCATGAGCAGGCAGCAGGGGTAAGCAGGG......SerAlaArgLysLeuLeuG/nAspIleMetSerArgGlnGlnGI

Exon 4 300

Intron C: 2.4 kb......AACCACACAGAGAGAGCAACCAAGAGCGAGGAGCAAGGyGluSerAsnGlnGluArgGlyAIaArg

350

GCACGGCTTGGTCGTCAGGTAGACAGCATGTGGGCAGAACAAAAGCAAATGGAATTGGAGAlaArgLeuGl yArgGinValAspSerMetTrpAlaGluGInLysGInMetGluLeuGlu

AGCATCCTGGTGGCCCTGCTGCAGAAGCACAGGTATGGGTGT .....Intron D: 3.0SerlIeLeuVaIAlaLeuLeuGInLysHisSe

1001 Exon 5

kb..CTTTCTGCAGCAGGAACTCCCAGGGATGAAGATTCCTCCTGTGACCCGGGCTACCTrArgAsnSerGlnGly

450 500

GTAGCCAAAATGCAACTGGATCCAGTTAATCCTCTCATTTCTGACCCACTTTTTCCTTTG

-50¢ Poly AAAAATAC A ATTCCCCCATACCGGTGTGCATTTAAATGTTCTTTCTCT TCAGCCTAATTTTTTATGTGTTTTACATAGTTACATTTCCCAAAGTGGTATTACGAGGTGTAATTTTTTTTAAAGCTAACATTTACCAAGCACGCAGAGTGCAACAGGCAGCTCATGCATAGGCATC

FIG. 3. DNA sequence of the human GRF gene. Nucleotidenumbering is from the proposed mRNA cap site; the intron nucleo-tides are not numbered. Boxed regions indicate consensus se-quences as described in the text; they are the CAT box, TATA box,and A-A-T-A-A-A sequence, in that order. Arrows indicate the be-ginning of exons or the poly(A) addition site (last arrow). The two-tailed arrow at the beginning of exon 5 indicates two alternativesplice-acceptor sites, both of which appear to be utilized (see text).The larger letters represent nucleotides found in the mature mRNA.The first and last 10 nucleotides of each intron are shown and theapproximate length of each intron is indicated. Amino acids indicat-ed in italic print are those of the mature 44-amino acid GRF peptide.

Biochemistry: Mayo et aL

Dow

nloa

ded

by g

uest

on

Nov

embe

r 27

, 202

0

Page 4: Gene hormone-releasing precursor: Structure, and · cisely the structure ofthe GRFgeneandto define sequences important for the expression of this gene, we have se-quenced all exons,

Proc. Nati. Acad. Sci. USA 82 (1985)

A1 2

1768-

11551118

BG A T C I

a

Z.' .1

: *t

...."I.t

A"+-W W:

as

+-750* 6526 -

447 -

G AT Cwaa *

4"-*

C1 23

4-59

215 -

9"~FIG. 4. Analysis of human GRF mRNA. (A) RNA blot analysis. Lane 1, HeLa cell poly(A)+ RNA (5 pg); lane 2, poly(A)+ RNA (5 gg) from

a human tumor ectopically producing GRF. The molecular weight markers are HindIII-digested and denatured simian virus 40 DNA fragments;sizes indicated are in nucleotides. (B) Primer-extension analysis. Lane 1, a synthetic oligonucleotide was used to prime synthesis with reversetranscriptase from human GRF-producing tumor poly(A)+ RNA (10 Ag). G, A, T, and C indicate a set of DNA sequence reactions carried outusing a fragment labeled at the EcoRV site in exon 1 (see Fig. 1). The sequence ladder cannot be used to directly read the position of the primer-extended product, but rather serves as a complete set of accurate size markers. The size of the band indicated is in nucleotides. The material atthe bottom of lane 1 is from labeled primer that has not been extended. (C) Nuclease S1 protection analysis. RNA samples were hybridized to asingle-stranded probe labeled at the EcoRV site in exon 1 (see Fig. 1), and nuclease-resistant products were analyzed. Lane 1, no RNA; lane 2,HeLa cell poly(A)' RNA (5 ug); lane 3, human GRF-producing tumor poly(A)+ RNA (5 pg). G, A, T, and C indicate a set of DNA sequencereactions carried out on the labeled fragment used as probe; the position of the nuclease-resistant band can therefore be read directly from thesequence ladder (which is of the anti-sense strand). Several faint bands were present in the higher molecular weight region of the gel (notshown); however, all of these bands were also present in the control lanes (1 and 2) and are presumed to be nonspecific.

the GRF mRNA. This would place the transcription initia-tion site 30 nucleotides downstream from the TATA box, inagreement with the consensus distance (27-33 nucleotides)between these sequences (19, 24). Interestingly, one cDNAclone that contains sequences 5' of this putative cap site hasbeen identified (8). It is thus possible that a small percentageof the GRF transcripts are initiated from an upstream site.Chromosomal Location of the GRF Gene. Previous analysis

of human genomic DNA by Southern blotting using GRFcDNA probes had indicated that the GRF gene was mostlikely single-copy in the human genome (7). To determinethe chromosomal location of this gene, we used a dual-lasercustom fluorescence-activated cell-sorter to sort mitoticchromosome suspensions stained with DIPI/chromomycinin conjunction with Hoechst 33258/chromomycin. This tech-nique allows separation of the 24 human chromosome typesinto 22 fractions (16). After the chromosomes were sorteddirectly onto nitrocellulose, the chromosomal DNA was de-natured and hybridized to a human GRF cDNA probe (7)(Fig. 5). In three independent experiments the GRF gene-specific probe hybridized to DNA from chromosome 20, butnot to DNA from any other chromosomal fraction. We there-fore assign the human GRF gene to chromosome 20.

DISCUSSIONWe have determined the complete structure of the humanGRF gene. This small peptide is encoded by a gene that con-tains 5 exons and spans 10 kb of human genomic DNA. Wehave noticed that there appears to be a segregation of dis-tinct structural and/or functional regions of the GRF precur-sor and its mRNA into the 5 exons of the gene. It is interest-ing that, although the complete GRF peptide is encoded bytwo exons (exon 3 and part of exon 4), all of the biologicallyactive portion of the peptide (3, 4) is encoded in exon 3. Thisexon therefore seems to encode a distinct functional domain.Several other examples of genes in which exons encode dis-crete structural or functional domains of the protein producthave been observed, although the significance of this obser-vation is unclear (25).

DNA sequence analysis provided the complete structureof the GRF gene and revealed that this gene contains all rec-ognized consensus sequences thought to be involved in genetranscription and RNA processing. The 5' flanking regioncontains two unusual variations of the consensus TATA andCAT sequences (24). In the TATA box, believed to be im-portant for transcription initiation, a G substitutes for the Tnormally found at position 1, generating the sequence G-A-T-A-A-A-T. Further evidence that the assignment of theCAT and TATA boxes is correct comes from an analysis ofthe rat GRF gene. The rat GRF gene is highly homologous tothe human gene in this region, and contains consensus CATand TATA box sequences in positions exactly analogous tothose described here for the human gene (unpublished re-sults). All intron/exon boundaries within the human GRFgene conform to previously described consensus splice-donor and splice-acceptor sequences (19).Two forms of the GRF precursor protein, 107 and 108 ami-

no acids long, have been identified by sequence analysis of

f20 ' /19 9N. O10-12

FIG. 5. Chromosome mapping of the human GRF gene. A dual-laser custom fluorescence-activated cell-sorter was used to sort DI-PI/chromomycin-stained human mitotic chromosome suspensionsinto 22 fractions representing the 24 human chromosome types(chromosomes 10-12 are pooled). Thirty thousand chromosomes ofeach type were sorted and the DNA was hybridized to a human GRFcDNA probe (see Materials and Methods). Regions of a nitrocellu-lose filter onto which DNAs from four of the chromosomal fractionswere sorted are shown. Chromosome 20 was positive in each ofthree independent experiments. All other chromosome fractionswere negative (data not shown).

66 Biochemistry: Mayo et aL

Dow

nloa

ded

by g

uest

on

Nov

embe

r 27

, 202

0

Page 5: Gene hormone-releasing precursor: Structure, and · cisely the structure ofthe GRFgeneandto define sequences important for the expression of this gene, we have se-quenced all exons,

Proc. NatL. Acad. Sci. USA 82 (1985) 67

cDNA clones from a human pancreatic tumor (8). Southernblot analysis of human genomic DNA reveals that the GRFgene is single-copy (7), suggesting that the multiple proteinprecursors are derived from a single gene. Analysis of theGRF gene sequence identifies two consensus splice-accep-tor sites (19), spaced by three nucleotides, at the 5' end ofexon 5. Differential usage of these two possible splice-accep-tor sites would result in either inclusion or exclusion of ser-ine-103 and thus generate the 107- or 108-amino acid forms ofthe precursor. After proteolytic processing of the GRF pre-cursor protein, this difference would be expected to result ineither a 30- or a 31-amino acid COOH-terminal peptide ofunknown function. Preliminary analysis of the rat GRF gene(unpublished results) indicates that two consensus splice-ac-ceptor sites spaced by three nucleotides also occur at the 5'end of exon 5 in this gene. A similar type of alternative RNAprocessing occurs in the human growth hormone gene,where differential usage of two splice-acceptor sites gener-ates growth hormone proteins that differ by inclusion or ex-clusion of 15 amino acids (26).Because GRF is believed to be an important physiological

regulator of growth hormone synthesis and secretion (27), itmight potentially be involved in some of the described clini-cal syndromes in which growth hormone production is im-paired: for example, familial isolated growth hormone defi-ciency (IGHD) and pituitary dwarfism (28). We have nowdetermined by a combined chromosome-sorting/dot-blottingtechnique that the GRF gene is located on human chromo-some 20. This observation indicates that the GRF gene ischromosomally linked to the loci for adenosine deaminase,the src proto-oncogene, inosine triphosphatase, and S-aden-osylhomocysteine hydrolase (29). It further suggests that, ifGRF is involved in any of the described human growth disor-ders, it is likely to be involved in those that demonstrate anautosomal inheritance (such as IGHD type IB or pituitarydwarfism type I) rather than in those that are X-linked (suchas IGHD type III or pituitary dwarfism type II) (28).The availability of human GRF gene probes will allow us

to begin to analyze the manner in which expression of thisgene is regulated by hormonal and neural factors. In additionto studying regulation in the hypothalamus, it will now bepossible to introduce the cloned gene into animal cells oranimals in an attempt to define the types of factors that regu-late this gene as well as the genomic sequences that mediatethis regulation.

We thank Michael Thorner for providing the tumor tissue, PhilLeder and John Collins for making human genomic libraries avail-able, Michael Harpold for providing the synthetic oligonucleotide,and Estelita Ong for help in growing phage. We appreciate com-ments on the manuscript by our colleagues and the secretarial assist-ance of Marijke terHorst and Connie Meloan. This work was sup-ported by grants from the National Institutes of Health and NationalInstitute of Arthritis, Diabetes, and Digestive and Kidney Diseases

to Ronald M. Evans and by a Damon Runyon-Walter Winchell Can-cer Fund Postdoctoral Fellowship to K.E.M. (DRG-648).

1. Martin, J. B. (1979) Physiologist 22, 23-29.2. Brazeau, P., Vale, W., Burgus, R., Ling, N., Butcher, M., Ri-

vier, J. & Guillemin, R. (1973) Science 179, 77-79.3. Rivier, J., Spiess, J., Thorner, M. & Vale, W. (1982) Nature

(London) 300, 276-279.4. Guillemin, R., Brazeau, P., Bohlen, P., Esch, F., Ling, N. &

Wehrenberg, W. (1982) Science 218, 585-587.5. Bloch, B., Brazeau, P., Ling, N., Bohlen, P., Esch, F., Weh-

renberg, W., Benoit, R., Bloom, F. & Guillemin, R. (1983) Na-ture (London) 301, 607-608.

6. Merchenthaler, I., Vigh, S., Schally, A. V. & Petrusz, P.(1984) Endocrinology 114, 1082-1085.

7. Mayo, K. E., Vale, W., Rivier, J., Rosenfeld, M. G. & Evans,R. M. (1983) Nature (London) 306, 86-88.

8. Gubler, U., Monahan, J. J., Lomedico, P. T., Bhatt, R. S.,Collier, K. J., Hoffman, B. J., Bohlen, P., Esch, F., Ling, N.,Zeytin, F., Brazeau, P., Poonian, M. S. & Gage, L. P. (1983)Proc. Natl. Acad. Sci. USA 80, 4311-4314.

9. Ravetch, J. V., Siebeulist, U., Korsmeyer, S., Woldmaunt, T.& Leder, P. (1981) Cell 27, 583-591.

10. Hohn, B. & Collins, J. (1980) Gene 11, 291-298.11. Maniatis, T., Fritsch, E. F. & Sambrook, J. (1982) Molecular

Cloning: A Laboratory Manual (Cold Spring Harbor Labora-tory, Cold Spring Harbor, NY).

12. Maxam, A. M. & Gilbert, W. (1980) Methods Enzymol. 65,499-560.

13. Meinkoth, J. & Wahl, G. (1984) Anal. Biochem. 138, 267-284.14. Agarwal, K. L., Brunstedt, J. & Noyes, B. E. (1981) J. Biol.

Chem. 256, 1023-1028.15. Berk, A. J. & Sharp, P. A. (1977) Cell 12, 721-732.16. Lebo, R. V., Gorin, F., Fletterick, R. J., Kao, F-T., Cheung,

M-C., Bruce, B. D. & Kan, Y. W. (1984) Science, 225, 57-59.17. Lebo, R. V. & Bastian, A. M. (1982) Cytometry 3, 213-219.18. Lawn, R. M., Fritsch, E. F., Parker, R. C., Blake, G. & Man-

iatis, T. (1978) Cell 15, 1157-1174.19. Breathnach, R. & Chambon, P. (1981) Annu. Rev. Biochem.

50, 349-383.20. Benoist, C., O'Hare, K., Breathnach, R. & Chambon, P.

(1980) Nucleic Acids Res. 8, 127-142.21. Proudfoot, N. J. & Brownlee, G. G. (1976) Nature (London)

263, 211-214.22. Houck, C. M., Rinehart, F. P. & Schmid, C. W. (1979) J. Mol.

Biol. 132, 289-306.23. Sawiki, S. G., Jelinek, W. & Darnell, J. E. (1977) J. Mol. Biol.

113, 219-235.24. Corden, J., Wasylyk, A., Buchwalder, A., Sassone-Corsi, P.,

Kedinger, C. & Chambon, P. (1980) Science 209, 1406-1414.25. Blake, C. (1983) Nature (London) 306, 535-537.26. DeNoto, F. M., Moore, D. D. & Goodman, H. M. (1981) Nu-

cleic Acids Res. 9, 3719-3730.27. Barinaga, M., Yamamoto, G., Rivier, C., Vale, W., Evans, R.

& Rosenfeld, M. G. (1983) Nature (London) 306, 84-85.28. Phillips, J. A., III (1983) in Recombinant DNA Applications to

Human Disease, eds. Caskey, C. T. & White, R. L. (ColdSpring Harbor Laboratory, Cold Spring Harbor, NY), pp. 305-315.

29. McKusick, V. A. (1984) Clin. Genet. 25, 88-123.

Biochemistry: Mayo et aL

Dow

nloa

ded

by g

uest

on

Nov

embe

r 27

, 202

0