Communication Vol. 267 No. 33 Issue THE .JOURNAL OF … · Communication Vol. 267 No. 33 Issue of...

4
Communication Vol. 267 No. 33 Issue of November 25, pp. 23471-23474.1992 8 1992 by The American’ Societ; for Biochemistry and Molecular Biology, Inc. Printed in USA. THE .JOURNAL OF BIOLOGICAL CHEMISTRY A Human Polyadenylation Factor Is a G Protein @-Subunit Homologue* (Received for publication, August 7, 1992) Yoshio Takagaki and James L. Manley From the Department of Biological Sciences, Columbia Uniuersity, New York, New York 10027 Cleavage stimulation factor (CstF) is one of the mul- tiple factors required for polyadenylation of mamma- lian pre-mRNAs in vitro. We have shown previously that this factor is composed of three distinct subunits of 77, 64, and 50 kDa, and that the 64-kDa subunit can be UV-cross-linked to RNA in a polyadenylation signal (AAUAAA)-dependent manner. By molecular cloning, the 64-kDa subunit was shown to contain a ribonucleoprotein-type RNA binding domain and a novel repeat structure. To study the functions of the other subunits, we have now isolated cDNAs encoding the BO-kDa subunit of human CstF. This subunit shares extensive homology with mammalian G protein &sub- units and has a characteristic repeat structure (trans- ducin repeat), in which an -44-amino acid-long se- quence is repeated seven times. To our knowledge, the 50-kDa subunit is the first example of a functional 8- subunit-like protein in vertebrates. Possible roles of the transducin repeat, both in CstF function specifi- cally and in other @-subunit homologues more gener- ally, are discussed. Polyadenylation of an RNA polymerase I1 transcript occurs in a two-step reaction (forreviews, see Refs. 1 and 2). A pre- mRNA is first endonucleolytically cleaved at thepolyadenyl- ationsite, which is located 10-30 nucleotides(nt)’ down- stream of the polyadenylation signal sequence AAUAAA (for review, see Ref. 3), and a poly(A) stretch of 200-300 nt is then added to the 3’-end of the upstream cleavage product. It has been shown that four separable factors are necessary to cleave an SV40 late pre-mRNA (4). Only one of these, the multisub- unit cleavage-polyadenylation specificity factor (CPSF; Refs. 4-8), which interacts directly with the AAUAAA sequence (9, lo), is also required for the poly(A)- addition reaction. Cleav- age factors I (CFI) and I1 (CFII) and cleavage stimulation factor (CstF) are necessary only for cleavage (4-6). Poly(A) polymerase (11-13) functions with CPSF to add poly(A) * This work was supported by National Institutes of Health Grant GM-28983. The costs of publication of this article were defrayed in part by the payment of page charges. This article must thereforebe hereby marked “aduertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) L02547. The abbreviations used are: nt, nucleotide(s);SV40, simian virus 40; CPSF, cleavage-polyadenylation specificity factor; CFI, cleavage factor I; CFII, cleavage factor 11; CstF, cleavage stimulation factor; mAb, monoclonal antibody; RNP, ribonucleoprotein; kb, kilobase(s); MHC, major histocompatibility complex; snRNP, small nuclear ri- bonucleoprotein particle. stretches to the 3’-ends of the upstream cleavage products, and is also required for cleavage of several other pre-mRNAs CstF has been purified to homogeneity from HeLa cell nuclear extracts (16, 17). CstF is composed of three subunits with estimated sizes of 77,64, and 50 kDa. By immunoprecip- itation with a monoclonal antibody (mAb) against the 64-kDa subunit, it was shown (16) that this polypeptide can be UV- cross-linkedtopre-mRNAsinan AAUAAA sequence-de- pendent manner (18, 19). Since both CPSF and CstF are required for this specific UV-cross-linking and for the for- mation of a stable complex on the pre-mRNA (6, 8, 17, 20, 21), CstF must interact with both the pre-mRNA and CPSF, thereby stabilizing the interaction between the pre-mRNA and CPSF. To understand fully how CstF participates in 3’- end formation, it is essential to study the structure and the function of each of the three subunits.By molecular cloning, we have shown that the 64-kDa subunit contains a ribonucle- oprotein (RNP)-type RNA binding domain in the N-terminal region and a novel repeat structure predicted to form a long stable a-helix in the C-terminal region (22). In this report, we describe the isolation of cDNAs encoding the 50-kDa subunit and the determination of its primary structure. We also discuss possible functions of the 50-kDa subunit in the cleavage reaction based on its characteristic structural fea- tures. (4-8, 14, 15). EXPERIMENTAL PROCEDURES Determination of Amino Acid Sequences-Internal amino acid se- quences of the 50-kDa subunit of CstF were determined as described previously (23). In brief, -100 pg of CstF purified through Mono S chromatography (16) was fractionated on an SDS-polyacrylamide (10%) gel. After staining with Coomassie Brilliant Blue, the 50-kDa subunit protein was cut out and partiallydigested in situ with 26 ng of Staphylococcus aureus V8 protease (Sigma). The digestion products were fractionated on an SDS-polyacrylamide (13.5%) gel, transferred toanImmobilon-Pmembrane(Millipore)andstainedwith Coo- massie Brilliant Blue as described (24). Protein bands with estimated molecular masses of 35 kDa (-3.0 pg) and 40 kDa (-2.0 pg) were cut out and subjected to peptide sequencing using a model 470A gas phase sequenator (Applied Biosystems). Amino acid sequences VI and VI1 were obtained for the 35- and 40-kDa polypeptides, respectively (Fig. 2, boxes): sequence VI, TAQQNME(N)(H/N)PVI(R/G)XLY; se- quence VII, TXYVTSHKGPX(R)VATYSX,where uncertain resi- dues are indicated by X or in parentheses. Cloning of the 50-kDa Subunit cDNAs-Oligonucleotides I and 11, corresponding to the parts of the amino acid sequences VI (TAQQNME) and VI1 (YVTSHKG), respectively, were synthesized using Gene Assembler Plus (Pharmacia LKB Biotechnology Inc.). 5“ACNGCNCAA/GCAA/GAAC/TATGGA-3’ OLIGONUCLEOTIDE I 5”TAC/TGTNACNA/TC/GNCAC/TAAA/GGG-3’ OLIGONUCLEOTIDE I1 To obtain cDNAs encoding the 50-kDa subunit of CstF, 1 X 10‘ plaques from a HeLa cell cDNA library in the AZAPII vector (Stra- tagene) were screened with a 20-mer oligonucleotide of 128-fold degeneracy (Oligonucleotide I) as described (25). Hybridization was carried out at 50 “C for 36 h, and the filters were washed twice in 6 X SSC at 50 “C for 30 min each. Twenty-nine positive clones were plaque-purified(26),and cDNA inserts were excised in uiuo and subjected to Southern blot analysis with Oligonucleotide 11. The structures of cDNAs from 16 positive clones were analyzed by restric- tion digestion and partial nucleotide sequencing. pZ50-19, one of the 23471

Transcript of Communication Vol. 267 No. 33 Issue THE .JOURNAL OF … · Communication Vol. 267 No. 33 Issue of...

Communication Vol. 267 No. 33 Issue of November 25, pp. 23471-23474.1992 8 1992 by The American’ Societ; for Biochemistry and Molecular Biology, Inc.

Printed in U S A .

THE .JOURNAL OF BIOLOGICAL CHEMISTRY

A Human Polyadenylation Factor Is a G Protein @-Subunit Homologue*

(Received for publication, August 7, 1992)

Yoshio Takagaki and James L. Manley From the Department of Biological Sciences, Columbia Uniuersity, New York, New York 10027

Cleavage stimulation factor (CstF) is one of the mul- tiple factors required for polyadenylation of mamma- lian pre-mRNAs in vitro. We have shown previously that this factor is composed of three distinct subunits of 77, 64, and 50 kDa, and that the 64-kDa subunit can be UV-cross-linked to RNA in a polyadenylation signal (AAUAAA)-dependent manner. By molecular cloning, the 64-kDa subunit was shown to contain a ribonucleoprotein-type RNA binding domain and a novel repeat structure. To study the functions of the other subunits, we have now isolated cDNAs encoding the BO-kDa subunit of human CstF. This subunit shares extensive homology with mammalian G protein &sub- units and has a characteristic repeat structure (trans- ducin repeat), in which an -44-amino acid-long se- quence is repeated seven times. To our knowledge, the 50-kDa subunit is the first example of a functional 8- subunit-like protein in vertebrates. Possible roles of the transducin repeat, both in CstF function specifi- cally and in other @-subunit homologues more gener- ally, are discussed.

Polyadenylation of an RNA polymerase I1 transcript occurs in a two-step reaction (for reviews, see Refs. 1 and 2). A pre- mRNA is first endonucleolytically cleaved at the polyadenyl- ation site, which is located 10-30 nucleotides (nt)’ down- stream of the polyadenylation signal sequence AAUAAA (for review, see Ref. 3), and a poly(A) stretch of 200-300 nt is then added to the 3’-end of the upstream cleavage product. It has been shown that four separable factors are necessary to cleave an SV40 late pre-mRNA (4). Only one of these, the multisub- unit cleavage-polyadenylation specificity factor (CPSF; Refs. 4-8), which interacts directly with the AAUAAA sequence (9, lo), is also required for the poly(A)- addition reaction. Cleav- age factors I (CFI) and I1 (CFII) and cleavage stimulation factor (CstF) are necessary only for cleavage (4-6). Poly(A) polymerase (11-13) functions with CPSF to add poly(A)

* This work was supported by National Institutes of Health Grant GM-28983. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) L02547.

The abbreviations used are: nt, nucleotide(s); SV40, simian virus 40; CPSF, cleavage-polyadenylation specificity factor; CFI, cleavage factor I; CFII, cleavage factor 11; CstF, cleavage stimulation factor; mAb, monoclonal antibody; RNP, ribonucleoprotein; kb, kilobase(s); MHC, major histocompatibility complex; snRNP, small nuclear ri- bonucleoprotein particle.

stretches to the 3’-ends of the upstream cleavage products, and is also required for cleavage of several other pre-mRNAs

CstF has been purified to homogeneity from HeLa cell nuclear extracts (16, 17). CstF is composed of three subunits with estimated sizes of 77,64, and 50 kDa. By immunoprecip- itation with a monoclonal antibody (mAb) against the 64-kDa subunit, it was shown (16) that this polypeptide can be UV- cross-linked to pre-mRNAs in an AAUAAA sequence-de- pendent manner (18, 19). Since both CPSF and CstF are required for this specific UV-cross-linking and for the for- mation of a stable complex on the pre-mRNA (6, 8, 17, 20, 21), CstF must interact with both the pre-mRNA and CPSF, thereby stabilizing the interaction between the pre-mRNA and CPSF. To understand fully how CstF participates in 3’- end formation, it is essential to study the structure and the function of each of the three subunits. By molecular cloning, we have shown that the 64-kDa subunit contains a ribonucle- oprotein (RNP)-type RNA binding domain in the N-terminal region and a novel repeat structure predicted to form a long stable a-helix in the C-terminal region (22). In this report, we describe the isolation of cDNAs encoding the 50-kDa subunit and the determination of its primary structure. We also discuss possible functions of the 50-kDa subunit in the cleavage reaction based on its characteristic structural fea- tures.

(4-8, 14, 15).

EXPERIMENTAL PROCEDURES

Determination of Amino Acid Sequences-Internal amino acid se- quences of the 50-kDa subunit of CstF were determined as described previously (23). In brief, -100 pg of CstF purified through Mono S chromatography (16) was fractionated on an SDS-polyacrylamide (10%) gel. After staining with Coomassie Brilliant Blue, the 50-kDa subunit protein was cut out and partially digested in situ with 26 ng of Staphylococcus aureus V8 protease (Sigma). The digestion products were fractionated on an SDS-polyacrylamide (13.5%) gel, transferred to an Immobilon-P membrane (Millipore) and stained with Coo- massie Brilliant Blue as described (24). Protein bands with estimated molecular masses of 35 kDa (-3.0 pg) and 40 kDa (-2.0 pg) were cut out and subjected to peptide sequencing using a model 470A gas phase sequenator (Applied Biosystems). Amino acid sequences VI and VI1 were obtained for the 35- and 40-kDa polypeptides, respectively (Fig. 2, boxes): sequence VI, TAQQNME(N)(H/N)PVI(R/G)XLY; se- quence VII, TXYVTSHKGPX(R)VATYSX, where uncertain resi- dues are indicated by X or in parentheses.

Cloning of the 50-kDa Subunit cDNAs-Oligonucleotides I and 11, corresponding to the parts of the amino acid sequences VI (TAQQNME) and VI1 (YVTSHKG), respectively, were synthesized using Gene Assembler Plus (Pharmacia LKB Biotechnology Inc.).

5“ACNGCNCAA/GCAA/GAAC/TATGGA-3’ OLIGONUCLEOTIDE I

5”TAC/TGTNACNA/TC/GNCAC/TAAA/GGG-3’ OLIGONUCLEOTIDE I1

To obtain cDNAs encoding the 50-kDa subunit of CstF, 1 X 10‘ plaques from a HeLa cell cDNA library in the AZAPII vector (Stra- tagene) were screened with a 20-mer oligonucleotide of 128-fold degeneracy (Oligonucleotide I) as described (25). Hybridization was carried out a t 50 “C for 36 h, and the filters were washed twice in 6 X SSC a t 50 “C for 30 min each. Twenty-nine positive clones were plaque-purified (26), and cDNA inserts were excised in uiuo and subjected to Southern blot analysis with Oligonucleotide 11. The structures of cDNAs from 16 positive clones were analyzed by restric- tion digestion and partial nucleotide sequencing. pZ50-19, one of the

23471

23472 The 50-kDa Subunit of Human Polyadenylation Factor CstF positive clones containing the longest cDNA insert, was subjected to complete sequence analysis on both strands with Sequenase 2.0 (U. S. Biochemical Corp.) (27).

Analysis of an in Vitro Translation Product-Capped mRNA syn- thesized in vitro from pZ50-19 was translated in vitro with [%]Met (Amersham Corp.) using a rabbit reticulocyte lysate (Promega) ac- cording to the manufacturer's instructions. Immunoprecipitation of an in vitro translation product was performed as described (16).

Northern Blot Analysis-Five micrograms of poly(A)' RNA iso- lated from HeLa cells was treated with glyoxal and fractionated on a 1% agarose gel in 10 mM sodium phosphate (pH 7.0) (26). After transfer to a nitrocellulose membrane, RNA was hybridized with a random primer-labeled (28), 1,399-nt-long FspI-KpnI fragment (nt -126 to +1,273) derived from pZ50-19 (Fig. 2) in a solution containing 50% formamide a t 42 "C for 18 h (26). The membrane was washed twice in 0.1 X SSPE and 0.1% SDS at 60 "C for 30 min each.

Southern Blot Analysis-Ten micrograms of genomic DNA isolated from HeLa cells (26) was digested with restriction enzymes indicated in Fig. 3B and fractionated on a 0.7% agarose gel. After transfer to a nitrocellulose membrane, DNA was hybridized with a random primer- labeled, 861-nt-long FspI-EcoRV fragment (nt -126 to +735) in a solution described above but without formamide a t 55 "C for 18 h. The membrane was washed twice in 1 X SSC and 0.1% SDS a t 50 "C for 30 min each.

RESULTS AND DISCUSSION

We previously described the purification of CstF to near homogeneity from HeLa cell nuclear extracts (16). As a first step toward obtaining a cDNA encoding the 50-kDa subunit, we set out to obtain partial amino acid sequence. Since the N terminus of the protein was blocked, we determined internal amino acid sequences after partial digestion of the 50-kDa subunit with S. uureus V8 protease (23). Mixtures of 20-mer oligonucleotides based on these sequences were then synthe- sized and used to screen a HeLa cell cDNA library (25). Positive clones were analyzed by restriction digestion, and the one containing the longest cDNA insert (pZ50-19) was sub- jected to further analysis.

When the RNA transcript produced from pZ50-19 was translated i n vitro, only a single major product with an esti- mated molecular mass of -50 kDa was detected (data not shown). This protein was immunoprecipitated with a specific anti-50-kDa mAb (Ref. 16; Fig. 1, lune I), but not with a nonspecific anti-polyoma large T antigen mAb (lane 2) , strongly suggesting that the pZ50-19 indeed encodes the 50- kDa subunit protein. Nucleotide sequence analysis of pZ50-

1 2 92.5kD- 69 -

46 - 4 -

30 -

FIG. 1. Characterization of the protein encoded by a 50- kDa subunit cDNA. An RNA transcript from pZ50-19 was trans- lated in vitro and the reaction product was immunoprecipitated with an anti-50-kDa subunit mAb (lane I ) or with an anti-polyoma large T antigen mAb KF4 (lane 2). Immunoprecipitated protein was frac- tionated on an SDS-polyacrylamide (10%) gel. Positions of the pro- tein size markers are indicated in kilodaltons on the left. Arrowhead indicates the position of the 50-kDa protein.

19 revealed that the cDNA is composed of 1,801 base pairs excluding a poly(A) stretch, which is located 18 nt down- stream of the putative polyadenylation signal ATTAAA (nt 1,598-1,603) (Fig. 2). The size of the cDNA corresponds well to that of the mRNA determined by Northern blot analysis (-2.1 kb; see below) assuming that the mRNA has a poly(A) stretch of 200-300 nt. Only a single long open reading frame starting with the putative translation initiation codon ATG (nt 1-3) was found. The 181-nt-long 5"untranslated region contains a translation termination codon (TAG) in the same reading frame (nt -102 to -100). The two amino acid se-

aqaqqaqcqqqaccqarcqacaq -159

c q c d q c q q L c q c : : q q c q c s c t r : c d q c q : q c q c a q : q a a q q q ~ ~ q q ~ ~ ~ ~ q ~ q ~ ~ : - 8 0

C C d : : t : t C C d q q d q d q d q C q q q ~ t ~ = = ~ ~ q ~ q ~ ~ = ~ q q d q ~ ~ q q ~ ~ q q q * ~ ~ ~ : q ~ = t t = = t t t ~ ~ t = = ~ ~ q -1

met cy' arq t h * 1yr "a! q1y leu !YS asp arq q l n q l n le" t y r 1 y t If" Ile Ile ter 20 a t 9 cac dqd acc dad 9:q qqc t t q aaq qac cqc sag caq c c c fac aaq c tq a t c d t t aqc 63

gin !e" l e u t q c asp q1y cyr ,!e I E ~ i l e a l a as" q!y leu i!e asn q l u iie l y r pro 40 c a q c ' q C._d :*: qac qqc tac a:= *qc a z c q c c a a t qqc CLC a z c ddL qaa I L C aaq CCL 120

- qln re r "01 Cyr a l a PZO ser q l u q!n !eu l e u h!$ !eu Ilc lys l e u q l y met q l u dsn 60 csq :=I- qcq :qL qca ccc ccq qaq caq CLC c:q cat ccc a t c add c:c qqa a:q qaa daC I80

qa: q a c dcc qca 9:: caq ta: qca a:t qq: cqc :ca q a t act q t t q c c CC: qqc aca qqq 2 4 0 asp asp :hr a!* V a l q!n Ly? a l a i l e q l y drq rer a l p Lhr val ala pro q l y :hr q l y 80

ilc asp leu q!u phe a l p a l a asp v a l q!n th r metEe. pro q!u a l a oer q l u c y r q l u 100

TR1 dZL qac c:q gild :ti qat qca qa: 9:: caq ac: at9 :cc CCd qaq 9 C t tCt qaq tac qa* 300

ICh? cy?. r y r "a! :>: so: 5;s :ys q l y pro c y * arq "a: a:a :!I: :qr Per arq a * p q l y 120 . I C ~ 1qc I d : q ' c dcd I-., c a r a d d oqa CCI :qc cq: q:a qc: dcc :a: aq: aqd qa:'qqa 360

qhn :FI; i ~ e a:a :hr q l y r e r a l a a s p a!- se: ~ y r ile l e u a sd:nr 91" arq re: 1 4 0 caq "La a c a 9 C L dCI qqq "CL qc: qa: qcr ccq aca daq a r a C L C qac aca q*q aqq acq 4 2 0

c

t t q qcc dad aqr qcc acq C C ~ aiza qaq q c c a t q a:q a a t qaq .KC q c a c a n can 480 le" d i d !yr le- dld ae: pro !le q12 " d l W L met AS" q l " ) L h . a l a 9!r. q:n arnj-ec 160

4:" d l r h i 5 p:o va: ;le a r q :Lr :CJ :y: a s p T R 2 v a ! asp q!u v a l t h r c y % leu a l a :80 7"" d , l C c a c CC.3 $ ' q *:: 'i" dC: C" J q a c Cd: qcq qat qaa qtc acq tqc Ctt QCL 540

* * . * p j - 7 1 y s P I 0 $e l * : a i y r arq I!* phs ; y s t q r 11r q:n q!" a l a 9!" ne: !e" 773

p h c h:l pro Lh: q l u q!r ile !eu a l a re: q!y lor drq asp zyr t h r leu !yr :CY phr 7 3 3 :IC c a c cca dcd qaa caq X c Ccq qc: tc: qq: t cd dqq qa: :a: ac t C:t dad z z a ::r 60C

TR3 qd: . d C :cc a d d C C ~ t c a qcd add aqa q c c :LC dad L ~ C d i t caq qaa qc: g a d d i q : t d 665

a19 S O Y :le ser phC ?is prc rer q l y a l p phe i:e !eu va: q!y :hr q l n h i s p10 :hr 74C cq: t c c a:c cc: i:: ca: CCL LC: qqa qac ::: *:a c:L q t c qqa dCL caq c.3: cc: dCL 1?3

TR4 !e> arq LPL: Zyr a s p le 1 5 - :?r phe q!n c y $ phe v a l r e r c y , asn pro q l n a s p q!n 260 c:: cqc c:L La" qa:2rc ddC acc :LC Cdd cq: :LC q t c :c: cqc d d t ccz cad q a t cad 7RS

CK KC qa: qc: aca t q t :cc q:t ad: :ac ad: :c: aq: qcc ad: a:q L I C 9:a qqa n40

rer !yS asp q!y c y s i:e !ys !e;i ::p a s p q!y v d r e r asp arq cyr ile Lhr : h i phe 33C aqc an9 qac q9c t q c a:c aaa r i a cqq qa.Iqq: q:; :ca ad: cod t q c acc aca de: :c: 905

at: C L C :ca aq: qqa dad 9 d C cc: 9Ld OC" add crz :qq qaA;a :c1 acq qqa cqa aca 1020

tis t3.I asp a!a >!e cy3 re: "a! as? Zyr d6n r e r rer a l a d l n met t y r v a l :hr q!y 283

TR5

q!L: l Y P a l a h i s asp q!y d i d q i i ; "dl cy$ 3er d i d iie PhC rer : y * as" *er l y r t y r 373 qaq add 9ca ca t qac qqt 9c11 qda qc: t q t c c t q c c ~ Z C :t: :cc add ad: t c t dad L ~ C 96C

l ! E :CY fer I C - 9 ! y :ys a l p rer Val a ? * l y s :cv t r p q i a 11E %c7 Chr q l y drq :hr 340 TR6

c:q q c c aqa :x acq qqc qcq qq: t:a aqt qqa cqc caq qtq cqq aca cdq q ~ q lono :oil vai a-9 ZYI t h r q!y a!d q!y !OY SCI q:y arq q:c va! h i s arq t h r q l n ala v a l 360

phr asp h i s t h r q l i asp :YT va! !el; lei; pro asp 9:" arq :hr i l e ser l e u c y s cyr 380 _.L ddC cac dCC qaq q*c :a: q:q t i 9 czq ccc qac qaq aqq acq acc aqt CLL LqC LqC 1140 ._. :.p ..z*... a r q t3: a l a q 1 3 arq arq dl3 isu ieu ley lei q l y >is as" d*" ile "a1 400

TR7 Lqq 9.C ccq aqq dCd qcc qaq cqq aqa ddC ccq C L q i c q t i q qqq coc ddC aac dLL qca :2co

arq cy* IlE "a: l i s rer pro :?I asn pro q1y phe re< rnr cyr *er asp *sp phe arq 4 2 0 cqc :qc d r d q:q cdc ccc ccc dcc ddc ccc qqq c t c a t q acq t q c aqc q a t qac c t c aqa 1260

qcq cqq :LC :qq tac cqq aqa ccq acc act qac zqa q ~ c a c c c z c ~ c c q c d q q q c c c - ~ ~ c f c q a q q 1327 4 3 1

d ~ i ~ : d ~ ~ ~ : ~ ~ C ~ ~ ~ ~ ~ d ~ q : = ~ : q : = : ~ ~ q ~ : q = ~ q : ~ q t ~ ~ q ~ ~ ~ q ~ q ~ ~ ~ ~ ~ ~ ~ ~ : ~ q ~ ~ q ~ ~ : : q ~ ~ q = = ~ = ~ t 1406

~ : q : ~ ~ d ~ d : t ~ L : ~ T : q q d : t : q - d L d d d d q d d L C L L r Z t L Z d ~ ~ q ~ t q q ~ ~ ~ 1485

C 1 C d q d Z C : q - q C d q t t C - a = ~ ~ ~ = ~ ~ ~ q ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - q c q ~ : t ~ t ~ ~ : ~ q q : ~ ~ t q ~ ~ ~ q ~ ~ ~ ~ q q ~ ~ ~ : q = = q ~ ~ 1564

c ~ t : : t q d : ~ : ~ ~ L q d : : q d d q q d q q ~ - d q q ~ : ~ q q ~ ~ ~ : ~ ~ ~ ~ q ~ q ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ q ~ 9 9 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ 1613

1652

a!a 1 - 9 phe Z I P arq r e r Lh: C h l asp -

ddanaaaad

- FIG. 2. Sequences of the 50-kDa subunit cDNA and the

predicted protein. Nucleotides (lower lines) and amino acids (upper lines) are numbered on the right, starting with the translation initi- ation site. Translation initiation codon (ATG), termination codon (TGA), and polyadenylation signal (ATTAAA) are underlined. The seven homologous segments that constitute the transducin repeats (Fig. 4) are indicated by brackets and numbered on the top (TRI- TR7). The hydrophobic amino acid-rich domain is underlined. The amino acid sequences corresponding to those obtained by peptide sequencing are boxed.

The 50-kDa Subuni t of Human Polyadenylation Factor CstF 23473

quences obtained by peptide sequencing were also found in the same reading frame (Fig. 2, boxes), confirming that this cDNA encodes the 50-kDa subunit of CstF. The 50-kDa subunit is composed of 431 amino acids, and the predicted molecular weight is 48,326.

When the expression of the 50 kDa subunit mRNA was examined by Northern blot analysis using HeLa cell poly(A)+ RNA, only a single band of -2.1 kb was detected (Fig. 3A). In addition, Southern blot analysis of human genomic DNA shows that only single DNA fragments hybridized with a cDNA fragment after digesting with two different enzymes even under less stringent hybridization conditions (Fig. 3B). These results suggest that the 50-kDa subunit is encoded by a single gene.

A search for homologous proteins in the GenBank data base (June, 1992) using the FASTA program (29) revealed that the C-terminal two-thirds (residues 160-425) and a short N-terminal segment (93-136) of the 50-kDa subunit share significant homology with the @-subunits of human and bo- vine G protein/transducin. Altogether the 50-kDa subunit exhibits an identity of -19.8% and a similarity of -34.4% to the three types of human @-subunits (@1-@3; Refs. 30-33) in these regions. The 50-kDa subunit also shares shorter extent of homology with STE4, a yeast G protein @-subunit (34), and other @-subunit-like proteins such as the chicken major his- tocompatibility complex (MHC)-linked protein (35), the Dro- sophih Enhancer of Split (36), yeast CDC4 (37), PRP4 (38, 39), MSIl (40), and TUPl (41).

The Dl-subunit of G proteins has a characteristic repeat structure (transducin repeat) composed of eight homologous segments (30). Similar structural motifs composed of differing numbers of repeat segments have been found in the C-termi- nal regions of all the proteins mentioned above (40, 42, 43). In the CstF 50-kDa polypeptide, such homologous segments are repeated seven times (Fig. a), although there is a 23- residue sequence between the first and second repeat segments (137-159) that does not conform to the repeat structure. Based on this alignment, we derived a consensus sequence (Fig. 4B) and compared it with a consensus compiled for several of the other proteins mentioned above, including the G protein @I-subunit, STE4, Enhancer of Split, CDC4, and PRP4 (Ref. 42; Fig. 4C). Although nine positions are specific to the 50-kDa subunit and three to these other proteins, the

A B 1 2 3

4.1 kb-

3.1 - 2.0 - 8.1

6.1 5.1 4.1

23.1 kb - 12.2 '

1.0 -

3.1 -

FIG. 3. Northern and Southern blot analysis. A , Northern blot analysis of HeLa cell RNA. Five micrograms of poly(A)' RNA was fractionated on a 1% agarose gel, transferred to a nitrocellulose membrane, and hybridized with an FspI-KpnI fragment derived from pZ50-19. Positions of the DNA size markers are indicated in kilobases on the left. Arrowhead indicates the position of the 50-kDa subunit mRNA. R, Southern blot analysis of HeLa cell genomic DNA. Ten micrograms of genomic DNA was digested with BamHI (lane I), EcoRI ( l a n e 2 ) , or Hind111 (lane 3 ) . fractionated on a 0.7% agarose gel, and transferred to a nitrocellulose membrane. DNA was hybrid- ized with an FspI-EcoRV fragment.

FIG. 4. Transducin repeat in the BO-kDa subunit. A , seven homologous segments with an average length of -44 amino acid residues are optimally aligned and numbered on the left. The number of additional residues present between the first and second repeat segments is shown in parentheses on the left. The first and the last residues of each repeat segment are numbered in parentheses on the right. Identical or similar residues that appear at the same positions four or more times are boxed and indicated in R as a consensus. C, a consensus sequence compiled for G protein /31-subunit, Enhancer of Split, STE4, CDC4, and PRP4 (42).

C-terminal third of the repeat structure is very well conserved and greater than 58% of the residues present at appropriate positions in the 50-kDa protein conform to the consensus sequence derived for the other transducin repeat proteins. These findings confirm that the 50-kDa subunit belongs to a group of proteins which show significant homology to G protein @-subunits. Besides the chicken MHC-linked gene (35), whose function has not been identified, the 50-kDa subunit is the first example of a functional @-subunit-like protein in vertebrates.

Mammalian G protein @-subunits (44, 45) and probably STE4, a yeast G protein @-subunit (34), form heterotrimeric protein complexes with a(GPA1) and y(STE18)-subunits, and these complexes play important roles in signal transduc- tion pathways (44, 45). The a-subunit bound with GTP dis- sociates from by-complex in response to the binding of a ligand to the receptor, and affects the functions of target molecules. When GTP is hydrolyzed to GDP, the a-subunit and By-complex reassociate. Although CstF is also composed of three distinct subunits (16, 17), several lines of evidence indicate that it does not belong to the G protein family. First, while G proteins are associated with the plasma membrane (44, 45), the 50- and 64-kDa subunits of CstF were detected only in the nucleus by indirect immunofluorescence micros- copy (16). Second, the sizes of the two other subunits of CstF (77 and 64-kDa) are quite different from those of a (40-46 kDa) and y (8-10 kDa)-subunits of G proteins (44,45). Third, no homology in amino acid sequences exists between these subunit proteins (22, 44-47).' Finally, GTP is required for activation of G proteins (44, 45) but has no effects on 3' cleavage activity, which is detectable in the presence of ATP and creatine phosphate (48, 49).'

Other non-G protein p-subunit-like proteins carry out var- ious biological functions probably in a different manner from that of G protein D-subunits. MSIl (40) and TUPl (41) regulate gene expression, and Enhancer of Split (36), CDC4 (37), and PRP4 (38, 39) are involved in neurogenesis, cell cycle control, and RNA splicing, respectively. While G protein @-subunits consist exclusively of transducin repeats (eight homologous segments), the @-subunit-like proteins contain smaller numbers of these repeats (four to seven), which are located in their C-terminal regions. It is intriguing to note that some of these proteins have additional structural motifs such as Pro-rich domains (Enhancer of Split; Ref. 36) and Gln-rich and Thr-rich domains (TUP1; Ref. 41) in their N- terminal regions, suggesting that these structures are respon-

' Y. Takagaki and J. L. Manley, unpublished observation.

23474 The 50-kDa Subunit of Human Polyadenylation Factor CstF

sible for additional functions specific to that particular pro- tein. In this respect, part of the N-terminal region in the 50- kDa subunit (residues 14-35) displays a high content of hydrophobic residues (Fig. 2). However, this region shows no significant homology with any other proteins and its function is not known at present.

Perhaps most relevant to the 50-kDa protein is the yeast PRP4 protein (38, 39). PRP4 is a component of the U4/U6 small nuclear ribonucleoprotein particle (snRNP) that is re- quired for pre-mRNA splicing. A number of protein factors and snRNPs are required for splicing, and a protein-RNA complex that catalyzes the splicing reaction (the spliceosome) is assembled and disassembled in an ordered manner (50,51). It has been demonstrated that PRP4 is required for the association of U4/U6 snRNP with U5 snRNP (52) in an early step of spliceosome assembly. Similar to splicing, several factors are required for mRNA polyadenylation (4-6). As mentioned above, the multisubunit CstF and CPSF interact cooperatively to form a stable complex on the pre-mRNA (6, 8, 17, 20, 21). Additional factors (CFI, CFII and poly(A) polymerase; Refs. 4-6) are required for the complete reaction to occur. Regarding the functions of CstF subunits, the 64- kDa subunit contains an RNP-type RNA binding domain (22) and binds RNA (16,17), and the 77-kDa subunit appears necessary for the formation of an intact heterotrimeric CstF complex.' These results suggest that the 50-kDa subunit may be responsible for the interaction of CstF with other factors.

The structural similarity between the 50-kDa subunit and PRP4, together with what is known about their functions, suggests the intriguing possibility that the transducin repeats in these proteins may play important roles in protein-protein interactions that occur during these RNA processing reac- tions. These interactions must be reversible as the proteins need to dissociate when the cleavage or splicing reaction concludes. These phenomena are similar in important ways to those of G proteins, where association and dissociation of the a-subunit and py-complex occur in response to the phos- phorylation status of a bound GTP molecule. These observa- tions together suggest a general model for the function of transducin repeats, which is that they participate in specific protein-protein interactions that involve association and dis- sociation in response to the cleavage or formation of a phos- phodiester bond, in RNA or a nucleotide cofactor. Further studies using mutant 50-kDa subunit proteins should be able to identify the target molecule(s) of the protein-protein inter- actions and to define the domains of the 50-kDa protein required for these interactions.

Acknowledgments-We thank C. Prives for supplying mAb KF4, T. E. Kennedy for discussions on peptide sequencing, L. Zhong for excellent technical assistance, and W. Weast for preparing the man- uscript.

REFERENCES

2. Manley, J. L. (1988) Biochim. Biophys. Acta 9 5 0 , 1-12 1. Humphrey, T., and Proudfoot, N. J. (1988) Trends Genet. 4,243-245

3. Proudfoot, N. (1991) Cell 64,671-674 4. Takagaki, Y., Ryner, L. C., and Manley, J. L. (1989) Genes & Den 3,1711-

5. Christofori, G., and Keller, W. (1988) Cell 54 , 875-889 6. Gilmartin, G. M., and Nevins, J. R. (1989) Genes & Deu. 3,2180-2189 7. Bienroth, S., Wahle, E., Suter-Crazzolara, C., and Keller, W. (1991) J. Biol.

8. Murthy, K., and Manley, J. L. (1992) J. Biol. Chem. 2 6 7 , 14804-14811 9. Bardwell, V. J., Wickens, M., Bienroth, S., Keller, W., Sproat, B. S., and

10. Keller, W., Bienroth, S., Lang, K. M., and Christofori, G. (1991) EMBOJ.

11. Wahle, E. (1991) J. Biol. Chem. 266 , 3131-3139 12. Raabe, T., Bollum, F. J., and Manley, J. L. (1991) Nature 353,229-234 13. Wahle, E., Martin, G., Schiltz, E., and Keller, W. (1991) EMBO J. 10 ,

14. Takagaki, Y., Ryner, L. C., and Manley, J. L. (1988) Cell 52,731-742 15. Ryner, L. C., Takagaki, Y., and Manley, J. L. (1989) Mol. Cell. Biol. 9 ,

16. Takagaki Y., Manley, J. L. MacDonald, C. C., Wilusz, J., and Shenk, T.

17. Gilmartin, G. M., and Nevins, J. R. (1991) Mol. Cell. Biol. 11,2432-2438 18. Wilusz, J., and Shenk, T. (1988) Cell 5 2 , 221-228

20. Wilusz, J., Shenk, T., Takagaki, Y., and Manley, J. L. (1990) Mol. Cell. 19. Moore, C. L., Chen, J., and Whoriskey, J. (1988) EMBO J. 7,3159-3169

21. Weiss, E. A,, Gilmartin, G. M., and Nevins, J. R. (1991) EMBO J. 10,215-

22. Takagaki, Y., MacDonald, C. C., Shenk, T., and Manley, J. L. (1992) Proc.

23. Kennedy, T. E., Gawinowicz, M. A., Barzilai, A., Kandel, E. R., and Sweatt,

24. Matsudaira, P. (1987) J. Biol. Chem. 262,10035-10038 25. Wallace, R. B., and Miyada, C. G. (1987) Methods Enzymol. 152,432-442 26. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) Molecular Cloning: A

Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory, Cold

27. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. Spring Harbor, NY

28. Feinberg A. P., and Vogelstein B. (1983) Anal. Biochem. 132,6-13 U. S. A. 74,5463-5467

29. Pearson,'W. R., and Lipman, D. J. (1988) Proc. Natl. Acad. Sci. U. S. A. 85,2444-2448

30. Fong, H. K. W., Hurley, J. B . , Ho kins, R. S., Miake-Lye, R., Johnson, M. S.. Doolittle. R. F.. and Slmon. h. I. (1986) Proc. Natl. Acad. Sci. U. S. A.

1724

Chem. 266,19768-19776

Lamond, A. I. (1991) Cell 6 5 , 125-133

10,4241-4249

4251-4257

4229-4238

(1990) kenes & Den 4,2i12-2120

Biol. 10,1244-1248

219

Natl. Acad. SCL. U. S. A. 89,1403-1407

J. D. (1988) Proc. Natl. Acad. Sci. U. S. A. 8 5 , 7008-7012

31.

32.

33.

34.

35.

36.

83, 2162-2166 . .

Fong, H. K. W., Amatruda, T. T., 111, Birren, B. W., and Simon, M. I.

Gao, B., Gilman, A. G., and Robishaw, J. D. (1987) Proc. Natl. Acad. Sci. (1987) Proc. Natl. Acad. Sci. U. S. A. 84,3792-3796

Levine, M. A., Smallwood, P. M., Moen, P. T., Jr., Helman, L. J., and Ahn, U. S. A. 8 4 , 6122-6125

Whiteway, M., Hougan, L., Dignard, D., Thomas, D. Y., Bell, L., Saari, G. T. G. (1990) Proc. Natl. Acad. Sci. U. S. A. 8 7 , 2329-2333

Guillemot, F., Billault, A., and Auffray, 8. (i98'9) Proc. Natl. Acad. Sci. C., Grant, F. J., O'Hara, P., and MacKa , V L (1989) Cell 56,467-477

Hartley, D. A,, Preiss, A., and Artavanis-Tsakonas, S. (1988) Cell 55,785- U. S. A. 86,4594-4598

795 37. Yochem, J., and Byers, B. (1987) J. Mol. Biol. 195,233-245 38. Bjorn, S. P., Soltyk, A., Beggs, J. D., and Friesen, J. D. (1989) Mol. Cell.

39. Banroques, J., and Abelson, J. N. (1989) Mol. Cell. Biol. 9 , 3710-3719 40. Ruggieri, R., Tanaka, K., Nakafuku, M., Kaziro, Y., Toh-e, A,, and Mat-

41. Fujita, A,, Matsumoto, S., Kuhara, S., Misumi, Y., and Kobayashi, H.

42. Dalrymple, M. A,, Petersen-Bjorn, S., Friesen, J. D., and Beggs, J. D. (1989)

43. Kearsey, S. (1991) Gene (Amst.) 9 8 , 147-148 44. Simon, M. I., Strathman, M. P., and Gautam, N. (1991) Science 252,802-

45. Kaziro. Y.. Itoh. H.. Kozasa. T.. Nakafuku. M.. and Satoh. T. (1991) Annu.

Biol. 9,3698-3709

sumoto, K. (1989) Proc. Natl. Acad. Sci. U. S. A. 86,8778-8782

(1990) Gene (Amst.) 89.93-99

Cell 58,811-812

808

Reo.'Biochem; 6 0 , 349-400 '

I. (1984) Proc. Natl. Acad. Sci. U. S. A. 8 1 , 6948-6952

, , I . .

46. Hurley, J. B., Fong, H. K. W., Teplow, D. B., Dreyer, W. J., and Simon, M.

47. Gautam, N., Baetcher, M., Aebersold, R., and Simon, M. I. (1989) Science 9 A A 471-474

48. Moore, C. L., and Sharp, P. A. (1985) Cell 41,845-855 49. Zarkower. D.. Steuhenson. P.. Sheets. M.. and Wickens. M. (1986) Mol.

""

Cell. ~ i b l . 6 , 23i7-2323 ' '

, .

50. Green, M. R. (1991) Annu. Reu. Cell Biol. 7 , 559-599 51. Rub , S W , and Abelson, J. (1991) Trends Genet. 7 , 79-85 52. Borabnne, R., Banroques. J.. Abelson. J.. and Guthrie. C. (1990) Genes &

Deu. 4 , 1185-1196