The cyanelle str operon from Cyanophora paradoxa: Sequence analysis and phylogenetic implications

13
Plant Molecular Biology 15: 561-573, 1990. © 1990 Kluwer Academic Publishers. Printed in Belgium. 561 The cyanelle str operon from Cyanophora paradoxa: Sequence analysis and phylogenetic implications M. Kraus, M. GOtz und W. Lrffelhardt* Institut fiir Allgemeine Biochemie und Ludwig Boltzmann Forschungsstelle far Biochemie, Wighringerstrafle 38, A-1090 Vienna, Austria (*author for correspondence) Received 22 March 1990; accepted in revised form 11 July 1990 Key words: cyanelle, Cyanophoraparadoxa, plastid evolution, str operon, phylogenetic analysis Abstract The str operon containing the genes for the ribosomal proteins $12 (rps12) and $7 (rps7) and for the elongation factors G (fus) and Tu (tufA) has been characterized for some cyanobacteria and chloroplasts from algae and higher plants. In the case of plastids a stepwise reduction by one and two genes, respectively, has been observed due to gene transfer to the nuclear genome. The nucleotide sequence of the str operon on the cyanelle genome from Cyanophoraparadoxa was determined as a first example for a chlorophyll b-less plastid. It comprises rpsl2, rps7 and tufA which are closely linked and not interrupted by introns. Transcript analysis revealed cotranscription of the two ribosomal protein genes whereas tufA gave rise to a monocistronic mRNA. Phylogenetic studies using these three different traits allowed an assessment of the position of Cyanophora paradoxa among oxygenic photoautotrophs. Introduction The endosymbiotic theory, now generally ac- cepted, postulates symbiotic cyanobacteria as ancestors of photosynthetic organelles [46]. The photosynthetic apparatus of blue-green algae contains chlorophyll a protein complexes and phycobiliproteins for light harvesting. The ancestral position of cyanobacteria can be easily derived for organelles with a similar pigment com- position, for example Rhodophyta (red algae). However, the situation is different for chloro- plasts from green algae or higher plants where the phycobilisomes are replaced by chlorophyll b containing antenna systems [46]. The finding of chl a and b containing photo- synthetic prokaryotes (Prochlorophyta) at first seemed to overcome this drawback [8]. However phylogenetic analysis of 16S rRNA genes placed Prochlorothrix exactly within cyanobacteria [47]. At present still no definitive decision can be made concerning the mono- or polyphyletic origin of the different chloroplast types [20]. Cyanophora paradoxa, a flagellated protist, The nucleotide sequence data reported will appear in the EMBL, GenBank and DDBJ Nucleotide Sequence Databases under the accession number X52497.

Transcript of The cyanelle str operon from Cyanophora paradoxa: Sequence analysis and phylogenetic implications

Page 1: The cyanelle str operon from Cyanophora paradoxa: Sequence analysis and phylogenetic implications

Plant Molecular Biology 15: 561-573, 1990. © 1990 Kluwer Academic Publishers. Printed in Belgium. 561

The cyanelle str operon from Cyanophora paradoxa: Sequence analysis and phylogenetic implications

M. Kraus, M. GOtz und W. Lrffelhardt* Institut fiir Allgemeine Biochemie und Ludwig Boltzmann Forschungsstelle far Biochemie, Wighringerstrafle 38, A-1090 Vienna, Austria (*author for correspondence)

Received 22 March 1990; accepted in revised form 11 July 1990

Key words: cyanelle, Cyanophora paradoxa, plastid evolution, str operon, phylogenetic analysis

Abstract

The str operon containing the genes for the ribosomal proteins $12 (rps12) and $7 (rps7) and for the elongation factors G (fus) and Tu (tufA) has been characterized for some cyanobacteria and chloroplasts from algae and higher plants. In the case of plastids a stepwise reduction by one and two genes, respectively, has been observed due to gene transfer to the nuclear genome.

The nucleotide sequence of the str operon on the cyanelle genome from Cyanophora paradoxa was determined as a first example for a chlorophyll b-less plastid. It comprises rpsl2, rps7 and tufA which are closely linked and not interrupted by introns.

Transcript analysis revealed cotranscription of the two ribosomal protein genes whereas tufA gave rise to a monocistronic mRNA. Phylogenetic studies using these three different traits allowed an assessment of the position of Cyanophora paradoxa among oxygenic photoautotrophs.

Introduction

The endosymbiotic theory, now generally ac- cepted, postulates symbiotic cyanobacteria as ancestors of photosynthetic organelles [46].

The photosynthetic apparatus of blue-green algae contains chlorophyll a protein complexes and phycobiliproteins for light harvesting. The ancestral position of cyanobacteria can be easily derived for organelles with a similar pigment com- position, for example Rhodophyta (red algae). However, the situation is different for chloro-

plasts from green algae or higher plants where the phycobilisomes are replaced by chlorophyll b containing antenna systems [46].

The finding of chl a and b containing photo- synthetic prokaryotes (Prochlorophyta) at first seemed to overcome this drawback [8]. However phylogenetic analysis of 16S rRNA genes placed Prochlorothrix exactly within cyanobacteria [47]. At present still no definitive decision can be made concerning the mono- or polyphyletic origin of the different chloroplast types [20].

Cyanophora paradoxa, a flagellated protist,

The nucleotide sequence data reported will appear in the EMBL, GenBank and DDBJ Nucleotide Sequence Databases under the accession number X52497.

Page 2: The cyanelle str operon from Cyanophora paradoxa: Sequence analysis and phylogenetic implications

562

contains cyanelles, i.e. photosynthetic organelles distinct from chloroplasts [48]. Cyanelles are surrounded by a rudimentary cell wall containing peptidoglycan, a prokaryotic feature not en- countered among organelles of eukaryotic cells, other than the small group of Glaucophyceae [ 1 ]. Murein biosynthesis is performed in a way analogous to that in E. coli. Seven penicillin- binding proteins have recently been demonstrated as components of the cyanelle envelope [ 3 ]. Other ultrastructural features such as concentric un- stacked thylakoids, phycobilisomes and carboxy- somes make cyanelles closely resemble cyanobacteria [16]. On the other hand, the cyanelle genome shows parallels to chloroplasts in size (127 kb) and organization [4, 5, 48].

The distribution of photosynthetic genes between nuclear and plastid genome is very much conserved among higher plants and algae. Even cyanelles show only a few deviations [48].

Components of the translation apparatus are better candidates for variations in that respect. In prokaryotes the genes for ribosomal proteins and translation factors are organized in several operons comprising 2 to 12 genes, a situation well documented with E. coli [28]. While chloroplast genomes encode only about 10 ~o of the organellar polypeptides, ribosomal protein genes (about 20 from a total of 57) are overrepresented on plastid DNA.

They are organized as remnants of the operons mentioned above. The three completely se- quenced chloroplast genomes, from the liverwort Marchantia polymorpha [41 ], the dicot Nicotiana tabacum [40] and the monocot Oryza sativa [23], showed nearly identical sets for genes of the trans- lation apparatus. Marchantia, as compared with the other two, just lacks rps16 but contains instead rp121. A characteristic feature is a large cluster comprising the genes for 10 ribosomal proteins in exactly the same arrangement [40]. In E. coli and presumably also in the cyanobacterial endo- symbiont ancestral to plastids these genes are distributed together with others among the S10, spc and alpha operons. The transfer of the major part of the genes to the nuclear genome left the remainder condensed to this single operon. Very

recently, the composition of the large ribosomal protein gene cluster from cyanelles proved to be quite different from that of plant plastids with 3 ribosomal protein genes missing and 6 genes pre- served from the prokaryotic operons [7, 11, 12, 34]. Unfortunately, no data about the respective cyanobacterial genes are available up to now.

Here we report the elucidation of the nucleotide sequence of the str operon of the cyanelles from Cyanophora paradoxa. The counterparts in E. coli and cyanobacteria contain the genes for the ribo- somal proteins $7 and S 12 and for the elongation factors G and Tu [9, 28, 33].

Two points of view are interesting: First, this operon is well suited to demonstrate the gradual movement of genes from the genome of an endo- symbiont to the host nucleus. The prokaryotic type of organization found in cyanobacteria such as Anacystis nidulans [ 33 ] or Spirulinaplatensis [ 9] (but also in E. coli) is the order 5'-rpsl2-rps7-fus- tufA-3'. In a first step the gene for elongation factor G has been transferred to the nucleus, and thus is not found on any chloroplast genome investigated till now. In a second step, parallel with the development of land plants, the tufA gene has been lost from the plastid genome [2] and for the remaining two genes a complex way o fmRNA maturation via trans- and cis-splicing has evolved [27]. Algal genomes show more variability in the organization of their str operon. In Chlamydo- monas reinhardii rps l 2, rps 7 and tufA are separated from each other [2, 29], in Codium fragile tufA seems not to be linked to the ribosomal genes [32 ]. However, in Euglena gracilis the reduced str operon is preserved with two introns within tufA [35, 36] (Fig. 1).

Now the question is: would the cyanelle str operon correspond to the cyanobacterial struc- ture with fus retained, or rather to the situation found in Euglena or Chlamydomonas, respec- tively?

The second point of interest was the availability of additional traits for phylogenetic sequence comparison, ranging from a gene with a very high degree of conservation (rpsl2) to one with a very low one (rps7). The properties of the tufa gene also render it a good candidate to compare

Page 3: The cyanelle str operon from Cyanophora paradoxa: Sequence analysis and phylogenetic implications

E.coli, Anacystis nidulans , Spirulina platensis

rpsl2 ~ fus tufa

Euglena gracilis

~ ' l - - - - r l ~ - rpsl2 rps7 \ tufA

Nicotiana tabacum

rps12 rps7

Chlamydomonas reinhardii

rpsl2 rps7 tufa

Fig. 1. Organization scheme of the str operon from cyano- bacteria (Anacystis nidulans or Spirulina platensis), algae (Euglena gracilis and Chlamydomonas reinhardii) and plant chloroplasts (e.g. Nicotiana tabacum) visualizing the transfer

of the elongation factor genes to the nucleus.

rRNA-derived phylogenies with those for a pro- tein gene.

Material and methods

Cyanophoraparadoxa (LB 555 UTEX) was grown and cyanelle DNA was isolated as described [26]. Fragment BgllI-11 was obtained from a cyanelle DNA library constructed by cloning Bgl II fragments into the Bam HI site of pEMBL 8 in Escherichia coli (strain 71-18) as a host [19].

Subclones were produced in pUC19 and Bluescript KS + and KS - , using different restric- tion endonucleases (Boehringer) or the 'erase-a- base system' (Promega) working with Exo- nuclease III and S 1-Nuclease [21].

Plasmid DNA was isolated and sequenced using the supercoil sequencing technique [26] with sequencing- and reverse-primer (Boehrin- ger), a set of special oligonucleotides (17- to 22-mer) designed for highly conserved regions of the tufA gene, and four extra synthesized primers (18-mers) for other regions of the cluster.

Cyanelle RNA was isolated and used for northern analysis and primer extension experi- ments as described elsewhere [26].

DNA sequence analysis was preformed using the 'Dave Mount Programm 5.07' from the Univer-

563

sity of Arizona for translation to protein se- quences.

Sequence alignments for further comparisons were done using 'Multalin' with default penalties for gaps; minor adjustments were made by hand [10].

Phylogenetic trees were constructed using dif- ferent programs from 'PHYLIP 3.2' (phylogeny inference package) running on an IBM PS2 or a Digital Mikro-VAX computer [13].

Results

The str-gene cluster is located on fragment Bgl II-11 in the center of the large single-copy region of cyanelle DNA [5] and contains the genes for the ribosomal proteins S 12, $7 and for the elongation factor Tu in the order observed in E. gracilis. A fus gene probe hybridized to Cyanophora nuclear DNA only [30].

The nucleotide and amino acid sequences of the three genes are shown in Fig. 2; the protein coding regions comprise 124 (rpsl2), 156 (rps7) and 409 (tufA) amino acids, respectively.

Upstream of rps12 typical prokaryotic regula- tory elements are found. The promoter shows very high homology with the consensus sequence from E. coli, starting at - 7 6 ( - 35 box) and at - 52 (Pribnow box) counted back from the start

codon. They are separated by 18 nucleotides, one more than the spacing considered optimal for E. coli, and both are at the correct distance from the transcription start at - 3 8 proximal to the coding region (Fig. 2 and 3). A putative ribosome binding site [25] is located 19 nucleotides up- stream of the initiator ATG.

rps12 and rps7 are separated by 47 nucleotides only and are cotranscribed to a 1.5 kb mRNA (Fig. 4).

In the cell the concentration of EF-Tu is much higher than that of individual ribosomal proteins which necessitates a specific transcriptional and/or translational regulation of the tufa gene. Indeed we found a strong northern signal at 1.45 kb (Fig. 4). In the range of 3.4 and 4.2 kb two additional weak signals were observed that might correspond to polycistronic transcripts. Two pos- sible' - 10' promoter sequences ( - 58: TATTAT and - 67: TATAAT) but no significant ' - 35'

Page 4: The cyanelle str operon from Cyanophora paradoxa: Sequence analysis and phylogenetic implications

564

regulatory sequences occurred in the upstream part of the gene. The 5'end of the tufA message could not be defined with certainty and maps at a distance of approximately 60 bases upstream of the start codon (Fig. 2 and 3). ThepsbA promoter giving also rise to a major transcript contains the c lass ical ' -35' sequence motif [25]. In contrast

to the rpsl2-rps7 intergenic region a good Shine- Dalgarno sequence is found 11 nucleotides up- stream of the tufA start codon.

In Fig. 5 the deduced amino acid sequences of the three genes are aligned with their counterparts from cyanobacteria (Anacystis nidulans [33], Spirulina platensis [9]), algae (Chlamydomonas

..... AAAATAATTG TATAATTTAAATATT CTATAATTTGA~TTCATGTTA

AGTTCG~-~AACAATGAAAAATTCAAGTAGATAGAAGGTTCGAAACTACTTACA

~T~ _e.e_%~e..~__~_~_T__e__~__e._~_a__TT% ~TT eQ~ TeT ~U~ ea~ ~em ~x~ ~ T Met Pro Thr Ile Gln Gln Leu

46 G~G A~A ~AG ACT ~ TeT CCG G1u Lys Lys Thr Lys Set Pro

91 AGA GGG GTT TGT AC~% CGT GTT Arg GIy Val Cys Thr Arg Val

136 AAT TCT GeT CT~ CGA AAAGTK Asn Set Ala Leu Arg Lys Val

181 TTT G~ GTA ACA GeT TAT ATT Phe G1u Val Thr A1a Tyr Ile

226 GAA eAT TeA GTT GTT CTT GTT GIu His Ser Val Val Leu Val

271 CCA GGT GTT CGT TAT CAT ATT Pro Gly Val Arg Tyr His Ile

316 GGA GTT AAA GAT CGT CGT CAA Gly Val Lys Asp Arg Arg Gln

361 CGA CC&A]~AGC~% TAA ... Arg Pro Lys A1a Ter

1 ATG TCT CGT CGT Met Set Arg Arg

46 CCT ATT TAT AAT Pro Ile Tyr ASh

91 TT~AAA GAT GGA Leu Lys Asp Gly

136 GeT TTA AR/% ATT Ala Leu Lys Ile

181 GTT TTA GAA eAA Val Leu GIu Gln

2 2 6 AAA GC~ CGC CG~ Ly$ Ala Arg Arg

271 GTA CGT GTA GAT Val Arg Val Asp

316 AGT TTT TeT TTA Set Phe Set Leu

361 GCA AAT GAA TT~ Ala Ash Glu Leu

406"AA~ ~ CGA GAA Lys Lys Arg G1u

451 TTT GTT CAT TAT Phe Val His Tyr

Ile Arg Set Ly$ Arg Thr Lys Ile

GeT TTAAAA GeT TGC COT CA~ CG& A1a Leu Lys Ala Cys Pr 9 Gln Arg

TAT ACT Ace ACT CCG /~%AA~J%CCA Tyr Thr Thr Thr Pro Lys Lys Pro

GCG AGG GTT CGT TTA ACT TCT GGT ,~ Ala Arg Val Arg Leu Thr Ser Gly ~D

CC~ GGA ATT GGT CAT AAT TTA C/~% Pro Gly Ile Gly His Asn Leu Gln

CGT GGA GGA AGG GTA AAA GAT TTA Arg Gly Gly Arg Val Lys Asp Leu

GTT CGT GGT GC~ CTT GAT GCA GCC Val Arg Gly Ala Leu Asp Ala Ala

AGT CGT TCAAIL% TAT GGT GCAAAA Set Arg Set Lys Tyr Gly A1a Lys

...TTTTATAAAAAAAAACTATAATTGGATTTAATTT...

AGT ACT Gee AA~ A~A CGT CT~ ATT TT~ ec/% GAC Set Thr Ala Lys Lys Arg LeuIle Leu Pro Asp

AGT AGA TT~ GTT ACA TTG TTA ATT AAT CAT ATG Ser Arg Leu Val Thr Leu Leu Ile Ash His Met

AAA AAA TCT ATT GC& CGA AGe TTT ATT TAT GA~ Lys Lys Ser Ile Ala Arg Set Phe Ile Tyr Glu

ATT GA~ GAA AAA A~A GGT TCJ G~T CCA CTT G/~% Ile Glu Glu

GeT GTT CGT Ala Val Arg

ATT GG~ GGT TeA ACT T~T C/~% GTT CC~ ATG GRA Ile Gly Gly Ser Thr Tyr Gln Val Pro Met GIu

AGA GGT ~TT ACT TTA GeT TTJ CGC GTG GTT AAC Arg Gly Ile Thr Leu Ala Leu Arg Val Val Asn

CAA CGT CTGGGA AR~ ACG ATT GeT GTT A~A TT~ Gln Arg Leu G1y Lys Thr Ile Ala Val Lys Leu

ATT G~T GeT GeT A~T G~A AC& GGC AAT ACA ATT Ile Asp Ala A1a A~n Glu Thr Gly Asn Thr Ile

G~ ATG CAT CGT ATG GeT GRJ~ GeT AAT A~A GCA GIu Met His Arg Met Ala Glu Ala ASh Lys Ala

CGA TAT TAA ... Arg Tyr Ter

Lys Lys Gly Ser Asp Pro Leu Glu

AAT TC~ ACT CCT TTA ATT GR/% GTT Ash Set Thr Pro Leu Ile Glu Val

Fig. 2. Nucleotide sequences and deduced amino acid sequences of the cyanelle rpsl2, rps7 and tufA genes (opposite page). Putative ribosome binding sites are underlined, conserved promotor sequence motifs are shown in boxes. The 5' ends of the transcripts as revealed by primer extension analysis are marked by arrows. Oligonueleotides used as primers were complementary

to the sequences marked with dotted lines.

Page 5: The cyanelle str operon from Cyanophora paradoxa: Sequence analysis and phylogenetic implications

..0 TTAGGAAGATAATTTG

ATTA'FFFr~AAAGAAATTATTACTTAAAATAAATAI-rl-I~TTTTCAAAATATAATTC

TTTATTATCAAAATCAAAGGGAACTCTAAAAAAAAAACTAACAAAGGA~-~AAATTT

1 &TG GCA AG& C&G AAA TTT G&C GGAJ~AT AA~ CC~ CAT GT&AAC &TT Met Ala Arg'GinLys'Phe'Asp'Gl'y Ash Lys Pro His Val Asn Ile

4 6 GGT ACT ATT GGT CAC G TT GAC CAT G G A A R I ~ C T ACC T T A ACT G~'Y/P Gly Thr Ile Gly His Val Asp His Gly Lys Thr Thr Leu Thr A1a

91 GCAATT ACAACT GCT CTAGCA TCC CA~ ~TAAA ~A ~ ~ CGT Ala Ile Thr Thr Ala Leu A1a Ser Gln Gly Lys G1y Lys A1a Arg

136 ~ TAT GAT G~A ATT GAT GCT GCT CCA GAA GAA AAA GCA CGT GGT Lys Tyr Asp Glu Ile Asp Ala Ala Pro Glu Glu Lys Ala Arg Gly

181 ATT ACT ATT ~T ACT GCA CAC GTA GAJ% TAT ~G ACT ~AAACGT lle Thr Ile ASh Thr Ala His Val Glu Tyr Glu Thr Glu Lys Arg

226 CAT TAT GCA CAC GTA GAT TGC CCA GGA CAC GCA GAT TAT GTG AAA His Tyr Ala His Val Asp Cys Pro Gly His Ala Asp Tyr Val Lys

271 R/%C ATG ATT ACA GGT GCA ~ C~ATG GAC GGA GCT ATT TTA GTT Ash Met Ile Thr Gly Ala Ala Gln Met Asp Gly Ala Ile Leu Val

316 GTT TCT GCA GCT GAT GGT CCA ATG CCT CAA ACT CGT GAA CAT ATT Val Ser Ala Ala Asp Gly Pro Met Pro Gln Thr Arg Glu His Ile

361 CTG TTA GCA ~ CAA GTT GGT GTT CCG ~AC ATG GTT GTT TTC TTA Leu Leu A1a Lys Gln Val Gly Val Pro Asn Met Val Val Phe Leu

406 ~T ~ GAA GAC ClqA ATT GAT GAC GCT GAT TTA TTA GAA TTA GTA Ash Lys Glu Asp Gln Ile Asp Asp Ala Asp Leu Leu GIu Leu Val

451 GAA TTA G~A GTT CGT GJ/% TTA TTA AGC ~ TAT GAT TTC CCA GGT Glu Leu Glu Val Arg Glu Leu Leu Set Lys Tyr Asp Phe Pro Gly

496 GAT CAA ATT CCG TTC GTT AGT GGT TCT GCG TTA TTA GCT TTA GAA Asp Gin ~le Pro Phe Val Set Gly Set Ala Leu Leu Ala Leu Glu

541 AGT CTA AGT TCT RAT CCA AAA CTT ATG CGT GGC GRA GAT ~ TGG Set Leu Set Set Ash Pro Lys Leu Met Arg Gly Glu Asp Lys Trp

586 GTA GAT ~ ATT CTT GCT TTA ATG GAT GCA GTT GAT GRA TAT ATT Val Asp Lys Ile Leu Ala Leu Met Asp Ala Val Asp Glu Tyr Ile

631 OCT ACT CCA GRA CGT CCA ATC GAT ~ TCT TTC TTA ATG GCA ATT Pro Thr Pro GIu Arg Pro Ile Asp Lys Set Phe Leu Met Ala Ile

676 GRA GAT GTT TTC TCT ~TT ACC GGT CGT GGT ACA GTA GCT ACT GGT Glu Asp Val Phe Ser Ile Thr Gly Arg Gly Thr Val Ala Thr Gly

721 AGA ATT GRA AGA GGT GCT ATT RAG GTT GGT GRAACA GTT GRA TTA Arg Ile Glu Arg Gly Ala Ile Lys Val Gly Glu Thr Val G1u Leu

766 GTA GGT TTA ~ GAT ACT ~ TCA ACA ACA GTT ACT GGT TTA GRA Val Gly Leu Lys Asp Thr Lys Set Thr Thr Val Thr Gly Leu Glu

811 ATG TTC CR/L ~ AC~ TTA GRA GRA GGG ATG GCT GGT GAT RAC ATe Met Phe Gln Lys Thr Leu G1u Glu Gly Met Ala Gly Asp Ash Ile

856 GGT ATT CTT CTT CGT GGT GTT C~A ~ ACT GAT ATT GAG CGT GGT Gly Ile Leu Leu Arg Gly Val Gln Lys Thr Asp Ile Glu Arg Gly

901 ATG GTT TTA GCA AR/% CCA GGT TCT ATT ~CT CCA CAT ACT CRA TTT Met Val Leu Ala Lys Pro Gly Set Ile Thr Pro His Thr Gln Phe

946 GRA TCT GRA GTT TAT GTA TTA ACA ~ GAT GR/q GGT GGT CGT CAT Glu Ser Glu Val Tyr Val Leu Thr Lys Asp Glu Gly Gly Arg His

991 ACT CCA TTC TTC TCT GGA TAT CGT CCA CRA TTT TAT GTA CGT ACT Thr Pro Phe Phe Ser Gly Tyr Arg Pro Gin Phe Tyr Val Arg Thr

1036 ACT GAT GTT ACT ~T AGe ATT GAT GCG TTT ACT GCA GAT GAT GGT Thr Asp Val Thr Gly Ser Ile Asp Ala Phe Thr Ala Asp Asp Gly

1081 AGT RAT GCA GRA ATG GTT ATG CCA GGA GAT CGT ATT ~ ATG ACA Set ASh Ala Glu Met Val Met Pro Gly Asp Arg Ile Lys Met Thr

1126 GTA AGT TTA OTA CAT CCA ATT GCG ATT GRA CRA GGT ATG CGT TTC Val Ser Leu Val His Pro Ile Ala Ile Glu Gln Gly Met Arg Phe

1171 CGG ATT CGT GRA GGT GGT CGT ACA ATT GGT GCA GGT GTA GTT TCT Arg Ile Arg Glu Gly Gly Arg Thr Ile Gly Ala Gly Val Val Set

1216 RAG ATT TTA~JU% TAA Lys Ile Leu Lys Ter

TTAAAAT CTAAA'FFFI'AGAGAGAATAGAAA ....

565

Page 6: The cyanelle str operon from Cyanophora paradoxa: Sequence analysis and phylogenetic implications

566

Fig. 3. Primer extension analyses were done with oligo- nucleotides described in Fig. 2 using 20 #g of cyanelle RNA. For rpsl2 (left) two possible transcription starts were obtained that would yield 5' untranslated regions of approximately 50 and 38 bases, respectively. The 5' end of the tufa message (right) appeared to be at a distance of 57

to 63 bases from the translation start.

Fig. 4. Northern analysis using 15 gg cyanelle RNA and a 0.7 kb Pst I fragment from within the tufA gene as probe (exposure time 1 day) is shown in the left lane. The analogous experiment using 30 #g cyanelle RNA and a 0.45 kb Hind III fragment comprising 290 bp of the rps12 gene and 120 bp of the rps7 gene as probe (exposure time 5 days) is shown in the

right lane.

reinhardii [2, 29], Euglena gracilis [35, 36]) and plants (Nicotiana tabacum [44], Marchantia poly- morpha [41], Arabidopsis thaliana [2]). The i- dentity scores are shown in Table 1.

Due to its position in the reaction center of the ribosome controlling the precision of the codon reading, the ribosomal protein S 12 is very highly conserved. Two prolines, positions 42 and 91, respectively, within the two longest (11 amino acids) invariant regions (Fig. 5a) could be respon- sible for streptomycin binding, since mutation at either locus has led to antibiotic resistance in

Table 1. Percent amino acid identity of the cyanelle str operon genes with the corresponding genes from:

s12 s7 EF-Tu

Anacystis nidulans 91 56 82 Euglena gracilis 77 48 78 Chlamydomonas reinhardii 85 75 Nicotiana tabacum 84 51 Arabidopsis thaliana * 74 Escherichia coli 76 44 71

* Nuclear gene.

E. coli and Nicotiana tabacum [15] (bold type in Fig. 5a).

A high degree of conservation is also observed for EF-Tu, which has to recognize specific loci on the 70S ribosome. Furthermore, binding sites are known for aminoacyl t-RNA, EF-Ts and GDP/GTP in accordance with its GTPase activity. Five stretches of invariant amino acids within the so-called G-domain [24] are responsi- ble for the interaction with GDP and GTP (boxed regions in Fig. 5c). These show homology to the corresponding binding sites from various G-pro- teins and are completely identical to eubacterial consensus sequences derived from a recent analy- sis of all available EF-Tus [31 ]. Among the kirro- mycin-binding sites Gly-233 is invariant whereas Lys-374 is less conserved according to Fig. 5c and also in the context of the data mentioned above. A variable degree of conservation is also found for the t-RNA binding sites His-67, Cys-82, Lys-219 and Lys-248. Little variation is seen among the amino acids surrounding Arg-59 assumed important for the contact with EF-Ts (Fig. 5c) [31].

The ribosomal protein $7 has just a role during

Page 7: The cyanelle str operon from Cyanophora paradoxa: Sequence analysis and phylogenetic implications

(a)

cyanelle: 1

anacyst : 1

chlamy : 1

e.coli : 1

euglena : 1

nicotia : 1

marchan : 1

rpsl2

: MPTIQQLIRSKRTKI EKKTKS PALKACPQRRGVCTRVYTTTPKKPNSALR

: M ........ DE.E..T ......... N ........................

: M ......... A.K..T ......... S ...... I.L .... V ..........

: MA. VN.. V. KP. ARKVA. SNV... E .... K ....................

: M..LEH.T..P.K..KR ........ G...K.AI.M ...............

: M...K .... NT. QP. RNV ...... RG ...... P ...... I ..........

: M ........ N..QP..NR ....... G ........................

51 : KVARVRLTSGFEVTAYIPGIGHNLQEHSVVLVRGGRVKDLPGVRYHIVRG

51 : .............................. MI ............... I..

51 : ........ T .......... V ....... A ......................

51 : ..C ..... N ..... S..G.E ......... I.I .............. T...

51 : ..T .... S..L .................... I ........... K..VI..

51 : ............ I .....................................

51 : .I .......... I .................................. I..

I01 : ALDAAGVKDRRQSRSKYGAKRPKA : 124

i01 : T..T .................... : 124

I01 : S..T .... N.V ....... V.MGSKTAAKTAGKK : 133

i01 : ...CS ..... K.A ..... V ..... : 124

i01 : C .... S..N.KNA ..... V.K..PK : 125

i01 : T...V ..... Q.G ..... V.K.. : 123

i01 : T...V ..... Q.G..R..V.KS. : 123

567

(b) rps7

cyanelle:

anacyst :

e.coli :

euglena :

nicotia :

marchan :

soj a

1 : MSRRSTAKKRLILPDPIYNSRLVTLLINHMLKDGKKSIARSFIYEALKII

1 : M...TS.Q..SVN...KF .... ASMMVARLMDS .... L.FRIL.S.FDL.

1 : MP..RVIGQ.K ..... KFG.E.LAKFV.ILMV ..... T.E.IV.S..ETL

1 : M...RR .... I.SQ ...... T.ASKV..KI.LN...TL.QYIF..TM.N.

1 : M...G..E.KTAKS .... RN...NM.V.RI..H .... L.YQI..R.V.K.

1 : M..K.I.E.QVAK ..... RN...NM.V.RI..N .... L.YRIL.K.M.N.

1 : M...G..EEKTAKS .... RN...NM.V.RI..H .... L.YQI..R.M.K.

51 : EEKKGSDPLEVLEQAVRNSTPLIEVKARRIGGSTYQVPMEVRVDRGITLA

51 : Q.RT.N .... LF ...... A...V..R...V..A ......... SE..TAM.

51 : AQRS.KSE..AF.V.LE.VR.TV...S..V ........ V..P.R.N-ALA

51 : Q.IYKK...DI.RK.IK.AS.QM.TRK ..... TI .... V..KE...TS..

51 : QQ.TETN..S..R..I.GV..D.T ..... V .... H...I.IGSTQ.KA..

51 : KQ.TKKN..F..R .... KV..NVT ...... D ....... L. IKSTQ.KA..

51 : QQ.TEIN..S..R..I.GV..D.A ..... V .... H...V.IGSTQ.KA..

I01 : LRVVNSFSLQRPGKTIAVKLANELIDAANETGNTIKKREEMHRMAEANKA

i01 : ..WLVQY.R ..... SM.I ...... M ....... SSVR .... TDK .......

i00 : MRWI.EAARK.GD.SM.LR..D..S...ENK.TAV .... DV.A .......

i01 : LKFIIEKARE.K.RG.ST..K..I...S.N..EAV..K..I.KT ......

i01 : I.WLLAA.RK...RNM.F..SS..V...KGS.DA.R.K..T ....... R.

i01 : I.WLLGA.RK.S.QNM.F..SY ...... RDN.IA.R.K..T.K ..... R.

i01 : I.WLLGA.RK...RNM.F..SS..V...KGS.DA.R.K..T ....... R.

151 : FVHYRY : 156

151 : .A .... : 156

150 : .A...WLSLRSFSHQAGASSKQPALGYLN : 178

151 : .SNMKF : 156

151 : .A.F. : 155

151 : .A.F. : 155

151 : .A.F. : 155

Fig. 5. Comparison of the amino acid sequences of the cyanelle gene for the ribosomal proteins (a) S12, (b) $7 and (c)for the elongation factor Tu with the corresponding sequences from cyanobacteria, algae and plants. Boxed regions correspond to

presumed GDP/GTP binding sites. Bold-face letters are assigned to important amino acids.

Page 8: The cyanelle str operon from Cyanophora paradoxa: Sequence analysis and phylogenetic implications

568

(c) tufa

cyanelle:

anacyst :

chlamy :

euglena :

arabido :

i : M ....................

I : M ...................

I : M ....................

i : M .............

1 : MAISAPAACSSSSRILCSYSSPSPSLCPAISTSGKLKTLTLSSSFLPSYS

2 : ARQKFDGNKPHVNIGT~GHVDH4KTT~TAAIT

2 . .A..E~T...A .... + ....... "I" ..... 2 ........... S.A..E~K ........ + ....... + .....

2 ........... . .... ERT...I .... + ....... + .....

51 LTTTSASQSTRRSFTVRA..G..ERK ........ .~. ....... .~....L.

34 : TALASQGKGKARKYDEIDAAPEE~TAHVEYETEKRHYAHV~CP

34 .v..~.~...A.AD ......... l ..... l ......... G~ ..... +..

34 MT..~.GSVG~ ...... S ...... l ..... I ................ +'"

34 M .... T.NS..K~.ED..S ...... I ..... i ......... ~ ..... +'"

i01 M...SI.SSVAK ........... R. I ..... I...T ...... N ..... .|...

84 : ~ D Y V K N M I T G A A Q ~ S A A D G P M P Q T R E H I L L A K Q V G V P N M V V

84 • - I - . . . . . . - ' ' ' - " . . . . " I ' ' ' - ' ' I " " . . . . ' ' ' ' ' ' ' ' ' ' ' ' . . . . . . . . I . . o o o . . . . . . . . . . . . o o o o o o o o + . . . . . . . . . . . . . . . . . . . . . v . 8 4

++ I . . . . . . . . . . . . . . . . I × : : I: . . . . . . . . . . . . . . . . . . . . . . . +

I I I

151 ..I. . . . . . . . . . . . . . , , G ........ K ............ D...

134 : ~IDDADLLE~VELEWELLSKYDPPGDQ~P~VSGSA~ESLS

i34 : liT ]~7\?'" .E .............. s ...... D..I.A .... Q...AIQ 134 : , .V..KE ........... T.D..E .... E..V.P ........ A.I

I m l m m l

201 : I II III U~.l..l.~.V ...E .............. S.E.N..D..II ....... V.T.T

184 : SNPKLMRGEDKWVDKILALMDAVDEYIPTPERPIDKSFLMAIEDVFSITG

184 : GGASGQK.DNP ...... K..EE..A ....... EV.RP .... V .... T...

184 : E...TQ...N ...... YQ...N..S ..... Q.ET..P..L.V...L ....

183 : K...ITK..N ....... N...Q..S ..... T.DTE.D ........ L ....

251 : E...VK..DN.W .... YE ...... D...I.Q.QTELP..L.V ........

234 : RGTVATGRIERGAIKVGETVELVGLKDTKSTTVTGLEMFQKTLEEGMAGD

234 : ............ SV ..... I.I...R..R...Y..V ....... D..L...

234 : ........ V .... LRISDN..I...RP.QTAV ....... K...D.TL...

233 : ........ V...T ............... R...I ........ S.D.AL...

301 : ........ V...TV ...... D .... RE.R.YT...V ..... I.D.AL...

284 : NIGILLRGVQKTDIERGMVLAKPGSITPHTQFESEVYVLTKDEGGRHTPF

284 : °V.L .... I ..................... K ........ K.E ........

284 : .V.V ....... K ....... I .... T ..... K..AQ ...... E ..... SA.

283 : .V.V .... I..N.V ......... RT.N...K.D.Q..I...E ........

351 : .V.L .... I..A..Q ............... K..AII...K.E ..... S..

334 : FSGYRPQFYVRTTDVTGS ......... IDAFTADDGSNAEMVMPGDRIKM

334 : .P ............... A ......... ISD ....... A .... I .......

334 : MI..Q ............ KVVGFNHIQMRNPSSV.EEHSNK.A ...... S.

333 : .E ............... K ......... IES.RS.NDNP.Q ..........

401 : .A ....... M ....... K ......... VTKIMN.KDEESK ....... V.I

375 : TVSLVHPIAIEQGMRFRIREGGRTIGAGVVSKILK : 409

375 : ..E.IN .......... A ................. Q : 409

385 : ..E.IN ..... K .... A ....... V ..... TN.VQ : 419

374 : K.E.IQ ..... K .... A ....... V ..... LS.IQ : 408

442 : V.E.IV.V.C ...... A ..... K.V .... IGT..E : 476

Page 9: The cyanelle str operon from Cyanophora paradoxa: Sequence analysis and phylogenetic implications

ribosome assembly and is therefore less con- served.

In all three cases the cyanelle-derived amino acid sequences resemble closest the cor- responding cyanobacterial genes. The further order of homology may vary between algae and plants depending upon the respective trait.

In the course of a more detailed sequence analysis with the computer, we chose tufA because of its size and the high degree of con- servation. Interestingly, it appeared that upon comparing different genes at the nucleotide level the succession mentioned above changed to algae > cyanobacteria > plants > E. coli (Table 2). Computing the relative substitution rates at the three codon positions gave us a first explanation for this change, e.g. the high value at the third codon position of the Anacystis gene as opposed to the low value for the Chlamydomonas gene.

We used the main parts of the aligned tufA genes without any gaps (1182 sites out of e.g. 1227 for Cyanophora) and the whole rpsl2 genes, in both cases each codon position weighted with the average relative substitution rate in different cate- gories, and the whole rps7 genes for phylogenetic analysis with the maximum likelihood algorithm of the program DNAML [ 13 ].

The unrooted trees for the three genes are shown in Fig. 6. The branch lengths reflect the phylogenetic relations among the organisms de- rived in terms of expected numbers of substitu- tions and scaled according to the categories mentioned above. The three genes under investi- gation have evolved in different ways resulting in some deviations among the derived trees. The

Table 2. Percent nucleot ide ident i ty and relat ive subs t i tu t ion rate of the three codon posi t ions of the cyanelle tufA gene with the cor responding genes from:

Chlamydomonas reinhardii 75.9 1 : 0,51 : 2.16 Euglena gracilis 73.5 1 : 0,43 : 2.68 Spirulinaplatensis 72.6 1 : 0,48 : 3.12 Anacystis nidulans 71.9 1 : 0,54 : 4.37 Arabidopsis thaliana* 67.7 1 : 0,46 : 2.86 Escherichia coli 66.8 1 : 0,54 : 2.38

* Nuc lear gene.

569

rps7 Marchantia

Spi~/ Cyanophora

Anacystis

Nicotiana

Euglena

rp s 12 Marchantia Spiru~__~ Nicotiana

Euglena Chlamydomonas

E. coli Cyanophora Anacystis

tufA

E.coli

Spiru~ ~ Euglena

~ Chlamydomonas

J ~ Cyanophora Anacystis Arabidopsis

Fig. 6. Phylogenet ic trees cons t ruc ted with the D N A M L program using the nucleot ide sequences of the str operon genes f rom cyanobac ter ia , cyanelle and chloroplasts . The

circled area indicates an uncer ta in b ranch ing order.

cyanelles of Cyanophora paradoxa appear as the closest relatives to cyanobacteria among plastids when the rps7-based tree is considered. This view still holds for rps12 though the order of branching is uncertain for Cyanophora and Chlamydomonas. In the tufA-derived tree the inclusion of a nuclear gene yielded the unexpected placement of Arabidopsis between cyanobacteria and Cyanophora.

Discussion

In E. coli an interesting mechanism of auto- regulation is known to operate for ribosomal protein operons including str. Overproduced $7 unable to bind to its specific site on 16S rRNA interacts with a similar structure in the (S 12-$7)

Page 10: The cyanelle str operon from Cyanophora paradoxa: Sequence analysis and phylogenetic implications

570

intergenic space on the mRNA, thus inhibiting further translation [38, 39]. For a cyanobacterial str operon this has been shown to be possible in theory, though no experimental proof has been given yet [33]. Autoregulation could not be demonstrated in the ribosomal protein operons of Bacillus subtilis [22].

The S12-$7 intergenic space of the cyanelle operon is much smaller and shows no homology to the corresponding region on 16S rRNA [25]. This seems to rule out autoregulation in accord- ance with reports on respective plastid operons [ 18]. However, the complex assembly mechanism of the plastid ribosomes, where two thirds of the proteins are produced in the cytoplasm and imported, certainly needs regulatory features. Binding sites for $7 on regions of the m R N A other than the intergenic spacer could also be envisaged [6].

The independent expression of the tufA gene, leading to a considerable excess of EF-Tu com- pared to the ribosomal proteins (and elongation factor G) is secured in E. coli by an additional internal promoter [28]. With cyanelles a specific tufA promoter would have to be considerably divergent from the E. coli ' - 3 5 ' consensus se- quence. The presence of a bicistronic S12-$7 transcript and a separate tufA transcript offers another explanation. The primary transcript (faintly visible in Fig. 4) coveting all three genes is processed to a relatively stable tufA m R N A and a less stable S 12-$7 mRNA. The data available for Euglena where the plastid str operon is organized in an analogous way support this view [35, 36]. A similar mechanism for regulation of expression of individual cistrons in a transcription unit by differential stability of processed m R N A fragments seems to operate in the plastid rps2- atpI, H,F,A operons (A.R. Subramanian, personal communication).

A lot of different methods constructing phylo- genetic trees are available [14]. We used the D N A M L program for several reasons. First, we wanted to do the analysis at the nucleotide level where the stochastic, process of mutation takes place. The pitfalls of protein sequence compari- sons become apparent upon looking at our per-

cent identity tables. The transition from nucleo- tide to amino acid sequence data changes the relationship of our cyanelles to the other organisms (Tables 1, 2).

Secondly, we wanted to work with an algorithm with a statistical basis. This probability method, as opposed to parsimony (seek trees that require the fewest character changes) and compatibility methods (attempt to find the largest number of sites where the nucleotides could evolve uniquely on the same tree), estimates phylogenies by first constructing a tree using defined substitution probabilities and then optimizing it through rearrangements of branches according to the likelihood of the new trees. The iteration algorithm scans for the maximum on the likelihood surface in connection with our parameters and is inde- pendent of the entering order of the sequences in contrast to some widely used parsimony methods. In comparing nucleotide sequences, the different evolutionary pressure on the three codon posi- tions should be taken into consideration and programs that neglect that (and there are many of them) may produce incorrect results.

Thus far this method was not used in the context of chloroplast evolution. A phylogenetic analysis using 16S rRNA partial sequences in a parsimony algorithm placed chloroplasts as a single lineage among cyanobacteria, with first Euglena, then Chlamydomonas deviating from the closely linked group of plants [45]. In a compre- hensive study by a distance matrix method, where numerous cyanobacterial 16S rRNA sequences have been included, the cyanelles from Cyanophora paradoxa were placed together with plastids with- in the cyanobacterial radiation [ 17].

Later, trying to fit Prochlorothrix to the plastid phylogenies, a distance matrix method used for this structural RNA sequence changed the diver- gence within the chloroplast tree; cyanelles branch earlier and Chlamydomonas and Euglena separate together from the rest of plastids [47]. An interesting aspect emerged from the compari- son ofpsbA genes. In this case the percent identity scores were highest between cyanelles and plants (Marchantia) [26], although the gene size was typically cyanobacterial. This dilemma was also

Page 11: The cyanelle str operon from Cyanophora paradoxa: Sequence analysis and phylogenetic implications

encountered upon the positioning of Prochloro- thrix using the psbA character. When the retain- ment of the loss of seven carboxyterminal codons are weighted strongly cyanelles would group with cyanobacterial and Prochlorothrix with green plas- tids [37].

When moving the components of the trans- lation machinery the trees diverge more. The analysis of five sequenced ribosomal protein genes compared with different programs, sug- gested that cyanelles are more closely related to land plant chloroplasts than the Euglena chloro- plasts [ 11 ]. Here the disadvantage is the lack of corresponding cyanobacterial sequences.

The inclusion of our cyanelle tufA sequence in an extensive phylogenetic analysis using a distance matrix method [31] and a parsimony method [2] places cyanelles again together with cyanobacteria aside from the chloroplast group. But all these data are somewhat difficult to com- pare due to the use of different phylogenetic methods.

Our trees reflect the biochemical (e.g. pigment composition) and morphological (e.g. thylakoid membrane structure) parallels [48] and show a distinct relationship of the cyanelles of Cyano- phora paradoxa to cyanobacteria. The Euglenophyta and plants build a more compact family. But there are also branchings at un- expected positions, as in the case of the Arabi- dopsis tufA gene, which should be located near the 'plant branching' and not close to cyanobacteria as observed. This indicates that the inclusion of nuclear genes into the tree derived from prokaryo- tic and plastidic sequences is problematic.

Although the branching in one area (circled in Fig. 5) is somehow uncertain, the overall phylo- geny remains stable for the three different traits. In Summary, our phylogenies constitute quasi a mean between other analyses based on tufA [2, 31 ] and 16S rRNA [ 17] and result in a bridge position for cyanelles of Cyanophora paradoxa between cyanobacteria and plastids.

571

Acknowledgements

We are grateful to Dr. H.J. Bohnert and C. Michalowski who donated the clone BgllI-11 from their cyanelle DNA library. We wish to thank Drs. P.-E. Montandon and J.-D. Rochaix for providing gene probes and Dr. J. Felsenstein for sending us the PHYLIP program package. The work was supported by the 'Fonds zur F6rderung der wiss. Forschung' ($29/06/WL). We are indebted to Drs. J.D. Palmer, S.L. Baldauf and M. Kuhsel for sequence information prior to publication and for generous gifts of primers, and to Dr. K.H. Schleifer for a preprint of ref. 31.

References

I. AitkenA, StanierRY:Characterizationofpeptidoglycan from the cyanelles of Cyanophora paradoxa. J Gen Microbiol 112:219-223 (1979).

2. Baldauf SL, Palmer JD: Evolutionary transfer of the chloroplast tufA gene to the nucleus. Nature 344: 262-265 (1990).

3. Berenguer J, Rojo F, dePedro MA, Pfanzagl B, L6ffelhardt W: Penicillin-binding proteins in the cyanelles of Cyanophora paradoxa, a eukaryotic photo- troph sensitive to fl-laetam antibiotics. FEBS Lett 224: 401-405 (1987).

4. Bohnert H J, L6ffelhardt W: Cyanelle DNA from Cyano- phoraparadoxa exists in two forms due to intramolecular recombination. FEBS Lett 150:403-406 (1982).

5. Bohnert HJ, Michalowski C, Bevacqua S, Mucke H, L6ffelhardt W: Cyanelle DNA from Cyanophora para- doxa: Physical mapping and location of protein coding regions. Mol Gen Genet 201:565-574 (1985).

6. Bonham-Smith PC, Bourque DP: The chloroplast genome and regulation of its expression. In: Adolph KW (ed) Chromosomes: Eukaryotic, Prokaryotic and Viral. Vol II, pp. 179-261. CRC Press, Boca Raton (1990).

7. Bryant DA, Stirewalt VL: The cyanelle genome of Cyanophora paradoxa encodes ribosomal proteins not encoded by the chloroplast genomes of higher plants. FEBS Lett 259:273-280 (1990).

8. Burger-Wiersma T, Veenhuis M, Korthals H J, Van de Wiel CCM, Mur LR: A new prokaryote containing chlorophyll a and b. Nature 320:262-264 (1986).

9. Buttarelli FR, Calogero RA, Tiboni O, Gualerzi CO, Pon CL: Characterization of the str operon genes from Spirulina platensis and their evolutionary relationship to those of other prokaryotes. Mol Gen Genet 217:97-104 (1989).

Page 12: The cyanelle str operon from Cyanophora paradoxa: Sequence analysis and phylogenetic implications

572

10. Corpet F: Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res 16:10881-10890 (1988).

11. Evrard JL, Kuntz M, Weil JH: The nucleotide sequence of five ribosomal protein genes from the cyanelles of Cyanophora paradoxa. J Mol Evol 30:16-25 (1990).

12. Evrard JL, Johnson C, Janssen I, LOffelhardt W, Weil JH, Kuntz M: The cyanelle genome of Cyanophorapara- doxa, unlike the chloroplast genome, codes for the L3 protein. Nucleic Acids Res 18:1115-1119 (1990).

13. Felsenstein J: Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368-376 (1981).

14. Fink WL: Microcomputers and phylogenetic analysis. Science 234:1135-1139 (1986).

15. Galili S, Fromm H, Aviv D, Edelman M, Galun E: Ribosomal protein S12 as a site for streptomycin re- sistance in Nicotiana chloroplasts. Mol Gen Genet 218: 289-292 (1989).

16. Giddings Jr. TH, Wasmann C, Staehelin LA: Structure of the thylakoids and envelope membranes of the cyanelles of Cyanophora paradoxa. Plant Physiol 71: 409-419 (1983).

17. Giovannoni S J, Turner S, Olsen GJ, Barns S, Lane DJ, Pace NR: Evolutionary relationships among cyano- bacteria and green chloroplasts. J Bact 170:3584-3592 (1988).

18. Giese K, Subramanian AP, Larrinua IM, Bogorad L: Nucleotide sequence, promoter analysis, and linkage mapping of the unusually organized operon encoding ribosomal protein $7 and S12 in maize chloroplast. J Biol Chem 262:15251-15255 (1987).

19. G6tz M: Lokalisierung von Genen Ftir Proteine des Translationsapparates auf dem Cyanellengenom von Cyanophora paradoxa LB555UTEX. Diplom at the University of Vienna (1988).

20. Gray MW: The evolutionary origins of organelles. Trends in Genetics 9:294-299 (1989).

21. Henikoff S: Unidirectional digestion with Exonuclease III creates targeted breakpoints for DNA sequencing. Gene 28:351-359 (1984).

22. Henkin LM, Moon SH, Mattheakis LC, Nomura M: Cloning and analysis of the spc ribosomal protein operon of Bacillus subtilis: comparison with the spc operon of E. coli. Nucleic Acids Res 17:7469-7486 (1989).

23. Hiratsuka J, Shimada H, Whittier R, Ishibashi T, Sakamoto M, Mori M, Kondo C, Honji Y, Sun CR, Meng BY, Li YQ, Kanno A, Nishizawa Y, Hirai A, Shinozaki K, Sugiura M: The complete sequence of the rice (Oryza sativa) chloroplast genome: Intermolecular recombination between distinct tRNA genes accounts for a major plastid DNA inversion during the evolution of the cereals. Mol Gen Genet 217:185-194 (1989).

24. Jacquet E, Parmeggiani A: Structure-function relation- ship in the GTP binding domain of EF-Tu: mutation of Val 20, the residue homologous to position 12 in p21. EMBO J 9:2861-2867 (1988).

25. Janssen I, Mucke H, L6ffelhardt W, Bohnert HJ: The central part of the cyanelle rDNA from Cyanophora paradoxa: comparison with cyanobacteria and plastids. Plant Mol Biol 9:479-484 (1987).

26. Janssen I, Jakowitsch J, Michalowski C, Bohnert H, L6ffelhardt W: Evolutionary relationship ofpsbA genes from cyanobacteria, cyanelles and plastids. Curr Genet 15:335-340 (1989).

27. Koller B, Fromm H, Galun E, Edelman M: Evidence for in vivo trans splicing of pre-mRNA's in tobacco chloro- plasts. Cell 48:111-119 (1987).

28. Lindahl L, Zengel JM: Ribosomal genes in Escherichia coli. Ann Rev Genet 20:297-326 (1986).

29. Liu XQ, Gillham NW, Boynton JE: Chloroplast protein gene rps12 of Chlamydomonas reinhardii. J Biol Chem 27: 16100-16108 (1989).

30. L6ffelhardt W, Kraus M, Pfanzagl B, GOtz M, Brandtner M, Markmann-Mulisch U, Subramanian AR, Micha- lowski C, Bohnert HJ: Cyanelle genes for components of the translation apparatus. In: Nardon P (ed) Endo- cytobiology IV, pp 561-564, INRA, Paris (1990).

31. Ludwig W, Weizenegger M, Betzl D, Leidel E, Lenz T, Ludvigsen A, M611enhoff D, Wenzig P, Schleifer KH: Complete nucleotide sequences of seven eubacterial genes coding for the elongation factor Tu: functional, structural and phylogenetic evaluations. Arch Microbiol 153:241-247 (1990).

32. Manhart JR, Kelly K, Dudock BS, Palmer JD: Unusual characteristics of Codium fragile chloroplast DNA revealed by physical and gene mapping. Mol Gen Genet 216:417-421 (1989).

33. Meng BY, Shinozaki K, Sugiura M: Genes for the ribosomal proteins S12 and $7 and elongation factors EF-G and EF-Tu of the cyanobacterium Anacystis nidulans: Structural homology between 16S rRNA and $7 mRNA. Mol Gen Genet 216:25-30 (1989).

34. Michalowski C, Pfanzagl B, L/Sffelhardt W, Bohnert H: The cyanelle S 10-spc operons from Cyanophoraparadoxa. Mol Gen Genet, submitted.

35. Montandon PE, Stutz E: Nucleotide sequence of a Euglena gracilis chloroplast genome region coding for the elongation faktor Tu; evidence for a spliced mRNA. N.A.R. 11:5877-5892 (1983).

36. Montandon PE, Stutz E: The genes for the ribosomal proteins S12 and $7 are clustered with the gene for the EF-Tu protein on the chloroplast genome of Euglena gracilis. N.A.R. 12:2851-2859 (1984).

37. Morden CW, Golden S: psbA genes indicate common ancestry of prochlorophytes and chloroplasts. Nature 337:382-385 (1989) and Nature 339:400 (1989).

38. Nomura M, Gourse R, Baughman G: Regulation of the synthesis of ribosomes and ribosomal components. Ann Rev Biochem 53:75-117 (1984).

39. Nomura M, Yates JL, Dean D, Post LE: Feedback regulation of ribosomal protein gene expression in Escherichia coli: Structural homology of ribosomal RNA

Page 13: The cyanelle str operon from Cyanophora paradoxa: Sequence analysis and phylogenetic implications

and ribosomal protein mRNA. Proc Natl Acad Sci USA 77:7084-7088 (1980).

40. Ohto C, Torazawa K, Tanaka M, Shinozaki K, Sugiura M: Transcription of ten ribosomal protein genes from tobacco chloroplasts: a compilation of ribosomal protein genes found in the tobacco chloroplast genome. Plant Mol Biol 11:589-607 (1988).

41. Ohyama K, Fukuzawa H, Kohchi T, Shirai H, Sano T, Sano S, Umesono K, Shiki Y, Takeuchi M, Chang Z, Aota S, Inokuchi H, Ozeki H: Complete nucleotide sequence of liverwort Marchantiapolymorpha chloroplast DNA. Plant Mol Biol Rep 4:148-175 (1986).

42. Palmer J: Comparative organization of chloroplast genomes. Ann Rev Genet 19:325-354 (1985).

43. Schmidt RJ, Hosler JP, Gillham NW, Boynton JE: Biogenesis and evolution of chloroplast ribosomes: co- operation of nuclear and chloroplast genes. In: Molecu- lar Biology of the Photosynthetic Apparatus, pp. 417-427. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1985).

44. Shinozaki K, Ohme M, Tanaka M, Wakasugi T, Hayshida N, Matsubayasha T, Zaita N, Chunwongse J, Obokata J, Yamaguchi-Shinozaki K, Ohto C, Torazawa K, Meng BY, Sugita M, Deno H, Kamogashira T,

573

Yamada K, Kusuda J, Takaiwa F, Kata A, Tohdoh N, Shimada H, Sugiura M: The complete nucleotide se- quence of the tobacco chloroplast genome. Plant Mol Biol Rep 4:110-147 (1986).

45. Sogin ML, Gunderson JH: Structural diversity of euka- ryotic small subunits ribosomal RNA's. In: Lee JL, Frederick JF (eds) Endocytobiology III, pp. 125-139. Ann N Y Acad Sci (1986).

46. Taylor FJR: An overview of the status of evolutionary cell symbiosis theories. In: Lee JL, Frederick JF (eds) Endocytobiology III, pp. 1-16. Ann N Y Acad Sci (1986).

47. Turner S, Burger-Wiersma T, Giovannoni S J, Mur LR, Pace NR: The relationship of a prochlorophyte Pro- chlorothrix hollandica to green chloroplasts. Nature 337: 380-382 (1989).

48. Wasmann CC, LSffelhardt W, Bohnert H: Cyanelles: organization and molecular biology. In: Fay P, Van Baalen C (eds) The Cyanobacteria, pp. 303-324. Elsevier, Amsterdam (1987).

49. Zurawski G, Clegg MT: Evolution of higher-plant chloroplast DNA-encoded genes: Implications for struc- ture-function and phylogenetic studies. Ann Rev Plant Physiol 38:391-418 (1987).