Small RNAs in Rickettsia: are they functional?

4
TRENDS in Genetics Vol.18 No.7 July 2002 http://tig.trends.com 0168-9525/02/$ – see front matter © 2002 Elsevier Science Ltd. All rights reserved. PII: S0168-9525(02)02685-9 331 Research Update Many obligate intracellular pathogens have small genomes with high fractions of pseudogenes. A recent analysis of gene expression patterns in Rickettsia conorii shows that short open reading frames inside deteriorating genes are occasionally transcribed into RNA. Here, we show that substitution frequencies at nonsynonymous sites are similar for expressed and unexpressed parts of the fragmented genes. We conclude that the observed expression is a temporary stage in the gene degradation process, suggesting that the expressed gene fragments are not functional. Microbial genes with similar functions are often organized into co-transcribed operons, the longest and most highly conserved of which is the super- ribosomal protein gene cluster [1]. As a consequence, most genomes are densely packed with short spacer regions between the genes [2]. The expression of genes and operons in free-living bacteria are controlled by a broad spectrum of regulatory systems, allowing fast adaptation to changing growth environments. Regulatory RNAs seem to be more common than previously thought, and several encompass small internal open reading frames [3]. By contrast, obligate intracellular bacteria often contain disrupted operon structures [4], and have only a small set of genes involved in regulatory processes [5–8]. Furthermore, many previously active genes have been partially degraded by mutations and deletions to form high fractions of what seems to be junk DNA [9–12]. The process of gene degradation has been inferred from comparative sequence analyses [9,10], providing no information about the possible role and regulation of the resulting junk DNA. Also, it is not known at what stage of the deterioration process that the function of the gene is lost or when its expression is turned off. A recent analysis has shown that RNA is still produced from some of the degraded genes in Rickettsia [6], raising questions about a putative role for the junk DNA. Gene degradation in Rickettsia Rickettsia was the first bacterium for which gene degradation was described in any detail [5,6,9–11]. Members of the genus Rickettsia are obligate intracellular pathogens that infect vertebrate hosts with the help of bloodsucking arthropods, such as fleas, lice and ticks. Some species multiply exclusively in the host-cell cytoplasm, whereas others can also grow in the cell nucleus. A few are deadly human pathogens, but others cause no observable harm to their eukaryotic hosts. Here, we compare two Rickettsia species for which complete genome sequence data are available, Rickettsia prowazekii [5], the causative agent of epidemic typhus and Rickettsia conorii [6], the causative agent of Mediterranean spotted fever. The genomes of R. prowazekii and R. conorii are very small, only 1.1 and 1.3 Mb in size, respectively [5–6]. Before the genome sequences were obtained, it was estimated that the R. prowazekii genome contains a high fraction of noncoding DNA, as inferred from a simple calculation of the GC content of coding and noncoding segments of the genome [13]. This estimate turned out to hold remarkably well: the complete R. prowazekii genome [5] showed that genes comprise only 76% of the nucleotide sequence, which at the time was the lowest gene density described in any microbial system. The remaining spacer regions were suggested to be degraded remnants of ancestral genes that are no longer functional. But if so, why had they not been eliminated completely? It has been suggested that high deletion rates have been selected in microbial genomes to prevent the accumulation of dangerous genetic parasites [14]. Indeed, numerous studies have shown that genes that confer no selectable functions are lost rapidly, most often by recombination between repetitive elements [15,16], resulting in compact genomes with little DNA in between genes. The mean spacer length in microbial genomes is estimated to be 140 bp, a value that is independent of genome size [2]. This suggests that most bacterial genomes, small and large, have very small spacers. At first sight, it might seem paradoxical that the bacterial genomes with the longest spacers are those of Rickettsia [5–6] and Mycobacterium leprae [7]; that is, obligate intracellular parasites subjected to reductive genome evolution. Comparative sequence analyses show that pseudogenes and long spacers in Rickettsia are degraded genes in the process of being eliminated [6,10]. The patterns of changes in these neutrally evolving sequences reveal that there is a mutational bias for short deletions, which explains the observed sequence degradation [9,11]. Because influx of genetic material by horizontal gene transfer is prevented by the lack of exposure to bacteriophages and other bacteria in the eukaryotic cytoplasm, the result is a net loss of DNA [4,9–11,14]. The obligate intracellular parasite M. leprae, which has a genome size of 3.2 Mb, a coding content of only 50% and as many as 1116 pseudogenes, provides the best example of a microbial genome in which massive gene disintegration has occurred [7]. The effect of this degenerative process is a temporary accumulation of junk DNA. Fragmented genes in R. conorii The R. conorii genome contains 804 of the 834 genes previously identified in R. prowazekii, and another 552 genes are present uniquely in R. conorii [6]. An inspection of the spacer sequences in R. prowazekii that are located at the corresponding position to the unique genes in R. conorii has identified short gene remnants for 229 of these 552 genes [6]. This suggests that more than 200 genes have been extensively degraded and that another 200 genes have been completely eliminated from the R. prowazekii genome since its divergence from R. conorii. A smaller suite of genes appears to have been mutated more recently, as inferred from the identification of short, neighbouring, open reading frames (ORFs) in R. conorii that are similar to full-length orthologues in other species. These include 37 genes that are split into 105 ORFs by internal stop codons and Small RNAs in Rickettsia: are they functional? Wagied Davids, Haleh Amiri and Siv G.E. Andersson

Transcript of Small RNAs in Rickettsia: are they functional?

Page 1: Small RNAs in Rickettsia: are they functional?

TRENDS in Genetics Vol.18 No.7 July 2002

http://tig.trends.com 0168-9525/02/$ – see front matter © 2002 Elsevier Science Ltd. All rights reserved. PII: S0168-9525(02)02685-9

331Research Update

Many obligate intracellular pathogens

have small genomes with high fractions of

pseudogenes. A recent analysis of gene

expression patterns in Rickettsia conorii

shows that short open reading frames

inside deteriorating genes are occasionally

transcribed into RNA. Here, we show

that substitution frequencies at

nonsynonymous sites are similar for

expressed and unexpressed parts of the

fragmented genes. We conclude that

the observed expression is a temporary

stage in the gene degradation process,

suggesting that the expressed gene

fragments are not functional.

Microbial genes with similar functions

are often organized into co-transcribed

operons, the longest and most highly

conserved of which is the super-

ribosomal protein gene cluster [1]. As a

consequence, most genomes are densely

packed with short spacer regions

between the genes [2]. The expression of

genes and operons in free-living bacteria

are controlled by a broad spectrum of

regulatory systems, allowing fast

adaptation to changing growth

environments. Regulatory RNAs seem

to be more common than previously

thought, and several encompass small

internal open reading frames [3].

By contrast, obligate intracellular

bacteria often contain disrupted operon

structures [4], and have only a small

set of genes involved in regulatory

processes [5–8]. Furthermore, many

previously active genes have been

partially degraded by mutations and

deletions to form high fractions of what

seems to be junk DNA [9–12]. The

process of gene degradation has been

inferred from comparative sequence

analyses [9,10], providing no

information about the possible role and

regulation of the resulting junk DNA.

Also, it is not known at what stage of

the deterioration process that the

function of the gene is lost or when

its expression is turned off. A recent

analysis has shown that RNA is still

produced from some of the degraded

genes in Rickettsia [6], raising

questions about a putative role for

the junk DNA.

Gene degradation in Rickettsia

Rickettsia was the first bacterium for

which gene degradation was described

in any detail [5,6,9–11]. Members of the

genus Rickettsia are obligate intracellular

pathogens that infect vertebrate hosts

with the help of bloodsucking arthropods,

such as fleas, lice and ticks. Some species

multiply exclusively in the host-cell

cytoplasm, whereas others can also grow

in the cell nucleus. A few are deadly

human pathogens, but others cause no

observable harm to their eukaryotic

hosts. Here, we compare two Rickettsia

species for which complete genome

sequence data are available, Rickettsia

prowazekii [5], the causative agent of

epidemic typhus and Rickettsia conorii

[6], the causative agent of Mediterranean

spotted fever.

The genomes of R. prowazekii and

R. conorii are very small, only 1.1 and

1.3 Mb in size, respectively [5–6]. Before

the genome sequences were obtained, it

was estimated that the R. prowazekii

genome contains a high fraction of

noncoding DNA, as inferred from a simple

calculation of the GC content of coding and

noncoding segments of the genome [13].

This estimate turned out to hold

remarkably well: the complete

R. prowazekii genome [5] showed that

genes comprise only 76% of the nucleotide

sequence, which at the time was the

lowest gene density described in any

microbial system. The remaining spacer

regions were suggested to be degraded

remnants of ancestral genes that are no

longer functional. But if so, why had they

not been eliminated completely?

It has been suggested that high

deletion rates have been selected in

microbial genomes to prevent the

accumulation of dangerous genetic

parasites [14]. Indeed, numerous studies

have shown that genes that confer no

selectable functions are lost rapidly, most

often by recombination between repetitive

elements [15,16], resulting in compact

genomes with little DNA in between

genes. The mean spacer length in

microbial genomes is estimated to be

140 bp, a value that is independent of

genome size [2]. This suggests that most

bacterial genomes, small and large, have

very small spacers. At first sight, it might

seem paradoxical that the bacterial

genomes with the longest spacers are

those of Rickettsia [5–6] and

Mycobacterium leprae [7]; that is, obligate

intracellular parasites subjected to

reductive genome evolution.

Comparative sequence analyses show

that pseudogenes and long spacers in

Rickettsia are degraded genes in the

process of being eliminated [6,10]. The

patterns of changes in these neutrally

evolving sequences reveal that there is

a mutational bias for short deletions,

which explains the observed sequence

degradation [9,11]. Because influx of

genetic material by horizontal gene

transfer is prevented by the lack of

exposure to bacteriophages and other

bacteria in the eukaryotic cytoplasm, the

result is a net loss of DNA [4,9–11,14].

The obligate intracellular parasite

M. leprae, which has a genome size of

3.2 Mb, a coding content of only 50% and

as many as 1116 pseudogenes, provides

the best example of a microbial genome

in which massive gene disintegration

has occurred [7]. The effect of this

degenerative process is a temporary

accumulation of junk DNA.

Fragmented genes in R. conorii

The R. conorii genome contains 804 of

the 834 genes previously identified in

R. prowazekii, and another 552 genes

are present uniquely in R. conorii [6]. An

inspection of the spacer sequences in

R. prowazekii that are located at the

corresponding position to the unique

genes in R. conorii has identified short

gene remnants for 229 of these 552 genes

[6]. This suggests that more than

200 genes have been extensively degraded

and that another 200 genes have been

completely eliminated from the

R. prowazekii genome since its divergence

from R. conorii.

A smaller suite of genes appears to

have been mutated more recently, as

inferred from the identification of short,

neighbouring, open reading frames

(ORFs) in R. conorii that are similar to

full-length orthologues in other species.

These include 37 genes that are split into

105 ORFs by internal stop codons and

Small RNAs in Rickettsia: are they functional?

Wagied Davids, Haleh Amiri and Siv G.E. Andersson

Page 2: Small RNAs in Rickettsia: are they functional?

TRENDS in Genetics Vol.18 No.7 July 2002

http://tig.trends.com

332 Research UpdateResearch Update

frameshift mutations [6]. Fourteen

of these have intact orthologues in

R. prowazekii, and the remaining 23 are

not present in that genome. Also, the

R. prowazekii genome contains 11 genes

that are split into 23 ORFs, all of which

have intact orthologues in R. conorii [6].

Ogata et al. use the term ‘split genes’

rather than ‘pseudogenes’ so as not to

make any a priori assumption about the

functional consequences of this type of

gene disintegration.

Expression patterns of split genes in

R. conorii

Studies of the transcription profiles of the

split genes in R. conorii suggest that gene

inactivation is a complex process that

occurs in a step-wise manner (Fig. 1).

The most intriguing finding is that

transcription is sometimes re-initiated

inside the fragmented genes in R. conorii

[6]. This suggests that promoters can

either be created by mutations, or

recruited from existing sequences inside

the fragmented genes. In general,

transcription might be less well regulated

in the small AT-rich genomes of obligate

intracellular bacteria, and unwanted

transcription, especially inside

degrading gene sequences, could be

difficult to prevent.

Indeed, bacterial promoters are AT-rich

and potential promoter sequences are

very frequent in the AT-rich genomes

of R. prowazekii and R. conorii [5,6].

For example, the sequence TATAAT, one of

several possible RNA polymerase binding

sites, occurs seven times inside the

expressed split genes shown in Fig. 1.

The use of new promoters inside

deteriorating genes could lead to a

temporary retention of a partial gene

function, which in principle could

compensate for the accumulation of

mutations in these small genomes.

Alternatively, transcription of these short

fragments might solely be an effect of

the exposure of internal binding sites for

RNA polymerase, with no functional

consequences at the protein level.

Substitution frequencies of fragmented

genes in R. conorii

To distinguish between these

two alternatives, we have searched for

functional constraints on the expressed

gene fragments in R. conorii by

TRENDS in Genetics

Phenylalanyl-tRNA synthetase β

Rc

Rp

Rc

Rp

Rc

Rp

Rp

Rp

Rp

Rc

Rc

Rc

Alkaline phosphate synthesissensor protein

Rc702

Rc1043

Rc148

Rc217 Rc216

Rc721Rc720

Rc215Rc218

Rc1042

Rc703 Rc704

Unknown protein

Acetate kinase

Rc149Rc150

Propylendopeptidase

LPS 1, 2 glycosyltransferase

P

P

P

(d)

(f)

(h)

(b)

(j)

(l)

(a)

(c)

(e)

(g)

(i)

(k)

P

Deletion

Fig. 1. Gene inactivation in Rickettsia. The left panels(a,c,e,g,i,k) show a comparison of a selected subset of split genes in Rickettsia conorii (Rc) with theirfull-length orthologues in Rickettsia prowazekii (Rp).The right panels (b,d,f,h,j,l) show the inferredexpression status of the split genes displayed in thecorrespondingleft panel. We assume that most genesin the common ancestor of Rp and Rc were functionaland produced full-length mRNA, translated intoproteins by ribosome (a,b). We speculate that thefunctional inactivation of genes in Rickettsia occurs bythe following mechanism: the fixation of internal stopcodons induces premature translation termination(c,d), followed by premature transcription termination(e,f) and, occasionally, initiation of transcription atinternal start sites (g,h). Any of the promotersequences can be lost (i,j) and the continuedaccumulation of deletion mutations results in theelimination of all or most parts of the ancestral gene (k,l).The first stage of this process (c,d) is here exemplified by three open reading frames withsequence similarity to a gene coding for alkalinephosphatase synthesis sensor protein in R. prowazekii;the second stage (e,f) by a gene coding for a proteinwith unknown function; the third stage (g,h) by a splitgene with sequence similarity to acetate kinase inR. prowazekii; the fourth stage (i,j) by the gene forpropyl-endopeptidase; and the final stage (k,l) bytwo open reading frames with sequence similarity tothe 3′-terminal segment of a long gene in R. prowazekiiputatively coding for lipopolysaccharide1,2-glucosyltransferase. Yellow and blue boxesrepresent untranscribed and transcribed open readingframes (ORFs) in R. conorii, respectively. Aquamarineboxes represent full-length orthologous genes inR. prowazekii. Dotted lines indicate the borders ofhomology of genes in R. prowazekii and genefragments R. conorii. Green symbols representtranscription initiation sites and red hexamerstranslation termination sites. Green symbols combinedwith red hexamers indicate putative transcriptiontermination sites. Red stars represent the sequenceTATAAT, one of several possible RNA polymerasebinding sites. Open circles above the boxes representribosomes translating mRNA (curved lines). It remainsto be determined whether any of the transcribed ORFsin R. conorii are translated by ribosomes.

Page 3: Small RNAs in Rickettsia: are they functional?

comparing substitution frequencies

(1) of split genes versus full-length

genes, (2) of split genes with different

expression characteristics and

(3) of expressed versus unexpressed

gene fragments.

Based on an analysis of

785 orthologous, full-length genes in

R. prowazekii and R. conorii, we estimated

that the nonsynonymous substitution

frequency (Ka) is 0.07 per site and that the

synonymous substitution frequency (Ks)

is 0.40 per site (Table 1). Here, Kais the

average number of substitutions at

sites causing amino acid replacements,

whereas Ksis the neutral exchange rate

for substitutions with no effect on the

amino acid sequences.

The corresponding Kafor 39 gene

fragments derived from 13 split genes

in R. conorii was estimated to be

0.16 substitutions per position (Table 1).

This shows that the split genes have, on

average, accumulated twice as many

mutations at nonsynonymous sites as

the full-length genes, suggesting that

they have less functional constraints on

evolution. This could be due to the simple

fact that different proteins evolve under

different functional constraints, or that

some or all of the split genes have recently

lost their function. Indeed, no more than

a twofold difference is to be expected for

genes that currently evolve under no

selective pressure in R. conorii. This is

because the observed substitution

frequencies for such genes represent those

substitutions that have accumulated

during their evolution as functional genes

in the R. prowazekii and R. conorii

lineages, plus those that have occurred

subsequent to fragmentation in the

R. conorii lineage.

To examine the difference in more

detail, we sorted the split genes in

R. conorii into five groups with different

expression features (Fig. 1). The first

group contains fragmented genes in

which all internal ORFs are expressed

(Fig. 1c); the second group includes genes

in which only the 5′ terminal ORF is

expressed (Fig. 1e); the third group

consists of fragmented genes in which

RNA is produced from two or more

fragments, but in no particular order

(Fig. 1g); the fourth group contains genes

in which only the 3′ terminal ORF is

expressed (Fig. 1i); and the fifth contains

a few short ORFs, none of which is

expressed (Fig. 1k). We observe that

genes in four of the five groups have

twofold higher fixation rates for

mutations at nonsynonymous sites

(Ka

= 0.14 to 0.16) than the set of full-

length orthologues (Ka

= 0.07) (Table 1).

The second group is the only group

with a lower frequency of substitutions

at nonsynonymous sites (Ka

= 0.09).

However, this group consists of a single

fragmented gene, leading to a less reliable

estimate. Thus, a higher substitution

frequency appears to be a characteristic

feature of the split genes, irrespective of

the different patterns of transcription.

If the split genes are indeed not

functional, we expect to find no difference

in substitution frequency for expressed

and unexpressed ORFs inside the

fragmented genes. To examine

systematically whether there is a

stronger selective constraint on the gene

fragments that still produce mRNA, we

compared the substitution frequencies

for 26 expressed ORFs with those of

13 unexpressed ORFs. No difference

was found between the two sets of

genes (Table 1), suggesting that the

expressed gene fragments have not been

more functionally constrained than the

unexpressed gene fragments.

We conclude that although

transcription is maintained for some

ORFs, these are accumulating mutations

at the same high frequencies as the

unexpressed ORFs. Together, the

data suggest that the split genes are

neutrally evolving sequences in the

process of being eliminated from the

R. conorii genome.

Translational readthrough, frameshifting

and/or ribosome hopping?

In a few split genes, all internal ORFs are

transcribed, possibly from the ancestral

promoter site. Are these, presumably

full-length, mRNAs translated by

ribosomes? Several mechanisms could,

in principle, account for the production

of a protein despite the accumulation of

internal stop codons (Fig. 1c,d). For

example, if the newly created stop codon

is leaky, translation might be able to

proceed (readthrough) or bypass a

stretch of noncoding nucleotides by

translational frameshifting or ribosome

hopping [17–20]. Alternatively, if

translation start sites are available

downstream of the internal termination

codon, translation could be re-initiated

on the same mRNA. If this process

restores the function of the gene

partially or completely, the ancestral,

full-length gene will be replaced by a

‘mini-operon’ with several, short genes.

In this context, it is interesting to note

that many bacterial operons, such as

the ribosomal protein operon, contain

stretches of short genes that encode

subunits of the same enzyme [1]. Some

of these operons could be the result of

compensatory mutations that have

been fixed in the population to preserve

gene function subsequent to the

accumulation of internal termination

codons and frameshift mutations in the

ancestral gene.

However, the irregular expression

patterns of the split genes in R. conorii

and the lack of conservation of split genes

in the two Rickettsia species suggest that

the accumulation of mutations do not

normally result in the creation of

functional ‘mini-operons’. A more likely

scenario is that the fixation of internal

termination codons in Rickettsia is most

often followed by additional mutational

TRENDS in Genetics Vol.18 No.7 July 2002

http://tig.trends.com

333Research Update

Table 1. Substitution frequencies of Rickettsia prowazekii genes and Rickettsia

conorii genes and gene fragments sorted into different expression groupsa

Set of genes n (ORF)b

Ka/K

sK

aK

s

Full-length genes, complete set 785 0.19 0.07±0.02 0.40±0.07Split genes, complete set 1 3 (39) 0.30 0.16±0.05 0.53±0.13Split genes, group c–d, Fig. 1 4 (10) 0.25 0.14±0.06 0.55±0.15Split genes, group e–f, Fig. 1 1 (2) 0.26 0.09±0.03 0.35±0.10Split genes, group g–h, Fig. 1 4 (17) 0.26 0.16±0.06 0.61±0.14Split genes, group i–j, Fig. 1 3 (8) 0.40 0.15±0.05 0.37±0.10Split genes, group k–l, Fig. 1 1 (2) 0.40 0.15±0.04 0.37±0.14Expressed ORFs 13 (26) 0.31 0.16±0.05 0.52±0.12Unexpressed ORFs 13 (13) 0.32 0.17±0.06 0.53±0.15

aThe frequency of substitutions at nonsynonymous (Ka) and synonymous (Ks) codon positions. Substitutionfrequencies and standard deviations have been estimated as described in [21–22].bn, number of genes in R. prowazekii; ORF, number of ORFs in R. conorii.

Page 4: Small RNAs in Rickettsia: are they functional?

TRENDS in Genetics Vol.18 No.7 July 2002

http://tig.trends.com

334 Research UpdateResearch Update

changes. Indeed, the lack of a difference

in substitution frequencies suggests

that both expressed and unexpressed

short ORFs are regions of the ancestral,

functional gene in different stages

of deterioration.

Conclusions

Our interpretation of these results is

that the split genes in R. conorii are

degraded genes in which mutations have

started to accumulate, in spite of which

transcription, and possibly translation,

can continue. The enhanced substitution

frequencies at nonsynonymous sites

suggest that the split genes are no

longer functional and that the

expression driven by some of these

fragments is most probably a temporary

phenomenon, just as the accumulation

of junk DNA is a temporary stage in the

overall genome deterioration process [9–12].

A possible scenario for the gradual

process whereby (1) the function of a gene

is lost, (2) the expression is turned off,

and (3) the sequence is eliminated,

is outlined below.

The fixation of a frameshift mutation

and/or an internal termination codon

might first induce a halt in translation

(Fig. 1c,d). It can be assumed that the

ancestral promoter will continue to be

active for a while, in particular if one or

more of the shorter ORFs are translated

and still able to maintain some

functional role. However, in the

absence of translational readthrough,

frameshifting, hopping or re-initiation

[17–20], the naked mRNA will be

degraded and transcription will

probably be terminated prematurely

(Fig. 1e,f). This will expose ‘cryptic’

promoter sequences at which RNA

polymerase can re-initiate transcription

(Fig. 1g,h). Both promoters (the

ancestral and the cryptic) might work

simultaneously for a while, but, unless

any of the short gene products are

functionally selected for, either or both

promoters will become inactivated as

the gene accumulates more and more

mutations (Fig. 1i,j). Finally, deletion

mutations will remove any remaining

regulatory signals, leaving only a few

unexpressed fragments with weak

similarities to their full-length

orthologues in other species (Fig. 1k,l).

Thus, the balance between the rate

of disruptive mutations and the rate

at which cryptic transcriptional and

translational initiation sites are exposed,

or created by mutations, will determine

the extent to which the original gene

function can be recovered. A probable

secondary effect of the highly simplified

regulatory systems of obligate

intracellular parasites is that gene

expression could be more or less

constitutive and that ‘false’ initiation of

transcription and translation at internal

gene sites might occur at higher than

normal frequencies, especially inside

degrading genes. Although it cannot be

excluded that such processes might

temporarily, or in rare cases permanently,

recover the function of a disruptive

mutation, degenerative processes are

expected to dominate the evolution of

obligate intracellular pathogens in the

long term [4].

If transcription can indeed be driven

by false initiation inside non-functional

genes, it means that positive signals in

microarray analyses of transcription

profiles do not necessarily imply the

presence of functional genes. It should

be emphasized that other experimental

data are needed to confirm a functional

or regulatory role for any of the

identified small RNAs. As we have shown

here, comparative studies of gene

conservation and substitution rates

across closely related species could yield

important clues about the functional

significance of any observed RNA

expression pattern.

Acknowledgements

The authors’work is supported by grants

from the Swedish Science Foundation

(VR), the Foundation for Strategic

Research (SSF), the Knut and Alice

Wallenberg Foundation (KAW), the

European Union (EU) and the National

Science Foundation (NSF), USA, and

the National Research Foundation,

SouthAfrica.

References

1 Lathe, W. et al. (2000) Gene context conservation

of a higher order than operons. Trends Biochem.

Sci. 25, 474–479

2 Mira, A. et al. (2001) Deletional bias and the

evolution of bacterial genomes. Trends Genet.

17, 589–596

3 Wassarman, K.M. et al. (2001) Identification

of novel small RNAs using comparative

genomics and microarrays. Genes Dev.

15, 1637–1651

4 Andersson, S.G.E. and Kurland, C.G. (1998)

Reductive evolution of resident genomes.

Trends Microbiol. 6, 263–268

5 Andersson, S.G.E. et al. (1998) The genome

sequence of Rickettsia prowazekii and the origin of

mitochondria. Nature 396, 133–140

6 Ogata, H. et al. (2001) Mechanism of evolution in

Rickettsia conorii and R. prowazekii. Science

293, 2093–2098

7 Cole, S.T. et al. (2001) Massive gene decay in the

leprosy bacillus. Nature 409, 1007–1011

8 Shigenobu, S. et al. (2000) Genome sequence of

the endocellular bacterial symbiont of aphids

Buchnera sp. APS. Nature 407, 81–86

9 Andersson, J.O. and Andersson, S.G.E. (1999)

Genome degradation is an ongoing process in

Rickettsia. Mol. Biol. Evol. 16, 1178–1191

10 Andersson, J.O. and Andersson, S.G.E. (2001)

Pseudogenes, junk DNA, and the dynamics

of Rickettsia genomes. Mol. Biol. Evol.

18, 829–839

11 Andersson, J.O. and Andersson, S.G.E. (1999)

Insights into the evolutionary process of

genome degradation. Curr. Opin. Genet. Dev.

9, 664–671

12 Ochman, H. and Moran, N.A. (2001) Genes lost

and genes found: evolution of bacterial

pathogenesis and symbiosis. Science

292, 1096–1098

13 Andersson, S.G.E. and Sharp, P.M. (1996)

Codon usage and base composition in

Rickettsia prowazekii. J. Mol. Evol.

42, 525–536

14 Lawrence, J.G. et al. (2001) Where are the

pseudogenes in bacterial genomes?

Trends Microbiol. 9, 535–540

15 Galitski, T. and Roth, J.R. (1997) Pathways for

homologous recombination between chromosomal

direct repeats in Salmonella typhimurium.

Genetics 146, 751–767

16 Frank, C. et al. Genome deterioration: loss of

repeated sequences and accumulation of junk

DNA. Genetica (in press)

17 Buckingham, R.H. et al. (1997) Polypeptide

chain release factors. Mol. Microbiol.

24, 449–456

18 Weiss, R. and Gallant, J. (1983) Mechanism of

ribosome frameshifting during translation of the

genetic code. Nature 302, 389–393

19 Herr, A.J. et al. (2000) Coupling of open reading

frames by translational bypassing. Annu. Rev.

Biochem. 69, 343–372

20 Huang, W.M. et al. (1988) A persistent

untranslated sequence within bacteriophage

T DNA topoisomerase geen 60. Science

239, 1005–1012

21 Nei, M. and Gojobori, T. (1986) Simple methods

for estimating the numbers of synonymous and

nonsynonymous nucleotide substitutions.

Mol. Biol. Evol. 3, 418–426

22 Ohta, T. and Nei, M. (1994) Variances and

covariances of the number of synonymous and

nonsynonymous substitutions per site.

Mol. Biol. Evol. 11, 613–619

Wagied Davids

Haleh Amiri

Siv G.E. Andersson*

Dept of Molecular Evolution, University of Uppsala, Norbyvägen 18C,S-752 36 Uppsala, Sweden*e-mail: [email protected]