UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl)
UvA-DARE (Digital Academic Repository)
Anomalous DNA in prokaryotic genomes
van Passel, M.W.J.
Publication date2006
Link to publication
Citation for published version (APA):van Passel, M. W. J. (2006). Anomalous DNA in prokaryotic genomes.
General rightsIt is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s)and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an opencontent license (like Creative Commons).
Disclaimer/Complaints regulationsIf you believe that digital publication of certain material infringes any of your rights or (privacy) interests, pleaselet the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the materialinaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letterto: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. Youwill be contacted as soon as possible.
Download date:22 Jun 2021
https://dare.uva.nl/personal/pure/en/publications/anomalous-dna-in-prokaryotic-genomes(0d6ef0e9-55f3-4ea8-8c4d-8a4aa5cdd53e).html
��
���� ��������������������������� ��
����������
� ����� ����� ����������������
�����
�
������������������������ ��������������������������� �������� ��� ��������������������������������������������������������������������������������������� ������ !"#$#$%#�#��&�������������������'���(�����������)(���������� ������� ����*����+,�������������� ������-������������� ��&���������,�������������.�)��/�����-������������0������������,���((�����(������+����������������������*,��1������������������2�-�������+��*,�� ���� ����������2�
����
����� ��������������������������� ��
�
�
�
�
�
�
�
������ ���������������
�
��������3+��+�������+����������������
�����������������������������
�(�+0�+�������4������ �+��������
(���2��2��2��2��������5�3���
�����������������������������+������(���������+�����
��
������������(�*�����������+��������������������������
�(��������+�67��*������8!!7����69:!!������
�
�����
�
� ����� ����� ��������������
�
�*������;�
�
��� ������� � ����
�
��� ����� � � (���2���2�-2� 2�2=2�.����*�����
��
%������!����������������������&��������!�����'�������!������(�-����� ������?���'�*����+,��������'������,@��
A� ����*����+,������ �������������+,�4��'���8!!9B
�
Table of Contents
Table of Contents Chapter 1 Introduction 1 Chapter 2 An in vitro strategy for the selective isolation of
anomalous DNA from unsequenced prokaryotic genomes 29
Chapter 3 δδδδρρρρ-web, an online tool to assess composition similarity of individual nucleic acid sequences 47
Chapter 4 An acquisition account of genomic islands based on
genome signature comparisons 55 Chapter 5 Identification of anomalous sequences in Neisseria
lactamica expands the neisserial gene pool 75 Chapter 6 Plasmid diversity in neisseriae 95 Chapter 7 Compositional discordance between prokaryotic plasmids
and host chromosomes 115 Chapter 8 Summary and discussion, samenvatting, publicatielijst en
curriculum vitae 129
Chapter 1
1
Chapter 1
Introduction
1.1 Prologue
Infectious diseases remain one of the leading causes of death worldwide.
Since the establishment of the germ theory of disease around 1870 by Louis Pasteur
and Robert Koch, which states that infectious diseases are caused by
microorganisms within the body, biomedical research has rigorously investigated
potential remedies against microbial infections. This resulted in better hygiene, the
production of vaccines (for example against anthrax, tetanus, polio), extensive
classes of antimicrobial agents against infectious bacteria (which cause diseases
such as gonorrhoea, pneumonia and meningitis), but also viral inhibitors. These
combined measures brought down the percentage of deaths due to infectious
diseases from 30% at the beginning of the 20th century to 1.5 % at the beginning of
this century in the Netherlands. Still, according to the World Health Report 1996,
infectious diseases kill over 17 million people every year, of which 9 million young
children.
The field of genetics has contributed considerably to the quest for cures for
infections by elucidating bacterial strategies for causing disease. Historically,
genetics has been studied for about one and a half century. The Austrian monk
Gregor Mendel started investigating the basics of genetics in the 19th century using
peas, but it wasn’t until 1944 that Oswald Avery and co-workers discovered that DNA
was the carrier on which hereditary information is stored. The structure of a DNA
molecule was finally resolved in 1953 by James Watson, Francis Crick and Rosalind
Franklin. In 1958, Matthew Meselson and Frederick Stahl found out that replication of
the two DNA strands, which occurs with every cell division, is semi conservative; this
explained how after every cell division the two daughter cells each contain an exact
copy of the original genetic information of the parent cell. Finally, a central dogma of
Chapter 1
2
molecular biology was proposed concerning the flow of genetic information; DNA is
transcribed into an intermediate, RNA, which in its turn is translated to the main
functional units of metabolism, namely proteins. Some of these proteins mediate
DNA replication, which makes the reproduction of DNA come full circle.
In 1995 the first complete genetic sequence of a free-living organism (the
bacterium Haemophilus influenzae) was published, and accordingly started the age
of genomics. Since then, genome sequencing has resulted in the publication of over
250 complete genome sequences, most of which originate from microbes (as the
genomes of microbes are relatively small and manageable compared to plant and
animal genome sequences). These genomes each contain the entire genetic data of
the organism in question, and so give insight in the organisation and expression of
genes, metabolic potential of a microbe, the formation of different species, and
genome evolution. The final shape of all genomes results from hundreds of millions
of years of evolution, and as such they each represent their own account of the
recorded evolutionary history of life.
Studying genome evolution allows insight in the pathogens side of the arms
race between the pathogen and the host (e.g. humans). Understanding genome
evolution is therefore key in finding new vaccines, antibiotics and other therapeutics.
This chapter aims to introduce the main topics of this thesis, which focuses on
bacterial genome composition, in particular of Neisseria and especially on horizontal
gene transfer (HGT), an important contributor to genome evolution. In order to
explain the implications of horizontal gene transfer, a little background in genomics
and bioinformatics is necessary, and is therefore included in this introduction.
Chapter 1
3
1.2.1 Neisseria meningitidis
Neisseria meningitidis, or meningococcus, is a Gram-negative diplococcal
bacterium belonging to the family of Neisseriae, a subdivision of the β-Proteobacteria.
It is an obligate human pathogen that inhabits the naso- and oropharynx. It has been
estimated that in this niche it can be encountered in approximately 10% of the
population (reviewed by [1]), although this may be a grave underestimation [2]. This
bacterium, was first isolated by Anton Weichselbaum from cerebrospinal fluid of a
meningitis patient and identified as the causal agent of a case of meningitis 1887,
and initially named Diplococcus intracellularis [3]. In contrast with the closely related
gonococcus (Neisseria gonorrhoeae), isolated almost a decade earlier by Albert
Neisser [4], which invariably leads to disease in the infected host, the meningococcus
only causes disease in a fraction of the carriers of this bacterium. Therefore, N.
meningitidis could be regarded as a commensal bacterium [5]. However, sporadically
meningococci crosses the mucosal barrier and enter the bloodstream, leading to
various clinical entities such as sepsis, meningitis or both simultaneously (reviewed
by [6]).
1.2.2 Bacterial typing
Meningococcal identification is based on a few distinctive features such as
shape, Gram-stain, and phenotypic characteristics. Meningococcal isolates are
grouped according to a number of antigenic and genotypic characteristics. This
enables intensive epidemiological surveillance, which is important for public health
decisions and the development of vaccination strategies. Traditionally,
meningococcal phenotypes were classified using polyclonal sera against surface
exposed structures [7], but nowadays serological typing is carried out with various
monoclonal antibodies more specifically aimed at the different neisserial antigenic
structures. The capsular polysaccharide, although occasionally absent, designates
the serogroups [8], of which the groups A, B, C, W-135 and Y are predominant
among clinical isolates. The major outer membrane proteins PorB and PorA define
the serotype and serosubtype, respectively, and finally, immunotypes are assigned
according to the lipopolysaccharide (LPS) structure [9, 10]. This results in the
following classification scheme for N. meningitidis:
Chapter 1
4
[serogroup]:[serotype]:[serosubtype]:[immunotype], e.g. B:4:P1.7,4:L3,8. Meanwhile,
serological sero(sub)typing has been largely replaced by typing schemes based on
sequence data of the two variable regions (VR1 and VR2) of the PorA encoding gene
porA, and the partial nucleotide sequence of the outer-membrane protein encoding
gene fetA.
With respect to genotyping approaches, the emphasis has shifted recently
from multilocus enzyme electrophoresis (MLEE, [11]), pulse field gel electrophoresis
(PFGE, [12]), and random amplified polymorphic DNA (RAPD, [13, 14]) to multilocus
sequence typing (MLST). The latter method allows a much higher resolution and is
by far more reproducible and unambiguously comparable (‘portable’) between
different laboratories [15]. Current MLST is based on the sequences of seven
housekeeping genes that contain sufficient variation to allow high resolution and
congruence, resulting in different sequence types, and is used for typing different
organisms (for details see http://www.mlst.net/). This MLST database is freely
available for global epidemiology analyses and surveillance, and can identify clonal
complexes [16] and even suggest the descent of isolates and/or clonal complexes
[17, 18].
However, Neisseria are naturally transformable [19], and recombination
events between strains of different types can distort tree-like interpretations of
phylogenetic analysis. For example, when using MLST, Neisseria meningitidis is
depicted with a ‘fuzzy’ or unclear species definition, as species “…are not ideal
entities with sharp and unambiguous boundaries” [20]. However, MLST analyses
may support further modeling of how species may emerge.
1.2.3 Epidemiology/Incidence
The overall incidence of meningococcal disease varies considerably
throughout the world. In the Netherlands, the frequency is around 2/100,000
inhabitants per year [21], but during epidemics in sub-Saharan African countries the
incidence of disease has reached numbers up to 500/100,000 [22, 23]. Industrialised
countries have also experienced epidemics of N. meningitidis, for example Norway in
1974-1975 [24] and New Zealand from 1990 onwards [25, 26].
Chapter 1
5
Widespread pandemics can be instigated by specific human migratory
patterns, such as the Hajj, and outbreaks of these clonal complexes be followed by
global surveillance via the MLST database. A specific N. meningitidis clone with
serogroup W-135 emerged after the Hajj pilgrimage of 2000, infecting a total of over
40 Hajj pilgrims and their household contacts in the United Kingdom, France, the
Netherlands, and Oman, with an additional number of meningococcal disease cases
in Saudi Arabia related to this outbreak [27]. This serogroup W-135 clone also
caused local epidemics such as in Burkina Faso in 2002 [28]. Recent vaccination
strategies however in this region halted the epidemic [29].
In the Netherlands, the National Reference Laboratory for Bacterial Meningitis
(NRLBM, hosted by the Academic Medical Center and The National Institute for
Public Health and the Environment (RIVM)) has been collecting isolates of N.
meningitidis since 1959 and currently harbours over 37,000 meningitis and/or sepsis
causing isolates available for epidemiological studies. The annual reports of the
NRLBM show that the seasonal distribution of meningococcal disease finds its peak
in the first quarter of each calendar year. Also, the distribution of patients suffering
from meningococcal disease over the different age groups is not uniform; age-
specific incidences per 100,000 inhabitants in the age groups younger than 5 years
and between 15-19 years old are substantially higher than in other age groups [21].
With the advent of molecular epidemiology new meningococcus variants have
been identified, amongst which the so-called Lineage III cluster, first defined in the
Netherlands. This hypervirulent B:4:P1,4 type cluster greatly increased in numbers in
disease cases, until it finally comprised over half of all Dutch meningococcal disease
isolates (figure 1 [21]). The emergence of this cluster halted in the mid-nineties, after
which a gradual decline of Lineage III meningococci was observed. Reasons for the
emergence and decline of this cluster remain obscure.
Chapter 1
6
Figure 1. Meningococcal disease in the Netherlands over the last 40 years (only the main serogroups
are depicted). With hardly any serogroup A present, it is clear that serogroup B is the main cause of
meningococcal disease in the Netherlands, although recently the incidence of serogroup B is
diminishing. From 2001 onwards a substantial increase of serogroup C is visible, followed by a
decrease (kindly provided by dr. A. van der Ende, the Netherlands Reference Laboratory for Bacterial
Meningitis, Academic Medical Center, Amsterdam).
Also visible in figure 1 is a steep increase in the number of cases of
meningococcal disease in the beginning of 2002 in the Netherlands, primarily caused
by N. meningitidis serogroup C [30]. This lead the Public Health authorities to start a
massive nationwide vaccination campaign by mid 2002; 3 million children aged
between 12 months and 19 years were vaccinated in the first year after the campaign
was started. Since then, a N. meningitidis serogroup C vaccine, based on a
conjugate of the polysaccharide capsule, is included in the Dutch vaccination
scheme. This policy almost immediately led to a sharp decrease of meningococcal
disease caused by the serogroup C variant [31], which was predicted by the study on
meningococcal carriage performed in the UK [32]. However, N. meningitidis
serogroup B still causes many cases of meningococcal infection per year in the
Netherlands. Its capsule, consisting of polysaccharides resembling those on host
cells, is poorly immunogenic [33, 34]. This renders the development of a vaccine very
difficult; hence no commercial vaccine is currently available against this serogroup.
Chapter 1
7
However, alternative epitopes are sought after for broad coverage meningococcal
vaccines, such as outer-membrane proteins (Pizza et al., 2000).
Interestingly, studies by Caugant and co-workers showed that among N.
meningitidis carrier isolates few meningococci of disease-causing genotypes are
found [11, 35], which emphasises the importance of epidemiological studies. The
related non-pathogenic Neisseria lactamica [36] has been suggested to be a human
coloniser causing natural immunity against the meningococcus [37, 38], and
potentially, non-pathogenic N. meningitidis variants may cause natural immunity as
well. Braun and co-workers reported the occurrence of cross-reactive epitopes that
are shared by N. meningitidis and N. lactamica [39], and Gorringe and co-workers
described initial research on a N. lactamica based vaccine [40]. Another longitudinal
study showed high N. lactamica carriage rates among infants, which was interpreted
as support for the aforementioned natural immunity hypothesis [41]. Since vaccines
based on one or a few protein epitopes still lack sufficient coverage due to variation,
alternative strategies for vaccine development, such as using commensal Neisseriae
are of interest.
1.2.4 Pathogenesis of meningococcal disease
Relatively little is known about the actual pathogenesis process of
meningococcal disease, as this process exclusively takes place in humans.
Moreover, this process usually occurs very suddenly and dramatically. However,
many different putative virulence-associated factors have been described, such as
adhesion factors such as pili [42-45], the immunoglobulin protease IgA1 [46], the
putative RTX toxins Frp [47-49], the capsular polysaccharide [50], outer membrane
proteins Opa and Opc [51], and lipopolysaccharide (LPS or endotoxin) [52]. More
recently, the formation of biofilms have been studied [53]. Most in vivo studies
concerning invasive meningococcal disease have been performed with animal
models [54]. These animal models are supposed to simulate the infection process in
humans, but often these models require intraperitoneal injection of the virulent or
attenuated bacteria [55], which is a poor imitation of the actual infection route in
humans. However, intranasal immunisation studies in murine models, simulating a
Chapter 1
8
more comparable route of infection, have been performed as well [56, 57], and the
recent introduction of transgenic mice which express human epitopes indicate
improvements in meningococcal infection models [58].
Next to animal models, some in vitro experiments have been conducted with
human tissue [51], but epidemiological data, sequence comparisons and analogies
with gonorrhoea can also give information about potential pathogenicity factors of the
meningococcus [59-61]. Interestingly, with recently developed population dynamics
models, it was shown that outbreaks of meningococcal disease are caused by
diversity in the pathogenicity of meningococcal strains [62]. This is an incentive for
performing genome comparisons between carrier strains of N. meningitidis and
clinical isolates, in order to identify bacterial genetic factors underlying invasiveness.
Our limited understanding of the factors that play an obligatory role in the
pathogenesis of invasive disease is highlighted by the recent description of a single
case of invasive disease caused by an unencapsulated strain in an presumably
immunocompetent patient [63]. This contrasts the general assumption that the
presence of a capsule is pivotal to the virulence of N. meningitidis. Also, strains
lacking PorA, the major outer membrane porin that is currently being considered and
tested as a potential broad-range vaccine-candidate [64], have been found to be able
to cause invasive disease [65], questioning the practicability of such single protein-
based vaccines.
Although meningococcal disease in humans is hard to study directly, the
consequences of meningococcaemia are easily observed, and different studies
showed the devastating effects. In addition to a case fatality rate of approximately 5%
for meningitis, and up to 40% for meningococcal septicaemia, [66], the sequelae of
the disease in survivors include deafness, loss of limbs and mental retardation [67]
This imposes a high burden for a prolonged time, due to the often relatively young
age of the patient. Koomen and colleagues have recently presented a number of
studies in which young patients that survived bacterial meningitis were tested for
several years for mental sequelae [68-70]. Amongst others it was found that after a
meningitis, children were more likely than ‘controls’ to underachieve at school [69].
Chapter 1
9
Although the precise disease process is still largely unknown, a number of
predisposing factors for the human host have been determined. As pathogenesis is a
complex interplay between host and pathogen, the role of the host should not be
underestimated. Host risk factors include (passive) smoking [71], and reduced
immunocompetence [72]. Recently, a higher attack rate of meningococcal disease
rate has been observed in children with a pregnant mother [73], but the cause of this
is still unknown. Also, genetic polymorphisms among components of the diverse
cascades of the immune system are involved in heightened susceptibility for
meningitis [74]. This clearly shows that factors beyond the intrinsic pathogenic
potential of the microbe are important for disease to occur.
1.2.5 Phase variation and antigenic variation
Meningococci employ various strategies to evade the human immune system.
One of these strategies is the variation in gene expression levels via length
differences in simple sequence repeats, coined phase variation [75]. Differences in
length of these sequence repeats, both homopolymeric tracts and other simple
repeats, which may occur in coding and/or in promoter regions, is presumably the
result of slipped strand mispairing during replication and can thereby influence gene
expression levels. Different studies suggested that various loci are responsible for
phase variation frequencies in various genes, such as transformation associated
genes [76], pilli and iron transport genes [77], but also genome maintenance genes
(such as mutS) may be involved [78].
This phase variation strategy allows the bacterium expression versatility of
genes including, but not restricted to, immunologically important surface structures
such as pili, outer membrane proteins or enzymes involved in capsular
polysaccharide biosynthesis, as well as adhesin encoding genes such as opC [79]. It
may also provide adaptation to different environmental niches, during colonisation
and potentially also during the dissemination process. Whole genome analysis of the
N. meningitidis MC58 genome sequence identified over 65 phase-variable genes
[80], whereas the comparative analysis of two additional N. meningitidis genome
sequences revealed a repertoire of over 100 putative phase variable genes [81].
Chapter 1
10
These studies showed that the meningococcal genome sequence contains the most
extensive repertoire of phase variable genes described to date.
A different strategy employed by the meningococcus is called antigenic
variation. This strategy exploits differences in variants of a single surface component,
such as the outer membrane porin PorA, which displays not only variation in
expression levels via phase variation, but also variation in sequence composition,
resulting in antigenically different PorA proteins. Antigenic variation can result from
transformation-mediated recombination, point mutations or replacement of different
alleles present in the genome sequence itself, as has been observed for pilin
structures [82, 83]. Combined, phase variation and antigenic variation make the
meningococcus a highly versatile bacterium.
1.3 Genomics & Bioinformatics
Since the publication of the first complete genome sequence of a free living
organism in 1995, Haemophilus influenzae [84], over 231 prokaryotic and 33
eukaryotic genome sequences have been annotated, and over 1000 genome
sequencing projects are still ongoing (www.genomesonline.org, [85, 86]). This
enormous amount of data has been amassed in order to mine for better vaccine
candidates [87, 88], scan for virulence evolution [89], to fuel industrial interest in
probiotics [90], or to examine adaptation strategies to extreme conditions [91, 92], to
give only a few examples. The application area of prokaryotic genome sequencing
projects is nonetheless biased towards biomedical research and industrial purposes,
as of the sequenced genomes, 52% and 47%, respectively are selected for their
relevance in these fields (www.genomesonline.org). Together, this results in a poor
representation of biological diversity [93] and this bias in its turn may have
consequences for the interpretation of certain types of analyses (e.g. species
diversity estimates due to sampling bias, protein function predictions).
With genome sequencing being highly automated, large scale projects such
as community sequencing are feasible and have been conducted [94, 95]. These
metagenome projects analyse all (microbial) DNA present in a chosen biotope (an
acid mine drainage pool in [94] and a seawater sample of the Sargasso Sea in [95]),
Chapter 1
11
including the DNA of uncultivable microbes, which are still thought to make up most
of microbial life on earth (reviewed by [96]). This approach has shown to be of great
value in estimating oceanic microbial diversity [95] and intraspecies genetic diversity
and metabolic potential [94]. Also, datasets from these metagenomic projects can
instigate new data mining schemes, such as the investigation for the selenoproteome
in the Sargasso Sea environmental genome project [93]. Recently, metagenomic
analyses by Tringe and co-workers have shown that vast amounts of sequences are
needed to yield a complete genome of the predominant species in biologically
complex populations [97]. These authors did however find environment-specific
genes, which allow for habitat-specific fingerprinting.
Furthermore, the release of a great many microbial genome sequences
allowed a critical look at species definition and taxonomy, which until recently was
based solely on phenotypical, morphological and limited genotypical data ([98],
recently reviewed by [99]). Coenye and co-workers suggest a number of approaches
for assessing taxonomic relationships [99]. Genomes can be compared according to
their gene content and gene order (synteny), as well as their nucleotide composition,
such as GC-content and dinucleotide frequencies or genome signature comparisons.
These approaches were compared in a study comprising the lactic acid bacteria as a
test case, and it was concluded that the different whole genome approaches that
were used yielded very similar classification results [100].
The release of these large amounts of data has fueled the informational
technology field considerably, as new and optimized computational approaches were
necessary to manage these complex and extensive databases [101]. An outstanding
example of the enormity of these databases and the potential of bioinformatics was
the finding, by serendipity, of almost entire genome sequences of new
endosymbionts in the raw sequence traces of various Drosophila genome-
sequencing projects [102].
The discipline of bioinformatics could be regarded as a science or a facilitative
technology platform [103]. On the one hand, numerous applications have been
developed that allow users to scan data for motifs, for instance promoter sequences
or genome signatures (which are species specific oligonucleotide frequencies
Chapter 1
12
observed in whole genome sequences). But also different applications have been
developed, such as phylogenomic tree builders, potential protein interaction partner
search tools, models for operon prediction in whole genomes, and also visualisation
software such as Artemis, Bugview, Plasmapper [104-107], which are mostly aimed
at facilitating research. These applications are described amongst others in
specialised web issues of renowned journals. On the other hand, bioinformaticians
may directly develop hypotheses concerning biological phenomena, such as the turn-
over of gene content in Proteobacteria and Archaea [108], the assessment of
functional modules by the determination of pair wise protein interaction [109] or the
origins of gene repertoires in prokaryotes [110].
Large-scale databases have also found their place in specialised sections of
scientific journals. As a response to the maintenance of these large datasets, the
American Society for Microbiology (ASM) recently published a colloquium report
regarding maintenance and improvement of extensive genome sequencing
databases, as this is often neglected due to lack of funds and scientific merit [111].
1.4 Genome composition and evolution
In the process of genome evolution, three major forces play an important role:
gene genesis (e.g. via horizontal acquisition of DNA), gene loss, and genomic
rearrangements (figure 2) ([112] and reviewed by [113]).
The relation between gene content and genome size has been studied by
Konstantinidis and Tiedje [114]. They suggest that prokaryotic species with large
genome sizes have a more extensive metabolic potential, enabling survival in
(different) environments where resources are scarce. The main contributors to
genome expansion are acquisition of DNA and duplication events. As for gene
acquisition or genesis, Daubin and Ochman suggest that a substantial part of new
(small) genes is acquired via horizontal gene transfer from bacteriophages [115,
116], whereas other acquisition events involve much larger gene clusters such as
Genomic Islands (GIs) and Pathogenicity-Associated Islands (PAIs) ([117]). The
origins of these large gene clusters remain obscure, which may be explained by the
relatively small number of different species that have been sequenced.
Chapter 1
13
However, although genomes do incorporate new DNA, they do not grow ever
larger. Cellular processes that eliminate (excess) DNA form the genome must be
present. The existence of species with small genomes, often endosymbionts, is
suggestive of genome evolution leading to niche-specific organisms by the loss of
many regulatory functions [118, 119]. These small-genome organisms are thought to
be derived from larger genome-sized species [120], and may represent an illustration
of the process of genome reduction.
Figure 2. Depicted are the processes involved in genome size evolution in bacteria. Genome
expansion takes place by duplication and acquisition events, whereas genome reduction is mainly
maintained by deletion, either by direct deletion or slow erosion via gene activation and subsequent
deletion (adapted from [121]).
A fine example of recent massive gene decay is found in the genome
sequence of the leprosy bacillus Mycobacterium leprae [122]. The bacterium is
strictly intracellular, but it has a relatively large genome size (3.3 Mb). It has,
presumably relatively recently, undergone massive reduction in genome size
resulting in many pseudogenes, thought to result from extensive recombination.
Pseudogenes are inactivated genes of which the remnants are still present, and
recent analyses of pseudogene content across diverse prokaryote genomes indicate
that pseudogenes are formed and eliminated rapidly from genome sequences [123].
In M. leprae the total number of predicted functional genes is around 1,600
Processes increasing
genome size
Processes reducing
genome size
Gene
acquisition
Loss of
fragments
plication
Deletional
bias
Gene inactivation
Bacterial
genome
Du-
Chapter 1
14
(compared to almost 4,000 in the related M. tuberculosis with a similar genome size),
which is indicative for a genome in flux.
Genomic rearrangements were first visualized when two different complete
genome sequences of the same species were sequenced and compared [124]. With
the exception of operons, functionally involved genes which out of necessity are in
close proximity to each other, the order of genes is often poorly conserved among
bacteria during evolution [125]. This genomic flexibility may have a function in both
the genesis of new genes (such as genomic region duplication events), in the
removal of unnecessary sequences or alteration of transcription levels. Comparisons
of whole genome sequences are best depicted in whole genome alignment graphs,
as described by Eisen and co-workers [126]. The conservation of gene order is
expressed with the synteny parameter, which is used when quantifying genome
sequence similarity.
1.5 Horizontal gene transfer
Even before whole genome sequences were available, horizontal gene
transfer (HGT) was recognized and acknowledged as a factor contributing to
prokaryotic evolution, although the impact of HGT was not fully appreciated until the
genomic era was well underway. One of the most well-known examples of HGT
events are R-plasmids that encode resistance against certain (types of) antibiotics
[127].
Horizontal gene transfer, sometimes addressed as lateral gene transfer,
constitutes an alternative for the orthodox vertical inheritance of genetic traits. There
are three distinct routes which can lead to horizontal acquisition of DNA: 1) direct
uptake of DNA by the cell (transformation), 2) directed DNA transfer via conjugation
and 3) bacteriophage-mediated DNA transfer (transduction) (see figure 3). These
three different routes of horizontal acquisition of DNA have all been studied
extensively; nowadays they constitute basic tools in molecular biology.
The first genome-scale analysis of HGT was performed by Lawrence and
Ochman [128], who found by that approximately 18% of the Escherichia coli genome
has been acquired via horizontal gene transfer. This study of a single genome
Chapter 1
15
sequence opened the door to further analyses, which seem to increase in accuracy
with the availability of more genome sequences (and more readily available
phylogenetic data) and of more parameters for identification procedures.
Although no actual natural transfer events have been witnessed, the results of
HGT can be recognized in different ways. The most obvious is the phylogenetic
approach, in which incongruence in evolutionary relationships between different gene
clusters hint at transfer events. Molecular phylogenetic analyses were originally
performed using nucleotide sequences present in all organisms: those of ribosomal
RNA [129]. Nowadays, with the availability of many different complete genome
sequences, weighted genome trees can be constructed [130], and discrepancies in
these analyses are often explained most parsimoniously by an horizontal transfer
event rather at one of the branches, rather then by a large number of deletion events
at a great many more branches of the phylogenetic tree.
The second approach for the detection of acquisition events consists of
parametric analyses, and is more enthusiastically embraced by bioinformaticians, as
it is relatively easily implemented in software. Different parameters have been
proposed to detect horizontally acquired sequences, and the most well known
approaches are based on codon-usage biases, GC percentage, and dinucleotide
frequency (also called the genome-signature) deviations [131]. Parametric
identification of horizontally transferred DNA is based on the genome hypothesis,
which proposes that for a given prokaryotic genus genomic DNA is relatively constant
in codon usage and GC content [132, 133]. Therefore, horizontally acquired
sequences may differ in codon usage and/or GC composition from the recipient
genome and can be identified in whole genome sequences. Improvements of these
approaches that permit better resolution have been published recently [134, 135].
Currently, with a great many genome sequences tested for putative horizontally
acquired genes, the emphasis has shifted toward functional analyses of horizontally
transferred sequences. Recent analysis of acquired genes suggest a bias towards
three functional categories: cell-surface, DNA binding and pathogenicity-associated
[136]. The observed bias in functional categories may however be the result of the
aforementioned disproportional availability of genome sequences of biomedical and
industrially relevant strains, as compared to other strains (www.genomesonline.org).
Now, with more genome sequences of environmental strains rapidly becoming
Chapter 1
16
available, an increasing variety of acquired gene clusters providing diverse metabolic
capacities are being discovered, emphasising that horizontal genetic transfer is not
limited to virulence traits [117], but may also constitute novel catabolic pathways
involved in for example xenobiotics degradation [91]. It is of note however, that the
parametric approach to the identification of compositionally dissimilar sequences
might not be sufficient to identify all horizontally acquired sequences. Exchange of
DNA between closely related species can lead to the acquisition of non-anomalous
sequences [137]. On the other hand, autochthonous sequences can sometimes be
very different from the rest of the genome sequence [131], as they are highly
expressed or have distinct features such as a strong bias for certain amino acids, but
also ribosomal RNA sequences. Finally, there are genomic sequences, are
suspected to be acquired via HGT, but no definite history or origin can be assigned.
This is a different drawback of parametric detection of anomalous sequences in
genome sequences; no clear cut-off value for the number of putative horizontally
acquired genes can be given without further phylogenetic support.
Besides whole genome sequencing, several other techniques are available to
selectively isolate putative horizontally acquired sequences. These include
subtractive hybridization [138] and representational difference analysis [138, 139],
both techniques with which the differences of two related strains are cloned and
analysed. These approaches rely on the lack of hybridisation between sequences
unique to one of the two strains. However, no dedicated tool is available to score
individual sequences isolated with these techniques for composition dissimilarities
compared to a genome sequence, although for many of these putative horizontally
transferred sequences a representative genome sequence (i.e. a genomic context) is
available. Currently, applications that can test whether a nucleotide sequence is
atypical within a genomic context, and therefore putatively horizontally acquired,
relies solely on whole genome sequences, and disregards all other sequences
available in the databases [136, 140-143].
Obviously, horizontal gene transfer does not create ever-larger prokaryotic
genome sequences. Constraints on HGT must therefore exist and limit the maximum
of acquired sequences that can be stably introduced to host genomes. For transfer
Chapter 1
17
routes such as conjugation or transduction, a limited host range of the transferring
agents may restrict the horizontal spread of genes. Lawrence and Hendrickson
propose a different potential constraint on HGT [144], based on specific
oligonucleotide motifs, distributed asymmetrically between the two DNA strands,
which may be involved in genome replication processes. The differential distribution
of such specific oligonucleotide motifs between different species of microbes may
limit sequence exchange, as the introduction of the DNA would incur a selective
detriment that could potentially offset any benefits provided by the newly acquired
gene products [144]. How this would relate to large-scale genome rearrangements is
still unknown.
Incompatibility may play a significant role in HGT. As sequences are thought
to function optimally in the genomes where they originally belong, sequence
establishment in a new host may not always result in compatibility of the encoded
proteins with the host (intra- and extracellular) environment. Also, most proteins
interact with and/or are dependent on other proteins in complex systems, and the
absence of similar systems in a new host may restrict functionality of acquired
sequences [145]. As a result, it is thought that sequences are removed quickly from
the genome, when they are not beneficial to the host. [121]
Finally, restriction modification systems may form a constraint posed on HGT
from a very different perspective. The defence hypothesis postulates that these
systems play a role in maintaining species identity [146]. Restriction modification
systems have two main activities: that of a restriction endonuclease and that of a
methylase. Both recognize the same nucleotide motif. The endonuclease activity
cleaves unmethylated DNA, which usually consists of acquired DNA that has not yet
been methylated. Genomic DNA is safe from the harmful activity of the
endonuclease, as the methylase protects all recognition sequences. However, the
mobile nature of these restriction modification systems questions their supposed
protective functionality towards species identity [147].
Chapter 1
18
Figure 3. Three different routes for horizontal gene transfer, Transformation (uptake of naked DNA),
conjugation (plasmid-directed DNA transfer) and transduction (bacteriophage-mediated transfer of
DNA).
Chapter 1
19
1.6 Outline of this thesis
The aim of this introduction was to draw the framework in which this thesis
should be placed, as well as to formulate the research questions of this thesis. The
initial aim of the project was to identify neisserial sequences that are responsible for
the hypervirulent nature of certain N. meningitidis genotypes. However, the aim
gradually shifted towards analyses of horizontal gene fluxes amongst prokaryotes in
general, and acquired gene clusters in Neisseriae in particular. In other words, how
can horizontally transferred sequences be isolated without strain-by-strain
comparisons? Next, how can individual sequences, suspected to be acquired via
HGT, be placed within a genomic context? Furthermore we aimed to analyse
horizontal gene fluxes both intragenomically (with genomic islands) and between
species (with plasmids). The first step would be the development of an in vitro
strategy to isolate horizontally acquired sequences from Neisseriae. Different
techniques exist to identify anomalous sequences in silico in completely sequenced
genomes. However, any sequenced genome is merely a representative of a
particular species and it is therefore difficult to extrapolate to unsequenced isolates of
the same species. Bioinformatical approaches have facilitated horizontal gene
transfer identification procedures in whole genome sequences as well as the
mapping of horizontal gene transfer processes. There is a need to implement in vitro
approaches to detect DNA acquisition events in order to unravel this important
contributor to bacterial evolution and diversification.
Chapter two deals with a novel in vitro strategy to selectively isolate
anomalous sequences from unsequenced prokaryotic genomes. The availability of
multiple genome sequences allows comparative genomics to test new strategies to
isolate putative acquired sequences, without performing hybridizations. In this new
strategy we focus on compositional attributes such as the genome signature, instead
of specific sequences potentially involved in DNA transfer processes such as tRNA
synthetase encoding genes. The availability of representative genome sequences is
a prerequisite for this strategy, and fortunately the large number of ongoing whole
genome sequencing projects safeguards a steady increase of such representative
genome sequences, which could expand our technique to different prokaryotic
genera.
Chapter 1
20
Chapter three describes the development of and gives instructions about a
web application that permits nucleotide composition analyses of individual sequences
in comparison to a suitable genomic context. The bioinformatical background of
chapter two is explained extensively in this chapter.
We hypothesized that different large putative horizontally acquired sequences
(i.e. Genomic Islands) within the same genome sequence may be compared with
each other, as it is reasonable to suggest that a single donor was responsible for
multiple transfer events. A comparison of Genomic Islands is described in chapter
four, and an application that allows users to investigate the acquisition account of
given prokaryotes is described. This approach might facilitate donor identification of
putative horizontally acquired genes.
Chapter five comprises a study focussed on the anomalous DNA content of a
Neisseria lactamica strain. N. lactamica is a human commensal residing in the
nasopharynx. This species however does not cause disease, although it is closely
related to the meningococcus. The strategy developed in chapter two to specifically
isolate putative horizontally acquired sequences is applied to the unsequenced N.
lactamica, and the analysis of the detected anomalous sequences in this species is
described.
A typical and well-known prokaryotic mobile element is the plasmid. Chapter
six reports on a range of plasmids isolated from N. lactamica strains. Selfish genetic
elements such as plasmids form a common vehicle for horizontal gene transfer
processes. Not much is known about plasmid sequence diversity amongst different
neisserial species. The (compositional) analyses of neisserial plasmids may reveal
what sequences are being transferred via these mobile elements.
Based on some of the results from chapter six, we perform a database-
analysis of a large number of prokaryotic plasmids, which is described in chapter
seven. Compositional comparisons of plasmid sequences and their respective host
chromosome sequences may confirm compositional compatibility. If compositional
incompatibility would be detected, alternative selection pressures would be exerted
on genomic and epigenetic sequences.
Chapter 1
21
Finally, chapter eight summarises the findings of this thesis and discusses
these further within the context of current developments in the field of molecular
microbiology.
Chapter 1
22
References
1. Yazdankhah, S.P. and D.A. Caugant, Neisseria meningitidis: an overview of the carriage
state. J Med Microbiol, 2004. 53(Pt 9): p. 821-32. 2. Sim, R.J., et al., Underestimation of meningococci in tonsillar tissue by nasopharyngeal
swabbing. Lancet, 2000. 356(9242): p. 1653-4. 3. Weichselbaum, A., Ueber die aetiologie der akuten meningitis cerebro-spinalis. Fortschr Med,
1887. 5: p. 573–583. 4. Neisser, A., Ueber eine der Gonorrhoe eigentuemliche Micrococcusform. Centralb. Med.
Wissenschaften, 1879. 28: p. 497-500. 5. Taha, M.K., et al., The duality of virulence and transmissibility in Neisseria meningitidis.
Trends Microbiol, 2002. 10(8): p. 376-82. 6. Tzeng, Y.L. and D.S. Stephens, Epidemiology and pathogenesis of Neisseria meningitidis.
Microbes Infect, 2000. 2(6): p. 687-700. 7. Branham, S.E., Serological relationships among meningococci. Bacteriol Rev, 1953. 17(3): p.
175-88. 8. Frasch, C.E., W.D. Zollinger, and J.T. Poolman, Serotype antigens of Neisseria meningitidis
and a proposed scheme for designation of serotypes. Rev Infect Dis, 1985. 7(4): p. 504-10. 9. Zollinger, W.D. and R.E. Mandrell, Outer-membrane protein and lipopolysaccharide serotyping
of Neisseria meningitidis by inhibition of a solid-phase radioimmunoassay. Infect Immun, 1977. 18(2): p. 424-33.
10. Mandrell, R.E. and W.D. Zollinger, Lipopolysaccharide serotyping of Neisseria meningitidis by hemagglutination inhibition. Infect Immun, 1977. 16(2): p. 471-5.
11. Caugant, D.A., et al., Multilocus genotypes determined by enzyme electrophoresis of Neisseria meningitidis isolated from patients with systemic disease and from healthy carriers. J Gen Microbiol, 1986. 132(3): p. 641-52.
12. Bygraves, J.A. and M.C. Maiden, Analysis of the clonal relationships between strains of Neisseria meningitidis by pulsed field gel electrophoresis. J Gen Microbiol, 1992. 138(3): p. 523-31.
13. Woods, J.P., et al., Use of arbitrarily primed polymerase chain reaction analysis to type disease and carrier strains of Neisseria meningitidis isolated during a university outbreak. J Infect Dis, 1994. 169(6): p. 1384-9.
14. Bart, A., et al., Randomly amplified polymorphic DNA genotyping of serogroup A meningococci yields results similar to those obtained by multilocus enzyme electrophoresis and reveals new genotypes. J Clin Microbiol, 1998. 36(6): p. 1746-9.
15. Maiden, M.C., et al., Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc Natl Acad Sci U S A, 1998. 95(6): p. 3140-5.
16. Jolley, K.A., M.S. Chan, and M.C. Maiden, mlstdbNet - distributed multi-locus sequence typing (MLST) databases. BMC Bioinformatics, 2004. 5: p. 86.
17. Feil, E.J., et al., eBURST: inferring patterns of evolutionary descent among clusters of related bacterial genotypes from multilocus sequence typing data. J Bacteriol, 2004. 186(5): p. 1518-30.
18. Spratt, B.G., et al., Displaying the relatedness among isolates of bacterial species -- the eBURST approach. FEMS Microbiol Lett, 2004. 241(2): p. 129-34.
19. Catlin, B.W., Transformation of Neisseria meningitidis by deoxyribonucleates from cells and from culture slime. J Bacteriol, 1960. 79: p. 579-90.
20. Hanage, W.P., C. Fraser, and B.G. Spratt, Fuzzy species among recombinogenic bacteria. BMC Biol, 2005. 3(1): p. 6.
21. Van der Ende, A., Spanjaard, L, Vandenbroucke-Grauls, C.M.J.E., Bacterial meningitis in the Netherlands; annual report 2003. 2003, Netherlands Reference Laboratory for Bacterial Meningitis (Academic Medical Center and the National Institute of Public Health and the Environment): Amsterdam.
22. Achtman, M., Epidemic spread and antigenic variability of Neisseria meningitidis. Trends Microbiol, 1995. 3(5): p. 186-92.
23. Hart, C.A. and L.E. Cuevas, Meningococcal disease in Africa. Ann Trop Med Parasitol, 1997. 91(7): p. 777-85.
24. Bovre, K., et al., Neisseria meningitidis infections in Northern Norway: an epidemic in 1974-1975 due mainly to group B organisms. J Infect Dis, 1977. 135(4): p. 669-72.
Chapter 1
23
25. Martin, D.R., et al., New Zealand epidemic of meningococcal disease identified by a strain with phenotype B:4:P1.4. J Infect Dis, 1998. 177(2): p. 497-500.
26. Sexton, K., et al., The New Zealand Meningococcal Vaccine Strategy: a tailor-made vaccine to combat a devastating epidemic. N Z Med J, 2004. 117(1200): p. U1015.
27. Fine, A., Layton, M., Hakim, A., Smith, P.,, Serogroup W-135 meningococcal disease among travelers returning from Saudi Arabia--United States, 2000. MMWR Morb Mortal Wkly Rep, 2000. 49(16): p. 345-6.
28. Decosas, J. and J.B. Koama, Chronicle of an outbreak foretold: meningococcal meningitis W135 in Burkina Faso. Lancet Infect Dis, 2002. 2(12): p. 763-5.
29. Ahmad, K., Vaccination halts meningitis outbreak in Burkina Faso. Lancet, 2004. 363(9417): p. 1290.
30. Van der Ende, A., Spanjaard, L, Vandenbroucke-Grauls, C.M.J.E., Bacterial meningitis in the Netherlands; annual report 2002. 2002, Netherlands Reference Laboratory for Bacterial Meningitis (Academic Medical Center and the National Institute of Public Health and the Environment): Amsterdam.
31. de Greeff, S.C., et al., [The first effect of the national vaccination campaign against meningococcal-C disease: a rapid and sharp decrease in the number of patients]. Ned Tijdschr Geneeskd, 2003. 147(23): p. 1132-5.
32. Maiden, M.C. and J.M. Stuart, Carriage of serogroup C meningococci 1 year after meningococcal C conjugate polysaccharide vaccination. Lancet, 2002. 359(9320): p. 1829-31.
33. Finne, J., et al., An IgG monoclonal antibody to group B meningococci cross-reacts with developmentally regulated polysialic acid units of glycoproteins in neural and extraneural tissues. J Immunol, 1987. 138(12): p. 4402-7.
34. Moe, G.R., S. Tan, and D.M. Granoff, Molecular mimetics of polysaccharide epitopes as vaccine candidates for prevention of Neisseria meningitidis serogroup B disease. FEMS Immunol Med Microbiol, 1999. 26(3-4): p. 209-26.
35. Caugant, D.A., et al., Asymptomatic carriage of Neisseria meningitidis in a randomly sampled population. J Clin Microbiol, 1994. 32(2): p. 323-30.
36. Hollis, D.G., G.L. Wiggins, and R.E. Weaver, Neisseria lactamicus sp. n., a lactose-fermenting species resembling Neisseria meningitidis. Appl Microbiol, 1969. 17(1): p. 71-7.
37. Oliver, K.J., et al., Neisseria lactamica protects against experimental meningococcal infection. Infect Immun, 2002. 70(7): p. 3621-6.
38. Cartwright, K.A., et al., The Stonehouse survey: nasopharyngeal carriage of meningococci and Neisseria lactamica. Epidemiol Infect, 1987. 99(3): p. 591-601.
39. Braun, J.M., et al., Neisseria meningitidis, Neisseria lactamica and Moraxella catarrhalis share cross-reactive carbohydrate antigens. Vaccine, 2004. 22(7): p. 898-908.
40. Gorringe, A., et al., The development of a meningococcal disease vaccine based on Neisseria lactamica outer membrane vesicles. Vaccine, 2005. 23(17-18): p. 2210-3.
41. Bennett, J.S., et al., Genetic diversity and carriage dynamics of Neisseria lactamica in infants. Infect Immun, 2005. 73(4): p. 2424-32.
42. DeVoe, I.W. and J.E. Gilchrist, Pili on meningococci from primary cultures of nasopharyngeal carriers and cerebrospinal fluid of patients with acute disease. J Exp Med, 1975. 141(2): p. 297-305.
43. Virji, M., et al., The role of pili in the interactions of pathogenic Neisseria with cultured human endothelial cells. Mol Microbiol, 1991. 5(8): p. 1831-41.
44. Virji, M., et al., Variations in the expression of pili: the effect on adherence of Neisseria meningitidis to human epithelial and endothelial cells. Mol Microbiol, 1992. 6(10): p. 1271-9.
45. Nassif, X., et al., Antigenic variation of pilin regulates adhesion of Neisseria meningitidis to human epithelial cells. Mol Microbiol, 1993. 8(4): p. 719-25.
46. Lomholt, H., et al., Molecular polymorphism and epidemiology of Neisseria meningitidis immunoglobulin A1 proteases. Proc Natl Acad Sci U S A, 1992. 89(6): p. 2120-4.
47. Thompson, S.A., et al., Neisseria meningitidis produces iron-regulated proteins related to the RTX family of exoproteins. J Bacteriol, 1993. 175(3): p. 811-8.
48. Thompson, S.A., L.L. Wang, and P.F. Sparling, Cloning and nucleotide sequence of frpC, a second gene from Neisseria meningitidis encoding a protein similar to RTX cytotoxins. Mol Microbiol, 1993. 9(1): p. 85-96.
49. Thompson, S.A. and P.F. Sparling, The RTX cytotoxin-related FrpA protein of Neisseria meningitidis is secreted extracellularly by meningococci and by HlyBD+ Escherichia coli. Infect Immun, 1993. 61(7): p. 2906-11.
Chapter 1
24
50. Swartley, J.S., et al., Capsule switching of Neisseria meningitidis. Proc Natl Acad Sci U S A, 1997. 94(1): p. 271-6.
51. de Vries, F.P., et al., Neisseria meningitidis producing the Opc adhesin binds epithelial cell proteoglycan receptors. Mol Microbiol, 1998. 27(6): p. 1203-12.
52. Brandtzaeg, P., et al., Neisseria meningitidis lipopolysaccharides in human pathology. J Endotoxin Res, 2001. 7(6): p. 401-20.
53. Yi, K., et al., Biofilm formation by Neisseria meningitidis. Infect Immun, 2004. 72(10): p. 6132-8.
54. Yi, K., D.S. Stephens, and I. Stojiljkovic, Development and evaluation of an improved mouse model of meningococcal colonization. Infect Immun, 2003. 71(4): p. 1849-55.
55. Gorringe, A.R., et al., Experimental disease models for the assessment of meningococcal vaccines. Vaccine, 2005. 23(17-18): p. 2214-7.
56. Mackinnon, F.G., et al., Demonstration of lipooligosaccharide immunotype and capsule as virulence factors for Neisseria meningitidis using an infant mouse intranasal infection model. Microb Pathog, 1993. 15(5): p. 359-66.
57. de Jonge, M.I., et al., Intranasal immunisation of mice with liposomes containing recombinant meningococcal OpaB and OpaJ proteins. Vaccine, 2004. 22(29-30): p. 4021-8.
58. Johansson, L., et al., CD46 in meningococcal disease. Science, 2003. 301(5631): p. 373-5. 59. Perrin, A., X. Nassif, and C. Tinsley, Identification of regions of the chromosome of Neisseria
meningitidis and Neisseria gonorrhoeae which are specific to the pathogenic Neisseria species. Infect Immun, 1999. 67(11): p. 6119-29.
60. Klee, S.R., et al., Molecular and biological analysis of eight genetic islands that distinguish Neisseria meningitidis from the closely related pathogen Neisseria gonorrhoeae. Infect Immun, 2000. 68(4): p. 2082-95.
61. Tinsley, C.R. and X. Nassif, Analysis of the genetic differences between Neisseria meningitidis and Neisseria gonorrhoeae: two closely related bacteria expressing two different pathogenicities. Proc Natl Acad Sci U S A, 1996. 93(20): p. 11109-14.
62. Stollenwerk, N., M.C. Maiden, and V.A. Jansen, Diversity in pathogenicity can cause outbreaks of meningococcal disease. Proc Natl Acad Sci U S A, 2004. 101(27): p. 10229-34.
63. Hoang, L.M., et al., Rapid and fatal meningococcal disease due to a strain of Neisseria meningitidis containing the capsule null locus. Clin Infect Dis, 2005. 40(5): p. e38-42.
64. Peeters, C.C., et al., Phase I clinical trial with a hexavalent PorA containing meningococcal outer membrane vesicle vaccine. Vaccine, 1996. 14(10): p. 1009-15.
65. van der Ende, A., et al., Outbreak of meningococcal disease caused by PorA-deficient meningococci. J Infect Dis, 2003. 187(5): p. 869-71.
66. Rosenstein, N.E. and B.A. Perkins, Update on Haemophilus influenzae serotype b and meningococcal vaccines. Pediatr Clin North Am, 2000. 47(2): p. 337-52, vi.
67. Edwards, M.S. and C.J. Baker, Complications and sequelae of meningococcal infections in children. J Pediatr, 1981. 99(4): p. 540-5.
68. Koomen, I., et al., Neuropsychology of academic and behavioural limitations in school-age survivors of bacterial meningitis. Dev Med Child Neurol, 2004. 46(11): p. 724-32.
69. Koomen, I., et al., Parental perception of educational, behavioural and general health problems in school-age survivors of bacterial meningitis. Acta Paediatr, 2003. 92(2): p. 177-85.
70. Koomen, I., et al., Hearing loss at school age in survivors of bacterial meningitis: assessment, incidence, and prediction. Pediatrics, 2003. 112(5): p. 1049-53.
71. Stanwell-Smith, R.E., et al., Smoking, the environment and meningococcal disease: a case control study. Epidemiol Infect, 1994. 112(2): p. 315-28.
72. Figueroa, J., J. Andreoni, and P. Densen, Complement deficiency states and meningococcal disease. Immunol Res, 1993. 12(3): p. 295-311.
73. van Gils, E.J., et al., Increased attack rate of meningococcal disease in children with a pregnant mother. Pediatrics, 2005. 115(5): p. e590-3.
74. Emonts, M., et al., Host genetic determinants of Neisseria meningitidis infections. Lancet Infect Dis, 2003. 3(9): p. 565-77.
75. van der Ende, A., et al., Variable expression of class 1 outer membrane protein in Neisseria meningitidis is caused by variation in the spacing between the -10 and -35 regions of the promoter. J Bacteriol, 1995. 177(9): p. 2475-80.
76. Alexander, H.L., A.R. Richardson, and I. Stojiljkovic, Natural transformation and phase variation modulation in Neisseria meningitidis. Mol Microbiol, 2004. 52(3): p. 771-83.
Chapter 1
25
77. Alexander, H.L., A.W. Rasmussen, and I. Stojiljkovic, Identification of Neisseria meningitidis genetic loci involved in the modulation of phase variation frequencies. Infect Immun, 2004. 72(11): p. 6743-7.
78. Martin, P., et al., Involvement of genes of genome maintenance in the regulation of phase variation frequencies in Neisseria meningitidis. Microbiology, 2004. 150(Pt 9): p. 3001-12.
79. Sarkari, J., et al., Variable expression of the Opc outer membrane protein in Neisseria meningitidis is caused by size variation of a promoter containing poly-cytidine. Mol Microbiol, 1994. 13(2): p. 207-17.
80. Saunders, N.J., et al., Repeat-associated phase variable genes in the complete genome sequence of Neisseria meningitidis strain MC58. Mol Microbiol, 2000. 37(1): p. 207-15.
81. Snyder, L.A., S.A. Butcher, and N.J. Saunders, Comparative whole-genome analyses reveal over 100 putative phase-variable genes in the pathogenic Neisseria spp. Microbiology, 2001. 147(Pt 8): p. 2321-32.
82. Seifert, H.S., et al., DNA transformation leads to pilin antigenic variation in Neisseria gonorrhoeae. Nature, 1988. 336(6197): p. 392-5.
83. Gibbs, C.P., et al., Reassortment of pilin genes in Neisseria gonorrhoeae occurs by two distinct mechanisms. Nature, 1989. 338(6217): p. 651-2.
84. Fleischmann, R.D., et al., Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science, 1995. 269(5223): p. 496-512.
85. Bernal, A., U. Ear, and N. Kyrpides, Genomes OnLine Database (GOLD): a monitor of genome projects world-wide. Nucleic Acids Res, 2001. 29(1): p. 126-7.
86. Kyrpides, N.C., Genomes OnLine Database (GOLD 1.0): a monitor of complete and ongoing genome projects world-wide. Bioinformatics, 1999. 15(9): p. 773-4.
87. Pizza, M., et al., Identification of vaccine candidates against serogroup B meningococcus by whole-genome sequencing. Science, 2000. 287(5459): p. 1816-20.
88. Tettelin, H., et al., Complete genome sequence of Neisseria meningitidis serogroup B strain MC58. Science, 2000. 287(5459): p. 1809-15.
89. Holden, M.T., et al., Complete genomes of two clinical Staphylococcus aureus strains: evidence for the rapid evolution of virulence and drug resistance. Proc Natl Acad Sci U S A, 2004. 101(26): p. 9786-91.
90. Altermann, E., et al., Complete genome sequence of the probiotic lactic acid bacterium Lactobacillus acidophilus NCFM. Proc Natl Acad Sci U S A, 2005. 102(11): p. 3906-12.
91. Springael, D. and E.M. Top, Horizontal gene transfer and microbial adaptation to xenobiotics: new types of mobile genetic elements and lessons from ecological studies. Trends Microbiol, 2004. 12(2): p. 53-8.
92. Futterer, O., et al., Genome sequence of Picrophilus torridus and its implications for life around pH 0. Proc Natl Acad Sci U S A, 2004. 101(24): p. 9091-6.
93. Zhang, Y., D.E. Fomenko, and V.N. Gladyshev, The microbial selenoproteome of the Sargasso Sea. Genome Biol, 2005. 6(4): p. R37.
94. Tyson, G.W., et al., Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature, 2004. 428(6978): p. 37-43.
95. Venter, J.C., et al., Environmental genome shotgun sequencing of the Sargasso Sea. Science, 2004. 304(5667): p. 66-74.
96. Hugenholtz, P., Exploring prokaryotic diversity in the genomic era. Genome Biol, 2002. 3(2): p. REVIEWS0003.
97. Tringe, S.G., et al., Comparative metagenomics of microbial communities. Science, 2005. 308(5721): p. 554-7.
98. Konstantinidis, K.T. and J.M. Tiedje, Genomic insights that advance the species definition for prokaryotes. Proc Natl Acad Sci U S A, 2005. 102(7): p. 2567-72.
99. Coenye, T., et al., Towards a prokaryotic genomic taxonomy. FEMS Microbiol Rev, 2005. 29(2): p. 147-67.
100. Coenye, T. and P. Vandamme, Extracting phylogenetic information from whole-genome sequencing projects: the lactic acid bacteria as a test case. Microbiology, 2003. 149(Pt 12): p. 3507-17.
101. Kanehisa, M. and P. Bork, Bioinformatics in the post-sequence era. Nat Genet, 2003. 33 Suppl: p. 305-10.
102. Salzberg, S.L., et al., Serendipitous discovery of Wolbachia genomes in multiple Drosophila species. Genome Biol, 2005. 6(3): p. R23.
103. Ouzounis, C., Bioinformatics and the theoretical foundations of molecular biology. Bioinformatics, 2002. 18(3): p. 377-8.
Chapter 1
26
104. Price, M.N., et al., A novel method for accurate operon predictions in all sequenced prokaryotes. Nucleic Acids Res, 2005. 33(3): p. 880-92.
105. Rutherford, K., et al., Artemis: sequence visualization and annotation. Bioinformatics, 2000. 16(10): p. 944-5.
106. Leader, D.P., BugView: a browser for comparing genomes. Bioinformatics, 2004. 20(1): p. 129-30.
107. Dong, X., et al., PlasMapper: a web server for drawing and auto-annotating plasmid maps. Nucleic Acids Res, 2004. 32(Web Server issue): p. W660-4.
108. Snel, B., P. Bork, and M.A. Huynen, Genomes in flux: the evolution of archaeal and proteobacterial gene content. Genome Res, 2002. 12(1): p. 17-25.
109. Snel, B., P. Bork, and M.A. Huynen, The identification of functional modules from the genomic association of genes. Proc Natl Acad Sci U S A, 2002. 99(9): p. 5890-5.
110. Lerat, E., et al., Evolutionary origins of genomic repertoires in bacteria. PLoS Biol, 2005. 3(5): p. e130.
111. Roberts, R.J., Karp, P. ,Kasif, S., Linn, S. & Buckley, M. R., An Experimental Approach to Genome Annotation. 2004, American Society for Microbiology: Washington DC.
112. Kunin, V. and C.A. Ouzounis, The balance of driving forces during genome evolution in prokaryotes. Genome Res, 2003. 13(7): p. 1589-94.
113. Cohan, F.M., What are bacterial species? Annu Rev Microbiol, 2002. 56: p. 457-87. 114. Konstantinidis, K.T. and J.M. Tiedje, Trends between gene content and genome size in
prokaryotic species with larger genomes. Proc Natl Acad Sci U S A, 2004. 101(9): p. 3160-5. 115. Daubin, V. and H. Ochman, Bacterial genomes as new gene homes: the genealogy of
ORFans in E. coli. Genome Res, 2004. 14(6): p. 1036-42. 116. Daubin, V. and H. Ochman, Start-up entities in the origin of new genes. Curr Opin Genet Dev,
2004. 14(6): p. 616-9. 117. Dobrindt, U., et al., Genomic islands in pathogenic and environmental microorganisms. Nat
Rev Microbiol, 2004. 2(5): p. 414-424. 118. Moran, N.A., Microbial minimalism: genome reduction in bacterial pathogens. Cell, 2002.
108(5): p. 583-6. 119. Moran, N.A., Tracing the evolution of gene loss in obligate bacterial symbionts. Curr Opin
Microbiol, 2003. 6(5): p. 512-8. 120. Moran, N.A. and A. Mira, The process of genome shrinkage in the obligate symbiont Buchnera
aphidicola. Genome Biol, 2001. 2(12): p. RESEARCH0054. 121. Mira, A., H. Ochman, and N.A. Moran, Deletional bias and the evolution of bacterial genomes.
Trends Genet, 2001. 17(10): p. 589-96. 122. Cole, S.T., et al., Massive gene decay in the leprosy bacillus. Nature, 2001. 409(6823): p.
1007-11. 123. Lerat, E. and H. Ochman, Recognizing the pseudogenes in bacterial genomes. Nucleic Acids
Res, 2005. 33(10): p. 3125-32. 124. Alm, R.A., et al., Genomic-sequence comparison of two unrelated isolates of the human
gastric pathogen Helicobacter pylori. Nature, 1999. 397(6715): p. 176-80. 125. Tillier, E.R. and R.A. Collins, Genome rearrangement by replication-directed translocation. Nat
Genet, 2000. 26(2): p. 195-7. 126. Eisen, J.A., et al., Evidence for symmetric chromosomal inversions around the replication
origin in bacteria. Genome Biol, 2000. 1(6): p. RESEARCH0011. 127. Leclercq, R., et al., Plasmid-mediated resistance to vancomycin and teicoplanin in
Enterococcus faecium. N Engl J Med, 1988. 319(3): p. 157-61. 128. Lawrence, J.G. and H. Ochman, Molecular archaeology of the Escherichia coli genome. Proc
Natl Acad Sci U S A, 1998. 95(16): p. 9413-7. 129. Woese, C.R. and G.E. Fox, Phylogenetic structure of the prokaryotic domain: the primary
kingdoms. Proc Natl Acad Sci U S A, 1977. 74(11): p. 5088-90. 130. Gophna, U., W.F. Doolittle, and R.L. Charlebois, Weighted genome trees: refinements and
applications. J Bacteriol, 2005. 187(4): p. 1305-16. 131. Karlin, S., Detecting anomalous gene clusters and pathogenicity islands in diverse bacterial
genomes. Trends Microbiol, 2001. 9(7): p. 335-43. 132. Grantham, R., et al., Codon catalog usage and the genome hypothesis. Nucleic Acids Res,
1980. 8(1): p. r49-r62. 133. Lawrence, J.G. and H. Ochman, Amelioration of bacterial genomes: rates of change and
exchange. J Mol Evol, 1997. 44(4): p. 383-97.
Chapter 1
27
134. Zhang, R. and C.T. Zhang, A systematic method to identify genomic islands and its applications in analyzing the genomes of Corynebacterium glutamicum and Vibrio vulnificus CMCP6 chromosome I. Bioinformatics, 2004. 20(5): p. 612-22.
135. Sandberg, R., et al., Capturing whole-genome characteristics in short sequences using a naive Bayesian classifier. Genome Res, 2001. 11(8): p. 1404-9.
136. Nakamura, Y., et al., Biased biological functions of horizontally transferred genes in prokaryotic genomes. Nat Genet, 2004. 36(7): p. 760-6.
137. Linz, B., et al., Frequent interspecific genetic exchange between commensal Neisseriae and Neisseria meningitidis. Mol Microbiol, 2000. 36(5): p. 1049-58.
138. Straus, D. and F.M. Ausubel, Genomic subtraction for cloning DNA corresponding to deletion mutations. Proc Natl Acad Sci U S A, 1990. 87(5): p. 1889-93.
139. Lisitsyn, N., N. Lisitsyn, and M. Wigler, Cloning the differences between two complex genomes. Science, 1993. 259(5097): p. 946-51.
140. Hsiao, W., et al., IslandPath: aiding detection of genomic islands in prokaryotes. Bioinformatics, 2003. 19(3): p. 418-20.
141. Merkl, R., SIGI: score-based identification of genomic islands. BMC Bioinformatics, 2004. 5(1): p. 22.
142. Dufraigne, C., et al., Detection and characterization of horizontal transfers in prokaryotes using genomic signature. Nucleic Acids Res, 2005. 33(1): p. e6.
143. Tsirigos, A. and I. Rigoutsos, A new computational method for the detection of horizontal gene transfer events. Nucleic Acids Res, 2005. 33(3): p. 922-33.
144. Lawrence, J.G. and H. Hendrickson, Lateral gene transfer: when will adolescence end? Mol Microbiol, 2003. 50(3): p. 739-49.
145. Jain, R., M.C. Rivera, and J.A. Lake, Horizontal gene transfer among genomes: the complexity hypothesis. Proc Natl Acad Sci U S A, 1999. 96(7): p. 3801-6.
146. Jeltsch, A., Maintenance of species identity and controlling speciation of bacteria: a new function for restriction/modification systems? Gene, 2003. 317(1-2): p. 13-6.
147. Jeltsch, A. and A. Pingoud, Horizontal gene transfer contributes to the wide distribution and evolution of type II restriction-modification systems. J Mol Evol, 1996. 42(2): p. 91-6.
28
Chapter 2
29
Chapter 2
An in vitro strategy for the selective isolation of
anomalous DNA from prokaryotic genomes
M. W. J. van Passel1, A. Bart1, R. J. A. Waaijer2, A. C. M. Luyf2, A. H. C. van
Kampen2, A. van der Ende1,*
1Department of Medical Microbiology, 2Bioinformatics Laboratory, Academic Medical
Center, Amsterdam, the Netherlands
Adapted from Nucleic Acids Research (2004), 32(14):e114
Chapter 2
30
Abstract
In sequenced genomes of prokaryotes, anomalous DNA (aDNA) can be
recognised, among others, by atypical clustering of dinucleotides. We hypothesised
that atypical clustering of hexameric endonuclease recognition sites in aDNA allows
the specific isolation of anomalous sequences in vitro. Clustering of endonuclease
recognition sites in aDNA regions of eight published prokaryotic genome sequences
was demonstrated. In silico digestion of the Neisseria meningitidis MC58 genome,
using four selected endonucleases, revealed that of 27 of the predicted small
fragments (300 bp and
Chapter 2
31
Introduction
Horizontal gene transfer (HGT) was already identified in 1944 by the same
experiment that demonstrated the transformation of non-virulent to virulent
Streptococcus pneumoniae [1]. The extent of HGT as an evolutionary phenomenon
had not been addressed quantitatively on genomic scale until Lawrence and Ochman
calculated that approximately 18% of the genome of Escherichia coli MG1665 was
horizontally transferred since its divergence from the Salmonella lineage 100 million
years ago [2]. This identified HGT as a major factor in prokaryotic genome evolution.
Recently, an extensive database of horizontally transferred genes based on complete
bacterial and archaeal genomes has been made available [3].
The rationale behind the computational identification of horizontally transferred
DNA is the genome hypothesis, which proposes that for a given prokaryotic genus
genomic DNA is relatively constant in codon usage and GC content [4, 5]. In contrast,
horizontally acquired anomalous DNA differs in codon usage and/or GC composition
from the recipient genome and can therefore be identified when substantial sequence
information is available.
An additional parameter in lateral genomics is based on oligonucleotide
compositional extremes: the dinucleotide relative abundance values or genome
signature ρ* [6, 7]. The genome signature is constant among members of a genus,
but deviates substantially between members of different genera [8]. When used for
intragenomic comparisons, ρ* makes an excellent parameter for the identification of
anomalous DNA regions. Aberrant dinucleotide frequencies in aDNA are then
expressed as the genome dissimilarity δ*, being the average dinucleotide relative
abundance difference between the aDNA region and the whole genome [6-8].
Although the genome signature is capable of identifying clusters of alien genes and
acquired pathogenicity associated islands (PAI) with an atypical nucleotide
composition, highly expressed regions such as ribosomal clusters can also display
aberrant dinucleotide frequencies [8, 9].
Still, to our knowledge, no method exists that uses (one of) these parameters
and enables the selective isolation of anomalous DNA sequences from a microbial
genome in vitro. In order to develop such a technique we investigated a special
group of oligonucleotide composition extremes: the local overrepresentation in a
Chapter 2
32
genome of palindromic hexanucleotide sequences, specifically restriction
endonuclease recognition sites, in aDNA regions. Like the genomic dinucleotide and
tetranucleotide frequencies [10, 11], frequencies of restriction sites vary between the
genomes of different microbial species [12]. Avoidance of cognate recognition
sequences is probably the operating mechanism [13, 14]. An HGT event between
different organisms may introduce clusters of certain restriction sites in the recipient’s
genome. Therefore, digestion of the chromosomal DNA with such a restriction
endonuclease can produce a limited number of small restriction fragments,
comprising potential anomalous DNA, which can be selectively amplified by adaptor-
linked PCR (ALP [15]). The resulting amplicons can subsequently be subcloned and
identified by sequence analysis.
Clustering of restriction endonuclease recognition sites in diverse aDNA
regions in prokaryotic genomes was illustrated by the in silico assessment of seven
genomes sequences of five different species. The restriction enzymes of which the
hexameric recognition sites are underrepresented were identified for each genome,
and restriction fragments between clustered sites, being smaller than 5 kbp, were
analysed for nucleotide composition concerning GC percentage and genomic
dissimilarity.
Next, the restriction fragments of N. meningitidis MC58 between 300 bp and 5
kb were analysed in silico for both GC content and genome signature compared to
the genomic values. Also, the restriction fragments obtained with the selected
restriction endonucleases from N. meningitidis MC58 and Z2491 strains were
compared.
Finally, in order to demonstrate the applicability of this technique in vitro,
adaptor-linked PCR was performed on chromosomal DNA from strain MC58 digested
by each of the selected restriction endonucleases. The resulting amplicons were
sequenced to verify the predicted sequence composition.
Chapter 2
33
Material and Methods
Bacterial strain and growth conditions
N. meningitidis MC58 is a serogroup B:15:P1.7,16 strain isolated from a case
of invasive infection in the UK [16]. This wild-type MC58 strain lacks the erythromycin
resistance cassette insertion in the capsule gene locus in contrast to the sequenced
strain MC58 [17]. Neisseriae were grown on heated blood (chocolate) agar plates or
in liquid Tryptic Soy Broth (DIFCO) medium at 37°C in a humidified atmosphere of
5% CO2.
Chromosomal DNA preparation and digestion
Chromosomal DNA was isolated with the Puregene DNA isolation kit (Biozym).
Restriction digests and subsequent heat inactivation were carried out according to
the manufacturer’s instructions (Roche).
Adaptor-linked PCR and DNA sequencing
Adaptor-linked PCR was performed as described in [18]. The adaptor and
linker sets are MP19 (5’- ACG TCG ACT ATC CAT GAA CAG ATC 3’) and MP23 (5’-
GAT CTG TTC ATG-3’) for the ScaI-digested genomic template, MP24 (5’-ACC GAC
GTC GAC TAT CCA TGA ACA-3’) and MP20 (5’- CTA GTG TTC ATG -3’) for both
the NheI- and SpeI-digested chromosomal DNA and MP24 and MP23 for the BglII-
digested genomic template. PCR amplicons were purified by agarose gel extraction
(Qiagen) and subcloned into a pCR2.1 vector (Invitrogen) according to the
manufacturer’s instructions. Escherichia coli DH5α was transformed by standard heat
shock procedure. The constructed plasmids were isolated with the Wizard Kit
(Promega). Inserts were sequenced using standard M13 primers or primer walking
on vector or genomic DNA according to the manufacturer’s instruction (ABI).
Sequences were analyzed using the Staden Package (http://www.mrc-
lmb.cam.ac.uk/pubseq/).
Software
The restriction site frequency tables from the various genomes were obtained
from http://tools.neb.com/~posfai/FINISHED. The in silico digestions of the various
Chapter 2
34
sequenced genomes (for accession numbers see Table 1) were performed using the
Restriction Digest tool from The Institute for Genomic Research (TIGR)
(http://www.tigr.org). In silico retrieval and identification of the restriction fragments
was performed with the Position Search/Segment Retrieval tool from TIGR
(http://www.tigr.org). The different genomes of N. meningitidis were compared using
the Artemis Comparison Tool (ACT) (http://www.sanger.ac.uk).
Data analysis
Fragments were designated anomalous in GC composition if the GC content
of the fragment is below the fifth or above the 95th percentile of the genomic GC
content distribution, calculated with a window and step size identical to the fragment
length (http://www.tigr.org).
The δ* value for each restriction fragments was calculated as described earlier
by Karlin and colleagues [7]. In brief, the dinucleotide r
Top Related