Post on 15-Jan-2022
1
Identification and Characterization
of Gene Functions Involved in
Recalcitrant Compound
Degradation using Metagenomic
Data
Södertörn University | Department of Life Sciences
Bachelor Thesis 15 ECTS | Molecular Biology | Fall Term 2011
Molecular Biosciences
By: Tino Lawson
2
Södertörn University
Bachelor Thesis
Identification and Characterization of Gene Functions
Involved in Recalcitrant Compound Degradation using
Metagenomic Data
Molecular Biology Fall Term 2011
Tino Lawson
Supervisor: Ass. Prof. Sara Sjöling
3
Abstract
With the environmental problems caused by man-induced pollution by
persistent toxic compounds, the importance of finding remediation solutions
is immense. As an emerging field, microbial environmental biotechnology
may provide the tools to achieve novel solutions. Microbial communities in
the environment have biodegradation capacities which could be, and
historically have been, exploited for bioremediation. The novelty lies in
being able to access the capacity of the uncultured majority of the microbial
community. Every day, more and more knowledge is gained in the field and
thanks to new approaches such as metagenomics, along with the access to
databases and archives where scientists share information and data, the
quest becomes considerably facilitated. Microorganisms are highly diverse
in metabolic pathways and some have become highly developed during
evolution; detoxification and biotransformation of naturally occurring toxic
compounds are therefore not novel concepts. The environmental problem
occurs when synthetically manufactured compounds are less efficiently
biodegraded. However, improved knowledge about the degradation
potential in nature and the involved enzymes may help in developing
bioremediation procedures. For this reason, an enzyme involved in catabolic
pathways of chlorinated aromatic compounds, dienelactone hydrolase,
which has been less well studied, was selected as a target. This study
investigated the biogeographical distribution of the dienelactone hydrolase
gene identified in metagenomes sampled from different environments
globally in order to detect potential environmental patterns. Results may
cast light on its significance for degradation of chlorinated aromatic
compounds in nature. The results indicate a broad biogeographical
distribution of dienelactone hydrolase in varying microbial habitats in the
environment. The enzyme was found in environments ranging from water
and soil habitats to hypersaline-, dechlorinating-, hot-spring- and other
extremophillic habitats, in which the gene sequences shared high similarity
within each group. A broad environmental distribution suggests that
dienlactone hydrolase could be useful in bioremediation.
4
Key words: dienelactone hydrolase, recalcitrant, biodegradation, bioremediation,
metagenomics
“Perfection is achieved, not when there is nothing more to add, but when there is nothing left
to take away” -Antoine de Saint-Exupéry
5
Table of Contents
I. Introduction ........................................................................................ 6
II. Objectives ........................................................................................ 10
III. Background & Literature review ................................................... 11
3.1 Biodegradation ............................................................................ 11
3.2 Halocarbons ................................................................................ 12
3.3 Dienelactone Hydrolase .............................................................. 13
3.4 Metagenomics ............................................................................. 14
3.5 Bioremediation for biotechnological applications ..................... 16
IV. Methodology .................................................................................. 18
4.1 Integrated Microbial Genomes (IMG) ........................................ 18
4.2 Protein data bank (PDB) ............................................................. 19
4.3 Basic Local Alignment Search Tool (BLAST) .......................... 19
4.4 Clustal W..................................................................................... 20
4.5 Construction of phylogenetic trees ............................................. 20
4.6 Methodological strengths & limitations ..................................... 21
V. Results ............................................................................................. 24
VI. Discussion ...................................................................................... 26
VII. Concluding remarks ..................................................................... 31
VIII. Acknowledgments ...................................................................... 32
IX. Appendix ....................................................................................... 33
X. References ....................................................................................... 58
Articles & Literature ...................................................................... 58
Figures-references .......................................................................... 61
Databases ........................................................................................ 62
6
I. Introduction The production of anthropogenic synthetic compounds for various purposes
is inevitable in our industrialized world, and even though needed and
therefore manufactured by the chemical industry, the same compounds can
pose a significant hazard both to the environment and to human health
(Klöppfer, 1994). This is due to the properties (i.e. lipophilicity or chemical
stability) particularly if these compounds are xenobiotic. For example,
various substitutions could be added to a naturally occurring aromatic
compound, making it toxic or resulting in the rise of toxic components
during biodegradation (Atlas & Bartha 1998, p. 511). Consequently,
humans but also animals, which are at the highest trophic level, become
subjects to bioaccumulation and biomagnifications (Ibanez et. al. 2007,
p.232). Therefore, although not necessarily exposed directly to toxic
compounds, great damage can be caused (the latter can be severely affected
as mentioned previously) even though the environmental concentrations of
these compounds are not at the level of toxicity for the ecosystem in that
particular area.
The majority of the synthetic compounds share enough similarity with
natural compounds so as to undergo microbial metabolism/degradation, i.e.
bioremediation (Dagley 1975, Atlas & Bartha 1998, p. 512). On the other
hand, recalcitrant xenobiotics have structures (and chemical bonds) that are
not recognized by the enzymes responsible for the degradative stages of the
microbial metabolism (Singh & Dwivedi 2004, p.60) and therefore resist
biodegradation, or are incompletely degraded, which leads to the
accumulation of these compounds in the environment. The main reasons
why a compound is recalcitrant are the presence of unusual substitutions
such as halogens (chlorine being the best example), unusual bonds, highly-
condensed aromatic rings and excessive molecular size/weight (Juhasz &
Naidu, 2000).
Microorganisms may use xenobiotic compounds as a source of energy,
carbon, nitrogen or sulfur and therefore degradation of many xenobiotic
chemicals requires microbial communities (Fetzner, 2001). Millions of
7
organic compounds are derived from natural biosynthetic pathways that are
present in animals, plants and microorganisms along with industrial
synthesis. Microorganisms not only generate, but they also degrade some of
these compounds as a part of their key role in the biogeochemical cycles of
the Earth (Madsen, 2011).
In addition to biodegradation, there are other processes that contribute to the
general features of microbial degradation of xenobiotics i.e.
biotransformation and co-metabolism. These processes are of great
ecological significance. Taking advantage of the microbial capabilities in
order to safely remove toxic and persistent compounds from the
environment is what biotechnological bioremediation encompasses.
In a historical perspective, the accumulation of recalcitrant compounds
correlates with increased industrialism, implying that the Industrial
Revolution is one of the main reasons that the world has experienced an
increased global pollution ratio and a greater proportion of contaminated
sites. In this text, the term ‘contaminated site’ is used to describe a specific
biogeographical area that is polluted by a specific recalcitrant compound.
With continued demand for products from the industries, there is no
evidence that the rate at which environmental sites are contaminated will
decrease in the near future. On the contrary, an expanding global population
will put strain on the environment. Due to an expected increase in demand,
the industrial synthesis of toxic compounds will most likely continue.
Therefore, novel solutions are needed to tackle these environmental
problems.
The contaminated sites could potentially be ‘cleaned up’ by controlled and
contained bioremediation. When selecting a bioremediation technique, the
limiting factors of degradation must be carefully assessed for each
contaminated site because each site is unique.
A first step in doing this could be the exploration of the genetic potential for
a specific bioremediation capacity present among the microbial
communities in the environment. An alternative to the classical PCR based
8
genetic studies is the analysis of genetic data of the total microbial
community, known as metagenomics. Metagenomic data differs from
genomic data in that it gives genetic information about an entire microbial
community, as opposed to just one microbial organism. This is very
important because 16S rRNA gene sequence analysis has suggested that 26
microbial phyla have no cultured representatives (Rappe & Giovannoni
2003). Thus, in relation to this study, the information retrieved from
metagenomics is much more valuable as it is independent of previously
known sequence information.
Part of reaching these goals is to consider the natural biodegradability
capabilities of microbial consortia and, with the aid of metagenomic
databases to acquire more knowledge about the diversity of a catabolic gene
within and among many different biogeographical areas.
According to Juhasz & Naidu, 2000, it has been shown that there are
microorganisms which are able to degrade recalcitrant compounds, but
which may not be prevalent in soils where remediation is necessary (Juhasz
& Naidu, 2000). Consequently, it is of great value to gain insight into the
biogeographical distribution of such catabolic genes in order for us to be
able to develop optimal remediation solutions.
In this study, the capacity of environmental habitats for biodegradation was
investigated by analyzing the presence of a selected catabolic gene
(involved in the degradation of recalcitrant compounds) to determine
whether it could be detected in any or many types of environments. In this
way we can gain more knowledge about the presence of catabolic enzymes
and recalcitrant compounds within a complex microbiome in the
environment. We thereby may also learn about the dynamics of the
bioremediation capacity in these environments in order to make realistic
interpretations of real-life conditions when developing solutions. This could
allow us to circumvent the limiting factors of bioremediation in a given
environment. Specifically, the presence and diversity of an enzyme able to
degrade halocarbons was investigated, the dienelactone hydrolase. This
gene function was selected as it is important to find bioremediation
9
solutions for sites contaminated with halocarbons because they pose a threat
for the reasons mentioned earlier. Three central research questions were
formulated, which seek to answer (i) whether there is a pattern of how
dienelactone hydrolase is biogeographically distributed in the environment,
(ii) how dienelactone hydrolases from different strains are related, and (iii)
if and how dienelactone hydrolase can be applicable in bioremediation
techniques involving recalcitrant compounds.
10
II. Objectives In order to answer the research questions mentioned above, the primary
objective of this study was to identify and characterize gene functions
involved in the degradation of recalcitrant compounds using available global
metagenomic data. Specifically, an enzyme involved in the degradation of
chlorinated aromatic compounds (i.e. dienelactone hydrolase) was
investigated. Dienelactone hydrolase operates exclusively in the catabolic
pathway of chlorocatechol. Hence, the study has focused on this catabolic
pathway, also known as the modified ortho-cleavage pathway. Through
phylogenetic analyses the aim was to reveal the evolutionary relationship of
the gene between different strains or species and to investigate if a
biogeographical distribution of the gene could be detected through the
comparison of metagenomes from different environments.
11
III. Background & Literature review
3.1 Biodegradation
The term biodegradation refers to the microbial process of breaking down
organic compounds into biomass, water, carbon dioxide, and the oxides or
mineral salts of other elements present. Mineralization refers to the
complete degradation of an organic compound into inorganic components.
However, under normal circumstances, mineralization also involves the
formation of biomass. Eventually, biomass will also undergo mineralization.
When an organic compound is broken down to a less complex organic
compound, the process is known as incomplete/partial degradation (Atlas &
Bartha, 1998).
Many factors affect the performance of the biodegradation process, such as
the physical and chemical properties of the environment. Major causes for
the persistence of many xenobiotics are e.g. sorption to soil and sediment,
micropore entrapment etc (Elzerman & Coates 1987). There are other
parameters that influence the extent and the rate at which biodegradation
takes place. For example, the bioavailability of the substrate plays a key role
in this. Chemical structure, concentration of substrate, environmental
conditions (pH, temperature, salinity, presence of inhibitory molecules,
availability of nutrients/oxygen/electron donors/electron acceptors),
composition and size of consortium (which microorganisms are present),
and the physicochemical properties of the environment all affect the
bioavailability of the substrate (Paul & Clark, 1989). Furthermore,
bioavailability is also controlled by the physical state, solubility and binding
affinity (to soil or sediment particles) of the xenobiotic compound.
Ultimately, as mentioned, sorption and entrapment in micropores are major
causes for the persistence of many xenobiotics (Elzerman & Coates 1987).
Numerous xenobiotics are toxic to organisms at high concentrations. This
includes the degradative microorganisms. However, it is important to keep
in mind that a minimum concentration of xenobiotic is required in order to
induce the synthesis of the catabolic genes responsible for the degradation
process (Fetzner 2001). It is also important to remember to examine the
12
products formed after a biodegradation processes, and not only the
disappearance of a compound. This implies that the disappearance of a
compound from its environment does not always mean that it has been
biodegraded. Moreover, biodegradation rates vary from compound to
compound, ranging from days to decades. A good example of a very
persistent xenobiotic is the insecticide DDT (1,1,1-trichloro-2,2-bis[p-
chlorophenyl]ethane).
Such recalcitrant compounds, as the halocarbons in focus in this study, have
natural counterparts that have similar structures, implying that there are
catabolic enzymes that could degrade recalcitrant compounds. Dienelactone
hydrolase is such an enzyme, so learning about it might help when
developing new remediation strategies. This study provides us with
information about which kinds of environmental habitats dienelactone
hydrolase is present in, casting light on which type of consortia the gene is
active in.
3.2 Halocarbons
The carbon-halogen bond is highly stable and a great amount of energy is
needed to break the bond which makes halocarbons chemically stable (Atlas
& Bartha 1998, p. 514). Halocarbons is a large group comprising solvents,
refrigerants, and haloaromatics e.g. chlorobenzenes, chlorophenols,
chlorobenzoates and polychlorinated biphenyls. It is common that the
aerobic biodegradability of haloaromatics decreases with the number of
substituents. However, dechlorination of the same compounds under
anaerobic conditions occurs relatively easily (U.S. Environmental
Protection Agency 2000). Various Pseudomonas strains use dioxygenases to
aerobically convert mono- and dichlorobenzenes to chlorocatechols, and
they do so with ease (Potrawfke et.al. 1998). The aerobic degradative
pathways of haloaromatics converge at chlorosubstituted catechols (Fig.1).
Chlorosubsituted catechols are converted to 3-oxoadipate in a similar way
seen in catechol metabolism. Catechol metabolism is also known as the
ortho-cleavage pathway; chlorosubstituted catechols are converted to 3-
oxoadipate via the modified ortho-cleavage pathway. It is designated as
modified ortho-cleavage pathway due to differences in substrate
13
specificities, and that the dienelactone hydrolases of this pathway do not
convert 3-oxoadipate enol-lactone, which is an intermediate in catechol
catabolism (Schlömann 1994).
3.3 Dienelactone Hydrolase
Dienelactone hydrolase (DLH) is an enzyme present in many prokaryotes
(bacteria and archea) and in a few eukaryotes (fungi) (Pathak & Ollis,
1990). DLH catalyses the hydrolysis of dienelactone. Dienelactone is a
cyclic ester, and DLH converts it to maleylacetate following the ring
cleavage reaction. DLH is part of the β-ketoadipate catabolic pathway,
which functions as a biodegradative pathway for toxic aromatic compounds.
This pathway results in intermediates of the tricarboxylic acid cycle i.e.
acetyl CoA and succinyl CoA (Figure 1).
All DLHs contain the catalytic triad amino acid residues Cys-His-Asp, and
it is special because it is the only known non-synthetic enzyme that contains
this triad (Beveridge & Ollis, 1994). Inhibitor binding studies suggest that
dienelactone is held in the active site by hydrophobic interactions around the
lactone ring and the ion pairs between its carboxylate and Arg-81 and Arg-
206. The catalysis probably involves the formation of covalently bound acyl
intermediate via a tetrahedral intermediate (Cheah et al., 1993).
There are 3 types of dienelactone hydrolase; type I, type II and type III.
Dienelactone hydrolase I hydrolyses trans-dienelactone faster than cis-
dienelactone. Type II hydrolyses cis-dienelactone faster than trans-
dienelactone. Type III hydrolyses both (Schlömann et al. 1990). Three genes
code for dienelactone hydrolase: ClcD, TcbE and TfdE, located on the
plasmid Pjp4 (Schlömann 1994).
Pseudomonas sp. B13 is most suitable to use as an example because it
contains both the ortho-cleavage pathway and the modified ortho-cleavage
pathway. In Pseudomonas sp. B13, DLH is a monomeric protein with 236
amino acid residues and a molecular weight of 25,500 Da. It is made up of
seven helices and eight β-sheets; therefore it is an α/β protein (Pathak &
Ollis, 1990). Fig.2 illustrates the structure of DLH, retrieved from the
14
Protein Data Bank (PDB). Fig.1 illustrates the modified ortho-cleavage
pathway and the enzymes involved.
The Enzyme Comission number of dienelactone hydrolase is EC:3.1.1.45.
The Clusters of Orthologous Groups number is COG0412. Its Locus is
NP_743344.
In a study conducted by Schreiber (et al., 1980), it was found that in the
catabolism of 4-fluorobenzoate by Pseudomonas sp. B13, chlorocatechol
1,2-dioxygenase and chloromuconate cycloisomerase had no activities. This
observation suggests that, in this particular strain, only dienelactone
hydrolase and maleylacetate reductase are required in the catabolism of 4-
fluorobenzoate (along with the enzymes for benzoate catabolism). This led
to the conclusion that there are some bacteria which have both dienelactone
hydrolase and maleylacetate reductase activities, but lack the other 2
enzymes that are necessary for complete chlorocatechol degradation
(Schreiber et al. 1980). In another study, it was confirmed that the catechol
catabolic pathways shared common enzymes (with similar substrate
specificities) with the chlorocatechol pathways. These enzymes are catechol
1,2-dioxygenase and muconate cycloisomerase, chlorocatechol 1,2-
dioxygenases and chloromuconate cycloisomerase, respectively, and they
are genetically very similar. On the other hand, 3-oxoadipate enol-lactone
hydrolases (from the catechol pathway) and dienelactone hydrolases (from
the chlorocatechol pathway) are not similar genetically and cannot use the
same substrate (Schlömann, 1994). The observations from these studies
suggested that DLH was derived from a preexisting pathway.
3.4 Metagenomics
Metagenomics is a set of research techniques and a research field,
originating from experiences from genomics. In metagenomics, instead of
investigating genomes of individual organisms as in genomics, the genomes
of all the individuals in a microbial community of a particular environment
are analyzed simultaneously (Handelsman, 2004). This eliminates several
problems common in traditional clinical and environmental microbiology.
15
First, microbes exhibit great genomic diversity, and second, many microbes
cannot be cultured in the laboratory. It is known that merely 0.1-10% of all
microorganisms can be cultured in the laboratory (Amann et al., 1995). This
is why genomics of cultured isolates is insufficient.
Metagenomics includes the analyses of the total microbial community
genomes. Generally, there are two approaches, sequence based analyses or
activity based analyses. Sequence based screening provides sequence data
on the total genomic content of the microbial community analyzed, whereas
the activity based approach can detect novel gene products and enzymatic
activities expressed by genes that may otherwise not have been identifiable
based on prior sequence knowledge (Riesenfeld et al. 2004). This study will
draw on information from the sequence based approach. The first step in
metagenomics is to extract total community DNA directly from the
microbes living in a particular environment. The total community DNA, or
the metagenome, is sequenced using high through put sequencing.
The sequence based approach can entail complete sequencing of clones
containing phylogenetic anchors that indicate which taxonomic group the
DNA fragment probably belongs to. Another option is to perform random
sequencing, and once a gene of interest is identified, "phylogenetic anchors
can be sought in the flanking DNA to provide a link of phylogeny with the
functional gene" (Zeyaullah et.al. 2009). The steps involved in a typical
sequence based metagenome project are sample processing, sequencing
technology, assembly, binning, annotation, statistical analysis and data
storage and sharing (Thomas et al., 2012) (Figure 3). Sample processing is
the first and most important step because extracted DNA should be
representative of all cells present in the sample. Sequencing technologies
include next-generation sequencing (NGS). Binning is the process of sorting
DNA sequences into groups that could characterize an individual genome or
genomes from closely related organisms. For genome annotation, there are
existing pipelines such as Integrated Microbial Genomes (IMG) (Markowitz
et al., 2009). In this study, metagenomic data from different habitats
provided by other studies was accessed from the IMG/M database.
Dienelactone hydrolase or homologs were searched for in different
16
metagenomes and when detected, the environment from which the
metagenomes came from i.e. the microbiomes, were identified in order to
conduct a biogeographical investigation based on phylogenetic analyses.
The activity based approach relies on cloning of the extracted DNA into
metagenomic libraries which are screened for expression of particular traits
(Stein et al., 1996). For example, an enzyme activity for degrading a
recalcitrant compound can be screened for (Lorenz et al., 2002). The
activity based metagenomics approach was less suitable for this study and
not used.
3.5 Bioremediation for biotechnological applications
In the field of biotechnology, the microbiological degradation processes
discussed in the introduction can be taken advantage of in order to remove
toxic compounds from the environment. These processes are collectively
known as bioremediation. Bioremediation can be performed in numerous
ways, either by single isolated microorganisms, by enzymes isolated from
microorganisms or by a natural consortium of soil microbial communities
making the method not only innovative but also more environmentally
oriented because the microorganisms used occur naturally in soil and
groundwater environments (Reineke et al., 2011). Additionally,
bioremediation if optimal does not result in the creation of wastes, and, it
does not require expensive equipment or heavy labour (U.S. Environmental
Protection Agency, 2000).
Scientists working with bioremediation aim to optimize environmental
factors for optimizing degradation conditions in order to speed up the rates
at which biodegradation takes place. In doing so, it allows for the
proliferation of the appropriate types of microorganisms, which results in a
copious and diverse distribution of these microorganisms (Philp et. al.,
2009).
Bioremediation techniques include intrinsic-, in situ-, ex situ- and phyto-
remediation. For easily degraded contaminants, biostimulation has been
17
successful in both contaminated soil (Machackova et al., 2008) and marine
environments (Harayama et al., 1999). It involves the addition of suitable
electron donors, electron acceptors or nutrients (Reineke et al., 2011). Dybas
et al. have conducted bioaugmentation studies which have been successful
(Dybas et al., 2002). Bioaugmentation refers to the application of specific
microorganisms to contaminated sites in order to enhance the biological
activity of the indigenous populations (Pepper et. al., 2002).
Bioaugmentation is used in cases where intrinsic bioremediation does not
work, and is usually only applied to highly recalcitrant compounds.
Recalcitrant compounds can obviously contaminate many different types of
sites, some of which would not directly affect human health. On the other
hand, it is vital that once certain types of sites (which humans are in close
association with (e.g. groundwater)) have been contaminated, efficient
methods are used to rapidly clean-up the site. Previous studies of
contaminated groundwater in the U.S. have investigated an interception
technology known as Permeable Reactive Biobarriers (PRB), which have
proven to be efficient for contaminated groundwater (Puls et al., 1999). This
is positive because groundwater is one of the environments which could, if
contaminated, potentially threaten an entire population.
Each of the methods described above have advantages and disadvantages
depending on the type of contaminated site. This means that a careful
assessment needs to be made before implementation of bioremediation
strategies and techniques on a given contaminated site. Methods that have
previously been used successfully and have proved to be effective tools for
e.g. oil spills involve the application of nutrients along with well-adapted
microorganisms to a particular environment. In one study specifically,
bioremediation of oil-spilled sites in the open environment proved to be
successful through seeding of naturally adapted Pseudomonas putida strains
and the addition of fertilizer. The results of this study indicated an increased
viable count and degradation capacity of the inoculums (Raghavan &
Vivekanandan, 1999).
18
IV. Methodology For this study to be feasible, sequences obtained from cultured organisms do
not provide sufficient information for the overall analysis since the number
of sequenced DLHs of cultured organisms of different habitats is limited.
Therefore, sequences were retrieved from a metagenomic database. The data
had to be retrieved in the form of amino acid sequences so that a
phylogenetic analysis could be performed. As a first step in this
comparative analysis of sequences, a query FASTA amino acid sequence
corresponding to dienelactone hydrolase (DLH) was retrieved from
GenBank at NCBI. The sequence was selected from a group of organisms
where the presence of both a normal- and a modified- ortho-cleavage
pathway was first discovered, which was in a Pseudomonas strain (sp.B13).
In order to retrieve FASTA amino acid sequences of DLH of metagenomic
data, several databases were available including the database for integrated
microbial genomes with microbiome samples or IMG/M
(img.jgi.doe.gov/cgi-bin/m/main.cgi), and MG-RAST
(http://metagenomics.anl.gov/). These two databases were the obvious
options. In this study, IMG/M was the metagenomic database of choice due
to reasons depicted under 4.6.
Additionally, other databases or tools were used i.e. the Protein Data Bank
(www.pdb.org), the Basic Local Alignment Search Tool or BLAST
(blast.ncbi.nlm.nih.gov/Blast.cgi), and Clustal W (from 2 websites cited in
the reference list), which is used for multiple sequence alignments and to
create phylogenetic trees.
4.1 Integrated Microbial Genomes (IMG)
IMG is a tool database for analysis and annotation of genome and
metagenome datasets. In this study, it was used for comparative analysis of
protein sequences. The database contains microbial genome and microbial
metagenome data, providing information about gene content and functional
capabilities (Technical report 2008). Using this database, dienelactone
19
hydrolase was searched for within all available microbiomes using both the
COG number (Clusters of Orthologous Groups number) and by simply
inserting the name of the gene product, in this case dienelactone hydrolase.
Using the amino acid FASTA sequence of Pseudomonas as a query
sequence, BLASTp against selected microbiomes was performed, which
stands for Basic Local Alignment Search Tool, the details of which will be
explained shortly. The query sequence consisted of 265 amino acid long
DLH gene from Pseudomonas putida KT2440 and had accession number:
NP_743344 in GenBank.
Using the gene product name and COG number (dienelactone hydrolase and
COG0412 respectively), homologs were searched for within the selected
microbiomes. In this study, all microbiomes available at the time were used
because it was possible. This generated a list of the microbiomes and the
number of genes found within the microbiomes and further gave access to
more details and information on the individual sequences. From here,
corresponding FASTA amino acid sequences could be retrieved for use in
Clustal W.
4.2 Protein data bank (PDB)
PDB is a database/archive used to obtain basic information about biological
macromolecules (proteins and nucleic acids), such as structural data,
taxonomy information and it is also useful for generating 3D putative
structures of proteins. The database contains structures of proteins
determined by nuclear magnetic resonance (NMR) and x-ray
crystallography (Berman et al. 2000). The three dimensional protein
structure of the 3D crystal structure of DLH from Pseudomonas putida
KT2440 was retrieved from PDB and is shown in Figure 2.
4.3 Basic Local Alignment Search Tool (BLAST)
BLAST is used in molecular biology to compare a query sequence with
other sequences that can be found in protein or nucleotide databases. In
doing so, information about similarities and differences between the
20
sequences is obtained. It is the best available tool for homology analysis
(Altschul et.al. 1990). In this study, once the amino acid FASTA sequence
of Pseudomonas was collected from GenBank at NCBI, it could be inserted
into IMGs database by performing a BLASTp (BLAST for proteins) against
selected microbiomes. This generated a list of homologs from the selected
microbiomes with varying percent identity to the query sequence. From this
list, FASTA sequences were selected under the following parameters: the
selected sequences (i) shared at least 30% amino acid identity with the
query sequence, and (ii) were at least 200 amino acids long.
4.4 Clustal W
Clustal W is a multiple sequence alignment tool. It uses multiple alignments
to detect patterns and homology in order to typify protein families. In
addition, it could for example also be used as a first step in the prediction of
secondary structures of new sequences (Thompson et al. 1994). After
selecting FASTA sequences based on satisfactory amino acid similarity and
appropriate sequence length, these were inserted into Clustal W in order to
carry out a phylogenetic analysis.
4.5 Construction of phylogenetic trees
Clustal W was used at 2 different websites (cited in the reference list) in
order to align the selected sequences depending on sequence similarities
before creating the dendrograms. There are several methods that can be used
when analyzing similarities and for constructing trees showing
relationships, or evolutionary distances as in a phylogenetic tree (Saitou &
Imanishi 1989). Two popular methods that are widely used are distance- and
character-based methods, also known as phenetic and cladistic methods
respectively. In phenetic methods, the pair-wise dissimilarity of genes is
measured whereas in cladistic methods, the best out of several possible trees
is chosen that is most suitable for evolutionary distance prediction (Duncan
et.al. 1980). In this study, the phylogenetic analysis was based solely on
distance based methods. Distance-based methods comprise the unweighted
pair group method with arithmetic mean (UPGMA) and the neighbour
joining (NJ) method.
21
There are several ways of contructing phylogenetic trees, all of which are
slightly different in their method/mechanism. For this reason, it is
imperative to understand what the dissimilarities are between the types of
trees when making interpretations. Both the UPGMA and the NJ methods
can be applied to construct either a rooted or an unrooted tree. The
difference between rooted and unrooted trees is that rooted trees show the
common ancestor of the species under study i.e. the evolutionary path,
whereas unrooted trees only show the relationship among the species (Page
& Holmes 1998). When constructing the five phylogenetic trees in this
study, all the methods above were applied in an attempt to increase the
validity of the interpretations in the final analysis.
The trees were constructed at the following websites:
- http://pir.georgetown.edu/pirwww/search/multialn.shtml
- http://www.genome.jp/tools/clustalw/
These two websites were the only available websites that could generate
dendrograms with a scale and a value for each branch length. For this
reason, these 2 websites sufficed.
4.6 Methodological strengths & limitations
The methods described above have limitations and restrictions. Whilst these
methods can be efficient or suitable when applied to appropriate studies, the
general limitations will be described as well as the limitations that could
ultimately affect the results of this study.
Having access to a variety of databases is nothing but advantageous because
they provide the tools for scientists and researchers to assemble and utilize
otherwise inaccessible data. The metagenomic databases IMG/M and MG-
RAST are both very good and useful tools and they have similarities and
differences. For example, they are similar in binning methods but differ in
feature prediction. IMG/M and MG-RAST use similarity-based binning
algorithms (as opposed to compositional-based binning algorithms). This
algorithm enables an unknown fragment of DNA to be matched in a
reference database in order to classify (and hence bin) the sequence
22
(Thomas et.al. 2012). Compositional-based binning is not reliable for small
fragment reads, as is the case with the 236 amino acid long DLH sequence.
Feature prediction is the process of labeling sequences as genes or genomic
elements and identifying protein coding sequences (CDS). Feature
prediction varies slightly between IMG/M and MG-RAST. MG-RAST and
IMG/M both provide a standardized pipeline, but the latter with "higher"
sensitivity as it performs. As described by Thomas (et al., 2012), IMG/M is
also the only system that integrates all datasets into a single protein level
abstraction. For these reasons, IMG/M was used in this study.
The other limitations in this study are leaning on the choice of methods for
constructing the phylogenetic trees, because the different methods could
result in different interpretations. Moreover, the use of certain methods (e.g.
character-based methods) in this study is currently not feasible, due to time-
and labour- constraints.
Generally, distance-based methods are more rapid than character based
methods, and are not as time- and labour-consuming (Page & Holmes
1998). However, distance based methods lack or discard supplementary
information (i.e. individual substitutions among sequences) needed to
produce an accurate tree that predicts the most probable evolutionary
relationship (Bruno et al. 2000).
As mentioned in 4.5, distance based methods include the UPGMA and the
NJ method. The UPGMA algorithm does not mirror evolutionary descent
because it assigns equal weight on the distance and assumes a randomized
molecular clock (Backeljau et al. 1996). The NJ method is rapid and gives
according to application better results than the UPGMA method; it does not
adjust for the rate variation among branches. The disadvantages of the NJ
method are that it creates just one tree, at the same time neglecting other
possible trees, and, it can create a biased tree due to errors in distance
estimates which become exponentially larger for longer distances (Saitou &
Nei 1987).
Conveniently, only the detection of clusters - and its association to certain
specific environments - is of interest in this study, so the methods chosen
23
and described above are appropriate and sufficient for the final cluster
analysis. However, it would be preferable and more reliable to use all
available methods which take different parameters into consideration when
constructing the dendrograms.
24
V. Results 50 protein sequences and 23 genomes containing the genes encoding the
protein of COG0412 were matched with dienelactone hydrolase and related
enzymes in IMG. DLH was found to be present in 8 phyla: Euryarchaeota,
Crenarchaeota, Ascomycota, Aquificae, Cyanobacteria, Actinobacteria,
Firmicutes and Proteobacteria. 13 classes and 15 orders make up all of the
proteins that were found.
The sequence from Pseudomonas KT2440 was used as a query sequence,
and the search produced hits in almost all the available microbiomes in
IMG/M. There were 90 available microbiomes. Along with the query
sequence, an additional 3 sequences representing the 3 different types of
DLH were retrieved from Pseudomonas aeruginosa using COG’s database
in order to ensure that all three types of DLH are taken into consideration
when making the phylogenetic analysis. All the retrieved sequences (65
sequences) showed enough similarity with the query sequence so as to
undergo phylogenetic analysis, according to the following: all sequences
were at least 200 amino acids long and had a minimum of 30% amino acid
identity with the query sequence. The retrieved sequences comprised 23 out
of 90 available microbiomes.
A phylogenetic analysis was executed on the retrieved environmental
sequence fragments taken from IMG, in order to see the evolutionary
relationship between them. This analysis provided information whether
phylogenetically similar DLH sequences are biogeographically close or
have a habitat in common. In order to achieve this, Clustal W was used for a
multiple sequence alignmnent, and the dendrograms were generated from
the Clustal W alignment (see Appendix). In the alignment, it was observed
that three regions were highly conserved i.e. Cys-His-Asp.
In the dendrograms, the microbiomes listed in Table 1 (as well as the query
sequence and three sequences retrieved from Pseudomonas aeruginosa)
were used.
25
The results are demonstrated in the dendrograms (see Appendix). The
names of the sequences are modified in the dendrograms. Table 2 shows
which names correspond to which microbiomes.
The results demonstrate a broad biogeographical distribution. From the
dendrograms it can be seen that the environments in which DLH was
present include groundwater and soil environments, where homologs of the
query sequence were found to be abundant. Extreme environments also
contained the protein, but no homologs were found in animal microflora.
The dendrograms also show that groundwater communities do not form any
clusters distinct from the other sequences, and are therefore widespread in
the dendrogram and are consequently part of many clusters, showing close
relationship to all of the microbiomes. The dechlorinating communities
form small separate clusters in the dendrograms, whereas the hot-spring
communities form one cluster. One of the 3 types of DLH is less common
than the other 2 types.
In the alignment, it was observed that three regions were highly conserved
i.e. Cys-His-Asp. This is highlighted in the multiple sequence alignment
which is shown in the Appendix. In the alignment, histidine (His) is
highlighted in red on position 119. Asparagine (Asp) is highlighted in blue
on position 154, and cysteine (Cys) is highlighted in green on position 251.
26
VI. Discussion In this study, a broad biogeographical distribution of dienelactone hydrolase
was observed. When constructing the phylogenetic trees, amino acid
sequences were taken from samples of genes from all environments in
which the protein was found (and in which the sequences had over 30%
amino acid identity with the query sequence). As indicated by the results,
homologs of DLH were found in i.e. groundwater, soil and in extreme
environments. These extreme environments include saline, dechlorinating
and thermophillic microbiomes. The sequences with the most similarity
with the amino acid sequence of Pseudomonas putida KT2440 were found
in groundwater. Since the initial search for DLH gave hits in almost all the
available microbiomes in IMG/M, this shows that the gene is abundant in
the environment, present in many microorganisms found in many different
habitats. What this instigates is that DLH is probably an intermediate in
another pre-existing pathway, because it must have taken a long time for the
gene to be acquired by so many microorganisms in such differing habitats.
Within a group of related sequences from metagenomes of saline
environments, the dienelactone hydrolase from Pseudomonas putida
KT2440 was identified. The analyses also show a close relationship with
groundwater and dechlorinating communities. This could mean that the
gene was transferred horizontally from contaminated groundwater to other
microbiomes. Moreover, in this study, DLH sequences taken from
extremophiles tend to cluster, meaning that they are phylogenetically
related; this is due to the fact that, as expected, they are only found in a
specific extreme environment, and therefore are very similar genetically and
so different from strains of other environments.
Sequences taken from water and soil samples agree with our expected
observations. The protein variant is spread out in the phylogenetic tree,
meaning that there are many types of the gene in many environments. This
makes sense, because microorganisms that live in soil and water are
exposed to conditions that are ubiquitous on earth.
27
The gene was also found to be present in various other extreme-condition
environments i.e. Uranium Contaminated Groundwater FW106, Acromyrex
echinator fungus garden, Oxygen-depleted microbiome and more. This
means that the gene is active even in extreme-condition environments.
Therefore, these results clarify that the dynamics within many microbial
consortia, including extremophillic microbiomes, are in one way or another
affected by the activity of the gene.
During the course of this study, several problems arose that directly affected
the interpretation of the results. When the three different types of DLH were
slected from Pseudomonas aurigunosa, it was not known which type they
corresponded to (DLH I, DLH II orDLH III) because this information was
not provided by COGs database. However, it cannot be concluded from this
study that one type of DLH is predominant in one specific environment.
The phylogenetic trees show great genetic variation of the DLH gene. The
most probable reason for this is that the genes are in different environments,
undergoing adaptation as a result. The fact that the gene is spread out in
different habitats may be due to the fact that there are different variants that
have adapted to a specific habitat. The sequences taken from extremophillic
environments and other groups might have been derived directly from
groundwater consortia, possibly through horizontal gene transfer. In an
evolutionary perspective, this means that, at some point, DLH underwent a
minor alteration in order to retain its activity in a new environment. This
alteration might not have been sufficient so as to cause enough structural
change to alter the conformation and ultimately the function of the protein,
but sufficient to maintain its’ catalytic activity.
From this analysis, it cannot be concluded which types of DLH are found in
which type of environments. However, it is still clear that microbiomes from
certain extreme environments display specific conserved regions that are
unique to them.
Dienelactone hydrolase from Pseudomonas putida KT2440 is clustered in
such a group which ultimately shows that it also has most if not all genes
needed to survive any given environment. This is in accord with our initial
28
thoughts, which were that samples of DLH from Pseudomonas strains have
proven to be of the three different types of DLH, hence the close
phylogenetic relation to variable microbiomes.
In the alignment, it is clear that three regions in the sequences are highly
conserved. DLH is known to be the only naturally occurring compound with
the catalytic triad Cys-His-Asp, so this triad was expected to be conserved
throughout the entire alignment; this was the case. Apart from this, it was
difficult to detect which additional regions (unique to the sequences within
that cluster) were conserved among the hot spring/boiling water sequences.
The results show that DLH could be used in developing remediation
techniques because of its widespread biogeographical distribution. If we can
detect the extent of activity of DLH within a microbiome, we could further
investigate the interactions between the microorganisms in that habitat. This
could help in bringing forth appropriate mix-cultures for bioremediation, fit
to clean up any contaminated site. It is therefore imperative that we search
for answers to what induces the activation of these genes. If methods can be
developed that could determine the activity of catabolic genes in a given
environment, it would facilitate in the detection of microbial consortia
capable of degrading recalcitrant compounds. This would enable us to
access these microbiomes directly instead of first looking for the gene
before determining its activity.
Dienelactone hydrolase can be used in the biotechnological field of
bioremediation by developing enzyme-assays that enable the detection of
the activity of this gene. In doing so, more knowledge can be gained about
the conditions of the environment in which the gene is active, thereby
assisting scientists as they attempt to develop novel methods.
Contaminated sites have proved to be a direct threat to human populations.
For example, the contamination of groundwater is hard to detect due to the
fact that it is a hidden resource. This makes it simple to forget that it is a
serious problem. According to Morris (et al., 2003) approximately two
billion people depend on aquifers for drinking water, and 40% of the world's
food is produced by irrigated agriculture that relies largely on groundwater
29
(Morris et al., 2003). Despite the need for effective methods for
bioremediation, such methods are still under development.
It would not be absurd to assume and expect a correlation between
contaminated sites and clusters of certain genetic variants of DLH. Studies
show the presence of thermophiles in oil reservoirs (Fardeau et al., 2004).
This is an important finding in relation to this study, because this indicates
that the DLH variant present in our thermophile-cluster could be utilized in
bioremediation in extreme environments.
A novel option in the development of bioremediation techniques could be to
consider GMO (Genetically Modified Organisms). In relation to this study,
GMO could be an effective option. Genetically-modified strains or consortia
(with the desired DLH catalytic capabilities) could be developed as
approaches that overcome the limiting factors of contaminant
biodegradation. These strains or consortia could then be inserted directly
into contaminated groundwater (along with nutrients if necessary), or
Permeable Reactive Biobarriers (PRB) could be developed. However,
genetic modification of microorganisms is controversial for the simple
reason that little is known about whether we can control the fate of such
organisms. Also, since the genes from groundwater were spread out in the
dendrograms, DLH could be used in relatively cheap bioremediation
methods such as In situ bioremediation.
Since the studies by Dybas (et al., 2002) mentioned previously have proved
bioaugmentation to be a successful technique in bioremediation of
recalcitrant compounds, it could be used when there are few or poorly
adapted natural populations, and when intrinsic bioremediation does not
work. This means that bioaugmentation would be a relevant method to use if
bioremediation-techniques involving DLH were to be developed.
Several other methods would be feasible, each having advantages and
disadvantages depending on the specific bioremediation sites. According to
Philp (et al., 2009) some compounds are more rapidly degraded
anaerobically. This is especially true for the dechlorination of halocarbons.
30
This suggests that sequential anaerobic and aerobic treatment would be the
best alternative for the bioremediation of highly chlorinated compounds.
Also, since the studies performed by Schreiber (et al., 1980) and Schlömann
(1994) indicated that DLH was derived from a preexisting pathway, and that
the results of this study reveal that DLH has a widespread biogeographical
distribution, it can be concluded that the gene is present in a great number of
microorganisms adapted to a variety of environments. The gene has
therefore had a long period of time to be passed on to other microorganisms
through horizontal gene transfer. The more time passes, the more probable it
is for a variety of microorganisms to acquire the gene. Hence, instead of
turning to controversial methods such as i.e. genetic modification, it should
be possible to extract adapted strains from a variety of habitats and grow
them in the laboratory. Thereafter, mixed inoculums could be produced (e.g.
adapted-strain/fertilizer inoculums) and used for the bioremediation of
certain contaminated sites, as suggested by Raghavan & Vivekanandan
(1999).
31
VII. Concluding remarks Dienelactone hydrolase was found to be present in many environments,
ranging from soil and groundwater environments to extreme environments.
The evolutionary relationship between different strains was easy to see.
Groundwater and soil samples were scattered across the entire dendrogram,
whilst extremophiles exhibited a closer relationship, given by the clusters.
Therefore, there seems to be an adaptation in the extremophiles that are
exclusive to their extreme environments.
This observation is based on the rooted dendrogram, which displays two
distinct clusters that seem to comprise samples that share a very close
genetic relationship, hence the absence of sequences retrieved from other
environments within these clusters. These two distinct clusters are the
dechlorinating and hot spring/boiling water microbiomes. The difference
between the two, well-conserved clusters is that the genes retrieved from hot
spring/boiling water environments are only found in a certain (rare) type of
geographical area, whereas the dechlorinating communities are
geographically more widespread.
Our expected observations were confirmed, because microorganisms in soil
and groundwater should be found in any given environment, due to their
abundance. Their abundance correlates with the fact that the conditions
present in these environments are copious on our planet.
From the results of this study it can be concluded that DLH can definitely be
useful in bioremediation in the future, but further research on the matter is
warranted. Unquestionably, the best results will draw from bioremediation
techniques that involve a combination of pre-assessed effective methods
custom made for the environment under decontamination, which will have
to circumvent obstacles such as limiting factors and other parameters.
Enzyme-assays will have to be developed in order to detect the activity of
catabolic genes in a given environment.
32
VIII. Acknowledgments I would foremost like to thank Associate Professor Sara Sjöling & Karin
Hjort for supervising and guiding me through the process. Second, I want to
thank the Department of Life Sciences at Södertörn University for making
this thesis possible. Last but not least, I would like to thank my examiner for
reviewing and assessing this report.
I am blessed to be expecting a daughter in August, so, any work that I do
from now on is strictly dedicated to her, and my family. Mutatis Mutandis.
33
IX. Appendix
Fig. 1. Illustration of the modified ortho-cleavage pathway with related enzymes. TCC = Tricarboxylic acid cycle. Degradative pathways for 3-chlorocatechol, 4-chlorocatechol and 3,5-dichlorocatechol in Rhodococcus opacus 1CP. (Moiseeva et. al., 2002)
34
Fig 2. 3D crystal structure of dienelactone hydrolase (DLH) from Pseudomonas putida
KT2440 retrieved from the Protein Data Bank (PDB). The coloured molecules are glycerol
and sulphate ions. Image of 1ZI6 (Following directed evolution with crystallography:
structural changes observed in changing the substrate specificity of dienelactone hydrolase.
(2005) Acta Crystallogr.,Sect.D61: 920-931, Kim, H.K., Liu, J.W., Carr, P.D., Ollis, D.L.)
created with J.mol (J.L. Moreland, A. Gramada, O.V. Buzko, Q. Zhang, P.E. Bourne (2005)
The Molecular Biology Toolkit (MBT): a modular platform for developing molecular
visualization applications. BMC Bioinformatics 6:21).
35
Fig. 3. Flow diagram of a typical metagenome project. Dashed arrows indicate steps
that can be omitted. (Thomas et. al., 2012)
36
-1 sequence of microbiome retrieved from ANAS dechlorinating bioreactor
-2 sequences of microbiome from Acid mine drainage
-1 sequences of microbiome from Acromyrmex echinator fungus garden
-2 sequences of microbiome from Air microbial communities Singapore
indoor air filters
-2 sequences of microbiome from Aquatic dechlorinating community (KB-
1)
-1 sequence of microbiome from Bath Hot Springs, filamentous community
-1 sequence of microbiome from Bath Hot Springs, planktonic community
-5 sequences of microbiome from Bison Hot Spring Pool, Yellowstone
-6 sequences of microbiome from Fossil microbial community
-1 sequence of microbiome from Marine micr. comm.. from Deepwater
Horizon Oil Spill
-1 sequence of microbiome from Hot Spring microbial communities from
Yellowstone National Park
-18 sequences of microbiome from Oak Ridge Pristine Groundwater
-1 sequence of microbiome from Saline water microbial communities from
Great Salt Lake
-1 sequence of microbiome from Sediment and water microbial
communities from Great Boiling Spring
-2 sequences of microbiome from Uranium Contaminated Groundwater
FW106
-1 sequence of microbiome from Marine planktonic communities from
Hawaii Ocean Times Series Station (oxygen minimum layer)
-1 sequence of microbiome from Bacterial pyrene-degrading mixed culture
-2 sequences of microbiome from Groundwater dechlorinating community
(KB-1) from synthetic mineral medium in Toronto, ON, sample from site
contaminated with chlorinated ethenes
-2 sequences of microbiome from Hypersaline Mat
-2 sequences of microbiome from Marine Trichodesmium cyanobacterial
communities the Bermuda Atlantic Time-Series
-2 sequences of microbiome from Lake Washington Formaldehyde
enrichment
-1 sequence of microbiome from PCE-dechlorinating mixed culture
37
-2 sequences of microbiome from Soil microbial communities from Puerto
Rico rain forest, that decompose switchgrass
-3 sequences of microbiome from Thermal compost enrichment from
Puerto Rico rainforest.
Table 1. The table shows the number of sequences taken from the microbiomes used in the
dendrograms.
38
NAME OF MICROBIOME (IMG/M) MODIFIED NAME
(Dendrogram)
ANAS dechlorinating bioreactor ANAS
Acid mine drainage Acid
Acromyrmex echinator fungus garden Acromyrmex
Air microbial communities Singapore indoor air
filters
Air
Aquatic dechlorinating community (KB-1) Aquatic
Bath Hot Springs, (filamentous & planktonic
communities)
BathHotS
Bison Hot spring pool Yellowstone Bison
Fossil microbial community
Fossil
39
Marine microbial community from
Deepwater Horizon Oil Spill
Hot Spring microbial communities
from Yellowstone National Park
Oak Ridge Pristine Groundwater
Saline water microbial communities
from Great Salt Lake
Sediment and water microbial
communities from Great Boiling
Spring
Uranium Contaminated Groundwater
FW106
Marine planktonic communities from
Hawaii Ocean Times Series Station
(oxygen minimum layer)
Bacterial pyrene-degrading mixed
culture
Groundwater dechlorinating
community (KB-1) from synthetic
mineral medium in Toronto, ON,
sample from site contaminated with
chlorinated ethenes
Hypersaline Mat
Oil
Hot
Oak
Saline
Sediment_water_boiling
Uranium
O2-minimun
Pyrene-degrading
Groundwater-
dechlorinating
Hypersaline
40
Marine Trichodesmium cyanobacterial
communities the Bermuda Atlantic
Time-Series
Lake Washington Formaldehyde
enrichment
PCE-dechlorinating mixed culture
Soil microbial communities from
Puerto Rico rain forest, that decompose
switchgrass
Thermal compost enrichment from
Puerto Rico rainforest
Cyanobacterial-
community
Formaldehyde-enrichment
PCE-dechlorinating
Rainforest-soil
Rainforest-thermophile
Table 2. Modified names (inserted in dendrogram) and corresponding names as in IGM/M
41
Multiple Sequence Alignment
Fossil4 --------------------------------------------------
Oak9 --------------------------------------------------
Air1 --------------------------------------------------
Fossil1 --------------------------------------------------
Cyanobacterial-community --------------------------------------------------
PA2682 --------------------------------------------------
Uranium1 --------------------------------------------------
Rainforest-Thermophiles1 --------------------------------------------------
Acid1 ---------------------------------------------MKKTF
Acid2 --------------------------------------------------
Hypersaline --------------------------------------------------
Cyanobacterial-community2 --------------------------------------------------
O2-minimum --------------------------------------------------
Bison5 -------------------PKGMHMKDLVQDGDSLVAKTAFEDGVDRRVF
Uranium2 --------------------------------------------------
Acromyrmex --------------------------------------------------
Formaldehyde-enrichment2 ------------------------------SQDDHFNSLVPETPIDRRGF
Aquatic_dechlorinating_ --------------------------------------------------
Groundwater-dechlorinating --------------------------------------------------
ANAS --------------------------------------------------
Oak11 -----------------------------------------------MSI
Oak1 --------------------------------------------------
Pseudomonas --------------------------------------------------
PA1166 --------------------------------------------------
Hypersaline2 --------------------------------------------------
Oak14 --------------------------------------------------
PA1597 --------------------------------------------------
Oak18 --------------------------------------------------
Saline1 -----------------------MRVAGFVLILCTLPLLAGCGSDSGSEA
Oak3 --------------------------------------------------
Bison2 --------------------------------------------------
Bison3 --------------------------------------------------
BathHotS1 --------------------------------------------------
BathHotS2 --------------------------------------------------
Bison1 --------------------------------------------------
Bison4 --------------------------------------------------
Sediment_Water_Boiling --------------------------------------------------
Rainforest-Thermophiles2 --------------------------------------------------
Oak17 --------------------------------------------------
Pyrene-degrading --------------------------------------------------
Fossil2 --------------------------------------------------
Rainforest-Thermophiles MADIKKEDIK------QEVFDLYDDYAHNRIDRREFVQKLSLYAVGGLTV
Fossil5 MTPIRKRASD----FHPHILEIFDGYVHGAISKRDFIKQAGKFAAAGVTG
Oak12 MT--RLTAKD----FAPELLELYDGYAHGKINRREFLDRAALFTLGGLTA
Marine_Oil MLMNNQQEENNGHLIPLEAFNWYDEYAHGLIDRRTFIARLSMLVTATLTL
Oak10 ----------------------------------------KRLALTGVAL
Oak7 --------------MDQKFITLFDRFTHGGMNRRTFMEKLTILAGSATAA
Oak4 ---------------------------------------------MKRIG
Oak5 ---------------------------------------------MKRIG
Fossil6 --------------------------------------------------
Fossil3 --------------------------------------------------
Oak2 --------------------------------------------------
Oak13 ------------------------------MCDQDHFDVDKLEFETKGLV
Oak15 -----------------------------------------GLAKSARFR
Aquatic_dechlorinating_2 --------------------------------------------------
Groundwater-dechlorinating1 --------------------------------------------------
42
Hot --------------------------------------------------
Formaldehyde-enrichment --------------------------------------------------
Rainforest-Soil --------------------------------------------------
Rainforest-Soil2 --------------------------------------------------
PCE-dechlorinating --------------------------------------------------
Oak6 ---------------------------------MTKGLSKGCQKDFPSGA
Oak16 --------------------------------------------------
Air2 --------------------------------------------------
Oak8 --------------------------------------------------
Saline2 --------------------------------------------------
Fossil4 -----------------------LRAPWWPGHRNRIDAGISRLPSPAAFG
Oak9 -----------------------IRVPVTGG---EMSAYMS-LPK---KG
Air1 -----------------------MSATNTTIPALDSEGEIPAYVARPDAD
Fossil1 --------------------------MTNSLTVTTPDGKFDAYVAMPAKL
Cyanobacterial-community --------------------------MKITISSN-YDETFTADLKIPTST
PA2682 ------------------------MGQYVSIAASDGSGRFDAYLALPASG
Uranium1 -------------------------MGQQINIPTSGTQCIGAYMAQASGK
Rainforest-Thermophiles1 -------------------------MGQWTELETP-AGSVAAWQADPPGT
Acid1 YIDVEQGGILMTEKQQMPEFFDRSAISGIAAEECRITSSLGGAFARPKGS
Acid2 ----------------MGEWIER----GLAFKEF------------PAGE
Hypersaline ---------------------------------------------MPVGE
Cyanobacterial-community2 --------------------MTQLKIHTTHIQVPNGDLQIDSYLAQPLEA
O2-minimum -------------------------------------------LRGTTTG
Bison5 LKAAVGSGFAAATLPVMAQSMIQTDTSGLSAGDHIIVINGQDVPVYRAQP
Uranium2 --------------------------------------------MSRA--
Acromyrmex ----------------------------VSAPFPVILVVHEIFGINDY--
Formaldehyde-enrichment2 IAAALAAGFAVTAGPVLG-QAIKTPMDGLEGGDISIGDIPAYYAVPKAG-
Aquatic_dechlorinating_ -----------MAQTKKMHS--------ETVNYKDGETELQGYLVYDENL
Groundwater-dechlorinating -----------MAQTKKMHS--------ETVNYKDGETELQGYLVYDENL
ANAS MKMLVAGLLFCVAMVFPMASGAGAAVRMETYPYGKGEVRLLGQLAWDDAV
Oak11 MTHRGSAILFWLVLIGCLPS-AQAAIQGQAVEYRDGDTVLEGYVAYDDAH
Oak1 --------------------------------------------------
Pseudomonas ---MNMRALLALTLMCSAALAQAAVVTREIPYQDDDGNRLVGYYAYDDAL
PA1166 -----MRLLCALLLIACAASVQAAIQTREMPYRSADGTRMVGYFAYDDSK
Hypersaline2 -----------------------------ISYED-EGVPLTGHLYWDDAI
Oak14 ------------------------------VVYQIDGQSYESRLAFDASH
PA1597 ----------------------MSEIRVEPVAYDIDGQPYEGQLVYDASH
Oak18 --------------------------------------------------
Saline1 ERMAEEHEGDTPTATEAAQAPKIPVEGRTVTYGQQNGTARTGYLAAPADV
Oak3 -----MKRIAIFLALLAFAASAFAAEGRTVTYKSGNDTISAVLYAPAKTM
Bison2 --------------------------MGQRISFNVNGVEVSGYLAEPENL
Bison3 --------------------------MGQRISFSVNGVEVSGYLAEPENL
BathHotS1 --------------------------MGQRISFNVNGVEVSGYLAEPENL
BathHotS2 --------------------------MGQRISFNVNGVEVSGYLAEPENL
Bison1 --------------------------MGQRISFNVNGVEVSGYLAEPENM
Bison4 --------------------------MGQRISFSVNGVEVSGYLAEPENM
Sediment_Water_Boiling --------------------------MGQRISFSINGIEVSGYLAEPENM
Rainforest-Thermophiles2 -------------------------MKTETLQFETANGATTAYVAMPDNA
Oak17 ---------------------------------ASNGEQARGYLSLPSGG
Pyrene-degrading -------------------------MGHLLDFKRPDNTNCRGYLAT-AGQ
Fossil2 -MLLSRLSPDYGMPEQVSFNDPDILASYEKYDSPNGNGEIEGYLVKPTAA
Rainforest-Thermophiles SSLMSFLMPDYKNRTQVKADDPRIQAEYITYASPKGAGTMKGLLCMPSDV
Fossil5 AMILDQLQPNYAWAAQVEPDDPSILSERISYDSPEGHGKIIGLMAKPVGA
Oak12 SALLAALSPNYALAEQVKFTDPDIVADYITYPSPKGNGTVRGYLVRPAKA
Marine_Oil SVLTSALIPNYAKAEQVSFNDQDIIAKYSTFSSPEGHGEGRGYLVLPAYI
Oak10 TAITEGLMPNYALGQQVRKDDERIKATYETVQSPMGNGSIKGYFVRPTSA
Oak7 NALLPLLENNYARADILPEGDPRIVSQTLEYKG--GAG----YFVKPSAE
Oak4 LVLVLLIAPGISAQDWARVKLEKSPRHREWVTVKHEGRAVETFVAYPESK
43
Oak5 LVLVLLIAPGISAQDWARVKLEKSPRHREWVTVKHEGRAVETFVAYPESK
Fossil6 --------------------------------------TMHNFVAYPERS
Fossil3 ---------------------------------------MSTYHVAP-AE
Oak2 -------------------------MIEKEVRVTSRNGVIPSFAVCPEGP
Oak13 TRRQFGVLLGAGMAMLLPRVVNAVAVTDGEVTITTPDGTCDAYFVHP-AS
Oak15 DDEAFIELLADVDAVIGAAAQPTPRVTTTKIEIATADGKCPSYVFRPEGT
Aquatic_dechlorinating_2 ---TSREPAQEPAVGDEPSPPFPYTAEDVDFGDARAGIRLAGTLTVPEGK
Groundwater-dechlorinating1 ---TSREPAQEPAVGDEPSPPFPYTAEDVDFGDARAGIRLAGTLTVPEGK
Hot -------------------------MEEKVRYKSFDGKEVEAFLVKGGDK
Formaldehyde-enrichment ----------------------MQNPTSEQIEISTADGLMPAVLAHPVVA
Rainforest-Soil ---------------------------MSIQCETVRYGDQVAFFAAPERH
Rainforest-Soil2 ---------------------------------------MPAWLAVPKSS
PCE-dechlorinating ----------------------QAFEEREVVLNAGTDWELPGTLALPVKR
Oak6 NSPILLRTQKPAIDPIHKSEPLMTIKDNEIVEVPTPTGPMRTYVFRPTAE
Oak16 ---------------------------------------VRRLADQEQAA
Air2 --------------------------MGEWITLDTHYGPVRAWQATPEGK
Oak8 MTNRIAGLPGKSKGDRVGAEQRLLPAPGRNRRGRVGEGDADHTLLRDGEQ
Saline2 -----MPGPARTGPLHPEARSRTDRRRFLASAGALGTALLAGCLGDDTES
Fossil4 NR----------PGVLVLQ--------EISGSRLDGRHP-DWLAGEGFTA
Oak9 KG----------PGIVVLQ--------EIFGVNESMRKVCDFLASRQFTA
Air1 SP----------RAIIVIP--------EIFGVNAGIRKKCDDWAAEGYLV
Fossil1 PA----------PVVVVIQ--------EIFGVNPVMRGIADDYAKQGYIA
Cyanobacterial-community PA----------PGLIIIQ--------EIFGVNEVMRNIADRYAQLGYVA
PA2682 KG----------PGVVIGQ--------EIFGVNANMRAVADLYAEEGYVA
Uranium1 PK----------GGLLVIQ--------EIYGVNAHMRSVVDRFARLGYTA
Rainforest-Thermophiles1 PR----------GGLVVIQ--------EIFGVNPHIRAVADGYAAEGYVV
Acid1 G---------PHPAMIVFM--------EAFGLNGFIKDFLRLLAAEGFFA
Acid2 K---------KVPGIILLM--------EAYGVNEHFRRLAARLAGWGYAV
Hypersaline E---------SLPGVVVLQ--------EIFGVNDHIRDVTQRIAQEGYVA
Cyanobacterial-community2 G---------LFPAVVVFQ--------EIFGVNNHIREVTENIAKEGYVA
O2-minimum S---------PGPAGLVIM--------EAFGVDAHIMDVARRLATEGYVT
Bison5 EGR------SNLPVVLVIS--------EIFGVHEHIKDVARRFAKAGYLA
Uranium2 -----------------------------------------RFAKQGYLA
Acromyrmex -----------------------------------IRDICRRLAEAGYLA
Formaldehyde-enrichment2 ---------GRRPVLLVVT--------EIWGLHEYIKDTCRRLAKAGYFA
Aquatic_dechlorinating_ TS--------PAPGVLVVH--------EWMGLNDYAKHRADMLAELGYVA
Groundwater-dechlorinating TS--------PAPGVLVVH--------EWMGLNDYAKHRADMLAELGYVA
ANAS KG--------PRPAVLVVH--------EWWGLNNYARERASALASMGYIA
Oak11 TQ--------PRPGVLVVH--------EWKGLNEYAKRRARQLAELGYIA
Oak1 --------------VLVCH--------EGSGLDRHAKGRAERLAGLGYAA
Pseudomonas DG--------KRPGIVVVH--------EWWGLNDYAKRRARDLAALGYKA
PA1166 PG--------IRPGVIVVH--------EWWGLNDYAKRRARDLAELGYSA
Hypersaline2 AD--------ERPGILVIH--------EWWGLNDYAKKRARMLAELGYVA
Oak14 KG--------PLPGLLMAP--------NWRGVSAGAEEIAKRVAAKGYVV
PA1597 AG--------PRPGLLMAP--------NWMGVSAAALDIARQVAGRGHVV
Oak18 -----------LPGLVVIH--------EWWGLNDDIKAVTRRLAAEGYVA
Saline1 DSVRSARGGDALPGIVVIH--------EWWGLNDNVRAATRRLAGEGYRA
Oak3 KG--------KLPAIVIIH--------EWWGLNDWVQEQASKWADQGYVT
Bison2 QK--------QAPLIMVFH--------EWWGLVKHIEDVCDRFAREGFYA
Bison3 QK--------QAPLIMVFH--------EWWGLVKHIEDVCDRFAREGFYA
BathHotS1 QK--------QAPLIMVFH--------EWWGLVKHIEDVCDRFAREGFYA
BathHotS2 QK--------QAPLIMVFH--------EWWGLVKHIEDVCDRFAREGFYA
Bison1 QK--------QAPLIMVFH--------EWWGLVKHIEDVCDRFAREGFYA
Bison4 QK--------QAPLIMVFH--------EWWGLVKHIEDVCDRFAREGFCA
Sediment_Water_Boiling QK--------QAPLIMVFH--------EWWGLVKHIEDVCDRFAREGFYA
Rainforest-Thermophiles2 DA--------SKAVILI-H--------EWWGLNDHIKDIANRYAAEGFIA
Oak17 TG----------PGVIVVQ--------EWWGLNPQIKGVADRLASEGFVA
Pyrene-degrading DR----------PGVVVIQ--------EWWGLNDQICGVADRFARAGFNA
44
Fossil2 TG--------KIPAVLVIH--------ENRGLNPYVKDVAHRVAKAGFLA
Rainforest-Thermophiles KE--------KLAGVIVVH--------ENRGLNPYIEDVARRTALAGFVA
Fossil5 TG--------KLPAVLVIH--------ENRGLNPYIEDVARRMAKAGYLA
Oak12 AG--------KLPAVVVVH--------ENRGLNPYIEDVARRLAKAGFIA
Marine_Oil AN--------KAPVVLVVH--------ENRGLNPYIKDVARRLAKAGFIA
Oak10 DSREAMPT--KLPGVIVVH--------ENRGLNPHIEDVARRFAMENFMA
Oak7 G---------KYPGVIVIH--------ENRGLNPHIKDVARRLAVEGFAV
Oak4 DK---------TPVVLVIH--------EIFGMTDWVEDLADQVAEAGYIA
Oak5 DK---------TPVVLVIH--------EIFGMTDWVEDLADQVAEAGYIA
Fossil6 DK---------APVFIVIH--------ENRGLNEWARSFTDQLAKKGFIA
Fossil3 GKY---------PPVILYM--------DAPGIREELRDFARRIAAQGYFV
Oak2 GAF---------PGIILYM--------DAPGIREELRNLARRIAKHGYFC
Oak13 GTA---------PGVLLWP--------DIFGRRPAMHQMAKRLAESGYSV
Oak15 GPF---------PAVLVFM--------DGLGIRPAILEIGERLSTYGYFV
Aquatic_dechlorinating_2 RP---------FPGVVLVS--------G----SGPQNRDEEILGHRPFLV
Groundwater-dechlorinating1 RP---------FPGVVLVS--------G----SGPQNRDEEILGHRPFLV
Hot AG------------IIVIS--------EIWGLTQFIQNVARRLASLGYTS
Formaldehyde-enrichment RG-----------AVVVVQ--------EAFGVTSHLESICHRLASVGWLA
Rainforest-Soil RES--------LPAVIVVQ--------EVFGLGGHIEDVARRIAAAGYAA
Rainforest-Soil2 TP---------VPGVVVLH--------DVFGMSRDHRNQANWLADAGFLA
PCE-dechlorinating GAS---------PAVVLVHGSGANDRDETIGPNKPFRDLAWGLASQGIVV
Oak6 GR---------YPGILLYS--------EIFQVTGPIRRTAAMLAGHGFVV
Oak16 QD---------QDQVLAAD-----------GVAEDREQRVGQAHHPGDRQ
Air2 PR----------GGLVVIQ--------EIFGANPHIRAVADDYAAKGYAV
Oak8 MDT-------EHADMRAVGDASEADTVAACALDYFSDGPCGRLVRQAVAV
Saline2 TDD--------DGGTDDGGDGG-----TDGGGDDNGGTDDGGDDGDGIDD
Fossil4 LCPDLFWRIEPGIQITDK--------------------TEAELNRAFELF
Oak9 VCPDLFWRAEPGVELK-----------------------ETEFERARALR
Air1 IAPDIFWRFAPGVELNPD--------------------VEAELQEAFGYF
Fossil1 VCPDLFWRIEPGINITDQ--------------------TEEEWKQAFGYY
Cyanobacterial-community IIPDLFWRQEPGIELSAQ--------------------SEDDWKKAFELY
PA2682 LVPDLFWRLQPGVDLG-Y--------------------DEAAFAKAIELF
Uranium1 IAPAFFDHLETGVELD-Y--------------------DRAGTHKGKQLV
Rainforest-Thermophiles1 LAPAFFDPVERGVELG-Y--------------------GEDGFARGRALV
Acid1 VAPDLY-----EGKIYEY--------------------SDFSGAIG--HL
Acid2 LVPDLYRRFPEERRVVAY--------------------SDRETAMG--NL
Hypersaline IAPALYQR-VAPGFETGY--------------------TEADLKIGKEYK
Cyanobacterial-community2 IAPSIYQR-QAPGFEVGY--------------------TEEDIILGRKYK
O2-minimum AAPDLFHR-GGR-LASAP--------------------YDKLAEYRDQLR
Bison5 IAPDLFVR----QGDPTK--------------------IANIADLMKDII
Uranium2 LAPELFVR----QGDAHN--------------------ASSIADLMTNIV
Acromyrmex IAPDLFFR----HSDPAS--------------------FSSPQQLKNELV
Formaldehyde-enrichment2 VANDPYYR----LGELWK--------------------LTQIKEVLAKAN
Aquatic_dechlorinating_ FAVDIYGVNNLPNDM-------------------------QGAAAMAGKF
Groundwater-dechlorinating FAVDIYGVNNLPNDM-------------------------QGAAAMAGKF
ANAS LAADIYGEGFATTDP-------------------------SKARELAGKF
Oak11 FAADMYGKGVLAKDH-------------------------DEAAKLSGVF
Oak1 FALDYHGDG-KPLGR-------------------------DEMMDRLGQL
Pseudomonas LAIDMYGDGKHTEHP-------------------------QDAQAFMAAA
PA1166 LAIDMYGEGKHTEHP-------------------------QDAMAFMQAA
Hypersaline2 FAADMYGNDQVTDQP-------------------------SQAREWMQEV
Oak14 LIADLYGQKVRPSNG-------------------------DEAGAAMMPL
PA1597 LVADLYGRDVRPQNG-------------------------DEAGAAMMPL
Oak18 LAVDLYGGKTAATP---------------------------DAAEALTND
Saline1 LAVDLYGGAVAETP---------------------------DSAQALMGQ
Oak3 LAVDLYRGKVATDR---------------------------DMAHELMRG
Bison2 FAIDLYKGKTADNP--------------------------EDAGKLMMDL
Bison3 FAIDLYKGKTADNP--------------------------EDAGKLMMDL
45
BathHotS1 FAIDLYKGKTADNP--------------------------EDAGKLMMDL
BathHotS2 FAIDLYKGKTADNP--------------------------EDAGKLMMDL
Bison1 FAIDLYKGKTADNP--------------------------EDAGKLMMDL
Bison4 FAIDLYKGKTADNP--------------------------EDAGKLMMDL
Sediment_Water_Boiling FAIDLYKGKTADNP--------------------------EDAGKLMMDL
Rainforest-Thermophiles2 IAPDLYRGTIATDP--------------------------QEASKLMHGL
Oak17 LAPDLYRGELAGHDE-------------------------MDRAGELMSK
Pyrene-degrading LAPDLYHGRIT--QD-------------------------ANEASHMMNG
Fossil2 FAPDGLSSVGGYPG---------------------------NDAEGKALQ
Rainforest-Thermophiles LAPDALTPLGGYPG---------------------------NDDEGRALQ
Fossil5 LAPDGLSPLGGYPG---------------------------NDDEGRTMQ
Oak12 LAPDGLTSVGGYPG---------------------------NDEKGVELQ
Marine_Oil FAPDILHTLGGYPG---------------------------NDDEGRKMQ
Oak10 FAPDGLTSVGGFPG---------------------------NDFQGGQLF
Oak7 LAPDYLSGLGGTPE---------------------------DADKARDMI
Oak4 VAPDLLSGMGPNGGRSSD--------------------F-AQG-KTMEAV
Oak5 VAPDLLSGMGPNGGRSSD--------------------F-AQG-KTMEAV
Fossil6 VAPDLISNTVEGFEKTTD--------------------F-ENSDAARSAI
Fossil3 LLPDMYYRQGELRFDLSKG---------------KEE----MKR-MFGAM
Oak2 LLPDMYYRLGQLRFDFVRR---------------AEG----MRATMFAAM
Oak13 LVVNPFYRVKKAPTADAGA---------------ATP----IQQLMP-LA
Oak15 LLPDLYYRFGPYAPMDARA---------------IFTDPEKIKELRERFF
Aquatic_dechlorinating_2 LADYLTRRGIAVLRYDDR-----------------------GVGDSKGSF
Groundwater-dechlorinating1 LADYLTRRGIAVLRYDDR-----------------------GVGDSKGSF
Hot LAPNLYSREGDLFSPENISGVMRRFFSIPPEKRGDQEFISRIVAELNERE
Formaldehyde-enrichment VAPALYHRQGSPVFAY------------------------DDLAGVMPVI
Rainforest-Soil IAPDLYAVDGVRPPHMTSARIERAFGVTRSLAPELAEDPVAKAAALARLP
Rainforest-Soil2 LAPDLYYHGGRLICIR-----------------------------HVIRD
PCE-dechlorinating LRYDKRTR--------------------------------VHASQMAGLR
Oak6 VAPEIYHEFEPAGTVLAYD--------------------QAGADRGNALK
Oak16 QQPDARAHRQAQADLP--------------------------GPRPLVLG
Air2 LAPSFFDLAESAPDS--------------------------TEPPELPYD
Oak8 VDQAHSGAVREHLRP-------------------------RGAVGAAVLQ
Saline2 DPVTLERRAREYMQLQGEG---------------------SFEAAFERFA
Fossil4 GLFNQDTGLQDIRTSLSALRAL-------------DACS---GKAGAIGY
Oak9 GKMNDDQVTDDIASAIAFLRKH-------------PACD---GTVGVVGY
Air1 GQYDADDGVKDIEAAIRWLHAQ-------------GA-----GKVGAVGF
Fossil1 QAFNVDKGVEDIAATMAQARKI-------------DGAN---GKVGVVGF
Cyanobacterial-community QGFDENKGVDDLISTMKTLKKL-------------PECS---GQVGTIGF
PA2682 QRIDLDAAVDDIAACIEHLRQR-------------EEVVH--AGIGFVGF
Uranium1 TELGLERALEDVASAAEAIAS---------------AGR-----IGTVGY
Rainforest-Thermophiles1 QELGTARALAILQAAAERLRAD-------------LAARQAPTAVGTVGY
Acid1 SRLKDDVVMEQTRQTLEWLE---------------KRPDVQKDRTGALGF
Acid2 SRLKDEEAKEDISRCLDILR---------------NDPRVDRDRIGVVGF
Hypersaline AQTKAEELLGDIQGAIDYLR---------------EQTPVKSNAIGCIGF
Cyanobacterial-community2 EQTKASELLGDIQATINYLK---------------TLPTVKSDKFGCIGF
O2-minimum VGFSDQTVLSDVEAAVRQLQ---------------IDPGVKG-PIGIVGF
Bison5 SKTPDAQVMSDLDTVVNWAR---------------QRG-GDIERLGITGF
Uranium2 AKVPDAQVMGDLDACVAWAR---------------ENG-GDTSRLGITGF
Acromyrmex GKVADREVLADLDHAANWAA---------------THG-GDLRRLGLTGF
Formaldehyde-enrichment2 S-LADEQAFSDLDAVVAWAG---------------THKRANVARLGITGF
Aquatic_dechlorinating_ KS-DRKLMRQRISLGLDELK---------------KQPNVNVNKIAAIGY
Groundwater-dechlorinating KS-DRKLMRQRISLGLDELK---------------KQPNVNVNKIAAIGY
ANAS RAGDRALLRERVNSALAALK---------------THPLADKGRVAAIGY
Oak11 RN-DRQLMRRRAKAGLEALS---------------KHPLTDPSRLAAIGY
Oak1 MG-DPDRIRAIGRAGLDVLL---------------AQPEVDPGRVAAIGY
Pseudomonas MK-DPAAAAARFDAGLELLK---------------KQPNVNKHQLGAVGY
PA1166 TR-DADAAKARFLAGLELLK---------------RQPQTDPSQIAAIGY
46
Hypersaline2 TV-DPELWRQRADAGLAQLK---------------AAADVDDAQIAAIGY
Oak14 KN-DRPLLNKRMQAALEQLQGQ-------------AEAAVDTSKLATFGF
PA1597 KN-DRALLRKRMQAALAALRGQ-------------ALAAVDTTRQAAFGF
Oak18 VYADPDGTRRNLQQAYDYLE---------------KYAFAP--RIATIGW
Saline1 AMREPSRLVENVRDGRAYLS---------------SEADAP--RTALLGW
Oak3 LDQE--RAVADMRAGIVYLK---------------SLPNVDGARIGSIGW
Bison2 MQNRLQEAENMVRASLEYFKKENI----------GYVPRVGEFMFGATGY
Bison3 MQNRLQEAENMVRASLEYFKKENI----------GYVPRVGEFMFGATGY
BathHotS1 MQNRLQEAESMVRASLEYFKKENI----------GYVPRIGEFMFGATGY
BathHotS2 MQNRLQEAENMVRASLEYFKKEDI----------GYVPRVGEFMFGATGY
Bison1 MQNRLQEAEDMVRAALEYFKKNDI----------GYVPRVGEFMCGAIGY
Bison4 MQNRLQEAEDMVRAALEYFRKNDI----------GYVPRVGEFMCGATGY
Sediment_Water_Boiling MQNRLDEAEKMVQASLEYFKRENI----------GYVPRVGEFMFGATGY
Rainforest-Thermophiles2 A---IEDGLDTIKNAMDAAR-----------------AKYGITHFGITGY
Oak17 MP-MERAAR-DMSGAVDFLAAH---------------PAVTGAGIGAIGF
Pyrene-degrading LD-FPGATHQDIHGAVTHLQR-------------------ISSQVGVMGF
Fossil2 ATVDGTKLMNDFFAGFEHLM----------------GHEASTGKVGAVGF
Rainforest-Thermophiles AKRNREEILEDFIAGYDYLK----------------KHPKCTGRIGVVGF
Fossil5 RTLDGAKLMEDFFAAFEFLR----------------DHDGSTGKVGAVGF
Oak12 QKVDPTKLMNDFFAAIEWLM----------------HHDSSTGKVGITGF
Marine_Oil SSMDRTKIEADFIAAAKFIK----------------SHPQCSGKLGAVGF
Oak10 MKVDGNKMREDMVAAANWLR----------------SRPDCNGKICATGF
Oak7 GTLTPEGIDSSSSAALAAVK----------------ANPACNGKAGAVGF
Oak4 SHLPPDQITADLNAVVDYAL----------------KLPASNGKLYVTGF
Oak5 SHLPPDQITADLNAVVDYAL----------------KLPASNGKLYVTGF
Fossil6 YGLDPDNVTQDLNAVLKYAK----------------SIKAGNGEIYVVGF
Fossil3 GTLNNALVMDDTRGMLDYLASA--------------PLAKAGP-RGCIGY
Oak2 NSLTNALVMEDTSAWLGFLEAQ--------------DKVKTGP-VGCVGY
Oak13 QALNETTHMSDAKAFVAWLDQQ--------------PSVAKNRKIGTQGY
Oak15 PHASPEKILADTGAFLVWLDSQ--------------PDVKPGG-IGTTGY
Aquatic_dechlorinating_2 QSATTFDFVDDARAALDFLAE---------------QPEVDARRVGVIGH
Groundwater-dechlorinating1 QSATTFDFVDDARAALDFLAE---------------QPEVDARRVGVIGH
Hot KRIYETLVVNRASTEDRMLKDLEHGY--------NYLKSMGISKYGVIGF
Formaldehyde-enrichment QTLTAAGIETDIDSALGYLH----------------ARGFEAHHCATLGF
Rainforest-Soil DGDAIAETEAGIQAAFAGMAGFTASLRKAFRYVVGERPETKGQKVACVGF
Rainforest-Soil2 LMARTGPAFDDVEAGRTWLLS----------------RRECSGRVGVIGF
PCE-dechlorinating DMTVEDEVIHDAVAAVELLRN---------------TDEVDPDRVFVVGH
Oak6 TTKELGSYDSDARAALAYLK----------------ALPVCTGKLGVMGI
Oak16 QLAGEDREEDDVVDAEDDLQTG-----------------EGGQGDEVLGR
Air2 GDDIAGWVGRAREQGSAVFEAK----------------DAQNHRRDYAGA
Oak8 VTAVKRHARQAVAGQSLLLRAY--------------QMLRRGFSHRRVGA
Saline2 ESVAEQVSVADIESGWEQVVQT-------------TGSFESVLAVEFQGI
*
Fossil4 CLGGLLAYRTACHTD--SDASVGYYGVS--------IEDRLAEAAG----
Oak9 CWGGMLAYLTAVRHK--PDAAVGYYGVG--------IEQRLDLAKN----
Air1 CLGGRLAYMTAARTD--IDASVGYYGVM--------IDQMLNESHA----
Fossil1 CLGGLLTFLSATRTD--GDAFAVYYGGG--------MDNYVGEADN----
Cyanobacterial-community CLGGKLAYLMATRSI--AECNVSYYGVG--------IEKNLDEASN----
PA2682 CMGGKLAYLAATRTD--VSCSVGYYGMG--------IEALLDEAKQ----
Uranium1 CWGGTVALLAALRLG--LPS-VSYYGAR--------NLPFLHEV------
Rainforest-Thermophiles1 CWGGSMALLAALRLG--LPS-VSYYGAR--------NLALLDECEREDPR
Acid1 CMGGRLTFLSLTTFPEKLKAGVSYYGGSIGHEGLDGLGRKEV-LSG----
Acid2 CMGGRLAFLSAGWFGEKIKAAVPFYGGGIGAPKGFFPGHTEVPLSL----
Hypersaline CFGGHVAYLAAT-LPDIKATASFYG----AGIATMTPGGNEPTISR----
Cyanobacterial-community2 CFGGHVTYLAAT-LPEIQAAASFYG----AGIATGTPGGGNPTITL----
O2-minimum CLGGRVSFVSAANVPGLAAAAVYYPG--NLVPAADAPTGTIRALEE----
Bison5 CWGGRITWLYAAHNPKVKAGVAWYG----RLTGDATANSPKHPVDV----
Uranium2 CWGGRIAWLYCAHNPAVKAGVVWYG----RLVGDKTALTPLQPLDI----
Acromyrmex CWGGRIAWLYATHNPQLQAAVAWYG----HLHPQITLRQPVTPVDA----
47
Formaldehyde-enrichment2 CRAGRTIWMYTAHSKRVKAGVAWYG----SLMPFGPNATG--PLDV----
Aquatic_dechlorinating_ CFGGTVVLELARSGA-DIAGVVSFHGG---------LDT--PMPED----
Groundwater-dechlorinating CFGGTVVLELARSGA-DIAGVVSFHGG---------LDT--PMPED----
ANAS CFGGTAVLELARSGA-ELDGVVSFHGG---------LGT--QVPAT----
Oak11 CFGGMTVLELARNGE-PLRGIVTFHGA---------LST--PHPED----
Oak1 CFGGTMALELARSGA-DLGAVVGFHSG---------LGT--QRPAQ----
Pseudomonas CFGGKVVLDAARRGE-KLDGVVSFHGA---------LAT--QTPAK----
PA1166 CFGGKIVLDMARQGL-PLAGVASFHGA---------LGT--ATPAS----
Hypersaline2 CFGGGTVLQMAYGGS-DIDGVVSFHGS---------LPA--APEEV----
Oak14 CFGGCCSLELARTGA-PLKAAVSFHGT---------LDT--PNPAD----
PA1597 CFGGCCALELARDGA-ELKAFVSFHGT---------LDT--PDPAH----
Oak18 DLGGEWSLQTALQYPGALDAAVMYYGR---------GVF--MDRDR----
Saline1 CFGGGMTYRTLAEEASAFDAAVAYYGT---------PDP--LAGEA----
Oak3 CMGGGMSFRLAVGEP-TLKAAVINYG----------GVT--SDPAV----
Bison2 CCGGTCVWYFGSRIE-DFKALVPYYG----------LYK--LAEID----
Bison3 CCGGTCTWYFGSRME-DFKALVPYYG----------LYK--LAEID----
BathHotS1 CCGGTCVWYFGSKIE-DFKALVPYYG----------LYK--LAEID----
BathHotS2 CCGGTCVWYFGSRIE-DFKALVPYYG----------LYR--LAEID----
Bison1 CCGGTCVWYFGSRLE-DFKALVPYYG----------LYK--LAEID----
Bison4 CCGGTCTWYFGSRME-DFKALVPYYG----------LYK--LAEID----
Sediment_Water_Boiling CCGGTCTWYFGSKFE-EFKALVPYYG----------LYK--LANID----
Rainforest-Thermophiles2 CMGGTFSLRAACELE-GVSAAAPFYG----------DIP--DEEV-----
Oak17 CMGGGLVLVLGCLRADKISAVVPFYGV---------LGFDDDNAPD----
Pyrene-degrading CMGGALTIAA-AVHVPALSAAVCFYGI---------P---PQEFAD----
Fossil2 CYGGGVCNALAVAYPE-MGASVPFYGR---------QAS---AAD-----
Rainforest-Thermophiles CFGGWVANMMAVRVPD-LGAAVPFYGG---------QPN---DED-----
Fossil5 CYGGGVC-------------------L---------EP------------
Oak12 CYGGGVANAAAVAYPE-LGAAVSFYGR---------QPE---AKD-----
Marine_Oil CFGGYIVN------------------------------------------
Oak10 CFGGGVANFLGVRLGENLAATAPFYGG---------NPA---LPD-----
Oak7 CWGGGAVNSLAVIDQG-LGAGVAYYGS---------QPA---AAD-----
Oak4 CWGGGQSFRFATNRGD-LAAAFVFYGP----------PP---KSDD----
Oak5 CWGGGQSFRFATNRGD-LAAAFVFYGP----------PP---KSDD----
Fossil6 CWGGSQSFRFATSAGDEIEAAMVFYGT---------GPQ---EASA----
Fossil3 CMSGQYVVSAAGTFPNDFTASASLYGVG--------IVTDQPDTPH---H
Oak2 CMSGRYVTTAAARFGNRFAASASLYGVG--------IVTDAEDSPH---L
Oak13 CMGGPIAFRTAAAVPDRVGAVGSFHGGG--------LVTTTPNSPH---L
Oak15 CMGGMLSLLAAGTYPDRVVAAASFHGAR--------LATDAPDSPH---L
Aquatic_dechlorinating_2 SEGAIVASILAARGAEDGQAADENSKAG--------ARAAFIVLLG----
Groundwater-dechlorinating1 SEGAIVASILAARGAEDGQAADENSKAG--------ARAAFIVLLG----
Hot CMGGGLSFQLSTQLP--FDATVVYYGR---------NPRSIEDISR----
Formaldehyde-enrichment CMGGIVSMYAATRTA--LGAAVTFYGGG--------VATGRFGFPP-LID
Rainforest-Soil CMGGGLSALLACEEE-GLSGAAIYYG----------MPPDPGAAVS----
Rainforest-Soil2 CMGGGFALLLASGHG--FSAASVNYGG-----------PLAKDVED----
PCE-dechlorinating SLGGYLAPRIAAEAG-HVAGVVILAGH-------------VRPLQD----
Oak6 CIGGHLSFRAAMNPEALAGVCFYATDIHKRGLG---KGTHDNTLDR----
Oak16 EQGGEEVGHGAGSSDAPVHVQCRCPGAAR------------PPAAG----
Air2 ALAGGDLYALLSAPAPGLLSWARLNPIG----------------------
Oak8 RFFESVSRQAQQFAPLFLHNIHSKGGTG--------MGKFIELKAS----
Saline2 ESGVAVVRVETAHTLARNTWQVSLNDEG-------ILGSVTTGQEP---Y
Fossil4 ----ISAPLMLHIAGADQFVPAAAQARLH---DGLGSNPHVTLHDYPGKE
Oak9 ----LSCPLMLHYAELDQYASPEVAAKVR---ATYQGDPRVTVWEYPKVG
Air1 ----IAHPLMLHIPTEDHLVDHDAQKKIH---EGLDPHPKVTLHDYQGLD
Fossil1 ----IKQPVIIHLAGNDEYIPAEAQDVIK---DALADHSLTELHFYPGRD
Cyanobacterial-community ----IEHPLMLHIAEKDDYVSPEIQSQLK---VELRNYSLVEIHSYPNVN
PA2682 ----IKGRLVLHFAEQDAYCPQQARDAIL---PCLRNLPKTELYLYPGVD
Uranium1 ----PKAPVLFHFGEKDQHITPEMVQKHR---DALP---QMDVYTYP-AD
Rainforest-Thermophiles1 LVAEPKAMVMFHFGEQDPSIPAEAVAAHR---QRLP---QMPLFVYP-AG
48
Acid1 -AGRLKSPILLLYGAKDDSIPSEEHGRIAKTLSALDKT--YLLSVYPDAP
Acid2 -VPGIRADLLLLYGGKDDFIPEEERNAVAKALSAANRS--FRMETFPDAG
Hypersaline -TQDITGTIHLFFGLDDASIPAEQVNQIEAELKKHQIA--HQIFRYEGAD
Cyanobacterial-community2 -TPKISGTIYCFFGTEDPLIPIEQVDQIEAELQKHQIK--HRVFRYP-AN
O2-minimum -CGKLDIPIIGFFGNNDANPTPEIVGQLDAELTKLGKQ--HDFNAYDDAG
Bison5 -AQGLKVPVLGLYGGKDTGIPLESVERMKTELAKGNSR--SEIVVFQPSG
Uranium2 -AYTLKTPVLGLYGAQDSSISQDSIDLMWQTLIHAGNH--SMFVVYPDAG
Acromyrmex -AASLTAPVLGLYGALDPMITAENVALMQQALRAANSD--SEIITYPDAG
Formaldehyde-enrichment2 -TDRLNAPVLGLYGGADAGIPLAHVERMRAGLFAFGKDKQSPIHVYPDAG
Aquatic_dechlorinating_ -AKNIKCKVLVCTGGDDPNVPPKQVEAFEKEMRDANVD--WQVKSYGGAV
Groundwater-dechlorinating -AKNIKCKVLVCTGGDDPNVPPKQVEAFEKEMRDANVD--WQVKSYGGAV
ANAS -AGGIKAKVLVLHGADDPSVPPAEVQAFQDEMRKSGAD--WQMVAYGGAV
Oak11 -ARQIKGKVLVLHGAHDTFVGSDEVAVFEADMKAAGVD--YRIIRYPDAV
Oak1 -PGEVKAAILVCIGADDPLVPAEQRAAFEAEMRVAQVD--WRMNLYGGAM
Pseudomonas -PGVVRADILVEHGAADSMVTPQQVEAFKAEMDAAKVN--YQFVSIEGAK
PA1166 -KGSVKAKILVEHGSADSLVPAKDLDALKQELSAAGAD--YRVVIQDGAK
Hypersaline2 -YGKIKPEILVLHGQADSFVAPEVVTNFQDKLEAAGAN--WEMDIYGGAR
Oak14 -AKNIKGSVLVLHGASDPLVPKEQLPAFEDEMNAAGVD--WQLLSYGGAV
PA1597 -ARNIKGAVLVLDGASDPLVPREQLPAFAREMTDAGVD--WQLTSYGGAV
Oak18 -LAPLNVPVLGFYGGDDKSIPVRQVQEFRARLLELGKN--AEVLIVPHAD
Saline1 -LQALETPILAHFGTQDQAVPIDAARKFRDRMEDAGT---SLAYHEYDAG
Oak3 -LGKIHASILGIFGGKDRGIPLDDVTAMAAE-------------------
Bison2 -FSKIKAPVLAVHAGMDAFVPLSEVMEAIQKCNENKVN--AQFLIYAGVD
Bison3 -FSKIKAPVLAVHAGMDAFVPLSEVMEAIQKCNENKVN--AQFLIYAGVD
BathHotS1 -FSKIKAPVLAVHAGMDAFVPLSEVMEAIQKCNENKVD--AQFLIYAGVD
BathHotS2 -FSKIKAPVLAVHAGMDAFIPLSEVMEAIQKCNENKVN--AQFLIYAGVD
Bison1 -FSKIKAPVLAVHAGMDAFVPLSEVMEAIQKCNENKVN--AQFLIYAGVD
Bison4 -FSKIKAPVLAVHAGMDAFVPLSEVMEAIQKCNENKVN--AQFLIYAGVD
Sediment_Water_Boiling -FSKIKAPVLAVHAGMDAFIPLSEVMEAIQKCNENKLN--AQFLIYAGVD
Rainforest-Thermophiles2 -LKKLNVPTVFISGTRDQWINTEKVGQLEDIAERNELP--LESLKYD-AD
Oak17 -WSKLAARVEGHYAETDGFFPVEKVQALEAGLKGLGKQ--VSLHVYPGTG
Pyrene-degrading -PANIRIPFQGHFATQDDWCVPSMVDALEATLQKTGVS--AEIHRYEAA-
Fossil2 -VPKIQAPLMLQYGGLDERVN-AGWPEYEAALKANNKE--YIAHFYEGAN
Rainforest-Thermophiles -VPKIKAPLLIHYAELDTRVN-EGWPAYESALKAHNKE--YTMYMYPKAN
Fossil5 ---------ILTYGG-----------------------------------
Oak12 -VRRIKAPIMLHYGELDTRIN-EGWPAY----------------------
Marine_Oil --------------------------------------------------
Oak10 -VPKIKAAVLVHHGELDTRLA-MAWPAYQAELNKAKIP--NEGHIYPGAV
Oak7 -VPKITAPLLLHYGSLDERID-AGIPAFEAALKAASKT--YEIHMYEGAN
Oak4 -MPRIKAPVYGFYAGNDARIG-ATIPDAEKQMKTADIP------------
Oak5 -MPRIKAPVYGFYAGNDARIG-ATIPDAEKQMKTADIP------------
Fossil6 -YASIKVPVHGFYGGADNRVN-ATIEGSEKAMNTYDKT--FTYEIYDGAG
Fossil3 LASKIKGELYLGFAETDMYVPDNVIPELRAALDQHKVD--YRLDTWPGTE
Oak2 LVDKIKGEMYYGFAEIDEHVPEKVIPTLRQGLDKAGVR--YGLDVFAGAR
Oak13 QAAKTKAQF-----------------------------------------
Oak15 LAPKMKARVYVAGAMEDMFFTDDMKARLEEALTKAGVD--HVVETYP-AR
Aquatic_dechlorinating_2 -GPGVRGDELLLMQSAALGRAMGVSEEQISEANRLNREL-YSIAMSDGDV
Groundwater-dechlorinating1 -GPGVRGDELLLMQSAALGRAMGVSEEQISEANRLNREL-YSIAMSDGDV
Hot ----LKGPVLGIYAGEDSAINNGVPDMVRAMFQYGKEL---EMKIYPKTY
Formaldehyde-enrichment IAPMLQTPWLGLYGDLDKGIPFDDVELLRTAASNAPVP--TTVVRYADAD
Rainforest-Soil ----IRCPVIGFYGAEDGRVN-AGLPAFEAALAVAKAS--FEKRMYPGAG
Rainforest-Soil2 -FLQTACPVVGSYGGLAQWERGVADQLAAALERALVAH---DVKEYPDAG
PCE-dechlorinating -LIVAQTEYLLGLDGELSDEDRAQLEQIEQIKGAIDHLDAAAPGMYFGAP
Oak6 -VKEIQGEMLMIWGRQDPHVPREGRALIYNAMTDAGLN--FTWHEFNGQH
Oak16 LAPRPGYHLAMIISDNETVDLATPAGPMRTHVFRPAAPGRYPGLILYSEI
Air2 -------TLLLPLGAWLTAFAAVMLLSERIFIRWLDYLERVAAIYARGRF
Oak8 -DGHKFAAYLAESSGKARGGVVVIPEIFGVNSHIKQTSDGYAADGYRVVS
Saline2 EWDPPAYAEESAFEEMAVTLDAPGSCELGGTLSVPTGEASVPGVVIVHGS
49
Fossil4 HAFARPDGL------------HYAADSANLANQRTLDFLHRNLD------
Oak9 HAFARPGGG------------HFDARAADLADMRTLAFLVQKMVGHDKRG
Air1 HGFAATMGN------------RRNEEGAQLADGRTRAFFAEHLA------
Fossil1 HAFARKGGA------------HYDKNDAETANARTAEFFRANLV------
Cyanobacterial-community HGFARIGGK------------DYDLEAANLAHSRTMEFFQKYLG------
PA2682 HAFARVGGM------------HFDKPAYLMAHERSIAALKREIGPNFDLS
Uranium1 HAFNRDGSS------------PYHEASAKLALQRTLAFFEQHLDGA----
Rainforest-Thermophiles1 HAFNRDVDPR-----------AYHEPSARLARERSLEFLAARLGAAA---
Acid1 HGFSCWQRS------------SYREEAAMPAWKLARFFLENTLKNAGK--
Acid2 HGFFCEDRP------------SYHKESADRSEILLREFLDRHLKSAPSRN
Hypersaline HGFFCDQRA------------SYNPQAAKDAWEKVKTLFQQEL-------
Cyanobacterial-community2 HGFFCDHRS------------SYNAQAATDAWIKTKELFDEQL-------
O2-minimum HGFFCDARD------------SYRADAAKDAWAKTLAFFDRHLGGGSATS
Bison5 HAFHADYRP------------SYNEADAKDGWKRALNWFAKHGVAF----
Uranium2 HAFYADYRP------------SYVEADAKDGWRRALAWFKGHGVV-----
Acromyrmex HAFHADYRP------------NYHAESAQDGWQRMLEWFGRYGVAPAHPG
Formaldehyde-enrichment2 HGFHADYRP------------SYRKQDA----------------------
Aquatic_dechlorinating_ HSFTNPASG-----NDNSKGAAYNEKADKRSWEDMKLFFNEIFK------
Groundwater-dechlorinating HSFTNPASG-----NDNSKGAAYNEKADKRSWEDMKLFFNEIFK------
ANAS HTFTNPAAG-----NDPSRGSAYNEKAALRSWEHMKAFFAEIFR------
Oak11 HSFTVPEAG-----DDPSKGMAYNPDADRQSWEAM---------------
Oak1 HSFTNPDAT-----VSDFPGVAYHQPTDERSWRAMLDFFEEVF-------
Pseudomonas HGFTNPDADRLSHGEHGGPDIGYNKAADERSWADMQAFFKKVFK------
PA1166 HGFTNPDAD--AHKGHG-LDIGYDRQADRRSWADLQAFLKDIFGQG----
Hypersaline2 HGFTNPDAG-----DYGIDNLKHDPQADARSWARMQSFFNELFAD-----
Oak14 HSFTDPHAN-------VPGMMMYDAKTAARAFQSMHNLLDEVFKG-----
PA1597 HSFTDPNAK-------LPGKMHYDARTSRRAFQAMDDLLAEVFA------
Oak18 HSFANPSSA------------TYNAQAANEAWTATLAFLERHLKLDTPTR
Saline1 HAFANPSGE------------SYEPAAAEQAWTRTTDFLQTHLTR-----
Oak3 --------------------------------------------------
Bison2 HAFFNDTRP-----------EVYHEEYAKDVWLKTIEFFRTHLL------
Bison3 HAFFNDTRP-----------EVYHEEYAKDVWLKTIEFFRTHLL------
BathHotS1 HAFFNDTRP-----------EVYHEEYAKDVWLKTIEFFRTHLL------
BathHotS2 HAFFNDTRP-----------EVYHEEYAKDVWQKTIEFFRTHLL------
Bison1 HAFFNDTRP-----------EVYHEEYAKDVWQKTIQFFRTHLL------
Bison4 HAFFNDTRP-----------EVYHEEYAKDVWLKTIEFFRTHLL------
Sediment_Water_Boiling HAFFNDTRP-----------EVYHEEYARDVWQKTIEFFRTHLT------
Rainforest-Thermophiles2 HAFFNNTRP-----------EVYNETAARDAWAKVIGFFNDKL-------
Oak17 HAFANETNALG----------TYDADAAQLAWERSVTFLHDNLG------
Pyrene-degrading HGFFNERRGD-----------VYNANAASQAWERMIAFFTRHLD------
Fossil2 HGFHNDS-TPRYD-----------EAQAALAWQRTIDFFGEKLA------
Rainforest-Thermophiles HGFHNDT-TPRYD-----------KESAELAWKRTIDFFNQKLK------
Fossil5 --------------------------------------------------
Oak12 --------------------------------------------------
Marine_Oil --------------------------------------------------
Oak10 HGFNCDA-TPERY-----------NK------------------------
Oak7 HAFNNDTNAARYN-----------KDAAELGWSRTVAFLKKHVS------
Oak4 --------------------------------------------------
Oak5 --------------------------------------------------
Fossil6 HAFMRSGDDPNAE-----------IGNPNVAARNASWERLLNIIKGNQPE
Fossil3 HGFCFPERA------------AYVEAAAEG-------VWKLGL-------
Oak2 HGFQFPERD------------VYDTHAAEASWAKIVAMWDRNLK------
Oak13 --------------------------------------------------
Oak15 HGWVPRDTP------------VHD--------------------------
Aquatic_dechlorinating_2 PTLREKVVR-----------------VMEDPIDSTSDLTEQRRCL-----
Groundwater-dechlorinating1 PTLREKVVR-----------------VMEDPIDSTSDLTEQRRCL-----
Hot HAFATEGGP------------VYNEAAAKDAWDRTVRFFTRILG------
Formaldehyde-enrichment HGFNCDDRP-----------AVYNAVAADDAWQRTLAW------------
Rainforest-Soil HGFFCDDRP------------SYHEGAARDSYWRLLQFFSRVLSG-----
Rainforest-Soil2 HSFMNRHRGYGFLRIVQLRSIGYNEPATMDARRRIVAFFNLHLKEQYSNA
50
PCE-dechlorinating PAYWVDLRKYDPVKTARALDKPLLILQGERDYQVTMDDFARWQEGLDGQD
Oak6 AFLRDEGYR-------------YDPALAHLGYTLVLQLFRRKLGEG----
Oak16 FQVTGPIRR-------------SAAMLAGHGFVVAVPEGYHELEAPG---
Air2 SVRPLQAMN---------------APAEIRTMARTLDEMA----------
Oak8 PAMFDRAQRN------YATGYSQPEIEAGRAIMQKLDWKQAILDVQAA--
Saline2 GPVDRDGTYGSNKPYKELAWGLASRGIAVLRYDKRTDACDVALSDLTI--
Fossil4 --------------------------------------------------
Oak9 LL------------------------------------------------
Air1 --------------------------------------------------
Fossil1 --------------------------------------------------
Cyanobacterial-community --------------------------------------------------
PA2682 ALWDEHVRHEFDTRDVAATMATMVAEPYVNHVPTLTGGVGQRELSRFYRH
Uranium1 --------------------------------------------------
Rainforest-Thermophiles1 --------------------------------------------------
Acid1 --------------------------------------------------
Acid2 G-------------------------------------------------
Hypersaline --------------------------------------------------
Cyanobacterial-community2 --------------------------------------------------
O2-minimum SR------------------------------------------------
Bison5 --------------------------------------------------
Uranium2 --------------------------------------------------
Acromyrmex S-------------------------------------------------
Formaldehyde-enrichment2 --------------------------------------------------
Aquatic_dechlorinating_ --------------------------------------------------
Groundwater-dechlorinating --------------------------------------------------
ANAS --------------------------------------------------
Oak11 --------------------------------------------------
Oak1 --------------------------------------------------
Pseudomonas --------------------------------------------------
PA1166 --------------------------------------------------
Hypersaline2 --------------------------------------------------
Oak14 --------------------------------------------------
PA1597 --------------------------------------------------
Oak18 PQ------------------------------------------------
Saline1 --------------------------------------------------
Oak3 --------------------------------------------------
Bison2 --------------------------------------------------
Bison3 --------------------------------------------------
BathHotS1 --------------------------------------------------
BathHotS2 --------------------------------------------------
Bison1 --------------------------------------------------
Bison4 --------------------------------------------------
Sediment_Water_Boiling --------------------------------------------------
Rainforest-Thermophiles2 --------------------------------------------------
Oak17 --------------------------------------------------
Pyrene-degrading --------------------------------------------------
Fossil2 --------------------------------------------------
Rainforest-Thermophiles --------------------------------------------------
Fossil5 --------------------------------------------------
Oak12 --------------------------------------------------
Marine_Oil --------------------------------------------------
Oak10 --------------------------------------------------
Oak7 --------------------------------------------------
Oak4 --------------------------------------------------
Oak5 --------------------------------------------------
Fossil6 R-------------------------------------------------
Fossil3 --------------------------------------------------
Oak2 --------------------------------------------------
51
Oak13 --------------------------------------------------
Oak15 --------------------------------------------------
Aquatic_dechlorinating_2 --------------------------------------------------
Groundwater-dechlorinating1 --------------------------------------------------
Hot --------------------------------------------------
Formaldehyde-enrichment --------------------------------------------------
Rainforest-Soil --------------------------------------------------
Rainforest-Soil2 PTGA----------------------------------------------
PCE-dechlorinating DVTFIRY-------------------------------------------
Oak6 --------------------------------------------------
Oak16 --------------------------------------------------
Air2 --------------------------------------------------
Oak8 --------------------------------------------------
Saline2 --------------------------------------------------
Fossil4 --------------------------------------------------
Oak9 --------------------------------------------------
Air1 --------------------------------------------------
Fossil1 --------------------------------------------------
Cyanobacterial-community --------------------------------------------------
PA2682 HFIHGNPPDMTLTPISRTVGALQVVDEFVMRFTHSCEIDWLLPGVPPTGR
Uranium1 --------------------------------------------------
Rainforest-Thermophiles1 --------------------------------------------------
Acid1 --------------------------------------------------
Acid2 --------------------------------------------------
Hypersaline --------------------------------------------------
Cyanobacterial-community2 --------------------------------------------------
O2-minimum --------------------------------------------------
Bison5 --------------------------------------------------
Uranium2 --------------------------------------------------
Acromyrmex --------------------------------------------------
Formaldehyde-enrichment2 --------------------------------------------------
Aquatic_dechlorinating_ --------------------------------------------------
Groundwater-dechlorinating --------------------------------------------------
ANAS --------------------------------------------------
Oak11 --------------------------------------------------
Oak1 --------------------------------------------------
Pseudomonas --------------------------------------------------
PA1166 --------------------------------------------------
Hypersaline2 --------------------------------------------------
Oak14 --------------------------------------------------
PA1597 --------------------------------------------------
Oak18 --------------------------------------------------
Saline1 --------------------------------------------------
Oak3 --------------------------------------------------
Bison2 --------------------------------------------------
Bison3 --------------------------------------------------
BathHotS1 --------------------------------------------------
BathHotS2 --------------------------------------------------
Bison1 --------------------------------------------------
Bison4 --------------------------------------------------
Sediment_Water_Boiling --------------------------------------------------
Rainforest-Thermophiles2 --------------------------------------------------
Oak17 --------------------------------------------------
Pyrene-degrading --------------------------------------------------
Fossil2 --------------------------------------------------
Rainforest-Thermophiles --------------------------------------------------
Fossil5 --------------------------------------------------
Oak12 --------------------------------------------------
52
Marine_Oil --------------------------------------------------
Oak10 --------------------------------------------------
Oak7 --------------------------------------------------
Oak4 --------------------------------------------------
Oak5 --------------------------------------------------
Fossil6 --------------------------------------------------
Fossil3 --------------------------------------------------
Oak2 --------------------------------------------------
Oak13 --------------------------------------------------
Oak15 --------------------------------------------------
Aquatic_dechlorinating_2 --------------------------------------------------
Groundwater-dechlorinating1 --------------------------------------------------
Hot --------------------------------------------------
Formaldehyde-enrichment --------------------------------------------------
Rainforest-Soil --------------------------------------------------
Rainforest-Soil2 --------------------------------------------------
PCE-dechlorinating --------------------------------------------------
Oak6 --------------------------------------------------
Oak16 --------------------------------------------------
Air2 --------------------------------------------------
Oak8 --------------------------------------------------
Saline2 --------------------------------------------------
Fossil4 --------------------------------------------------
Oak9 --------------------------------------------------
Air1 --------------------------------------------------
Fossil1 --------------------------------------------------
Cyanobacterial-community --------------------------------------------------
PA2682 FVEIPMLGVVRFRGDRLYHEHIYWDQAGVLVQIGLLDPQGLPVAGVESAR
Uranium1 --------------------------------------------------
Rainforest-Thermophiles1 --------------------------------------------------
Acid1 --------------------------------------------------
Acid2 --------------------------------------------------
Hypersaline --------------------------------------------------
Cyanobacterial-community2 --------------------------------------------------
O2-minimum --------------------------------------------------
Bison5 --------------------------------------------------
Uranium2 --------------------------------------------------
Acromyrmex --------------------------------------------------
Formaldehyde-enrichment2 --------------------------------------------------
Aquatic_dechlorinating_ --------------------------------------------------
Groundwater-dechlorinating --------------------------------------------------
ANAS --------------------------------------------------
Oak11 --------------------------------------------------
Oak1 --------------------------------------------------
Pseudomonas --------------------------------------------------
PA1166 --------------------------------------------------
Hypersaline2 --------------------------------------------------
Oak14 --------------------------------------------------
PA1597 --------------------------------------------------
Oak18 --------------------------------------------------
Saline1 --------------------------------------------------
Oak3 --------------------------------------------------
Bison2 --------------------------------------------------
Bison3 --------------------------------------------------
BathHotS1 --------------------------------------------------
BathHotS2 --------------------------------------------------
Bison1 --------------------------------------------------
Bison4 --------------------------------------------------
53
Sediment_Water_Boiling --------------------------------------------------
Rainforest-Thermophiles2 --------------------------------------------------
Oak17 --------------------------------------------------
Pyrene-degrading --------------------------------------------------
Fossil2 --------------------------------------------------
Rainforest-Thermophiles --------------------------------------------------
Fossil5 --------------------------------------------------
Oak12 --------------------------------------------------
Marine_Oil --------------------------------------------------
Oak10 --------------------------------------------------
Oak7 --------------------------------------------------
Oak4 --------------------------------------------------
Oak5 --------------------------------------------------
Fossil6 --------------------------------------------------
Fossil3 --------------------------------------------------
Oak2 --------------------------------------------------
Oak13 --------------------------------------------------
Oak15 --------------------------------------------------
Aquatic_dechlorinating_2 --------------------------------------------------
Groundwater-dechlorinating1 --------------------------------------------------
Hot --------------------------------------------------
Formaldehyde-enrichment --------------------------------------------------
Rainforest-Soil --------------------------------------------------
Rainforest-Soil2 --------------------------------------------------
PCE-dechlorinating --------------------------------------------------
Oak6 --------------------------------------------------
Oak16 --------------------------------------------------
Air2 --------------------------------------------------
Oak8 --------------------------------------------------
Saline2 --------------------------------------------------
Fossil4 ------------------------
Oak9 ------------------------
Air1 ------------------------
Fossil1 ------------------------
Cyanobacterial-community ------------------------
PA2682 KLLDESLPSNRLMARWAASEGLGL
Uranium1 ------------------------
Rainforest-Thermophiles1 ------------------------
Acid1 ------------------------
Acid2 ------------------------
Hypersaline ------------------------
Cyanobacterial-community2 ------------------------
O2-minimum ------------------------
Bison5 ------------------------
Uranium2 ------------------------
Acromyrmex ------------------------
Formaldehyde-enrichment2 ------------------------
Aquatic_dechlorinating_ ------------------------
Groundwater-dechlorinating ------------------------
ANAS ------------------------
Oak11 ------------------------
Oak1 ------------------------
Pseudomonas ------------------------
PA1166 ------------------------
Hypersaline2 ------------------------
Oak14 ------------------------
PA1597 ------------------------
Oak18 ------------------------
54
Saline1 ------------------------
Oak3 ------------------------
Bison2 ------------------------
Bison3 ------------------------
BathHotS1 ------------------------
BathHotS2 ------------------------
Bison1 ------------------------
Bison4 ------------------------
Sediment_Water_Boiling ------------------------
Rainforest-Thermophiles2 ------------------------
Oak17 ------------------------
Pyrene-degrading ------------------------
Fossil2 ------------------------
Rainforest-Thermophiles ------------------------
Fossil5 ------------------------
Oak12 ------------------------
Marine_Oil ------------------------
Oak10 ------------------------
Oak7 ------------------------
Oak4 ------------------------
Oak5 ------------------------
Fossil6 ------------------------
Fossil3 ------------------------
Oak2 ------------------------
Oak13 ------------------------
Oak15 ------------------------
Aquatic_dechlorinating_2 ------------------------
Groundwater-dechlorinating1 ------------------------
Hot ------------------------
Formaldehyde-enrichment ------------------------
Rainforest-Soil ------------------------
Rainforest-Soil2 ------------------------
PCE-dechlorinating ------------------------
Oak6 ------------------------
Oak16 ------------------------
Air2 ------------------------
Oak8 ------------------------
Saline2 ------------------------
CLUSTAL W (1.82) multiple sequence alignment
55
56
57
58
X. References
Articles & Literature
1. Altschul S. F., Gish W., Miller W., Myers E. W. & Lipman D. J., (1990). Basic
Local Alignment Search Tool, J. Mol. Biol. 215. 403-410
2. Amann, R. I., Ludwig, W. & Schleifer, K. H., (1995). Phylogenetic Identification
and In situ Detection of Individual Microbial Cells Without Cultivation, Microbiol. Rev.
59: 143-169
3. Anzai Y., Kim H., Park J-Y. , Wakabayashi H. & Oyaizu H., (2000).
Phylogenetic Affiliation of the Pseudomonads Based on 16S rRNA Sequence,
International Journal of Systematic & Evolutionary Microbiology, 50, 1563-1589
4. Atlas, Ronald M. & Bartha, Richard, (1998). Microbial Ecology, Fundamentals
and Applications, 4th Ed. Benjamin/Cummings Publishing Company, Inc. Citing:
Dagley, S. (1975) Essays in Biochemistry. Vol.2 pp. 81-130
5. Backeljau, T., De Bruyn, L., De Wolf, K. & Jordaens, K. (1996). Multiple
UPGMA and Neighbor-joining Trees and the Performance of Some Computer
Packages. Mol. Biol. Evol. 13(2):309-313
6. Berman H. M. , Westbrook J. , Feng Z. , Gilliland G. , Bhat T. N. , Weissig H. ,
Shindyalov I. N. & Bourne P. E., (2000). The Protein Data Bank, Oxford University
Press, Nucleic Acids Research, Vol. 28, No.1 235-242
7. Beveridge A. J. & Ollis D. L., (1994). A Theoretical Study of Substrate-induced
Activation of Dienelactone Hydrolase, Oxford Journals, Oxford University Press
8. Bruno, W. J., Socci, N. D. & Halpern, A. L., (2000). Weighted Neighbor Joining: A
Likelihood-Based Approach to Distance-Based Phylogeny Reconstruction, Mol. Biol.
Evol. 17(1):189-19
9. Cheah, E., Ashley, G. W., Gary, J. & Ollis, D., (1993). Catalysis by Dienelactone
Hydrolase: A Variation on The Protease Mechanism, J. Mol. Biol., PubMed 16(1):64-
78
10. Duncan, T., Phillips, R. B. & Wagner Jr., W. H., (1980). A Comparison of
Branching Diagrams Derived by Various Phenetic and Cladistic Methods, Systematic
Botany, 5(3): pp.264-293
11. Dybas, M. J., Hyndman, D. W., Heine, R. et al., (2002). Development, Operation
and Long-term Performance of a Full-scale Biocurtain Utilizing Bioaugmentation,
Environmental Science and Technology 36:3635-3644
12. Elzerman, A.W. & Coates, J.T., (1987). Hydrophobic Organic Compounds on
Sediments, Equilibria and Kinetics of Sorption, Sources and Fates of Aquatic Pollutants,
pp. 263-317
13. Fardeau, M. L., Salinas, M. B., L’Haridon, S. et al. (2004). Isolation from oil
reservoirs of novel thermophilic anaerobes phylogenetically related to
Thermoanaerobacter subterraneus: reassignment of T. subterraneus,
59
Thermoanaerobacter yonseisensis, Thermoanaerobacter tengcongensis and
Carboxydibrachium pacificum to Caldanaerobacter subterraneus gen, nov., sp. Nov.,
comb. Nov., as four novel subspecies. International Journal of Systematic and
Evolutionary Microbiology, 54:467-474
14. Fetzner S., (2001). Biodegradation of Xenobiotics, , Biotechnology, Vol X,
Encyclopedia of Life Support Systems (EOLSS), Department of Microbiology,
University of Oldenburg, D-26111 Oldenburg, Germany
15. Handelsman, J., (2004). Metagenomics: Application of Genomics to Uncultured
Microorganisms, Microbiol. Mol. Biol. Rev., Vol. 68, No.4, p.669-685
16. Harayama, S., Kishira, H., Kasai, Y. & Shutsubo, K., (1999). Petroleum
biodegradation in marine environments. J. Mol. Microbiol. and Biotechnol. 1:63-70
17. Ibanez, J.G., Hernandez-Esperaza, M. & Dorria-Serrano, C., (2007).
Environmental Chemistry, Fundamentals, Springer Science Business Media, LLC. Page
232 Bioconcentration, bioaccumulation and biomagnification
18. Juhasz, A. L., Naidu, R., (2000). Bioremediation of High Molecular Weight
Polycyclic Aromatic Hydrocarbons: A Review of the Microbial Degradation of
Benso(a)Pyrene, Elsevier, International Biodeterioration & Biodegradation 45 (2000)
57-88
19. Klöppfer, W., (1994). Environmental Hazard- Assessment of Chemicals and
Products, Environ. Sci. & Pollut. Res. 1 (2)108-116
20. Lorenz, P., Liebeton, K., Niehaus, F. & Eck, J., (2002). Screening for Novel
Enzymes for Biocatalytic Processes: Accessing the Metagenome as a Resource of Novel
Functional Sequence Space, Curr. Opin. Biotechnol. 13: 572-577
21. Machackova, J., Wittlingerova, Z., Vlk, K., Zima, J. & Linka, A., (2008).
Comparison of two methods for assessment of in situ jet-fuel remediation efficiency,
Water, Air and Soil Pollution 187:181-194
22. Madsen E. L., (2011). Microorganisms and their roles in fundamental
biogeochemical cycles, Elsevier, Cornell University Vol. 22, Issue 3, pages 456-464
23. Markowitz, V. M., Mavromatis, K., Ivanova, N. N., Chen, I. M., Chu, K.,
Kyrpides, N. C., (2009). IMG ER: A System For Microbial Genome Annotation Expert
Review and Curation, Bioinformatics 2009, 25(17):2271-2278
24. Morris, B. L., Lawrence, A. R. L., Chilton, P. J. C. et al. (2003). Groundwater
and its susceptibility to degradation: a global assessment of the problem and options for
management. Early Warning and Assessment Report Series RS.03-3, United Nations
Environment Programme, Nairobi, Kenya.
25. National Research Council, (2007). The New Science of Metagenomics, The
National Academies Press
26. Page, R. & Holmes, E., (1998). Molecular Evolution: A Phylogenetic Approach,
Blackwell Science, Chapter 2, ISBN 0-86542-889-1
60
27. Pathak, D. & Ollis, D., (1990). Refined Structure of Dienelactone Hydrolase at
1.8Å, J. Mol. Biol., Elsevier, Vol.214, Issue 2, pages 497-525
28. Paul, E. A. & Clark, F. E., (1989). Soil Microbiology and Biochemistry, pp.11-31,
Academic Press, New York
29. Pepper, I. L., Gentry, T. J., Newby, D. B., Roane, T. M. & Josephson, K. L.,
(2002). The Role of Cell Bioaugmentation and Gene Bioaugmentation in the
Remediation of Co-contaminated Soils, Application of Technology to Chemical-mixture
Research, Environ. Health Perspect.110(suppl.6):943-946
30. Philp, J. C., Atlas, R. M. & Cunningham, C. J., (2009). Bioremediation,
Encyclopedia of Life Sciences, John Wiley & Sons, Ltd. Online posting date: 15th
March 2009
31. Potrawfke, T., Timmis, K. N. & Wittich, R-M., (1998). Degradation of 1,2,3,4-
Tetrachlorobenzene by Pseudomonas chlororaphis RW71, Applied and Environmental
Microbiology, 64(10):3798 Vol 64. No. 10, p. 3798-3806
32. Puls, R. W., Paul, C. J. & Powell, R. M., (1999). The Application of In-Situ
Permeable Reactive (Zero-Valent Iron) Barrier Technology For the Remediation of
Chromate-contaminated Groundwater: A Field Test, Elsevier, Vol, 14, Issue 8, pp. 989-
1000
33. Raghavan, P. U. M. & Vivekanandan, M., (1999). Bioremediation of oil-spilled
sites through seeding of naturally adapted Pseudomonas putida, Elsevier, International
Biodeterioration & Biodegradation 44 (1999) 29-32
34. Rappe, M. S. & Giovannoni S. J., (2003). The Un-Cultured Microbial Majority.
Annu. Rev. Microbiol. 57:369-94
35. Reineke, W., Mandt, C., Kaschabek, S. R. & Pieper, D. H., (2011). Chlorinated
Hydrocarbon Metabolism, In: eLS. John Wiley & Sons, Ltd: Chichester. DOI:
10.100279780470015902.a0000472.pub3
36. Riesenfeld, C. S., Schloss, P. D. & Handelsman, J., (2004). Metagenomics:
Genomic Analysis of Microbial Communities, Annu. Rev. Genet. 2004. 38:525-52
37. Rothmel, R. K. & Chakrabarty, A. M., (1990). Microbial Degradation of
Synthetic Recalcitrant Compounds, Pure & Appl. Chem., Vol. 62, No. 4, pp. 769-779
38. Saitou, N. & Nei, M., (1987). The Neighbor-joining Method: A New Method for
Reconstructing Phylogenetic Trees. Mol. Biol. Evol. 4(4):406-425
39. Saitou, N. & Imanishi, T., (1989). Relative Efficiencies of the Fitch-Margoliash,
Maximum-Parsimony, Maximum-Likelihood, Minimum-Evolution, and Neighbor-
joining Methods of Phylogenetic Tree Construction in Obtaining the Correct Tree,.
Mol. Biol. Evol. 6(5):514-525
40. Schlömann, M., (1994). Evolution of chlorocatechol catabolic pathways,
Biodegradation Kluwer Academic Publishers 5: 301-321
41. Schlömann, Schmidt & Knackmuss, (1990) Different Types of Dienelactone
Hydrolase in 4-Fluorobenzoate Utilizing Bacteria, J. Bacteriol., 172(9):5112
61
42. Schreiber, Hellwig, Dorn, Reineke & Knackmuss, (1980). Critical reactions in
fluorobenzoic acid degradation by Pseudomonas sp. B13, Applied and Environmental
Microbiolology, Vol. 39, No. 1, p. 58-67
43. Singh, D. P. & Dwivedi, S. K. (2004) Environmental Microbiology and
Biotechnology, First Edition, New Age International (P) Ltd, Publishers. Page 60
44. Stein, J. L., Marsh, T. L., Wu, K. Y., Shizuya, H. & DeLong, E. F., (1996).
Characterization of Uncultivated Prokaryotes: Isolation and Analysis of a 40- kilobase-
pair Genome Fragment Front a Planktonic Marine Archaeon, J. Bacteriol. 178: 591-
599
45. Technical Report LBNL-63614, (2008). Using IMG-M, Comparative Analysis with
the IMG/M System, Addendum to Using IMG, Department of Energy Joint Genome
Institute, Lawrence Berkeley National Laboratory
46. Thomas, T., Gilbert, J. & Meyer, F. (2012). Metagenomics - A Guide From
Sampling to Data Analysis, Microbial Informatics and Experimentation 2012, 2:3
47. Thompson J. D. , Higgins D. G. & Gibson T. J., (1994). CLUSTAL W: improving
the sensitivity of progressive multiple sequence alignments through sequence weighting,
position-specific gap penalties and weight matrix choice, Oxford University Press,
Nucleic Acids Research, Vol.22, No.22, p. 4673-4680
48. U.S. Environmental Protection Agency, (2000). Engineered Approaches to In Situ
Bioremediation of Chlorinated Solvents: Fundamentals and Field Applications, Office
of Solid Waste and Emergency Response, Technology Innovation Office, Washington
49. Zeyaullah, Md, Kamli, M.R., Islam,B., Atif, M., Benkhayal, F.A., Nehal, M.,
Rizvi, M.A. & Ali, A. (2009). Metagenomics - An Advanced Approach For Non-
Cultivable Microorganisms, Biotechnology and Molecular Biology Reviews Vol. 4 (3),
pp. 049-054
Figures-references
Fig. 1:
50. Moiseeva, Solyanikova, Kaschabek, Gröning, Thiel, Golovleva & Schlömann,
(2002). A New Modified ortho Cleavage Pathway of 3-Chlorocatechol Degradation by
Rhodococcus opacus 1CP: Genetic and Biochemical Evidence, Journal of Bacteriology,
Vol. 184, No. 19, p. 5282-5292
Fig. 2:
51. Image of 1ZI6 (Following directed evolution with crystallography: structural
changes observed in changing the substrate specificity of dienelactone hydrolase.
(2005) Acta Crystallogr.,Sect.D61: 920-931, Kim, H.K., Liu, J.W., Carr, P.D., Ollis,
D.L.) created with J.mol (J.L. Moreland, A. Gramada, O.V. Buzko, Q. Zhang, P.E.
Bourne (2005) The Molecular Biology Toolkit (MBT): a modular platform for
developing molecular visualization applications. BMC Bioinformatics 6:21)
62
Fig. 3:
51. Thomas, T., Gilbert, J. & Meyer, F. (2012). Metagenomics - A Guide From
Sampling to Data Analysis, Microbial Informatics and Experimentation 2012, 2:3
Databases
1) Protein Data Bank (PDB)
http://www.pdb.org
RCSB Protein Data Bank H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N.
Bhat, H. Weissig, I. N. Shindyalov, P. E. Bourne, (2000). The Protein Data Bank,
Nucleic Acids Research, 28: 235-242
2) IMG/M
http://img.jgi.doe.gov/m/
IMG: the integrated microbial genomes database and comparative analysis system
Victor M. Markowitz1, I-Min A. Chen, Krishna Palaniappan, Ken Chu, Ernest
Szeto, Yuri Grechkin, Anna Ratner, Biju Jacob, Jinghua Huang, Peter Williams,
Marcel Huntemann, Iain Anderson, Konstantinos Mavromatis, Natalia N. Ivanova
and Nikos C. Kyrpides
3) Clustal W
Websites for Clustal W multiple alignments:
Dendrogram 1 & 3: http://pir.georgetown.edu/pirwww/search/multialn.shtml
Dendrogram 2: http://www.genome.jp/tools/clustalw/