Exploring the evolutionary route of the acquisition of betaine aldehyde dehydrogenase activity by...
-
Upload
hector-riveros -
Category
Documents
-
view
9 -
download
0
description
Transcript of Exploring the evolutionary route of the acquisition of betaine aldehyde dehydrogenase activity by...
Muñoz-Clares et al. BMC Plant Biology 2014, 14:149http://www.biomedcentral.com/1471-2229/14/149
RESEARCH ARTICLE Open Access
Exploring the evolutionary route of the acquisitionof betaine aldehyde dehydrogenase activity byplant ALDH10 enzymes: implications for thesynthesis of the osmoprotectant glycine betaineRosario A Muñoz-Clares1*, Héctor Riveros-Rosas2, Georgina Garza-Ramos2, Lilian González-Segura1,Carlos Mújica-Jiménez1 and Adriana Julián-Sánchez2
Abstract
Background: Plant ALDH10 enzymes are aminoaldehyde dehydrogenases (AMADHs) that oxidize different ω-aminoor trimethylammonium aldehydes, but only some of them have betaine aldehyde dehydrogenase (BADH) activityand produce the osmoprotectant glycine betaine (GB). The latter enzymes possess alanine or cysteine at position441 (numbering of the spinach enzyme, SoBADH), while those ALDH10s that cannot oxidize betaine aldehyde (BAL)have isoleucine at this position. Only the plants that contain A441- or C441-type ALDH10 isoenzymes accumulateGB in response to osmotic stress. In this work we explored the evolutionary history of the acquisition of BALspecificity by plant ALDH10s.
Results: We performed extensive phylogenetic analyses and constructed and characterized, kinetically andstructurally, four SoBADH variants that simulate the parsimonious intermediates in the evolutionary pathway fromI441-type to A441- or C441-type enzymes. All mutants had a correct folding, average thermal stabilities and similaractivity with aminopropionaldehyde, but whereas A441S and A441T exhibited significant activity with BAL, A441Vand A441F did not. The kinetics of the mutants were consistent with their predicted structural features obtained bymodeling, and confirmed the importance of position 441 for BAL specificity. The acquisition of BADH activity couldhave happened through any of these intermediates without detriment of the original function or protein stability.Phylogenetic studies showed that this event occurred independently several times during angiosperms evolutionwhen an ALDH10 gene duplicate changed the critical Ile residue for Ala or Cys in two consecutive single mutations.ALDH10 isoenzymes frequently group in two clades within a plant family: one includes peroxisomal I441-type, theother peroxisomal and non-peroxisomal I441-, A441- or C441-type. Interestingly, high GB-accumulators plants havenon-peroxisomal A441- or C441-type isoenzymes, while low-GB accumulators have the peroxisomal C441-type,suggesting some limitations in the peroxisomal GB synthesis.
Conclusion: Our findings shed light on the evolution of the synthesis of GB in plants, a metabolic trait of mostecological and physiological relevance for their tolerance to drought, hypersaline soils and cold. Together, ourresults are consistent with smooth evolutionary pathways for the acquisition of the BADH function from ancestralI441-type AMADHs, thus explaining the relatively high occurrence of this event.
Keywords: Osmoprotection, Osmotic stress, Aminoaldehyde dehydrogenase, Enzyme kinetics, Substrate specificity,Enzyme subcellular location, Protein stability, Protein structure, Protein evolution
* Correspondence: [email protected] de Bioquímica, Facultad de Química, Universidad NacionalAutónoma de México, México D.F., MéxicoFull list of author information is available at the end of the article
© 2014 Muñoz-Clares et al.; licensee BioMed CCreative Commons Attribution License (http:/distribution, and reproduction in any mediumDomain Dedication waiver (http://creativecomarticle, unless otherwise stated.
entral Ltd. This is an Open Access article distributed under the terms of the/creativecommons.org/licenses/by/4.0), which permits unrestricted use,, provided the original work is properly credited. The Creative Commons Publicmons.org/publicdomain/zero/1.0/) applies to the data made available in this
Muñoz-Clares et al. BMC Plant Biology 2014, 14:149 Page 2 of 16http://www.biomedcentral.com/1471-2229/14/149
BackgroundThe synthesis of the osmoprotectant glycine betaine(GB) is a metabolic trait of great adaptive importancethat allows the plants possessing it to contend with os-motic stress caused by drought, salinity or low tempera-tures. Since these adverse environmental conditions arethe major limitations of agricultural production, engin-eering the synthesis of GB in crops that naturally lackthis ability has been, and still is, a biotechnological goalfor improving their tolerance to osmotic stress (reviewedin [1]). Also, it is becoming increasingly appreciated thatthe GB content of an edible plant is valuable for humanand animal nutrition [2].In plants, GB is formed by the NAD+-dependent oxidation
of betaine aldehyde (BAL) catalyzed by betaine aldehyde de-hydrogenases (BADHs). Within the aldehyde dehydrogenase(ALDH) superfamily, plant BADHs belong to the family 10[3] whose members are ω-aminoaldehyde dehydrogenases(AMADHs) that in vitro can oxidize small aldehydes posses-sing an ω-primary amine group, such as 3-aminopropional-dehyde (APAL) and 4-aminobutyraldehyde (ABAL) [4-12], atrimethylammonium group, such as 4-trimethylaminobutyr-aldehyde (TMABAL) [9,12], or a dimethylsulfonium group,such as 3-dimethylsulfoniopropionaldehyde [4,5]. In vivo, de-pending of the substrate used, these enzymes may participatein diverse biochemical processes, which range from thecatabolism of polyamines to the synthesis of several osmo-protectants (alanine betaine, 4-aminobutyrate, carnitine or3-dimethylsulfoniopropionate) in addition to GB. Althoughthe biochemically characterized plant ALDH10s oxidize allthe above-mentioned aldehydes, only some of them effi-ciently use BAL as substrate [9,13-18] and therefore can par-ticipate in the synthesis of GB. The difference in BALspecificity among the plant ALDH10s was puzzling giventhe high structural similarity between BAL and the otherω-aminoaldehydes, as well as between the plant ALDH10enzymes. Recently, by means of X-ray crystallography, insilico model building, site-directed mutagenesis, and kineticstudies of the ALDH10 enzyme from spinach (SoBADH), wefound that only an amino acid residue at position 441 is crit-ical for an ALDH10 enzyme being able to accept or rejectBAL as a substrate [19]. This residue, located in the secondsphere of interaction with the substrate behind the indolegroup of the tryptophan equivalent to W456 in SoBADH, de-termines the size of the pocket formed by the Trp and Tyrresidues equivalent to Y160 and W456 (SoBADH number-ing) where the bulky trimethylammonium group of BALbinds. If this residue is an Ile it pushes the Trp against theTyr, thus hindering the binding of BAL, whereas if it is anAla or a Cys the Trp adopts a conformation that leavesenough room for productive BAL binding [19]. This conclu-sion was drawn by Díaz-Sánchez et al. [19] by comparing thecrystal structures of the SoBADH (PDB code 4A0M) withthose of the two pea AMADH enzymes, which do not have
BADH activity (PsAMADH1 and PsAMADH2, PDB codes3IWK and 3IWJ, respectively, [12]), and was later confirmedby Kopěcný et al. [20] when they reported the crystal struc-tures of the maize ALDH10 isoenzyme, which contains Cysat position equivalent to 441 (SoBADH numbering) and ex-hibits BADH activity (ZmAMADH1a; PDB code 4I8P), andof a tomato ALDH10 isoenzyme, which contains Ile at thisposition and is devoid of BADH activity (SlAMADH1; PDBcode 4I9B). Moreover, by correlating the reported level ofBADH activity of ALDH10 enzymes with the presence of ei-ther of these residues, Díaz-Sánchez et al. [19] predicted thatthose enzymes that have an Ile at position 441—which wewill name hereafter as I441-type isoenzymes—would haveonly AMADH activity while those that have either Ala orCys—which we will name hereafter as A441- or C441-typeisoenzymes—would exhibit also BADH activity. And sincean almost perfect correlation was found between the re-ported ability of the plant to accumulate GB and the pres-ence of an ALDH10 isoenzyme with proved or predictedBADH activity, it was proposed that the absence of this kindof isoenzyme is a major limitation for the synthesis of GB inplants [19]. Indeed, a significant BADH activity would be ne-cessary not only to produce significant levels of GB but alsoto prevent the accumulation of BAL, which is formed in theoxidation of choline by choline monooxygenase (CMO), upto toxic concentrations.Amino acid sequence analysis showed that most plants
have two ALDH10 isoenzymes, probably as a conse-quence of gene duplication, and that the I441-type iso-enzyme was the commonest [19]. The latter observationled to the suggestions that this residue corresponds tothe ancestral feature in the plant ALDH10 family, andthat a functional specialization occurred in some plantswhen the Ile at position equivalent to 441 of SoBADHmutated to Ala or Cys in one of the two copies of theduplicated gene [19]. Since the codons for Ile differ fromthose for Ala or Cys in two positions, we reasoned thatany of these changes had to occur through an intermedi-ate. To explore the evolutionary history of the synthesisof GB in plants, we generated and characterized theSoBADH mutants A441V, A441S, A441T and A441F,which simulate the four parsimonious intermediates inthe pathway from the plant ALDH10 isoenzymes exhi-biting only AMADH activity, exemplified by the A441ISoBADH mutant, to those that also exhibited BADHactivity, exemplified by the wild-type SoBADH or theA441C mutant. In this work, by comparing the kineticproperties and the thermo-stabilities of the mutantswith those of the wild-type enzyme, we confirm thatthe size of the residue at position 441 greatly affect thespecificity for betaine aldehyde, and conclude that theacquisition of the new BADH function occurred with-out detriment of either the oxidation of other aminoal-dehydes or the protein stability. Also, we present here
Muñoz-Clares et al. BMC Plant Biology 2014, 14:149 Page 3 of 16http://www.biomedcentral.com/1471-2229/14/149
strong phylogenetic evidence that confirms that peroxi-somal I441-type isoenzymes correspond to the ALDH10ancestral form and that independent duplication eventsoccurred in monocots and eudicots plants. Indeed, thechange to A441-type isoenzymes was the commonest ineudicots, whereas the change to C441-type isoenzymeswas in monocots.
ResultsPhylogenetic analysis of the ALDH10 enzymesWe expanded the amino acid sequence alignments ofplant ALDH10 enzymes, including in this phylogeneticstudy three times more sequences than in previousworks [19,20]. The retrieved non-redundant sequencesbelong mainly to plants (122 sequences), but ALDH10proteins were also found in fungi, protists, and proteo-bacteria; none in animals or archea (Additional file 1:Table S1). Figure 1 shows an unrooted phylogenetic treethat includes all identified ALDH10 sequences (panel A),as well as detailed phylogenetic trees from monocots(panel B) and eudicots (panel C). As expected, land plants(Embriophytes) form a well-supported monophyletic group,as well as Spermatophytes (seed plants) and Angiosperms(flowering plants). In Figure 1B it can be observed thatprimitive plants with a known genome like Ostrococcustauri, O. lucimarinus, Micromonas pusilla, Chlamydomo-nas reinhardtii, Volvox carteri (Chlorophyta), Physcomi-trella patents (Briophyta) and Selaginella moellendorffii(Lycopodiophyta) contain only one ALDH10 enzyme. Inter-estingly, all these enzymes possess Ile at position equivalentto 441 (SoBADH numbering), which is also the residuemost frequently found in ALDH10 enzymes of theother plant families (Figures 1B and 1C). Thus, amongthe 122 non-redundant plant ALDH10 sequences analyzed,88 possess Ile, 19 Ala, and 10 Cys. Only three ALDH10 iso-enzymes—from Vitis vinifera, Solanum tuberosum andPandanus amaryllifolius— have Val at this position, andtwo—from Auluropus lagopoides and Theobroma cacao—have Thr. These data strongly support the previous pro-posal [19] that I441-type isoenzymes correspond to theancestral protein of the ALDH10 family.Figure 1B also shows that all known monocot ALDH10
genes cluster together, which suggests that the dupli-cated ALDH10 genes in monocots originated after themonocot-eudicot divergence. All monocot plants ofknown genome possess two genes coding for ALDH10proteins, except maize that possesses three genes. Aspreviously found [20], in the Poaceae family—which in-cludes most of the known sequences from monocots—each of the two ALDH10 genes forms a different cladein the phylogenetic tree: one (which we name Poaceae 1)exclusively includes I441-type isoenzymes while the sec-ond (which we name Poaceae 2) mainly contains C441-type. Because the limited number of monocot ALDH10
sequences available, it is not yet possible to know whetheror not every monocot family, besides Poaceae, possess twoALDH10 isoenzymes.Eudicots of known genomes have a variable number of
genes coding for ALDH10 proteins (Figure 1C). Some spe-cies have only one gene—Ricinus communis (Euphorbia-ceae), Citrus clementina (Rutaceae), Aquilegia coerulea(Ranunculaceae), Fragaria vesca (Rosaceae), Cucumis sati-vus (Cucurbitaceae), and Mimulus guttatus (Phrymaceae)—,others two genes—Arabidopsis thaliana, A. lyrata, Capsellarubella, Brassica rapa (Brassicaceae), Glycine max, Medi-cago truncatula (Fabaceae), Gossypium raimondii, Theo-broma cacao (Malvaceae), Populus trichocarpa (Salicaceae),Solanum lycopersicum, S. tuberosum (Solanaceae), and Betavulgaris (Amaranthaceae)—, and another—Vitis vinifera(Vitaceae)—three genes. Glycine max, in addition to thetwo ALDH10 genes, possesses an additional copy that cor-responds to a pseudogene. The complex distribution pat-tern exhibited by ALDH10 genes in eudicots stronglysuggests that several independent gene-duplication eventsoccurred during their evolution after monocot-eudicot di-vergence (Figure 1C). Thus, at least four independent dupli-cation events, those that took place in Fabaceae, Salicaceae,Solanaceae and Amaranthaceae, exhibit a very high boot-strap support (>90%). In species of the Brassicaceae, Faba-ceae, Salicaceae, Rosaceae, and Solanaceae families theprotein coded by the duplicate gene conserved the Ile at theposition equivalent to 441, but in plants of the Amarantha-ceae and Acanthaceae this residue was changed to an Ala,in Malvaceae to an Ala or Thr, and in the only sequencedspecies of Vitaceae to a Val. As in the case of monocots,two different clades can be observed in the phylogenetictree of several eudicot families: the first includes the originalI441-type isoenzymes, with the only exception of Solanumtuberosum where this Ile mutated to a Val; the second in-cludes the duplicate I441-type isoenzymes or the A441-,V441- or T441-type derived from the I441-type. Interest-ingly, the majority of the A441-type isoenzymes are clus-tered in the Amaranthaceae 2 clade, with the exception ofthe A441-type of Amaranthus hypochondriacus, which isphylogenetically very close to the I441-type of the sameplant, suggesting a recent duplication event. The genome ofthis plant has not been yet completely sequenced, so itcould be that this plant possesses another A441-type isoen-zyme that groups with the Amaranthaceae 2. Also, we can-not yet explain the unexpected position of the A441-typeisoenzyme from Ophiopogon japonicus (a monocot), whichclustered with the A441-type isoenzymes in the Amarantha-ceae 2 clade. Since the A441-type isoenzymes are predictedto have BADH activity, i.e. the ability to oxidize BAL [19], itis interesting that O. japonicus CMO also has higher aminoacid sequence identity with CMO proteins from the Amar-anthaceae family than with CMO from monocots [21]. Onepossible explanation to this anomalous behavior is that both
Figure 1 (See legend on next page.)
Muñoz-Clares et al. BMC Plant Biology 2014, 14:149 Page 4 of 16http://www.biomedcentral.com/1471-2229/14/149
(See figure on previous page.)Figure 1 Phylogenetic analysis of plant ALDH10 enzymes. A) Unrooted phylogenetic tree that includes all identified ALDH10 proteinsequences showing the taxonomic group to which they belong. B) Monocot and non-flowering plant ALDH10 sequences. C) Eudicot ALDH10sequences. Indicated are the presence/absence of a peroxisomal-targeting signal PST1 that fits to the consensus sequence (S/A/C)-(K/R/H)-(L/M)(in red and underlined) as well as the amino acid residue and codon at position equivalent to 441 of SoBADH. The tree was inferred from 500replicates using the ML method [61]. The best tree with the highest log likelihood (−32886.4851) is shown. Similar trees were obtained with MP,ME and NJ methods. The analysis involved 131 amino acid sequences (122 from plants and 9 from non-plants). In panel A the branches of theunrooted tree are drawn to scale, with the bar length indicating the number of substitutions per site. In panels B and C only the branch topologyis shown. The proportion of replicate trees in which the associated taxa clustered together in a bootstrap test (500 replicates) is given next to thebranches. Branches with a very low bootstrap value (<20%) are collapsed. For each sequence, the accession number and the name assigned inpublished papers (in the case of proteins previously studied) are given. X indicates an unidentified amino acid/nucleotide.
Muñoz-Clares et al. BMC Plant Biology 2014, 14:149 Page 5 of 16http://www.biomedcentral.com/1471-2229/14/149
genes were acquired by O. japonicus by horizontal genetransfer, which is a significant force in the evolution of plantgenomes [22,23]. Further studies are needed to provide evi-dence in favor or against this possibility.We confirmed the previous observation [19,20] that
the majority of the I441-type isoenzymes possesses aperoxisomal targeting signal type 1 (PST1) that fits tothe consensus sequence (S/A)-(K/R)-(L/M/I) [24,25]whereas all the A441-type and the majority of C441-typeisoenzymes lacks it (Figure 1). In the case of the C441-type the exceptions are those from maize, shorghum andfoxtail millet (Setaria italica), which have the SKL signaland that of Zoysia, which have an SKI signal. The per-oxisomal targeting signal was lost by the change of aresidue or by truncation of the C-terminal region. Thecodon that encodes the missing Leu in the peroxisomaltargeting signal of some of the C441-type isoenzymes ofPoaceae 2 can be changed with only one punctual muta-tion to a stop codon. The same occurs with the genesthat code for A441-type isoenzymes from the Amar-anthaceae family, where the codons for the first missingSer can be transformed with just one punctual mutationto a stop codon. The sequence divergence pattern of theAmaranthaceae supports the proposal that the loss ofthe peroxisomal signal PST1 occurred in this family be-fore the gene duplication that gave rise to the A441-typeisoenzymes. In other eudicot and monocot families, theenzymes with a truncated or mutated C-terminus clusterin the phylogenetic clades 2, a finding that gives add-itional support to the idea that these enzymes derivedfrom the peroxisomal I441-type of clades 1.
Possible ALDH10 evolutionary intermediatesThe phylogenetic analysis described above strongly sup-ports that A441- and C441-type isoenzymes evolved fromthe I441-type ones. Ile can be coded by three differenttriplets, ATT, ATC and ATA, and of these the most fre-quently found in monocots and eudicots ALDH10 genesis ATT and the least frequent ATA, which indeed was notfound in monocots; alanine is coded by four, GCT, GCC,GCA, and GCG, of which GCT is the most used in eudi-cot ALDH10 genes; and cysteine is coded by two, TGTand TGC, and both are present in monocots ALDH10
genes. The observed frequency of each of these codons inmonocots and eudicots is given in Figure 2A. From thesedata it can be observed that the triplet ATT at this pos-ition is more frequent than the ATC one.Since the most parsimonious pathways from Ile to either
Ala or Cys involve two nucleotide substitutions, thereshould have been intermediates in the evolution from theI441-type to the A441- or C441-type ALDH10 isoen-zymes. Several pathways could be followed depending onthe Ile codon of the original enzyme, but in all cases theamino acid substitution in the evolutive intermediate hasto be either Val or Thr in the pathway from Ile to Ala, andPhe or Ser in that from Ile to Cys (Figure 2B). This is con-sistent with the five different amino acids found at thecritical position 441 in the ALDH10 enzymes of knownsequence (Figure 1). Ile is the most frequently codedamino acid (71.9%), followed by Ala (15.7%), Cys (8.3%),Val (2.5%), and Thr (1.6%). Interestingly, ALDH10 isoen-zymes containing Ser or Phe at position equivalent to 441,which are the possible intermediates in the pathway fromI441 to C441, have not been found so far. Since C441-typeisoenzymes are present in monocots, and the numberof available ALDH10 sequences from monocots is stilllow (30 sequences) when compared with the numberof available eudicot sequences (82 sequences), it is tobe expected that the missing intermediates will befound when the number of known monocots ALDH10sequences increases.
Construction and kinetic characterization of the SoBADHA441 mutantsTo simulate the possible evolutionary intermediates wegenerated four SoBADH variants: A441V, A441S, A441T,and A441F, and we characterized them, both kineticallyand structurally. In a previous work [19] we had con-structed the A441I mutant, which represents the putativeoriginal ALDH10 isoenzyme with only AMADH activity,i.e., devoid of significant BADH activity, and the A441Cmutant, which, together with the wild-type SoBADH, rep-resents those isoenzymes that in addition to the AMADHactivity also have BADH activity. The kinetics of the wild-type SoBADH and of the A441I and A441C mutant en-zymes were previously studied using BAL, APAL, ABAL
Figure 2 Evolutionary pathways for the acquisition of BADHactivity by plant ALDH10 enzymes. Possible nucleotide changesin the pathway from the isoenzymes containing Ile at position 441(SoBADH numbering) to those containing Ala (A) or Cys (B) at thisposition. The frequency of the observed codons for monocots (M;30 sequences), and eudicots (E; 82 sequences) is given. Experimentallyobserved codons and amino acids are underlined.
Muñoz-Clares et al. BMC Plant Biology 2014, 14:149 Page 6 of 16http://www.biomedcentral.com/1471-2229/14/149
and TMABAL as substrates [19], at pH 8.0 and at fixed0.2 mM NAD+, conditions both that are nearly physio-logical [26,27]. Now we extend that work by studying thesteady-state kinetics of the other four mutants with BALor APAL as substrates. APAL can be used as representa-tive of the other ω-aminoaldehydes that do not have abulky trimethylammonium group close to the carbonylgroup, and therefore their binding is not sterically con-strained by the size of the cavity formed by the residuesequivalent to Y160 and W456 (PaBADH numbering).Consequently, the kinetics of these aldehydes are similarand very different from the kinetics of BAL, as previouslyfound not only in the wild-type SoBADH but also in theA441C and A441I mutants [19].Taking the kinetics of BAL as the criterion, two groups
of enzymes were observed: one that includes the wild-type and the A441C, A441S and A441T mutants, whichhad a relatively high kcat, low Km(BAL) and high kcat/Km
(BAL) values, and another formed by the A441V, A441Fand A441I mutants, which exhibited low kcat, high Km
(BAL), and very low kcat/Km(BAL), particularly the A441Imutant (Figure 3 and Table 1). The enzymes in the firstgroup have, therefore, significant BADH activity, while theenzymes in the second group will be devoid of this activityat the expected intracellular BAL levels, which should not
be high given the known toxicity of this aldehyde [28]. Ithas to be noted that the BADH activity of the enzymes inthe first group is achieved not only by their much smallerKm(BAL) values, which most likely reflect a much betterbinding of the aldehyde, but also, although not so import-antly, by their significantly higher kcat(BAL) values whencompared with the enzymes of the second group. We alsofound clear differences between these two groups in theirKm(NAD
+) values, which were higher in the enzymesexhibiting BADH activity than in the enzymes with onlyAMADH activity, with the exception of the A441F(Table 1). As the Km value for the first substrate in a bi-biordered steady-state kinetic mechanism depends not onlyon the second-order rate constant of the binding of thissubstrate but also in first-order rate constants associatedwith the steps after the central ternary complex is formed,the finding that the BADHs enzymes have higher Km
(NAD+) values than those of the only-AMADHs onescould be due to the higher kcat values of the former.The differences between the two groups of enzymes
vanish when their kinetics with APAL as substrate arecompared (Table 2), indicating that APAL oxidation washardly affected by the kind of residue at position 441.The exception was the A441F mutant, which showedlower kcat/Km values for both BAL and APAL than thewild-type and the other mutant enzymes, which suggestimportant structural alterations in the active site of thismutant. The catalytic efficiencies (kcat/Km) for APAL ofthe wild-type and mutants SoBADH enzymes are higherthan that for BAL (Table 2), as it has been found withother A441- or C441-type ALDH10 isoenzymes that inaddition to the AMADH activity exhibit BADH activity[7,9,20]. Another difference between the kinetics withBAL and APAL is that APAL produced a small but clearsubstrate inhibition in the wild-type and mutant SoBADHs,while this inhibition was not observed in the saturationkinetics with BAL in the concentration range studied.The observed degree of inhibition by high APAL con-centrations was roughly the same in all the enzymes,but the highest APAL concentration used in our experi-ments, 0.2 mM, did not allow the accurate estimationof the substrate inhibition constant, KIS, values, which arenot given in Table 2 for this reason. As previously shown[29], substrate inhibition arises from the non-productivebinding of the aldehyde to the enzyme-NADH complex.Therefore, these results suggest that the enzyme-NADHcomplexes have lower affinity for BAL than for APAL, asit also the case of the productive enzyme-NAD+ com-plexes, as judged by the Km values for the aldehydes.
Structural characterization of the SoBADH A441 mutantsSince the two main aspects that determine the evolutionof a protein are function and protein stability, we inves-tigated whether the changes made at position 441 affect
Figure 3 Effects of mutation of A441 on the steady-state kinetic parameters of SoBADH. Wild-type and mutant SoBADH enzymes wereassayed at pH 8.0 and 30°C with BAL as variable substrate at fixed 0.2 mM NAD+. Other conditions are given in the Methods section. The kineticparameter values were calculated from the best fit of initial velocity data to the Michaelis-Menten equation by non-linear regression. Each saturationcurve was determined at least in duplicate using enzymes from two different purification batches. Bars indicate standard deviations. In the inset thekcat/Km values of A441V, A441F and A441I are plotted using a scale smaller than that of the main figure.
Muñoz-Clares et al. BMC Plant Biology 2014, 14:149 Page 7 of 16http://www.biomedcentral.com/1471-2229/14/149
the structure and/or thermal stability of the mutant en-zymes. The amino acid substitutions made at position441 were well tolerated, and the levels of expression ofsoluble mutant proteins were similar to that of the wild-type enzyme (results nor shown). The mutations did notaffect either the native dimeric state of the enzymes, asjudged by gel filtration experiments (results not shown),or the protein secondary structure, as judged by theiralmost identical far-UV CD spectra (Figure 4A). Changesin the protein tertiary structure could be detected intheir CD spectra in the near-UV range (Figure 4B), wherethe signals originate from aromatic residues. The near-UV-CD spectrum of wild-type SoBADH shows well-
Table 1 Steady-state kinetic parameters of wild-type and mut
Enzyme Variable substrate kcat (s
BAL
Wild type 3.36 ±
A441C 2.99 ±
A441S 3.29 ±
A441T 2.64 ±
A441V 0.54 ±
A441F 0.29 ±
A441I 0.74 ±
NAD+
Wild type 4.25 ±
A441C 2.20 ±
A441S 3.27 ±
A441T 2.39 ±
A441V 0.51 ±
A441F 0.39 ±
A441I 0.68 ±
Initial velocities were obtained at 30°C in 50 mM HEPES-KOH buffer, pH 8.0, containof NAD+ was 0.2 mM, and in the experiments with variable NAD+ the fixed BAL concenfixed 0.2 mM NAD+. The apparent kinetic parameters were estimated by non-linear reggiven in the Table are the mean ± standard deviation of the kinetic parameters estimadifferent purification batches. Values for kcat are expressed per subunit.
defined positive maximum bands at 284 and 291 nm,characteristic of tryptophan residues, and a minimum be-tween 260 to 280 nm [30], which were also observed inthe mutant enzymes. The exception was A441F, which ex-hibited an altered near-UV CD spectrum with a pro-nounced decrease in the intensity of the peaks at 284 and291 nm and a reduction of the deep of the trough between260 to 280 nm. These changes probably are the result ofthe interaction of the Phe benzyl ring with the neighborside chain of W456, as will be discussed below.The stability of the mutant SoBADH enzymes was mea-
sured by thermal denaturation, which was monitored byfollowing the far-UV CD signal at 222 nm. As previously
ant SoBADH enzymes in the oxidation of BAL
Kinetic parameters-1) Km (μM) kcat/Km (mM-1s-1)
0.13 98 ± 15 35 ± 4
0.19 90 ± 6 33 ± 2
0.12 119 ± 8 28 ± 3
0.16 180 ± 6 15 ± 1
0.05 512 ± 79 1.1 ± 0.3
0.02 605 ± 26 0.48 ± 0.05
0.05 1791 ± 115 0.41 ± 0.05
0.16 22 ± 2 195 ± 8
0.09 14 ± 1 179 ± 17
0.18 29 ± 3 114 ± 20
0.00 24 ± 4 100 ± 15
0.00 6.4 ± 0.5 80 ± 5
0.02 18 ± 1 22 ± 0
0.03 2.8 ± 0.0 243 ± 11
ing 0.1 mM EDTA. In the experiments with variable BAL, the fixed concentrationtrations were at least 10-times their appKm values estimated for each enzyme atression fit of the experimental data to the Michaelis-Menten equation. The valuested in two duplicate saturation experiments performed with enzymes from two
Table 2 Steady-state kinetic parameters of wild-type and mutant SoBADH enzymes in the oxidation of APAL
Kinetic parameters
Enzyme Variable substrate kcat (s-1) Km (μM) kcat/Km (mM-1s-1)
APAL
Wild type 0.99 ± 0.04 3.9 ± 0.1 256 ± 5
A441C 0.67 ± 0.02 0.72 ± 0.16 931 ± 188
A441S 1.12 ± 0.00 2.0 ± 0.5 550 ± 142
A441T 1.50 ± 0.12 4.6 ± 0.5 326 ± 13
A441V 0.52 ± 0.10 1.1 ± 0.3 473 ± 216
A441F 0.34 ± 0.04 3.7 ± 0.0 92 ± 11
A441I 1.85 ± 0.06 4.8 ± 1.3 375 ± 80
NAD+
Wild type 0.99 ± 0.04 4.0 ± 0.1 250 ± 7
A441C 0.89 ± 0.00 2.6 ± 0.5 342 ± 76
A441S 0.88 ± 0.02 2.8 ± 0.0 314 ± 13
A441T 0.76 ± 0.02 4.0 ± 0.4 190 ± 16
A441V 0.54 ± 0.06 1.7 ± 0.2 318 ± 2
A441F 0.43 ± 0.01 5.9 ± 1.4 73 ± 17
A441I 2.11 ± 0.01 5.5 ± 0.1 382 ± 9
Initial velocities were obtained at 30°C in 50 mM HEPES-KOH buffer, pH 8.0, containing 0.1 mM EDTA. In the experiments with variable APAL, the fixed concentra-tion of NAD+ was 0.2 mM, and in the experiments with variable NAD+ the fixed APAL concentrations were at least 10-times their appKm values estimated for eachenzyme at fixed 0.2 mM NAD+. The apparent kinetic parameters were estimated by non-linear regression fit of the experimental data to the Michaelis-Menten equation(saturation by NAD+ at fixed APAL) or to Equation 1 given in the main text (saturation by APAL at fixed NAD+). The values given in the Table are the mean ± standarddeviation of the kinetic parameters estimated in two duplicate saturation experiments performed with enzymes from two different purification batches. Values for kcatare expressed per enzyme subunit. Substrate inhibition constants for APAL are not given because they could not be accurately estimated in the concentration rangeused in these experiments, but the observed degree of inhibition by high APAL concentrations was roughly the same in all the enzymes.
Muñoz-Clares et al. BMC Plant Biology 2014, 14:149 Page 8 of 16http://www.biomedcentral.com/1471-2229/14/149
found with the wild-type enzyme [30], all mutant proteinswere irreversibly denatured at 90°C and the meltingcurves exhibited monophasic transitions (Figure 4C). Theirreversibility of the thermal denaturation of all enzymesprecludes equilibrium thermodynamic analysis of theprocess. However, the use of the same measurement pa-rameters and of the same experimental conditions allowedus to evaluate the possible effect of the changed aminoacid on the mutant enzymes stability by comparing thetransitions midpoints of their thermal transitions, i.e. theirapparent Tm values. The estimated apparent Tm values ofthe A441C, A441S, A441T and A441F mutants were simi-lar to that of the wild-type enzyme, around 50°C, butthose of A441V and A441I were approximately 10°Chigher (Figure 4C). These findings clearly indicate thestabilizing effect of the presence of a hydrophobic side-chain of medium or large volume inside the protein atposition 441. In the case of the A441F mutant, althoughthe side chain introduced is highly hydrophobic and theminimized model of this mutant indicates that it maymake μ-stacking interactions with the side chain ofW456 (see below), the strain exerted on the protein toaccommodate the bulky benzyl ring of Phe, also indi-cated by the model, probably causes a decrease in thestability of this enzyme when compared with that of the
A441I or A441V mutants. Not considering the A441Fmutant, it is interesting that the differences in thermo-stability of the enzymes also indicate the existence ofthe same two groups identified by the differences in thekinetic parameters of the reaction with BAL as substrate.Clearly both effects depend on the packing of the side-chains in the region surrounding the position 441.
Models of the SoBADH A441 mutantsTo interpret the observed kinetic and stability propertiesof the SoBADH mutants, we got an estimation of thepossible position and contacts in the SoBADH structureof the changed 441 residue by performing in silico muta-tions followed by energy minimizations of the mutatedstructures. The results of these simulations are consist-ent with the known crystal structures of the ALDH10enzymes from pea and tomato, which have Ile at pos-ition 441, and maize, which has a Cys at this position(Figure 5). This support the validity of the models of themutant enzymes for which there is no a homolog crystalstructure. When compared with the wild-type enzyme,the models of A441V and, particularly, of A441I show asimilar displacement of W456 to that observed in thecrystal structures of the I441-type isoenzymes of pea andtomato. This displacement causes the narrowing of the
Figure 4 Effects of mutation of A441 on the structural propertiesof SoBADH. Conformational characteristics of wild-type andmutant SoBADH enzymes examined by near- (A) and far-UV (B)CD spectra. (C) Thermal denaturation followed by changes at 222 nmin the far-UV CD signal. The temperature range was 20–90°C andthe scan rate 1.5°C/min. The solid lines represent the best fit ofthe thermal transition data to a sigmoidal Boltzman function bynon-linear regression.
Figure 5 Structural comparisons of the A441 region of ALDH10isoenzymes. A) Superimposition of the side-chains of residues atposition 441, 443 and 456 (SoBADH numbering) in the known crystalstructures of plant ALDH10 enzymes. Side-chains are shown as stickswith oxygen atoms in red, nitrogen in blue, and sulphur in yellow.Carbon atoms are green in SoBADH (PDB code 4A0M), cyan inPsAMADH2 (PDB code 3IWJ), magenta in ZmAMADH1a (PDB code4I8P), and black in SlAMADH1 (PDB code 4I9B). B) Superimpositionof the same region in the minimized models of the in silico SoBADHmutants in which the residue at position 441 was changed. Side-chainsare shown as sticks with oxygen atoms in red, nitrogen in blue, andsulphur in yellow. Carbon atoms are green in the wild-type enzyme,magenta in A441C, grey in A441S, yellow in A441T, brown in A441V,salmon in A441F, and cyan in A441I. In the figure, the wild-type andA441T models mask the A441C and A441S models, respectively. Thefigure was generated using PyMOL (www.pymol.org).
Muñoz-Clares et al. BMC Plant Biology 2014, 14:149 Page 9 of 16http://www.biomedcentral.com/1471-2229/14/149
cavity where the trimethylammonium group of BALbinds, thus explaining the low BADH activity of thesetwo mutants. On the contrary, W456 occupies almostthe same position in the models of A441C, A441S andA441T than in the wild-type spinach and maize enzymes(Figure 5A and B), which is consistent with the signifi-cant BADH activity exhibited by these three mutants.In the crystal structure of SoBADH the side chain
of A441 interacts only with the active site residue W456,
but in the known structures of the other three plantALDH10s the residue at this position also makes con-tacts with W443 (SoBADH numbering) (Additionalfile 2: Figure S1A and Table S2). In all plant ALDH10sof known sequence, W456 is highly conserved (97.5%)and W443 strictly (100%) conserved. W433 is located atthe interface between monomers and has a similar con-formation in the ALDH10 crystal structures so far deter-mined. The models of the SoBADH A441 mutants showthat the number of contacts made by the side chainof the residue at position 441 considerably increaseswhen the size of its side chain increases (Additional file2: Figure S1B and Table S2). Regardless of the positionof the thiol group at the start of the simulation, we al-ways obtained a model of A441C that has the thiol
Muñoz-Clares et al. BMC Plant Biology 2014, 14:149 Page 10 of 16http://www.biomedcentral.com/1471-2229/14/149
group pointing at W443, as in the ZmAMADH1a crystalstructure where the sulphur of the cysteine residue atposition 441 makes closer contacts with W448 (equiva-lent to W443 in SoBADH) than with W461 (equivalentto W456 in SoBADH). The serine residue in A441S canhave two alternative positions: one in which the hydroxylgroup is pointing at W443 and another where the hy-droxyl points to W456. The first of these two possibleconformations, which is similar to that adopted by Cys,is less stable since the hydroxyl in this conformationwould be too far from the aromatic ring of W443 tomake any interaction with it. The second conformation,which is the one of the model shown in Additional file2: Figure S1B, is favored because the hydroxyl groupmakes van der Waals contacts, and possibly polar-μ in-teractions, with the ring of W456. In the A441T mutant,the hydroxyl group has only one possible conformation,the one that points to W456, similarly to that of the Serin the A441S mutant model. In the A441T minimizedmodel the distance of the hydrogen atom of the hydroxylgroup from the centroid of the aromatic face of W456ring is of 2.96 Å, suggesting the possible existence of ahydroxyl-aromatic-ring hydrogen bond of the kind de-scribed by Levitt and Perutz [31]. The Val side chain inA441V fits well in this position and makes contacts withW456. The side chain of the Ile in the mutant A441Imakes almost the same contacts that those observed inthe crystal structures of the pea and tomato enzymes(PDB codes 3IWJ and 4I9B, respectively), and similarlypushes the side-chain of W456. Finally, in the mutantA441F, the only possible way to accommodate the bulkybenzyl group is by stacking against the aromatic ring ofW456. However, the A441F model showed a clash of theF441 ring with the main chain in the region of residuesP455, W456 and G457, which causes this region to movein order to accommodate the phenylalanine residue, thusnarrowing the aldehyde entrance tunnel (not shown).
DiscussionImportance of size, polarity and conformation of the sidechain at position 441 of SoBADH for the kinetics andstabilityAll data in this work agree with our previous proposal thatonly one amino acid residue at position 441 is critical forALDH10 enzymes to accept or reject BAL as substrate[19]. I441-type isoenzymes posses low or very low activitywith BAL whereas A441- and C441-type isoenzymes ex-hibit high activity with BAL [19]. The SoBADH mutantsthat have a residue of similar size to Ala or Cys, as A441Sor A441T, exhibit a high activity with BAL, but the mu-tants with a bulky nonpolar residue, as A441V or A441I,have a very low activity with BAL (Figure 3). The exquisitesensitivity of SoBADH affinity for BAL to the size of theside chain of the residue at position 441 is reflected in the
finding of a Km(BAL) of the A441T mutant lower thanthat of the A441V. Val and Thr have similar sizes but themethyl group of Val pointing at W456 is a hydroxyl groupin Thr. Since the van der Waals radius of oxygen is 0.27 Ålower than that of carbon [32], this difference would resultin a lesser steric impediment of W456 to the binding ofthe trimethylammonium group of BAL in the A441T thanin the A441V mutant. Also, the polar μ-interaction be-tween the hydroxyl group of T441 and the aromatic ringof W456 suggested by the model would reduce the dis-tance between these two groups, thus widening thetrimethylammonium-binding cavity when compared withthat of the A441V mutant. Although the van der Waalsradius of oxygen is 0.39 Å lower than that of sulphur [32],the A441C mutant has a slightly but significantly lowerKm(BAL) than A441S, which can be explained by the dif-ferent conformation adopted by their side chains accord-ing to the models (Figure 5B and Additional file 2: FigureS1B). In the A441C mutant the atom closer to the W456ring is a hydrogen, as is in the wild-type enzyme, whereasin the A441S mutant the position of this hydrogen is oc-cupied by a bulkier hydroxyl group. This could explainthat the A441C SoBADH mutant has similar kinetic pa-rameters for BAL than the wild-type enzyme. Regardingthe mutant A441F, which has the bulkiest of the sidechains introduced at this position, it was surprising to usthat it had a lower Km(BAL) than the mutant A441I. Thismay be due to the stacking of the aromatic ring of F441with that of W456, as suggested by the model, whichwould result in a more compact packing of both sidechains, and therefore in a lesser steric impediment ofW456 to the binding of the trimethylammonium group ofBAL that in the A441I mutant. The movement of themain chain, also suggested by our model, and the conse-quent narrowing of the aldehyde-binding tunnel could ex-plain the low kcat/Km(APAL) exhibited by this enzyme. Insummary, it is not only the size but also the conformationadopted by the side chain of residue 441 what matters indetermining the position of the side chain of W456, andtherefore the size of the pocket where the trimethylammo-nium group of BAL binds and the affinity for BAL. As-suming that Km values are an indication of affinity, thesize of the residue at position 441 also appears to affectthe binding of the nucleotide, although to a much lesserextent than the binding of the aldehyde, for as yet notclear reasons.The acquisition of new functions by proteins is limited
because most mutations have destabilizing effects, par-ticularly if the mutated residue is buried [33], as is theresidue at position 441. However, the findings that thevariants of SoBADH with six different residues at pos-ition 441 were correctly folded and have thermal stabili-ties in the range expected for mesophilic enzymes,indicate that this position is highly evolvable and able
Muñoz-Clares et al. BMC Plant Biology 2014, 14:149 Page 11 of 16http://www.biomedcentral.com/1471-2229/14/149
to accommodate several mutations, some of them giv-ing rise to a new function: the BADH activity andtherefore, the capacity to participate in the synthesis ofthe osmoprotectant GB. The higher thermal stability ofthe A441V and A441I mutants relative to that of thewild-type and A441C, A441S, and A441T is consistentwith the extensive interactions made by the side chainsof Val and Ile, as indicated by our models and theknown crystal structures of ALDH10 enzymes that con-tain Ile at this position, and with the hydrophobicity ofthese two residues. In the case of the A441F mutant thestrain exerted on the protein by the bulky benzyl ringof Phe probably causes a decrease in the stability of thisenzyme when compared with that of the A441I orA441V mutants.
Evolution of plant ALDH10 enzymesOne of the strategies used by plants to respond to thedemands of their environment involves the evolution ofnovel metabolic pathways by recruiting enzymes fromold ones. The phylogenetic evidence presented here indi-cates that this is the case of the enzymes exhibitingBADH activity that participate in the synthesis of theosmoprotectant glycine betaine (GB), which in angio-sperms evolved from I441-type ALDH10 enzymes.Our results show that non-flowering plants possess
only one ALDH10 gene expressing an I441-type enzyme,but flowering plants (monocots plus eudicots) have avariable number of ALDH10 genes coding for I441-,A441- or C441-type isoenzymes. The presence of add-itional gene copies in monocots and eudicots can be ex-plained by several small- and large-scale duplications,which were accompanied by gene losses and genomerearrangements [34]. Indeed, the distribution of the du-plicated ALDH10 enzymes that can be seen in thephylogenetic tree in Figure 1 is in agreement with previ-ous observations: (i) various eudicot families independ-ently underwent genome duplication [35,36]; (ii) onewhole-genome duplication occurred early in the mono-cot lineage after its divergence from the eudicot lineage[37]; and (iii) recent duplication events took place inmaize, soybean and grape [34]. Duplication is a promin-ent feature of the plant genomic architecture, and muchof plant diversity may have arisen following the duplica-tion and adaptive specialization of pre-existing genes[38]. In this context, duplication of the ALDH10 gene insome plants allowed a functional specialization whenI441 mutated to A441 or C441 in one of the two copiesof the duplicated gene, as proposed by Díaz-Sánchezet al. [19]. The ancient enzymes most probably had thesame AMADH activity that the present-time I441-typeisoenzymes, but they may have had also a vestigialBADH activity, which was improved during the evolu-tion of plants via an evolutionary pathway of sequential
single mutations of the duplicates. In this way, the A441-or C441-type isoenzymes acquired an extra BADH activity,necessary for GB synthesis, while the original peroxisomalI441-type isoenzymes remained as AMADHs devoid ofBADH activity and most likely retained their metabolicrole. Although the gain of the BADH activity does notimply the loss of the other AMADH activities whenassayed in vitro ([19]; this work), the above observa-tions indicate that the physiological functions of theseisoenzymes are not interchangeable. Indeed, in many plantspecies the duplication of the ALDH10 gene did not resultin functional redundancy, since the duplicated genes notonly derived in enzymes with BADH activity but also innon-peroxisomal I441-type AMADH isoenzymes, as inthe Solanaceae, Malvaceae, Rosacease and Brassicaceaefamilies, whose different cellular location probably allowsthem to perform a different physiological function thanthe peroxisomal ones. This is one possible reason for thepermanence of the duplicate genes that encode non-peroxisomal I441-type isoenzymes.Our phylogenetic analysis also shows that all known
plant genomes contain at least one ADH10 gene codingfor a peroxisomal I441-type isoenzyme, which is the onepresent even in those plants with a sequenced genomethat possess only one ALDH10 gene. This underscores theimportance of the biochemical processes in which theseperoxisomal I441-type isoenzymes participate, and indi-cate that, as mentioned above, in vivo they cannot be re-placed by the ALDH10 isoenzymes that have gained theextra BADH activity, i.e. the A441- or C441-type isoen-zymes. Together, our findings strongly support that theactivity of the peroxisomal I441-type isoenzymes is essen-tial for the plant, which is not unexpected considering thatthe AMADH activity of these enzymes is involved in thecatabolism of polyamines taking place in the peroxisome[39]. As mentioned above, the cytosolic I441-type isoen-zyme may be involved in other physiological processes, forinstance the synthesis of several osmoprotectants, such as4-aminobutyrate, β-alanine, or carnitine by oxidizingABAL, APAL or TMABAL. The different intracellular lo-cation of the ALDH10 isoenzymes, predicted and in somecases experimentally proven, suggest that there are twokinds of the I441-type—peroxisomal (the commonest)and cytosolic—, both devoid of significant BADH activity,and possibly two kinds of the A441- and C441-type—chloroplastic and cytosolic in the first case, and peroxi-somal and cytosolic in the second—, all of them havingBADH activity.Regarding the intracellular location, the Amaranthaceae
A441-type isoenzymes most likely are chloroplastic, asthose from spinach [40] and sugarbeet [28], although theylack a typical chloroplast-targeting signal [14]. Therefore,in the Amaranthaceae the synthesis of GB most likelytakes place in the chloroplasts, as experimentally proven
Muñoz-Clares et al. BMC Plant Biology 2014, 14:149 Page 12 of 16http://www.biomedcentral.com/1471-2229/14/149
for spinach [13]. This is in accordance with the chloro-plastic location of the CMO enzyme in this plant [41] andwith the fact that the CMO activity requires the electronsprovided by reduced ferredoxin [17] and plant ferredoxinsare plastidic proteins [42]. But in the Poaceae plants thathave the C441-type isoenzyme, GB synthesis may takeplace in the cytosol—since some of these isoenzymes arenon-peroxisomal and do not have a clear chloroplast-leading signal—or in peroxisomes, since other C441-typeisoenzymes have the peroxisomal signal. Indeed, it hasbeen suggested that in barley the synthesis of GB may takeplace in the peroxisome, given the finding of a peroxi-somal CMO in this plant [43]. But this proposal is at oddswith the proven non-peroxisomal location of the C441-type ALDH10 isoenzyme of barley [9], and with the factthat peroxisomal plant ferredoxins have not beenso far reported. Regardless of the peroxisomal or non-peroxisomal CMO location, it seems probable that theneed of transport into the peroxisomes of either cholineor BAL limits the synthesis of GB in those plants withC441-type peroxisomal isoenzymes in comparison withthose in which this isoenzyme is non-peroxisomal. This isin full agreement with the finding that the level of GB ac-cumulation observed in cereal plants that have a predictedperoxisomal C441-type isoenzyme—maize, sorghum andfoxtail millet—is much lower than that in plants that havethe non-peroxisomal C441-type, as wheat and barley[44-46]. It could be predicted that other Poaceae specieswhich have a moderate ability to accumulate GB, suchas Pennisetum [44] and Panicum [46], have peroxisomalC441-type isoenzymes, while those that are high GB accu-mulators, such as Secale [44,46], have the non-peroxisomalC441-type isoenzyme.Since the ability to synthesize the osmoprotectant GB
protects the plant against the most frequent environ-mental stresses such as salinity, drought, and low tem-peratures, as well as indirectly against other stresses thatusually accompany the formers, such as oxidative stressand high temperatures (rev. in [47,48]), it is clear thatthe presence of A441- or C441-type isoenzymes providesa strong adaptive advantage. And not only to halophytes,which grow in habitats where saline soils are prevalent,but also to mesophytes, which may experience sporadicepisodes of water deficit or freezing temperatures. This,together with the relatively easiness of the evolutionaryprocess, explains why the A441- and C441-type isoen-zymes evolved independently several times through theevolution of flowering plants. However, the A441- orC441-type isoenzymes exhibit a limited phyletic distribu-tion (Figures 1B and 1C), which agrees with the findingthat GB accumulation is also restricted to some eudicotand monocot families [19]. This finding further supportthe proposal that a significant BADH activity is essentialfor GB accumulation [19]. Indeed, a significant BADH
activity would be necessary not only to produce thelevels of GB needed for an osmoprotectant, but also toprevent the accumulation of the BAL—which is pro-duced by CMO—up to toxic concentrations. But al-though the BADH activity is a sine qua non conditionfor the plant to become a GB-accumulator, the currentexperimental evidence, obtained studying wild-type GB-accumulator plants and transgenic non-accumulatorplants that express BADH or CMO genes (rev in [1]),suggests that the acquisition of the ability to synthesizeGB by evolving a BADH activity in some plant ALDH10enzymes should have been accompanied by other adap-tations, such as: (i) the gain of a significant CMO activ-ity; (ii) the adequate intracellular targeting of both theCMO and BADH enzymes; (iii) the regulation by os-motic stress signals of the expression of the CMO andBADH genes, as well as of those coding for critical en-zymes of the choline biosynthetic pathway; and (iv)probably also, the development of an efficient cholineand GB transport through intracellular membranes.
Evolutionary pathway for the acquisition of BADH activityby plant ALDH10 enzymesOur results indicate that plant ALDH10 enzymes mayhave followed several evolutionary pathways for the acqui-sition of BADH activity, particularly the ones involvingany of the parsimonious intermediates with Val, Thr, Serat position equivalent to 441 of SoBADH. The evolution-ary intermediate carrying F441 would have a diminishedAMADH activity, as judged by the low kcat/Km values ofthe SoBADH A441F mutant, which could make the routethough it less probable. But as the enzymes that undergothese mutations were coded by gene duplicates and theoriginal I441-type enzymes were conserved, the plantwould not have experienced any deleterious effect even ifthe mutation to Phe took place. But this mutation wouldhave not provided any advantage to the plant either, incontrast to the Ala o Ser mutations, and even to the muta-tion that led to non-peroxisomal I441-type isoenzymes.Thus, the Phe441 mutation could have been eliminatedduring evolution. Interestingly, natural ALDH10s of theT441-type (2 sequences) and V441-type (3 sequences),which the intermediary mutants in the parsimoniouspathway to A441 from I441, were found within the plantALDH10 protein sequences retrieved from public data-bases. We anticipate that enzymes of the S441-type will befound when more monocot sequences are known. Therelatively high frequency of A441-type ALDH10 isoen-zymes may result from a bias in the available reports, sincemost of the studies have been carried out in halophyticplants in which these enzymes participate in their osmoto-lerance mechanisms. Regarding the C441-type isoen-zymes, they are present in cereal plants, which have beenextensively studied for their high agronomical interest.
Figure 6 Comparison of thermal stabilities and catalyticefficiencies of wild-type and A441 SoBADH mutants. Thepercentage of catalytic efficiency for the oxidation of BAL (greendown triangles) or APAL (red up triangles) was calculated from thekcat/Km values given in Table 1 and Table 2, respectively, taking as100% the values of the A441I mutant, which was assumed torepresent the ancestral AMADH enzyme. Thermal stability data(apparent Tm values) are those given in Figure 4. SoBADH Enzymes:A, wild-type; C, A441C; S, A441S; T, A441T; V, A441V; F, A441F; I, A441I.
Muñoz-Clares et al. BMC Plant Biology 2014, 14:149 Page 13 of 16http://www.biomedcentral.com/1471-2229/14/149
Indeed, in cereals an evolutionary process directed byhumans to increase their tolerance to drought or salinestress may have occurred.The mutation of the original Ile residue at this position
to Val, Thr or Ser could be considered as neutral since itwould not have had any deleterious effect on the primi-tive AMADH function, as judged by the kcat/Km(APAL)values of the mutant enzymes studied here. But while inthe case of the mutation to Val the resulting enzymewould still be an AMADH devoid of significant BADHactivity, in the case of the mutation to Thr or Ser the en-zyme would have experienced an important increase inthe vestigial BADH activity of the original enzyme,which would make these enzymes true BADHs. The kin-etic data of the mutant enzymes suggest that the BADHactivity of the T441-type or the S441-type intermediateswould be increased if a second mutation to Ala or Cystook place. The same BADH activity would be gainedafter the mutation of Val to Ala. These increases couldbe of enough physiological importance to explain the se-lection of the enzymes with this second mutation to ei-ther Ala or Cys, which in turn would explain, at least inpart, the relatively higher frequency of the A441- orC441-type isoenzymes over that of the V441-, T441-, orS441-type.The decrease in the thermal stability of those ALDH10
enzymes that underwent the mutational event by whichthey acquire BADH activity—as suggested by our resultswith the SoBADH mutants at position 441—would notcompromise their stability under physiological conditions.Neither would be the folding affected, as judged by thesimilar level of expression of these mutant enzymes as sol-uble proteins in E. coli. As a summary, Figure 6 shows thecomparison of the thermal stabilities and catalytic efficien-cies of the seven SoBADH proteins studied in this work—the wild type and six mutants— to illustrate the possiblefitness landscape of the evolution of the BADH activitywithin the ALDH10 family. As can be seen, the change ofthe Ile at position 441 does not affect the catalytic effi-ciency with APAL as substrate but it has a profound effecton the catalytic efficiency with BAL. The thermostabilitycost of this change would be surpassed by the advantageof the gain of the BADH activity. Indeed, this may be thereason for the scarcity of V441-type isoenzymes in thepresent-time plant ALDH10s, which is rather surprisinggiven that the change of I441 for V441 has almost no ef-fect either on AMADH activity or protein stability. It canbe speculated that a subsequent mutation to A441 is soadvantageous for the plant that most of the V441 enzymesevolved to this final stage.
ConclusionsThe phylogenetic, biochemical, and structural evidencepresented here support relatively smooth evolutionary
processes for the gain of the BADH activity in plants,and therefore for the acquisition of the potential tosynthesize the osmoprotectant GB, since evolution tookplace in duplicated genes and the four parsimoniousevolutionary intermediates are functionally and structur-ally feasible, particularly three of them. Our resultsstrongly support that ancestral genes encoding peroxi-somal I441-type isoenzymes devoid of BADH activitymutated in several occasions during terrestrial plantevolution, so that the present time A441- or C441-typeisoenzymes that exhibit BADH activity do not form amonophyletic group inside the ALDH10 protein family.We also provide additional evidence of the critical roleplayed by the residue at position 441 (SoBADH num-bering) in the affinity for BAL of plant ALDH10 en-zymes. Finally, our findings about the differences inintracellular location of the isoenzymes with BADH ac-tivity indicate that the peroxisomal synthesis of GB isconstrained by some factor(s), and explain the knownfact that there are high and medium-GB accumulatorplants, the former having non-peroxisomal A441- orC441-type isoenzymes, the latter being those with aperoxisomal C441-type.
MethodsChemicals and biochemicalsBetaine aldehyde chloride, the diethylacetal of APAL andNAD+ were obtained from Sigma-Aldrich Química, S.A.
Muñoz-Clares et al. BMC Plant Biology 2014, 14:149 Page 14 of 16http://www.biomedcentral.com/1471-2229/14/149
de C.V. (Toluca, México). APAL was freshly prepared byhydrolyzing its diethylacetal form as described by Floresand Filner [49]. The exact concentration of the resultingfree APAL was determined in each experiment by theamount of NADH produced after its complete oxidationin the reaction catalyzed by SoBADH in the presence ofan excess of NAD+.
Site-directed mutagenesis, production and purification ofwild-type and mutant SoBADH enzymesThe plasmid pET28-SoBADH, containing the full se-quence of the spinach BADH gene and a N-terminalHis-tag [19], was used as template for site-directed mu-tagenesis, which was performed via polymerase chainreaction (PCR) using the Quick Change XL-II Site Di-rected Mutagenesis system (Agilent) and the followingmutagenic primers: A441V, GAAGGCTCTAGAAGTTGGAGTTGTTTGGGTTAATTGCTCAC (forward) andTTGTGAGCAATTAACCCAAACAACTCCAACTTCTAGAGCC (reverse); A441S, GAAGGCTCTAGAAGTTGGAACTGTTTGGGTTAATTGCTCAC (forward) and TTGTGAGCAATTAACCCAAACAGTTCCAACTTCTAGAGCC (reverse); A441T, GAAGGCTCTAGAAGTTGGAACCGTTTGGGTTAATTGCTCAC (forward) and TTGTGAGCAATTAACCCAAACGGTTCCAACTTCTAGAGCC(reverse); A441F, GAAGGCTCTAGAAGTTGGATTTGTTTGGGTTAATTGCTCAC (forward) and TTGTGAGCAATTAACCCAAACAAATCCAACTTCTAGAGCC (re-verse);. The non-complementary mutagenic codons are initalics. Mutagenesis was confirmed by DNA sequencing.The expression and purification of the recombinant pro-teins were carried out as reported [19]. Protein concentra-tions were determined spectrophotometrically, using themolar absorptivity at 280 nm of 86,400 M−1 cm−1 deducedfrom the amino acid sequence by the method of Gill andvon Hippel [50].
Activity assay and kinetic characterization of the wild-type and mutant SoBADH enzymesSteady-state initial velocities of wild-type SoBADH andits mutants were measured spectrophotometrically asreported [19] at pH 8.0 using 0.2 mM NAD+ and vari-able concentrations of BAL or APAL. The fixed con-centration of NAD+ used in these experiments wassaturating, as indicated by the finding of very close kcatvalues in experiments where the nucleotide was thevariable substrate and the aldehyde kept constant at aconcentration ten times its Km value. All assays wereinitiated by addition of the enzyme. Each saturationcurve was determined at least in duplicate using enzymesfrom two different purification batches. Initial velocitydata were analyzed by non-linear regression calculationsusing the Michaelis-Menten equation. Equation 1 was
used for the APAL saturation data, which exhibited sub-strate inhibition:
v= E½ � ¼ kcat S½ �= Km þ S½ � 1þ S½ �=K ISð Þf g ð1Þ
where v is the experimentally determined initial velocity,[E] is the enzyme concentration, kcat is the first orderrate constant for product formation at saturating sub-strate (equal to the maximal velocity, Vmax, divided bythe enzyme concentration), [S] is the concentration ofthe variable substrate, Km is the concentration of sub-strate at half-maximal velocity, and KIS is the substrateinhibition constant. Although substrate inhibition ofSoBADH by aminoaldehydes is partial [19], the highestsubstrate concentration used in our experiments did notallow distinguishing between total and partial inhibition,and therefore Equation 1 for total inhibition was used.
Circular dichroism spectroscopyCD signals were recorded with a Jasco J-715 spectropolar-imeter (Jasco Inc., Easton, MD) equipped with a Peltier-type temperature control system (Model PTC-423S) andcalibrated with d-10-(+)-camphorsulfonic acid. Near-UV(250–320 nm) and far-UV (200–250 nm) CD spectra wererecorded for samples of 0.25 or 1.0 mg/mL protein con-centration, respectively, placed in quartz cuvettes of1.0-cm and 0.1-cm path length, respectively. Data werecollected at 0.5 nm (near-UV) or 1.0 nm (far-UV) inter-vals, a bandwidth of 1.0 nm and at a scan rate of 20 nm/min. Spectra were averaged over 5 scans and the averagespectrum of a reference sample without protein was sub-tracted. The observed ellipticities were converted to meanresidue ellipticities [Θ] on the basis of a mean molecularmass per residue of 109.1. Thermal-induced protein de-naturation was monitored by following the changes in el-lipticity at 222 nm by increasing the temperature from 20to 90°C at a constant rate of 1.5°C/min. Determination ofapparent Tm values was performed by non-linear regres-sion fit of the data to a single Bolztman sigmoidal func-tion. ORIGIN software (OriginLab Corp.) was used fordata analysis and display.
In silico mutagenesis and modelingMutations in the crystal structure of the SoBADH (PDBcode 4A0M) at position 441 were generated in silico usingthe standard rotamer library of Coot [51]. Mutant modelswere subjected to a 1000-step energy-minimization processusing the Amber force field parameters in the UCSFChimera [52].
Sequence analysesALDH10 amino acid and nucleotide sequences were re-trieved by Blast searches at the NCBI site [53] (http://blast.ncbi.nlm.nih.gov/Blast.cgi) or Phytozome v9.1 database ([54];
Muñoz-Clares et al. BMC Plant Biology 2014, 14:149 Page 15 of 16http://www.biomedcentral.com/1471-2229/14/149
http://www.phytozome.net/). Progressive multiple amino acidsequence alignments were performed with ClustalXversion 2 ([55], http://www.clustal.org/clustal2/) usingas a guide a structural alignment constructed with the VASTalgorithm [56] that included all non-redundantALDH10protein structures in the PDB [57]. Amino acid sequencealignments were corrected manually using BioEdit ([58],http://www.mbio.ncsu.edu/bioedit/bioedit.html) accord-ing to gapped BLASTP results. To identify protein sequencesas members of the ALDH10 family the phylogeneticanalyses previously performed by Julián-Sánchez et al.[59] were used as a reference. Phylogenetic analyses wereconducted using the MEGA5 software ([60]; http://www.megasoftware.net). Four methods were used to infer phylo-genetic relationships: maximum likelihood (ML), maximumparsimony (MP), minimum evolution (ME), and neighborjoining (NJ). The amino acids substitution model de-scribed by Whelan and Goldman [61] using a discreteGamma distribution with five categories, was chosen asthe best substitution model, since it gave the lowestBayesian Information Criterion values and correctedAkaike Information Criterion values [62] in MEGA5[60]. The gamma shape parameter value (+G parameter= 1.1824) was estimated directly from the data withMEGA5. Confidence for the internal branches of thephylogenetic tree, obtained using ML method, was deter-mined through bootstrap analysis (500 replicates each).
Additional files
Additional file 1: Table S1. Aldehyde dehydrogenases identified asmembers of the ALDH10 family.
Additional file 2: Figure S1. Interactions of the residue at positionequivalent to A441 of SoBADH. Table S2. Distances of the side-chainatoms of the residue at position 441 to their closest neighbors.
AbbreviationsALDH: Aldehyde dehydrogenase; ABAL: 4-Aminobutyraldehyde;AMADH: Aminoaldehyde dehydrogenase; APAL: 3-Aminopropionaldehyde;BADH: Betaine aldehyde dehydrogenase; BAL: Betaine aldehyde;CMO: Choline monooxygenase; GB: Glycine betaine; PDB: Protein data bank;PsAMADH: Pea AMADH; SlAMADH: Tomato AMADH; SoBADH: Spinach BADH;TMABAL: 4-Trimethylaminobutyraldehyde; WT: Wild type; ZmAMADH: MaizeAMADH.
Competing interestsAll the authors declare that they have no competing interests.
Authors’ contributionsRAMC conceived, designed and coordinated the study, analyzed kinetic andstructural data, and wrote the manuscript. HRR and AJS carried out thephylogenetic analysis, interpreted and wrote the results of this analysis, andcontributed to the general discussion and writing of the manuscript. CMJconstructed and purified the mutant proteins and performed the kineticexperiments. GGR carried out the CD and thermostability experiments andanalyzed these data. LGS made the in silico mutants, constructed theirmodels and helped with the making of the figures. All authors read andapproved the final manuscript.
AcknowledgementsThis work was supported by UNAM-PAPIIT grants IN217814 to RAMC andIN216513 to HRR. We thank Dr. LP Martínez-Castilla (Faculty of Chemistry,UNAM) for helpful discussions and to Dr. I. Chávez-Béjar (Faculty of Chemistry,UNAM) for her help with the sequencing of the recombinant mutant enzymes.
Author details1Departamento de Bioquímica, Facultad de Química, Universidad NacionalAutónoma de México, México D.F., México. 2Departamento de Bioquímica,Facultad de Medicina, Universidad Nacional Autónoma de México, México D.F., México.
Received: 27 March 2014 Accepted: 22 May 2014Published: 29 May 2014
References1. Chen THH, Murata N: Glycinebetaine protects plants against abiotic
stress: mechanisms and biotechnological applications. Plant Cell Environ2011, 34(1):1–20.
2. Craig SA: Betaine in human nutrition. Am J Clin Nutr 2004, 80(3):539–549.3. Vasiliou V, Bairoch A, Tipton KF, Nebert DW: Eukaryotic aldehyde
dehydrogenase (ALDH) genes: human polymorphisms, andrecommended nomenclature based on divergent evolution andchromosomal mapping. Pharmacogenetics 1999, 9(4):421–434.
4. Vojtechová M, Hanson AD, Muñoz-Clares RA: Betaine-aldehydedehydrogenase from amaranth leaves efficiently catalyzes theNAD-dependent oxidation of dimethylsulfoniopropionaldehyde todimethylsulfoniopropionate. Arch Biochem Biophys 1997, 337(1):81–88.
5. Trossat C, Rathinasabapathi B, Hanson AD: Transgenically expressedbetaine aldehyde dehydrogenase efficiently catalyzes oxidation ofdimethylsulfoniopropionaldehyde and ω-aminoaldehydes. Plant Physiol1997, 113(4):1457–1461.
6. Šebela M, Brauner F, Radová A, Jacobsen S, Havliš J, Galuszka P, Peč P:Characterisation of a homogeneous plant aminoaldehydedehydrogenase. Biochim Biophys Acta 2000, 1480(1–2):329–341.
7. Livingstone JR, Maruo T, Yoshida I, Tarui Y, Hirooka K, Yamamoto Y, Tsutui N,Hirasawa E: Purification and properties of betaine aldehyde dehydrogenasefrom Avena sativa. J Plant Res 2003, 116(2):133–140.
8. Oishi H, Ebina M: Isolation of cDNA and enzymatic properties of betainealdehyde dehydrogenase from Zoysia tenuifolia. J Plant Physiol 2005,162(10):1077–1086.
9. Fujiwara T, Horia K, Ozaki K, Yokota Y, Mitsuya S, Ichiyanagi T, Hattori T, Takabe T:Enzymatic characterization of peroxisomal and cytosolic betaine aldehydedehydrogenases in barley. Physiol Plant 2008, 134(1):22–30.
10. Bradbury LMT, Gillies SA, Brushett D, Waters DLE, Henry RJ: Inactivation ofan aminoaldehyde dehydrogenase is responsible for fragrance in rice.Plant Mol Biol 2008, 68(4–5):439–449.
11. Brauner F, Šebela M, Snégaroff J, Peč P, Meunier J-C: Pea seedling aminoaldehydedehydrogenase: primary structure and active site residues. Plant Physiol Biochem2003, 41(1):1–10.
12. Tylichová M, Kopečný D, Moréra S, Briozzo P, Lenobel R, Snégaroff J,Šebela M: Structural and functional characterization of plantaminoaldehyde dehydrogenase from Pisum sativum with a broadspecificity for natural and synthetic aminoaldehydes. J Mol Biol 2010,396(4):870–882.
13. Hanson AD, May AM, Grumet R, Bode J, Jamieson GC, Rhodes D: Betainesynthesis in chenopods: localization in chloroplasts. Proc Natl Acad Sci U SA 1985, 82(11):3678–3682.
14. Weretilnyk EA, Hanson AD: Betaine aldehyde dehydrogenase fromspinach leaves: purification, in vitro translation of the mRNA, andregulation by salinity. Arch Biochem Biophys 1989, 271(1):56–63.
15. Arakawa K, Takabe T, Sugiyama T, Akazawa T: Purification of betaine-aldehydedehydrogenase from spinach leaves and preparation of its antibody.J Biochem 1987, 101(6):1485–1488.
16. Valenzuela-Soto E, Muñoz-Clares RA: Purification and properties ofbetaine aldehyde dehydrogenase extracted from detached leaves ofAmaranthus hypocondriacus L. subjected to water deficit. J Plant Physiol1994, 143(2):145–152.
17. Burnet M, Lafontaine PJ, Hanson AD: Assay, purification, and partialcharacterization of choline monooxygenase from spinach. Plant Physiol1995, 108(2):581–588.
Muñoz-Clares et al. BMC Plant Biology 2014, 14:149 Page 16 of 16http://www.biomedcentral.com/1471-2229/14/149
18. Hibino T, Meng YL, Kawamitsu Y, Uehara N, Matsuda N, Tanaka Y, IshikawaH, Baba S, Takabe T, Wada K, Ishii T, Takabe T: Molecular cloning andfunctional characterization of two kinds of betaine-aldehyde dehydro-genase in betaine-accumulating mangrove Avicennia marina (Forsk.)Vierh. Plant Mol Biol 2001, 45(3):353–363.
19. Díaz-Sánchez AG, González-Segura L, Mújica-Jiménez C, Rudiño-Piñera E,Montiel C, Martínez-Castilla LP, Muñoz-Clares RA: Amino acid residuescritical for the specificity for betaine aldehyde of the plant ALDH10isoenzyme involved in the synthesis of glycine betaine. Plant Physiol2012, 158(4):1570–1582.
20. Kopěcný D, Konečitíková R, Tylichová M, Vigouroux A, Moskalíková H, SouralM, Ŝebela M, Moréra S: Plant ALDH10 family. identifying critical residuesfor substrate specificity and trapping a thiohemiacetal intermediate.J Biol Chem 2013, 288(13):9491–9507.
21. Joseph S, Murphy D, Bhave M: Glycine betaine biosynthesis in saltbushes(Atriplex spp.) under salinity stress. Biologia 2013, 68(5):879–895.
22. Richardson AO, Palmer JD: Horizontal gene transfer in plants. J Exp Bot2007, 58(1):1–9.
23. Bock R: The give-and-take of DNA: horizontal gene transfer in plants.Trends Plant Sci 2010, 15(1):11–22.
24. Lingner T, Kataya AR, Antonicelli GE, Benichou A, Nilssen K, Chen X-Y, Siemsen T,Morgenstern B, Meinicke P, Reumann S: Identification of novel plantperoxisomal targeting signals by a combination of machine learningmethods and in vivo subcellular targeting analyses. Plant Cell 2011,23(4):1556–1572.
25. Chowdhary G, Kataya ARA, Lingner T, Reumann S: Non-canonicalperoxisome targeting signals: identification of novel PTS1 tripeptidesand characterization of enhancer elements by computationalpermutation analysis. BMC Plant Biol 2012, 12:142.
26. Werdan K, Heldt HW: Accumulation of bicarbonate in intact chloroplastsfollowing a pH gradient. Biochim Biophys Acta 1972, 283(3):430–441.
27. Heineke D, Riens B, Grosse H, Hoferichter P, Peter U, Flugge U, Heldt HW:Redox transfer across the inner chloroplast envelope membrane.Plant Physiol 1991, 95(4):1131–1137.
28. Rathinasabapathi B, McCue KF, Gage DA, Hanson AD: Metabolicengineering of glycine betaine synthesis: plant betaine aldehydedehydrogenases lacking typical transit peptides are targeted to tobaccochloroplasts where they confer betaine aldehyde resistance. Planta 1994,193(2):155–162.
29. Vojtechová M, Rodríguez-Sotres R, Valenzuela-Soto EM, Muñoz-Clares RA:Substrate inhibition by betaine aldehyde dehydrogenase fromleaves of Amaranthus hypochondriacus L. Biochim Biophys Acta 1997,1341(1):49–57.
30. Garza-Ramos G, Mújica-Jiménez C, Muñoz-Clares RA: Potassium and ionicstrength effects on the conformational and thermal stability of two aldehydedehydrogenases reveal structural and functional roles of K+-binding sites.PLoS One 2013, 8(1):e54899.
31. Levitt M, Perutz MF: Aromatic rings act as hydrogen bond acceptors.J Mol Biol 1988, 201(4):751–754.
32. Alvarez S: A cartography of the van der Waals territories. Dalton Trans2013, 42(24):8617–8636.
33. Tokuriki N, Stricher F, Serrano L, Tawfik DS: How protein stability and newfunctions trade off. PLoS Comput Biol 2008, 4(2):e1000002.
34. Van de Peer Y, Fawcett JA, Proost S, Sterck L, Vandepoele K: The floweringworld: a tale of duplications. Trends Plant Sci 2009, 14(12):680–688.
35. Van de Peer Y, Maere S, Meyer A: The evolutionary significance of ancientgenome duplications. Nat Rev Genet 2009, 10(10):725–732.
36. Fawcett JA, Maere S, Van de Peer Y: Plants with double genomes mighthave had a better chance to survive the Cretaceous–Tertiary extinctionevent. Proc Natl Acad Sci U S A 2009, 106(14):5737–5742.
37. Tang H, Bowers JE, Wang X, Paterson AH: Angiosperm genomecomparisons reveal early polyploidy in the monocot lineage. Proc NatlAcad Sci U S A 2010, 107(1):472–477.
38. Flagel LE, Wendel JF: Gene duplication and evolutionary novelty inplants. New Phytol 2009, 183(3):557–564.
39. Moschou PN, Sanmartin M, Andriopoulou AH, Rojo E, Sanchez-Serrano JJ,Roubelakis-Angelakis KA: Polyamine oxidase responsible for a fullback-conversion pathway in Arabidopsis. Plant Physiol 2008,147(4):1845–1857.
40. Weigel P, Weretilnyk EA, Hanson AD: Betaine aldehyde oxidation byspinach chloroplasts. Plant Physiol 1986, 82(3):753–759.
41. Brouquisse R, Weigel P, Rhodes D, Yocum CF, Hanson AD: Evidence for aferredoxin-dependent choline monooxygenase from spinach chloroplaststroma. Plant Physiol 1989, 90(1):322–329.
42. Hanke G, Mulo P: Plant type ferredoxins and ferredoxin-dependentmetabolism. Plant Cell Environ 2013, 36(6):1071–1084.
43. Mitsuya S, Kuwahara J, Ozaki K, Saeki E, Fujiwara T, Takabe T: Isolation andcharacterization of a novel peroxisomal choline monooxygenase inbarley. Planta 2011, 234(6):1215–1226.
44. Hitz WD, Hanson AD: Determination of glycine betaine by pyrolysis-gaschromatography in cereals and grasses. Phytochemistry 1980, 19(11):2371–2374.
45. Lerma C, Rich PJ, Ju GC, Yang W-J, Hanson AD, Rhodes D: Betaine deficiencyin maize: complementation tests and metabolic basis. Plant Physiol 1991,95(4):113–1119.
46. Ishitani M, Arakawa K, Mizuno N, Kishitani S, Takabe T: Betaine aldehydedehydrogenase in the gramineae: levels in leaves of both betaine-accumulatingand nonaccumulating cereal plants. Plant Cell Physiol 1993, 34(3):493–495.
47. Chen THH, Murata N: Glycinebetaine: an effective protectant againstabiotic stress in plants. Trends Plant Sci 2008, 13(9):499–505.
48. Ahmad R, Lim CJ, Kwon S-Y: Glycine betaine: a versatile compound withgreat potential for gene pyramiding to improve crop plant performanceagainst environmental stresses. Plant Biotechnol Rep 2013, 7:49–57.
49. Flores HE, Filner P: Polyamine catabolism in higher plants: characterizationof pyrroline dehydrogenase. Plant Growth Reg 1985, 3:277–291.
50. Gill SC, von Hippel PH: Calculation of protein extinction coefficients fromamino acid sequence data. Anal Biochem 1989, 182(2):319–326.
51. Emsley P, Cowtan K: Coot: model-building tools for molecular graphics.Acta Crystallogr D Biol Crystallogr 2004, 60:2126–2132.
52. Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC,Ferrin TE: UCSF Chimera—a visualization system for exploratory researchand analysis. Comput Chem 2004, 13:1605–1612.
53. Benson DA, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW: GenBank.Nucl Acids Res 2014, 42:D32–D37.
54. Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, Fazo J, Mitros T, DirksW, Hellsten U, Putnam N, Rokhsar DS: Phytozome: a comparative platformfor green plant genomics. Nucleic Acids Res 2012, 40:D1178–D1186.
55. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliamH, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ,Higgins DG: Clustal W and Clustal X version 2.0. Bioinformatics 2007,23(21):2947–2948.
56. Madej T, Lanczycki CJ, Zhang D, Thiessen PA, Geer RC, Marchler-Bauer A,Bryant SH: MMDB and VAST+: tracking structural similarities betweenmacromolecular complexes. Nucleic Acids Res 2013, 42:D297–D303.
57. Rose PW, Bi C, Bluhm WF, Christie CH, Dimitropoulos D, Dutta S, Green RK,Goodsell DS, Prlic A, Quesada M, Quinn GB, Ramos AG, Westbrook JD,Young J, Zardecki C, Berman HM, Bourne PE: The RCSB protein databank: new resources for research and education. Nucleic Acids Res 2013,41:D475–D482.
58. Hall TA: BioEdit: a user-friendly biological sequence alignment editor andanalysis program for Windows 95/98/NT. Nucl Acids Symp Ser 1999, 41:95–98.
59. Julián-Sánchez A, Riveros-Rosas H, Martínez-Castilla LP, Velasco-García R,Muñoz-Clares RA: Phylogenetic and structural relationships of thebetaine aldehyde dehydrogenases. In Enzymology and Molecular Biology ofCarbonyl Metabolism. 13th edition. Edited by Weiner H, Plapp B, Lindahl R,Maser E. West Lafayette, IN: Purdue University Press; 2007:64–76.
60. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S: MEGA5:molecular evolutionary genetics analysis using maximum likelihood,evolutionary distance, and maximum parsimony methods. Mol Biol Evol2011, 28(10):2731–2739.
61. Whelan S, Goldman N: A general empirical model of protein evolutionderived from multiple protein families using a maximum-likelihoodapproach. Mol Biol Evol 2001, 18(5):691–699.
62. Posada D, Buckley TR: Model selection and model averaging inphylogenetics: advantages of Akaike information criterion and Bayesianapproaches over likelihood ratio tests. Syst Biol 2004, 53(5):793–808.
doi:10.1186/1471-2229-14-149Cite this article as: Muñoz-Clares et al.: Exploring the evolutionary routeof the acquisition of betaine aldehyde dehydrogenase activity by plantALDH10 enzymes: implications for the synthesis of the osmoprotectantglycine betaine. BMC Plant Biology 2014 14:149.
1
Table S1. Aldehyde dehydrogenases identified as members of ALDH10 family
Organisms with a complete sequenced genome are shaded in light blue. Phytozome data are colored green and
NCBI data black.
Organism
Lineage
Genome assembly
Number of genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Plants
Chlorophyta
Mamiellophyceae
Micromonas pusilla CCMP1545
Chlorophyta; Mamiellophyceae; Mamiellales
Phytozome: partially finished v3 genome assembly and annotation
10 660 total loci containing protein-coding transcripts
XP_003064309
(C1N8N9)
MicpuC2.e_gw1.17.47.1
(MICPUCDRAFT_23534)
505 aa
Micromonas pusilla RCC299
Chlorophyta; Mamiellophyceae; Mamiellales
Phytozome: finished v3 genome assembly and annotation
17 chromosomes
10 103 total loci containing protein-coding transcripts
Absent
Ostreococcus lucimarinus CCE9901
Chlorophyta; Mamiellophyceae; Mamiellales
Phytozome: finished v.2 genome assembly and annotation
21 chromosomes
7,796 total loci containing protein-coding transcripts
XP_001420154
estExt_Genewise_ext.C_Chr_100439
OSTLU_50750
Chromosome 10
523 aa
Ostreococcus tauri
Chlorophyta; Mamiellophyceae; Mamiellales
8 110 genes
CAL56178
XP_003081654
Ot10g02210
Chromosome 10
506 aa
Trebouxiophyceae
Coccomyxa subellipsoidea C-169
Chlorophyta; Trebouxiophyceae; Coccomyxaceae
Phytozome: improved v2 genome assembly and annotation
9 629 total loci containing protein-coding transcripts
EIE20955
I0YRD6
fgenesh1_pm.14_#_152
COCSUDRAFT_37718
498 aa
Chlorophyceae
Chlamydomonas reinhardtii (strain: CC-503 cw92 mt+)
(Chlamydomonas smithii)
Chlorophyta; Chlorophyceae; Chlamydomonadales;
Chlamydomonadaceae
Phytozome: v5.3.1 of Chlamydomonas annotations
17 linkage groups (chromosomes)
12,264 stable gene (13,450 stable transcript)
XP_001699134
A8JAX9
Cre13.g605650
CHLREDRAFT_24394
504 aa
Volvox carteri f. nagariensis (strain: Eve, forma: nagariensis)
(Green alga)
Chlorophyta; Chlorophyceae; Chlamydomonadales; Volvocaceae
Phytozome: version 2, 8x genome assembly and annotation
14,971 total loci containing protein-coding transcripts
(314 total alternatively spliced transcripts)
XP_002947147
D8TLH8
Vocar20010574m.g
VOLCADRAFT_73155
503 aa
Streptophyta
2
Organism
Lineage
Genome assembly
Number of genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Embriophyta
Physcomitrella patens subsp. patens (Moss)
Streptophyta; Embryophyta; Bryophyta; Bryophytina; Bryopsida;
Funariidae; Funariales; Funariaceae
Phytozome: v3.0 assembly (preliminary)
27 chromosomes
26610 loci containing protein-coding transcripts
XP_001756623
A9RQZ4
PHYPADRAFT_204867
507 aa
Selaginella moellendorffii (Spikemoss)
Streptophyta; Embryophyta; Tracheophyta; Lycopodiophyta;
Isoetopsida; Selaginellales; Selaginellaceae
Phytozome: v1.0 Dec 20, 2007 FilteredModels3 annotation
27 chromosomes
22 285 protein-coding transcripts
XP_002974519
D8RTU8
174224
SELMODRAFT_174224
503 aa
XP_002990779
D8T5B3
SELMODRAFT_272160
503 aa
(redundant copy because the
assembled genome includes two
haplotypes)
Gymnosperm
Picea sitchensis (Sitka spruce) (Pinus sitchensis)
Tracheophyta; Spermatophyta; Coniferopsida; Coniferales; Pinaceae
ABK24463
A9NV02
(ABK24261)
503aa
Angiosperm
Liliopsida (monocots)
Aegilops tauschii (Tausch's goatgrass)
Tracheophyta; Spermatophyta; Magnoliophyta; Liliopsida; Poales;
Poaceae; BEP clade; Pooideae; Triticeae
EMT10403
M8C5I2
F775_31015
503 aa
Aeluropus lagopoides
Liliopsida; Poales; Poaceae; PACMAD clade; Chloridoideae;
Cynodonteae; Aeluropinae
AEV53927
G9JNB0
334 aa (fragment)
Agropyron cristatum
(Crested wheatgrass) (Bromus cristatus)
Liliopsida; Poales; Poaceae; BEP clade; Pooideae; Triticeae
ACZ67850
D2K6H6
394 aa (fragment)
Brachypodium distachyon
Liliopsida; Poales; Poaceae; BEP clade; Pooideae; Brachypodieae
BioProject PRNJ74771
v1.0 8x assembly
5 chromosomes
26,552 loci containing protein-coding transcripts
XP_003579919
LOC100826926
Chromosome 5
506 aa
XP_003574495
LOC100845613
Chromosome 3
501 aa
3
Organism
Lineage
Genome assembly
Number of genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Hordeum brevisubulatum
Liliopsida; Poales; Poaceae; BEP clade; Pooideae; Triticeae
AAS66641
505aa
Hordeum vulgare subsp. vulgare (domesticated barley)
Tracheophyta; Spermatophyta; Magnoliophyta; Liliopsida; Poales;
Poaceae; BEP clade; Pooideae; Triticeae /cultivar="Haruna-nijyo
BAB62846
BBD2
503 aa
BAB62847
BBD1
506 aa
Hordeum vulgare
Liliopsida; Poales; Poaceae; BEP clade; Pooideae; Triticeae /cultivar:
116 Jeonju Native Korca
Q40024
LOC548296
505 aa
(BAB62847 ortolog)
Leymus chinensis
Liliopsida; Poales; Poaceae; BEP clade; Pooideae; Triticeae
BAD86758
LcBADH2
502 aa
BAM10432
BAD86757
LcBADH1
506 aa
Lolium perenne (Perennial ryegrass)
Liliopsida; Poales; Poaceae; BEP clade; Pooideae; Poeae; Loliinae
AFA36547
288aa (fragment)
Ophiopogon japonicus
Liliopsida; Asparagales; Asparagaceae; Nolinoideae
ABG34273
Q153G6
BADH
500 aa
Oryza sativa Japonica Group
Tracheophyta; Spermatophyta; Magnoliophyta; Liliopsida; Poales;
Poaceae; BEP clade; Ehrhartoideae; Oryzeae
BioProject PRJNA13141:
30534 genes (12 chromosomes)
28555 proteins
BioProject PRJNA13139
37032 genes (12 chromosomes)
35394 proteins
XP_482470
NP_001061833
BADH2
Os08g0424500
(Phytozome: LOC_Os08g32870)
Chromosome 8
503 aa
O24174
NP_001053016
Os04g0464200
BADH1
(Phytozome: LOC_Os04g39020)
Chromosome 4
505 aa
4
Organism
Lineage
Genome assembly
Number of genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Oryza sativa Indica Group (long-grained rice)
Tracheophyta;Spermatophyta; Magnoliophyta; Liliopsida; Poales;
Poaceae; BEP clade; Ehrhartoideae; Oryzeae
BioProject PRJN361
39285 genes (12 chromosomes)
37358 proteins
CM000133
Chromosome 8
503 aa
(unreported gen)
EEC77423
B8AV58
OsI_16212
Chromosome 4
505 aa
Almost identical to:
ABB83473
505 aa
Pandanus amaryllifolius
Liliopsida; Pandanales; Pandanaceae
AFD62259
H9BS79
BADH2
387 aa (fragment)
Setaria italica (foxtail millet)
Liliopsida; Poales; Poaceae; PACMAD clade; Panicoideae; Paniceae
chromosome-scale release of the 8.3x whole genome shotgun
assembly (98.9% of the sequence data is represented in the 9
pseudomolecules).
35,471 loci containing protein-coding transcripts
Si009902m
Si009902m.g
505 aa
Si013592m
Si013592m.g
505 aa
Sorghum bicolor (sorghum)
Liliopsida; Poales; Poaceae; PACMAD clade; Panicoideae;
Andropogoneae
NCBI: BioProject PRJNA38691, PRJNA13876; Assembly: Sorbi1
10 chromosomes
33,080 genes
Phytozome: v1.0 release, comprising the Sbi1 assembly and Sbi1.4
gene set
20 chromosomes
34,496 loci containing protein-coding transcripts
AAC49268_NC012875
SORBIDRAFT_06q019200
Chromosome 6
506 aa
XP_002444357
SORBIDRAFT_07g020650
Chromosome 7
505 aa
Triticum aestivum (bread wheat)
Liliopsida; Poales; Poaceae; BEP clade; Pooideae; Triticeae
AAL05264
CBK51339
503 aa
Triticum urartu (Red wild einkorn)
Tracheophyta; Spermatophyta; Magnoliophyta; Liliopsida; Poales;
Poaceae; BEP clade; Pooideae; Triticeae
EMS68665
M8ATW1
TRIUR3_04819
417 aa
EMS48376
M7YN64
TRIUR3_05640
544 aa
5
Organism
Lineage
Genome assembly
Number of genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Zea mays
Liliopsida; Poales; Poaceae; PACMAD clade; Panicoideae;
Andropogoneae
BioProject B73RefGen_V2
10 chromosomes
39 454 genes
AAT70230
NP_001105781
LOC606443
AMADH1B
505 aa
NP_001157807
(PDB: 4I8P)
C0P9J6
gpm154
AMADH1A
LOC100302679
505 aa
NP_001157804
AMADH2
506 aa
Zoysia tenuifolia
Liliopsida; Poales; Poaceae; PACMAD clade; Chloridoideae;
Zoysieae; Zoysiinae
BAD34955
clone="18
BAD34954
BAD34947
504 aa
BAD34957
clone="ZBD1"
BAD34952
BAD34953
507 aa
Eudicotyledons
Amaranthus hypochondriacus (grain amaranth)
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
O04895
AAB58165
BADH4
Chloroplastic
501 aa
AAB70010
O22467
BADH17
500 aa
Ammopiptanthus mongolicus
eudicotyledons; core eudicotyledons; rosids; fabids; Fabales;
Fabaceae; Papilionoideae; Thermopsideae
ABC86862
Q2I693
241aa (fragment)
6
Organism
Lineage
Genome assembly
Number of genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Aquilegia coerulea Goldsmith (Rocky mountain columbine;
Colorado blue columbine)
Streptophyta; Embryophyta; Tracheophyta; Spermatophyta;
Magnoliophyta; eudicotyledons; Ranunculales; Ranunculaceae
Phytozome: initial 8X unmapped genome assembly and the version
1.1 annotation
24,823 loci containing protein-coding transcripts
Aquca_002_00337
503 aa
Arabidopsis lyrata subsp. lyrata (Lyre-leaved rock-cress)
eudicotyledons; core eudicotyledons; rosids; malvids; Brassicales;
Brassicaceae
Phytozome: includes JGI release v1.0
32670 protein-coding transcripts
XP_002877599
D7LRJ5
ARALYDRAFT_906054
937567
ALDH10A9
503 aa
XP_002889000
D7KSI7
ARALYDRAFT_895359
926872
ALDH10A8
501 aa
Arabidopsis thaliana (thale cress)
eudicotyledons; core eudicotyledons; rosids; malvids; Brassicales;
Brassicaceae; Camelineae
Phytozome; includes TAIR annotation release 10 of the Arabidopsis
thaliana genome release 9
27416 loci containing protein-coding transcripts
AAL34161
NP_190400
ALDH10A9
AT3G48170
Chromosome 3
Peroxisome
503 aa
Q9S795
NP_565094
NP_001185399
ALDH10A8
AT1G74920
ALDH10A8
Chromosome 1
501 aa
Atriplex centralasiatica
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
AAM19157
Q8L8I3
BADH
500aa
Atriplex hortensis
(Mountain spinach)
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
ABF72123
Q19QV8
(P42575; same gene)
500 aa
7
Organism
Lineage
Genome assembly
Number of genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Atriplex micrantha
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
ABM97658
A2TJX7
BADH
500 aa
Atriplex prostrata
(Spear-leaved orache) (Atriplex triangularis)
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
AAM08913
Q8RX99
BADH1
(AAP13999)
500 aa
AAM08914
Q8RX98
BADH2
424 aa (fragment)
Atriplex tatarica
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
ABQ18317
A5HMM5
BADH
500 aa
Avicennia marina (betaine-accumulating mangrove) (Grey
mangrove) (Sceura marina)
eudicotyledons; core eudicotyledons; asterids; lamiids; Lamiales;
Acanthaceae; Avicennioideae
BAB18544
Q9FRY2
Clone 13
502 aa
BAB18543
Q9FRY3
Clone 2
503 aa
Beta vulgaris (sugar beet)
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
NCBI: BioProject PRJNA176558, assembly BwSeq-1
9 chromosomes
P28237
CAA41377
CAA41376
Chloroplastic
500 aa
(not reported yet at NCBI)
Locus 1572
500 aa
Brassica napus (rape)
eudicotyledons; core eudicotyledons; rosids; malvids; Brassicales;
Brassicaceae; Brassiceae
AAQ55493
503 aa
Brassica rapa (field mustard; Turnip)
eudicotyledons; core eu dicotyledons; rosids; malvids; Brassicales;
Brassicaceae; Brassiceae
Phytozome: v1.2 annotation is on assembly v1.1.
26,374 total loci containing protein-coding transcripts
Bra003781
Bra003781
501 aa
Bra019528
Bra019528
503 aa
8
Organism
Lineage
Genome assembly
Number of genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Camellia sinensis (tea)
eudicotyledons; core eudicotyledons; asterids; Ericales; Theaceae
AFP19449
I7F760
505 aa
Capsella rubella
Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core
eudicotyledons; rosids; malvids; Brassicales; Brassicaceae;
Camelineae
Phytozome: initial 22x genome assembly and a preliminary
annotation
26,521 total loci containing protein-coding transcripts
EOA23841
CARUB_v10017058mg
503 aa
EOA35065
CARUB_v10020176mg
501 aa
Carthamus tinctorius (safflower)
eudicotyledons; core eudicotyledons; asterids; campanulids;
Asterales; Asteraceae; Carduoideae; Cardueae; Centaureinae
ADW80905
G8D1H1
510 aa
Chorispora bungeana
eudicotyledons; core eudicotyledons; rosids; malvids; Brassicales;
Brassicaceae; Chorisporeae
AAV67891
Q5Q033
502 aa
Chrysanthemum lavandulifolium (Daisy) (Dendranthema
lavandulifolium)
eudicotyledons; core eudicotyledons; asterids; campanulids;
Asterales; Asteraceae; Asteroideae; Anthemideae; Artemisiinae
AAY33871
Q4U5F2
DlBADH1
503 aa
AAY33872
Q4U5F1
DlBADH2
506 aa
Cicer arietinum (chickpea) (garbanzo)
Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core
eudicotyledons; rosids; fabids; Fabales; Fabaceae; Papilionoideae;
Cicereae
XP_004508822
LOC101506136
Chromosome Ca7
503 aa
XP_004501961
LOC101507930
Chromosome Ca5
503 aa
Citrus clementina (clementine mandarin)
Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core
eudicotyledons; rosids; malvids; Sapindales; Rutaceae
Phytozome: (clementine1.0) integrates 1.560 M ESTs with homology
and ab initio-based gene predictions.
9 chromosomes
24,533 protein-coding loci
Ciclev10019800m
Ciclev10019800m.g
505 aa
Ciclev10019822m
Ciclev10019800m.g
501 aa
(same locus, alternative transcript)
9
Organism
Lineage
Genome assembly
Number of genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Citrus sinensis (sweet orange)
Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core
eudicotyledons; rosids; malvids; Sapindales; Rutaceae
Phytozome: (v.1) of the assembly is 319 Mb spread over 12,574
scaffolds
25,376 protein-coding loci
Located in scaffold_0016_14
(non-reported gene)
Corylus heterophylla
eudicotyledons; core eudicotyledons; rosids; fabids; Fagales;
Betulaceae
ADW80331
E9NRZ9
503 aa
Cucumis melo (muskmelon)
eudicotyledons; core eudicotyledons; rosids; fabids; Cucurbitales;
Cucurbitaceae; Benincaseae
AEK81574
G1EH68
503 aa
Cucumis sativus (cucumber)
eudicotyledons; core eudicotyledons; rosids; fabids; Cucurbitales;
Cucurbitaceae; Benincaseae
Phytozome: Roche/JGI v1 annotation
21491 loci containing protein-coding transcripts
XP_004138132
LOC101213657
Cucsa.197230
503 aa
Fragaria vesca subsp. Vesca (woodland strawberry)
Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core
eudicotyledons; rosids; fabids; Rosales; Rosaceae; Rosoideae;
Potentilleae; Fragariinae
NCBI BioProject: PRJNA66853, PRJNA60037; assembly:
FraVesHawaii_1.0
7 chromosomes
25,247 genes
23,319 proteins
Phytozome: includes gene annotation of genome release 1.1.
7 chromosomes
32,831 loci containing protein-coding transcripts
XP_004299590
mrna09492.1-v1.0-hybrid
LOC101300913
gene09492-v1.0-hybrid
505 aa
10
Organism
Lineage
Genome assembly
Number of genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Glycine max (Soybean) (Glycine hispida)
eudicotyledons; core eudicotyledons; rosids; fabids; Fabales;
Fabaceae; Papilionoideae; Phaseoleae
50202 genes (20 chromosomes)
44642 proteins
Phytozome: Soybean Glyma1.0 annotation; 20 chromosomes, with a
small additional amount of mostly repetitive sequence in unmapped
scaffolds.
54,175 protein-coding loci and 73,320 transcripts have been
predicted
NP_001234990
BADH1
B0M1A6
Glyma06g19820
Chromosome 6
503 aa
NP_001238427
B0M1A5
ADN03184
BADH2
Glyma05g01770
Chromosome 5
503 aa
XP_003550754
LOC100795267
Glyma17g10120
Chromosome 17
315 aa (pseudogene)
Gossypium hirsutum (upland cotton)
eudicotyledons; core eudicotyledons; rosids; malvids; Malvales;
Malvaceae; Malvoideae
AAR23816
503 aa
Gossypium raimondii (cotton)
eudicotyledons; core eudicotyledons; rosids; malvids; Malvales;
Malvaceae; Malvoideae
Phytozome: v2.1 annotation release is on genome assembly v2.0
13 chromosomes
37,505 protein coding genes
Gorai.001G071200.2
Gorai.001G071200
503 aa
Gorai.007G047400.1
Gorai.007G047400
502 aa
Halocnemum strobilaceum
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
AFB74193
H6VX92
BADH
500 aa
Halostachys caspica
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
ABO45931
A4LAP2
500 aa
Haloxylon ammodendron
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
ACS96437
D0E0H5
BADH
500 aa
Haloxylon persicum
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
AEW31327
G9I1P9
BADH
500 aa
11
Organism
Lineage
Genome assembly
Number of genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Helianthus annuus (common sunflower)
eudicotyledons; core eudicotyledons; asterids; campanulids;
Asterales; Asteraceae; Asteroideae; Heliantheae alliance; Heliantheae
ACU65243
C8CBI9
BADH
503 aa
Jatropha curcas
eudicotyledons; core eudicotyledons; rosids; fabids; Malpighiales;
Euphorbiaceae; Crotonoideae; Jatropheae
ABO69575
B2BBY6
BADH
503 aa
(AFY98894-521aa, incorrect
insertion)
Kalidium foliatum
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
ABI95806
Q06AI9
500 aa
Ligusticum sinense
eudicotyledons; core eudicotyledons; asterids; campanulids; Apiales;
Apiaceae; Apioideae; apioid superclade; Selineae
ADL61811
H2KKR8
508 aa
Lycium barbarum (Matrimony vine)
eudicotyledons; core eudicotyledons; asterids; lamiids; Solanales;
Solanaceae; Solanoideae; Lycieae
ACQ99195
D2DEK8
503 aa
Medicago sativa (alfalfa)
eudicotyledons; core eudicotyledons; rosids; fabids; Fabales;
Fabaceae; Papilionoideae; Trifolieae
AFS33786
J9XXT4
BADH
505 aa
Medicago truncatula (barrel medic)
eudicotyledons; core eudicotyledons; rosids; fabids; Fabales;
Fabaceae; Papilionoideae; Trifolieae
45000 genes (8chromosomes)
46092 proteins
ABE82378
XP_003608928
G7JNS2
(AFK34878)
(AFK43263)
MTR_4g106510
Chromosome 4
503 aa
(ACJ85836; fragment)
AFK44601
(NC_016409)
Chromosome 3
503 aa
Mimulus guttatus (spotted monkey flower; Yellow monkey flower)
Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core
eudicotyledons; asterids; lamiids; Lamiales; Phrymaceae
Phytozome: includes the JGI gene annotation v1.1 of assembly v1.0
26718 loci containing protein-coding genes
mgv1a004929m
mgv1a004929m.g
504 aa
Morus alba var. Multicaulis (mulberry)
Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core
eudicotyledons; rosids; fabids; Rosales; Moraceae
AGG82698
M4NDU5
501 aa
12
Organism
Lineage
Genome assembly
Number of genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Panax ginseng (Korean ginseng)
eudicotyledons; core eudicotyledons; asterids; campanulids; Apiales;
Araliaceae
AAQ76705
Q6JSK3
BADH1
503 aa
Phaseolus vulgaris (Kidney bean; French bean)
eudicotyledons; core eudicotyledons; rosids; fabids; Fabales;
Fabaceae; Papilionoideae; Phaseoleae
Phytozome: release of V1.0, the first chromosome scale version
11 chromosomes
Phvul.009G182300.1
Phvul.009G182300
Chromosome 9
503 aa
Phvul.003G196700.1
Phvul.003G196700
Chromosome 3
503 aa
Pisum sativum (pea)
eudicotyledons; core eudicotyledons; rosids; fabids; Fabales;
Fabaceae; Papilionoideae; Fabeae
CAC48393
Q93YB2
(PDB: 3IWJ)
AMADH2
503 aa
CAC48392
(PDB: 3IWK)
AMADH1
503 aa
Populus euphratica (Euphrates poplar)
eudicotyledons; core eudicotyledons; rosids; fabids; Malpighiales;
Salicaceae; Saliceae
AFA53117
H6V966
BADH2
503 aa
AFA53116
H6V965
BADH1
503 aa
Populus trichocarpa (Populus balsamifera subsp. trichocarpa) (black
cottonwood tree; Western balsam poplar))
eudicotyledons; core eudicotyledons; rosids; fabids; Malpighiales;
Salicaceae; Saliceae
JGI v3.0 gene annotation of assembly v3
41335 loci containing protein-coding transcripts
73013 protein-coding trancripts
45555 protein-coding genes (19 chromosomes)
XP_002322147
(B9IEL4)
Potri.015G070600
(POPTRDRAFT_666405)
Chromosome LGXV
503 aa
XP_002318630
(B9I351)
Potri.012G075600
(POPTRDRAFT_661953)
Chromosome LGXII
503 aa
13
Organism
Lineage
Genome assembly
Number of genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Prunus persica (peach)
Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core
eudicotyledons; rosids; fabids; Rosales; Rosaceae; Maloideae;
Amygdaleae
EMJ18033
M5X943
PRUPE_ppa022568mg
503 aa
EMJ11127
M5WBJ6
PRUPE_ppa004563mg
503 aa
Pyrus betulifolia (birch-leaf pear)
eudicotyledons; core eudicotyledons; rosids; fabids; Rosales;
Rosaceae; Amygdaloideae; Maleae
AER10508
G8EWJ4
503 aa
Ricinus communis (Castor bean)
eudicotyledons; core eudicotyledons; rosids; fabids; Malpighiales;
Euphorbiaceae; Acalyphoideae; Acalypheae
Phytozome: includes TIGR/JCVI release v0.1
31221 protein-coding transcripts
XP_002511463
B9R8Y8
RCOM_1512620
30147.t000284
503 aa
Sesuvium portulacastrum (Shoreline sea purslane) (Portulaca
portulacastrum)
eudicotyledons; core eudicotyledons; Caryophyllales; Aizoaceae
AEK98521
G1FG21
500aa
Solanum lycopersicum (tomato) (Lycopersicon esculentum)
eudicotyledons; core eudicotyledons; asterids; lamiids; Solanales;
Solanaceae; Solanoideae; Solaneae
NCBI BioProject: PRJNA66163, assembly SL2.40
12 chromosomes
27,466 genes
Phytozome: ITAG2.3 made by International Tomato Annotation
Group (ITAG) on assembly ITAG2.3
12 chromosomes
34,727 loci containing protein-coding transcripts
AAX73303
Q56R04
(PDB: 4I8Q; 4I9B)
Chromosome 6
AMADH1
Solyc06g071290.2
504 aa
NP_001234235
B6ECN9
Chromosome 3
AMADH2
Solyc03g113800.2
505 aa
Solanum torvum
eudicotyledons; core eudicotyledons; asterids; lamiids; Solanales;
Solanaceae; Solanoideae; Solaneae
AFA37976
H6V8T9
BADH
505 aa
Solanum tuberosum (Potato) (SOLTU)
eudicotyledons; core eudicotyledons; asterids; lamiids; Solanales;
Solanaceae; Solanoideae; Solaneae
Phytozome: Solanum tuberosum Group Phureja DM1-3 516R44
(CIP801092) Genome Annotation v3.4 mapped to pseudomolecule
sequence
12 chromosomes
35,119 loci containing protein-coding transcripts
PGSC0003DMP400055759
PGSC0003DMT400083025
PGSC0003DMG400033028
504 aa
PGSC0003DMP400042549
PGSC0003DMT400063205
PGSC0003DMG400024582
505 aa
14
Organism
Lineage
Genome assembly
Number of genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Spinacia oleracea (spinach)
Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core
eudicotyledons; Caryophyllales; Amaranthaceae; Chenopodioideae;
Anserineae
P17202
(PDB: 4A0M)
chloroplastic
497 aa
Suaeda liaotungensis
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
AAL33906
Q8W5A1
BADH
501aa
Suaeda maritima (Annual sea blite) (Suaeda spicata)
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
AFW04226
K7T3P2
501 aa
Suaeda salsa (Seepweed) (Chenopodium salsum)
eudicotyledons; core eudicotyledons; Caryophyllales; Amaranthaceae
ABG23669
Q155V4
BADH
501 aa
Theobroma cacao cultivar="Matina 1-6" (cacao) (cocoa)
Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core
eudicotyledons; rosids; malvids; Malvales; Malvaceae;
Byttnerioideae
Phytozome: V1.1 release
29,452 total loci containing protein-coding transcripts
EOY20585
Domain 1
TCM_011968
Chromosome 3
503 aa
EOY20585
Domain 2
TCM_011968
Chromosome 3
511 aa
Comprise two adjacent ALDH10
genes
15
Organism
Lineage
Genome assembly
Number of genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Vitis vinifera (wine grape)
eudicotyledons; core eudicotyledons; rosids; Vitales; Vitaceae
NCBI BioProject: PRJNA34679
19 chromosomes
24,508 genes
Phytozome: 12X March 2010 release of the draft genome and
annotation of Vitis vinifera by the French-Italian Public Consortium
for Grapevine Genome Characterization
19 chromosomes
26346 loci containing protein-coding transcripts
XP_003634296
LOC100251899
Chromosome 17
497 aa
(CBI15092) fragment
XP_002283690
D7SHY3
GSVIVT01007829001
LOC100246770
GSVIVT01007829001
Chromosome 17
503 aa
XP_002281984
GSVIVG01032588001
LOC100250859
GSVIVG01032588001
Chromosome 14
499 aa
Fungi
Ascomycota
Schizosaccharomyces pombe (fission yeast)
Fungi; Dikarya; Ascomycota; Taphrinomycotina;
Schizosaccharomycetes; Schizosaccharomycetales;
Schizosaccharomycetaceae
Bioproject: PRJNA127, PRJNA13836, PRJNA20755
3 chromosomes
5133 protein coding genes
O59808
NP_588102
SPCC550.10
Chromosome III
500 aa
Protist
Alveolata
Perkinsus marinus ATCC 50983
Eukaryota; Alveolata; Perkinsea; Perkinsida; Perkinsidae
Bioproject: PRJNA46451, PRJNA12737
23654 protein coding genes
XP_002784436
Pmar_PMAR003695
394aa (fragment)
incomplete on both ends
Cryptophyta
Guillardia theta CCMP2712
Eukaryota; Cryptophyta; Pyrenomonadales; Geminigeraceae
Bioproject: PRJNA223305, PRJNA53577
24822 protein coding genes
EKX35999
GUITHDRAFT_117911
542 aa
Stramenopiles
Ectocarpus siliculosus (brown alga)
Eukaryota; Stramenopiles; PX clade; Phaeophyceae; Ectocarpales;
Ectocarpaceae
Bioproject: PRJEA42625
4005 protein coding genes
CBN79171
Esi_0010_0069
517 aa
16
Organism
Lineage
Genome assembly
Number of genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Phytophthora sojae
Eukaryota; Stramenopiles; Oomycetes; Peronosporales
Bioproject: PRJNA17989
26489 protein coding genes
EGZ28599
G4YMH3
PHYSODRAFT_248120
473 aa
EGZ28472
G4YIQ2
PHYSODRAFT_284276
418 aa
Bacteria
Alphaproteobacteria
Rhizobium leguminosarum bv. trifolii WSM2297
Bacteria; Proteobacteria; Alphaproteobacteria; Rhizobiales;
Rhizobiaceae; Rhizobium/Agrobacterium group
WP_003575619
EJC83393
ZP_18315655
Rleg4DRAFT_5155
496 aa
Rhizobium sp. CCGE 510
Bacteria; Proteobacteria; Alphaproteobacteria; Rhizobiales;
Rhizobiaceae; Rhizobium/Agrobacterium group
EJT02404
ZP_10838419
J6DKY5
RCCGE510_27926
plasmid="pRspCCGE510d"
496 aa
Betaproteobacteria
Burkholderia ambifaria MC40-6
Betaproteobacteria; Burkholderiales; Burkholderiaceae; Burkholderia
6784 genes
ZP_01555907
YP_001809915
BamMC406_3226
Chromosome 2
493 aa
Burkholderia multivorans ATCC 17616
Betaproteobacteria; Burkholderiales; Burkholderiaceae; Burkholderia
6373 genes
ZP_01568328
YP_001585229
Bmul_5267
Chromosome 2
493 aa
Burkholderia pseudomallei 305
Betaproteobacteria; Burkholderiales; Burkholderiaceae; Burkholderia
ZP_01765086
BURPS305_6395
482 aa
Burkholderia sp. 383
Burkholderia cepacia R18194
Betaproteobacteria; Burkholderiales; Burkholderiaceae; Burkholderia
7824 genes
YP_366413
Bcep18194_C6720
Chromosome 3
490 aa
Gammaproteobacteria
Colwellia psychrerythraea 34H (COLPS)
5054 genes (chromosome)
YP_266864
CPS_0096
491 aa
17
Organism
Lineage
Genome assembly
Number of genes
ALDH10
Protein accession number
Gene locus
Amino acid sequence length
Pseudomonas fluorescens Pf0-1
Gammaproteobacteria; Pseudomonadales; Pseudomonadaceae;
Pseudomonas
5833 genes (chromosome)
YP_350198
Pfl01_4470
483 aa
Pseudomonas protegens Pf-5 (PSEF5)
Gammaproteobacteria; Pseudomonadales; Pseudomonadaceae;
Pseudomonas
6233 genes (chromosome)
YP_261892
PFL_4811
482 aa
YP_260018
PFL_2912
476 aa
Pseudomonas putida W619 (PSEPW)
5309 genes (chromosome)
ZP_01639992
YP_001751324
PputW619_4475
476 aa
C446
W448
W461
ZmAMADH1a
A441
W443
W456
SoBADH
I444
W459
W446
PsAMADH2 SlAMADH1
A
C441
W443
W456
B
S441
W443
W456
T441
W443
W456
V441
W443
W456
F441
W443
W456
I441
W443
W456
W460
W447
I445
Figure S1 Interactions of the
residue at position equivalent
to Ala441 of SoBADH. (A)
Molecular surface
representations showing the side-
chains of residues at position
441, 443 and 456 (SoBADH
numbering) in the known crystal
structures of plant ALDH10
enzymes. (B) Molecular surface
representations of the same
residues in the minimized
models of the in silico SoBADH
mutants in which A441 was
changed. Side-chains are shown
as sticks with oxygen atoms in
red, nitrogen in blue, and sulfur
in yellow. Carbon atoms are
colored as shown. The following
PDB codes were used:
SoBADH, 4A0M; PsAMADH2,
3IWJ; ZmAMADH1a, 4I8P;
SlAMADH1, 4I9B. The figure
was generated using PyMOL
(www.pymol.org).
Table S2. Distances of the side-chain atoms of residue at position 441 to their closets neighbor
Enzyme (residue) Atoms Distance (Å)
SoBADH (A441) CB A441-CZ3 W456 3.71
ZmAMADH1 (C446) CB C446-CZ3 W461 3.90
SG C446-CZ3 W461 3.94
SG C446-CZ2 W448 3.62
SG C446-CE2 W448 3.52
SG C446-NE1 W448 3.76
PsAMADH2b (I444) CD1 I444-CG W459 3.90
CD1 I444-CD2 W459 3.39
CD1 I444-CE2 W459 3.54
CD1 I444-CZ2 W459 3.85
CD1 I444-CZ3 W459 3.84
CD1 I444-CE3 W459 3.55
CG2 I444-CH2 W446 3.74
CG2 I444-CZ2 W446 3.59
CG2 I444-CE2 W446 3.81
SlAMADH (I445) CD1 I445-CH2 W460 3.81
CD1 I445-CD2 W460 3.51
CD1 I445-CE2 W460 3.56
CD1 I445-CZ2 W460 3.74
CD1 I445-CZ3 W460 3.76
CD1 I445-CE3 W460 3.62
CG2 I445-CH2 W447 3.81
CG2 I445-CZ2 W447 3.54
CG2 I445-CE2 W447 3.67
SoBADH A441C mutant CB C441-CZ3 W456 3.67
SG C441-CZ3 W456 3.94
SG C441-CE2 W443 3.74
SoBADH A441S mutant CB S441-CZ3 W456 3.67
OG S441-CZ3 W456 3.56
OG S441-CE3 W456 3.50
SoBADH A441T mutant CB T441-CZ3 W456 3.69
CG2 T441-CZ3 W456 3.72
OG1 T441-CZ3 W456 3.34
OG1 T441-CE3 W456 3.30
SoBADH A441V mutant CG2 V441-CZ3 W456 3.73
CG2 V441-CE3 W456 3.70
SoBADH A441F mutant CG F441-CZ3 W456 3.60
CG F441-CE3 W456 3.56
CD1 F441-CE3 W456 3.70
CE1 F441-CE3 W456 3.67
CZ F441-CE3 W456 3.48
CZ F441-CD2 W456 3.70
CZ F441-CG W456 3.74
CD2 F441-CZ3 W456 3.59
CD2 F441-CE3 W456 3.36
CE2 F441-CE3 W456 3.32
CE2 F441-CD2 W456 3.40
CE2 F441-CG W456 3.71
CZ F441-CE3 W456 3.48
CZ F441-CD2 W456 3.70
CZ F441-CG W456 3.74
SoBADH A441I mutant CD1 I441-CZ3 W456 3.41
CD1 I441-CH2 W456 3.86
CD1 I441-CE3 W456 3.17
CD1 I441-CD2 W456 3.41
CD1 I441-CE2 W456 3.85
aThe distances given are the average of those observed in the two monomers of each crystal structure or model. The cutoff was made at 4.0 Å. bThe crystal
structure of the PsAMDH2 enzyme is very similar in this region to that of PsAMDH1, which is not included here for this reason. The PDB codes of the crystal
structures are: SoBADH, 4A0M; ZmAMADH1a, 4I8P; PsAMADH2, 3IWJ; SlAMADH1, 4I9B.