Relative Mass Defect Filtering of Mass Spectra: A Path to · metabolites of relatively low...

12
Breakthrough Technologies Relative Mass Defect Filtering of Mass Spectra: A Path to Discovery of Plant Specialized Metabolites 1[OPEN] E.A. Prabodha Ekanayaka 2 , Mary Dawn Celiz 3 , and A. Daniel Jones* Department of Chemistry (E.A.P.E., A.D.J.) and Department of Biochemistry and Molecular Biology (M.D.C., A.D.J.), Michigan State University, East Lansing, Michigan 48824 ORCID ID: 0000-0002-7408-6690 (A.D.J.). The rapid identication of novel plant metabolites and assignments of newly discovered substances to natural product classes present the main bottlenecks to dening plant specialized phenotypes. Although mass spectrometry provides powerful support for metabolite discovery by measuring molecular masses, ambiguities in elemental formulas often fail to reveal the biosynthetic origins of specialized metabolites detected using liquid chromatography-mass spectrometry. A promising approach for mining liquid chromatography-mass spectrometry metabolite proling data for specic metabolite classes is achieved by calculating relative mass defects (RMDs) from molecular and fragment ions. This strategy enabled the rapid recognition of an extensive range of terpenoid metabolites in complex plant tissue extracts and is independent of retention time, abundance, and elemental formula. Using RMD ltering and tandem mass spectrometry data analysis, 24 novel elemental formulas corresponding to glycosylated sesquiterpenoid metabolites were identied in extracts of the wild tomato Solanum habrochaites LA1777 trichomes. Extensive isomerism was revealed by ultra-high-performance liquid chromatography, leading to evidence of more than 200 distinct sesquiterpenoid metabolites. RMD ltering led to the recognition of the presence of glycosides of two unusual sesquiterpenoid cores that bear limited similarity to known sesquiterpenes in the genus Solanum. In addition, RMD ltering is readily applied to existing metabolomics databases and correctly classied the annotated terpenoid metabolites in the public metabolome database for Catharanthus roseus. Plant metabolic networks generate amazing chemical diversity, but our understanding of the genetic factors responsible for plant chemistry remains primitive. The discovery and identication of metabolites has posed the greatest bottleneck in recent efforts to exploit meta- bolomics to address questions about the basis for bio- synthetic diversity in the plant kingdom (Ji et al., 2009; Zhou et al., 2012). Since the specialized metabolism of nonmodel plants is taxonomically restricted, metabolite databases offer a poor representation of plant chemical diversity, and de novo recognition and discovery of metabolite chemistry is necessary. A common strategy for metabolite discovery has often started with the gen- eration of tandem mass spectrometry (MS/MS) spectra, usually beginning with the most abundant metabolites, and uses characteristic fragment ions to assign metabolites to a particular class of compounds. Flavonoid identi ca- tion from MS/MS spectra is often successful because most avonoids yield MS/MS fragment ions characteristic of their avonoid cores (Ma et al., 1997; Li et al., 2013). However, when MS/MS spectra fail to display class- characteristic fragment ions, the recognition of a metabo- lites structural class is less obvious. Specialized plant metabolites are often grouped as polyphenolic, terpenoid, alkaloid, polyketide, or fatty acid metabolites based upon the biosynthesis of their core scaffolds, which often undergo subsequent metabolic decoration such as glycosylation. Among phytochemi- cals, terpenoids offer perhaps the greatest structural diversity. This feature makes them useful as chemical defenses and as the foundation for candidate drugs (Ajikumar et al., 2008; Goodger and Woodrow, 2011), and the commercial importance of terpenes makes their discovery and synthesis an important research focus (Zwenger and Basu, 2008). Terpenoids exhibit remark- able structural diversity resulting from varied metabolic cyclizations, oxidations, rearrangements, and branching reactions (Chappell, 1995; Mizutani and Ohta, 2010) and from diversity in glycosylation (Dembitsky, 2006; Goodger and Woodrow, 2011). Such structural diversity challenges investigators to recognize novel terpenoids in a complex matrix (Pfander and Stoll, 1991; Fraga, 2012), because few features in the MS/MS spectra of nonvolatile terpenoids provide reliable keys for their annotation as terpenoids. As a result, nonvolatile terpenoids represent an underappre- ciated group of plant specialized metabolites. 1 This work was supported by the National Science Foundation (grant nos. IOS1025636 and DBI0604336), the National Institutes of Health (grant no. 1RC2 GM092521), and Michigan AgBioResearch (grant no. MICL02143). 2 Present address: MPI Research, 54943 North Main Street, Matta- wan, MI 49071. 3 Present address: Maryland Department of Health and Mental Hygiene, 201 West Preston Street, Baltimore, MD 21201. * Address correspondence to [email protected]. The author responsible for distribution of materials integral to the ndings presented in this article in accordance with the policy de- scribed in the Instructions for Authors (www.plantphysiol.org) is: A. Daniel Jones ([email protected]). [OPEN] Articles can be viewed without a subscription. www.plantphysiol.org/cgi/doi/10.1104/pp.114.251165 Plant Physiology Ò , April 2015, Vol. 167, pp. 12211232, www.plantphysiol.org Ó 2014 American Society of Plant Biologists. All Rights Reserved. 1221

Transcript of Relative Mass Defect Filtering of Mass Spectra: A Path to · metabolites of relatively low...

Page 1: Relative Mass Defect Filtering of Mass Spectra: A Path to · metabolites of relatively low molecular mass, such mea-surements provide sufficient information to assign mo-lecular

Breakthrough Technologies

Relative Mass Defect Filtering of Mass Spectra: A Path toDiscovery of Plant Specialized Metabolites1[OPEN]

E.A. Prabodha Ekanayaka2, Mary Dawn Celiz3, and A. Daniel Jones*

Department of Chemistry (E.A.P.E., A.D.J.) and Department of Biochemistry and Molecular Biology (M.D.C.,A.D.J.), Michigan State University, East Lansing, Michigan 48824

ORCID ID: 0000-0002-7408-6690 (A.D.J.).

The rapid identification of novel plant metabolites and assignments of newly discovered substances to natural product classespresent the main bottlenecks to defining plant specialized phenotypes. Although mass spectrometry provides powerful supportfor metabolite discovery by measuring molecular masses, ambiguities in elemental formulas often fail to reveal the biosyntheticorigins of specialized metabolites detected using liquid chromatography-mass spectrometry. A promising approach for miningliquid chromatography-mass spectrometry metabolite profiling data for specific metabolite classes is achieved by calculatingrelative mass defects (RMDs) from molecular and fragment ions. This strategy enabled the rapid recognition of an extensiverange of terpenoid metabolites in complex plant tissue extracts and is independent of retention time, abundance, and elementalformula. Using RMD filtering and tandem mass spectrometry data analysis, 24 novel elemental formulas corresponding toglycosylated sesquiterpenoid metabolites were identified in extracts of the wild tomato Solanum habrochaites LA1777 trichomes.Extensive isomerism was revealed by ultra-high-performance liquid chromatography, leading to evidence of more than 200distinct sesquiterpenoid metabolites. RMD filtering led to the recognition of the presence of glycosides of two unusualsesquiterpenoid cores that bear limited similarity to known sesquiterpenes in the genus Solanum. In addition, RMD filtering isreadily applied to existing metabolomics databases and correctly classified the annotated terpenoid metabolites in the publicmetabolome database for Catharanthus roseus.

Plant metabolic networks generate amazing chemicaldiversity, but our understanding of the genetic factorsresponsible for plant chemistry remains primitive. Thediscovery and identification of metabolites has posed thegreatest bottleneck in recent efforts to exploit meta-bolomics to address questions about the basis for bio-synthetic diversity in the plant kingdom (Ji et al., 2009;Zhou et al., 2012). Since the specialized metabolism ofnonmodel plants is taxonomically restricted, metabolitedatabases offer a poor representation of plant chemicaldiversity, and de novo recognition and discovery ofmetabolite chemistry is necessary. A common strategyfor metabolite discovery has often started with the gen-eration of tandem mass spectrometry (MS/MS) spectra,usually beginning with the most abundant metabolites,

and uses characteristic fragment ions to assign metabolitesto a particular class of compounds. Flavonoid identifica-tion fromMS/MS spectra is often successful because mostflavonoids yield MS/MS fragment ions characteristic oftheir flavonoid cores (Ma et al., 1997; Li et al., 2013).However, when MS/MS spectra fail to display class-characteristic fragment ions, the recognition of a metabo-lite’s structural class is less obvious.

Specialized plant metabolites are often grouped aspolyphenolic, terpenoid, alkaloid, polyketide, or fattyacid metabolites based upon the biosynthesis of their corescaffolds, which often undergo subsequent metabolicdecoration such as glycosylation. Among phytochemi-cals, terpenoids offer perhaps the greatest structuraldiversity. This feature makes them useful as chemicaldefenses and as the foundation for candidate drugs(Ajikumar et al., 2008; Goodger and Woodrow, 2011),and the commercial importance of terpenes makes theirdiscovery and synthesis an important research focus(Zwenger and Basu, 2008). Terpenoids exhibit remark-able structural diversity resulting from varied metaboliccyclizations, oxidations, rearrangements, and branchingreactions (Chappell, 1995; Mizutani and Ohta, 2010) andfrom diversity in glycosylation (Dembitsky, 2006; Goodgerand Woodrow, 2011). Such structural diversity challengesinvestigators to recognize novel terpenoids in a complexmatrix (Pfander and Stoll, 1991; Fraga, 2012), because fewfeatures in the MS/MS spectra of nonvolatile terpenoidsprovide reliable keys for their annotation as terpenoids. Asa result, nonvolatile terpenoids represent an underappre-ciated group of plant specialized metabolites.

1 This work was supported by the National Science Foundation(grant nos. IOS–1025636 and DBI–0604336), the National Institutes ofHealth (grant no. 1RC2 GM092521), and Michigan AgBioResearch(grant no. MICL02143).

2 Present address: MPI Research, 54943 North Main Street, Matta-wan, MI 49071.

3 Present address: Maryland Department of Health and MentalHygiene, 201 West Preston Street, Baltimore, MD 21201.

* Address correspondence to [email protected] author responsible for distribution of materials integral to the

findings presented in this article in accordance with the policy de-scribed in the Instructions for Authors (www.plantphysiol.org) is:A. Daniel Jones ([email protected]).

[OPEN] Articles can be viewed without a subscription.www.plantphysiol.org/cgi/doi/10.1104/pp.114.251165

Plant Physiology�, April 2015, Vol. 167, pp. 1221–1232, www.plantphysiol.org � 2014 American Society of Plant Biologists. All Rights Reserved. 1221

Page 2: Relative Mass Defect Filtering of Mass Spectra: A Path to · metabolites of relatively low molecular mass, such mea-surements provide sufficient information to assign mo-lecular

Advances in chromatography and mass spectrome-try (MS) have enabled the detection of a broad range ofnatural products, and characteristic ions in massspectra have been useful for distinguishing compoundclasses. While gas chromatography-MS has enabled theidentification of volatile and semivolatile terpenes fordecades, it is not a suitable approach for nonvolatileconjugated terpenoids unless they are first cleaved toform volatile products or derivatized to increase volatility.Furthermore, MS/MS fragment ions characteristic ofterpenoid glycosides have yet to be documented, andthe characterization of conjugated terpenoids has beenlimited largely to saponins that share a common steroidalor triterpenoid core (Challinor and De Voss, 2013). Incontrast with other specialized metabolite classes, thediversity of terpenoid cores dictates that fragment ionsspecific to terpenoids often fail to provide for theuniversal recognition of metabolites within this class,particularly for two situations: (1) when terpenoids areglycosylated and MS/MS spectra are dominated byfragment ions derived from the carbohydrate, and(2) when mass spectra are generated in negative ionmode, which often yields limited cleavage of carbon-carbon bonds in the terpenoid core that might serve asterpenoid indicators. The structural diversity of the ter-penoid cores yields different fragments in MS/MSspectra of different nonvolatile terpenoids, as has beendemonstrated for a series of saponins (Huhman andSumner, 2002). Therefore, annotations of terpene gly-cosides in a metabolite profile have been driven by theabsence of fragment ions in mass spectra that representother classes of molecules (Ward et al., 2011).

Despite its limited capabilities in differentiating stereo-isomers, MS plays important roles in the discovery ofnatural products and the elucidation of their structures(Lei et al., 2011). Modern medium- to high-resolutionmass spectrometers have provided greater (low-ppm)mass measurement accuracy. Such mass measurementerrors may be more pronounced than measurements foran individual sample when they represent an averagemass extracted from large metabolomics data sets. Formetabolites of relatively low molecular mass, such mea-surements provide sufficient information to assign mo-lecular formulas, but for metabolites of higher (greaterthan 500 D) molecular masses, formula assignments oftenare ambiguous owing to the large number of formulasconsistent with a molecular mass (Kind and Fiehn, 2007).Moreover, assignments of molecular formulas often fail toyield reliable assignments of metabolites to specific bio-synthetic origins.

In this report, we examine specialized metabolites ofthe wild tomato Solanum habrochaites LA1777, which hasbeen studied extensively for its plant defense com-pounds, including volatile sesquiterpenoids and acylsugars (Coates et al., 1988; Ghosh et al., 2014). Our recentdiscovery of a few glycosylated sesquiterpenoids in thisaccession suggested the metabolic capacity to form suchmetabolites in the genus (Ekanayaka et al., 2014). It is theintent of this report to present a framework for the ac-celerated discovery of terpenoid glycosides from massspectra generated using common instruments such astime-of-flight mass spectrometers that provide interme-diate mass resolution and low-ppm mass accuracy usingS. habrochaites LA1777 as an example.

Figure 1. The complexity of a plantextract is evident from the number ofpeaks in an ultra-high-performanceliquid chromatography (UHPLC)-MSbase peak intensity chromatogramgenerated from a leaf dip extract ofS. habrochaites LA1777. Automatedpeak detection yielded 3,280 retentiontime-mass pair features. Analysis wasperformed using a 110-min chromato-graphic gradient and detected in nega-tive ion mode.

1222 Plant Physiol. Vol. 167, 2015

Ekanayaka et al.

Page 3: Relative Mass Defect Filtering of Mass Spectra: A Path to · metabolites of relatively low molecular mass, such mea-surements provide sufficient information to assign mo-lecular

FOUNDATIONS OF RELATIVE MASS DEFECTFILTERING FOR MINING PLANT METABOLOMESFOR NOVEL TERPENOID GLYCOSIDES

The power of high-resolution or exact-mass MS formetabolite identification lies in the unique mass of eachelement and isotope. A common approach that reflectsthis derives from the idea of a mass defect, which isdefined as the deviation of each atom’s mass from theinteger-rounded mass (nominal mass). The absolutevalue of the mass defect of an ion reflects the ion’s ele-mental composition, because each element has a uniquemass defect. A positive absolute mass defect usually re-flects a large number of hydrogen atoms, because theatomic mass of hydrogen is slightly greater than therounded-off integer (by +7.83 mD), and carbon (exactly12 D) does not contribute to the mass defect. Oxygen hasa smaller negative mass defect (25.09 mD) and nitrogenhas a small positive defect (+3.07 mD), and these elementsoften are fewer in number than hydrogen in specializedmetabolites. The absolute mass defect of an ion representsthe sum of the mass defects for all atoms in the molecule.Absolute mass defects serve as the basis for assigningelemental formulas from high-resolution mass spectra,but mass measurement accuracy often falls short of pro-viding unambiguous formula assignments, particularlyfor higher Mr substances where the number of elementalformulas within mass measurement error can be large.An alternative and promising strategy relies on nor-

malizing the absolute mass defect to an ion’s mass,known as the relative mass defect (RMD). Since absolutemass defect largely reflects the total hydrogen content,RMD serves as a measure of fractional hydrogen content(Stagliano et al., 2010), which in turn reflects the reducedstate of carbon that derives from the contributions ofmetabolic precursors. RMD is calculated in ppm as (massdefect/measured monoisotopic mass) 3 106. For theterpene building block isoprene, the RMD of 920 ppmreflects its high hydrogen content (11.8 weight percenthydrogen). This value remains constant for largermonoterpene, diterpene, and triterpene oligomers, whichshare the same fractional hydrogen content. This dem-onstrates how RMD values aid the grouping of metab-olites based on common biosynthetic precursors, despitedifferences in molecular mass and absolute mass defect.Metabolic oxidations of a sesquiterpene decrease RMDvalues, as shown by the shift in RMD to 830, 752, and692 ppm upon the addition of one- and two-oxygenatoms and subsequent oxidative dehydrogenation (e.g.C15H24O, C15H24O2, and C15H22O2). Terpenoid metabo-lites usually require one or more oxygen atoms in orderto be detected by liquid chromatography (LC)-MS usingelectrospray ionization, and in many organisms, they areconjugated to more polar groups (e.g. glycosides orphosphates). Such conjugations decrease a terpenoid’sRMD value further: each glucosylation adds C6H10O5, soglucosylation of a sesquiterpene alcohol (to form C21H34O6)would decrease RMD to 616 ppm. Additional oxidation orconjugation by malonate (addition of C3H2O3) would de-crease the RMD value of terpenoid metabolites yet further,

and acylation by aliphatic acids (e.g. acetylation) may in-crease RMD if the fractional hydrogen content of the acylgroup is greater than that of the core molecule. Sinceconjugated terpenoids usually consist of a terpenoid corethat is rich in reduced carbon and conjugate groups (car-bohydrates, malonate esters) of low hydrogen content,RMD values of glycosylated sesquiterpenoids range fromapproximately 400 to 600 ppm. In contrast, polyphenolicmetabolites have lower hydrogen content, and their RMDis usually less than 300 ppm (e.g. 230 ppm for salicylic acidand 167 ppm for kaempferol).

Terpenoid glycosides represent a diverse class of phy-tochemicals that have been understudied due to thechallenges in their identification and structure elucidation(Pfander and Stoll, 1991; Sahu and Achari, 2001). Whilesome of these compounds display biological activity(Chang et al., 2002; da Silva et al., 2008), much remainsto be learned about their synthesis and functionality inplants (Maier et al., 1995). In this report, the applicationof RMD filtering is presented as a quickly calculatedmeasure that can advance the annotation of novel plantmetabolites from metabolite profiling analyses anddatabases.

RESULTS AND DISCUSSION

Recognition of Sesquiterpene Glycosides from Ion RMDs

Analysis of leaf extracts of S. habrochaites LA1777 usingLC-multiplexed collision-induced dissociation (CID)-MS

Table I. Characteristic fragment ions observed in negative ion modeMS/MS spectra for various sugar oligosaccharides and monosaccharides

The masses shown correspond to [M-H]2 formed by each sugargroup. The MS/MS spectra of candidate terpenoid compounds wereexamined for the presence of these fragment ions for identification ofthe presence of these oligosaccharides in the terpenoids.

Negative Ion Mode

Fragment Ion

(Theoretical Exact Mass)

RMD Common Sugar Moiety

m/z ppm503.1618 321 Trihexoside (hexose-hexose-

hexose; C18H31O162)

485.1512 312 Trihexoside-water (C18H29O152)

589.1622 275 Trihexoside malonate ester(C21H33O19

2)571.1516 265 Trihexoside malonate

ester-water341.1089 319 Dihexoside (hexose-hexose;

C12H21O112)

323.0984 304 Dihexoside-water (C12H19O10)179.0561 313 Monohexoside (hexose;

C6H11O62)

221.0667 302 Monohexoside acetate ester(C8H13O7

2)161.0455 283 Monohexoside-water (C6H9O5

2)101.0232 230 Fragment ion from hexoses

(C4H5O32)

113.0228 202 Fragment ion from hexoses125.0244 195 Fragment ion from hexoses

Plant Physiol. Vol. 167, 2015 1223

RMD Filtering for Specialized Metabolite Discovery

Page 4: Relative Mass Defect Filtering of Mass Spectra: A Path to · metabolites of relatively low molecular mass, such mea-surements provide sufficient information to assign mo-lecular

in negative ion mode yielded evidence of complex mix-tures of metabolites, dominated by acylsucroses andflavonoid glycosides (Fig. 1). Automated peak detection,deisotoping, integration, and retention time alignmentusing Waters MarkerLynx XS software yielded a total of3,280 mass-to-charge ratio (m/z)-retention time pairs,which are estimated to represent more than 1,000 distinctmetabolites owing to the formation of multiple adductions, noncovalent dimer ions, and fragment ions.

Sorting of the RMD values for the entire automatedpeak-picking data set revealed that 3,199 (98%) of the ionshad positive absolute mass defects, but 2% had negativeabsolute mass defects typical of inorganic salt cluster ionsand instrument contaminants (e.g. trifluoroacetate,NaHPO4

2), and these were filtered from further

consideration. The ions with positive mass defects weredivided into bins, with 1,805 (55% of total) falling in theRMD range of 400 to 650 ppm and 1,177 (36% of total)with RMD from 200 to 400 ppm, the latter range beingtypical of polyphenols. Since the objective of this exercisewas the annotation of sesquiterpene glycosides fromthis data set, three boundary conditions satisfied bysesquiterpenoid glycosides were proposed: (1) the max-imum RMD for a sesquiterpene glycoside is estimatedas 636 ppm based on the theoretical m/z of 383.2439for [M-H]2 of farnesol monoglycoside (C21H35O6

2); (2)the minimum RMD that a sesquiterpene glycoside(maximum of four hexose moieties) can display is 463ppm, calculated from the theoretical m/z of 869.4024 for[M-H]2 of farnesol tetraglycoside; and (3) the minimum

Figure 2. Negative ion mode multiplexed CID mass spectra of S. habrochaites LA1777 metabolites. A, Acylsugar S4:22; RMD =492 ppm for [M+formate]2. B, Acylsugar S4:23; RMD = 402 ppm for [M+formate]2. C, Acylsugar S4:17; RMD = 440 ppm for[M+formate]2. D, Negative ion mode MS/MS spectrum of products of m/z 609 [M+formate]2 of campherenane diol diglucoside. E,MS/MS spectrum of products of m/z 811 [M-H]2 from campherenane diol triglycoside malonate ester. F, Negative ion mode multi-plexed CID mass spectrum of the triterpenoid glycoalkaloid tomatine from S. habrochaites LA1777. All chromatographically resolvedisomers displayed fragments of the samem/z values. Values for RMD of the major fragment ions are presented. All displayed negativeion mode CID mass spectra were obtained using a collision potential of 260 V, and MS/MS spectra were obtained using a collisionpotential of 250 V. All chromatographically resolved isomers displayed fragments of the same m/z values.

1224 Plant Physiol. Vol. 167, 2015

Ekanayaka et al.

Page 5: Relative Mass Defect Filtering of Mass Spectra: A Path to · metabolites of relatively low molecular mass, such mea-surements provide sufficient information to assign mo-lecular

nominal m/z of a sesquiterpenoid monoglycoside shouldbe 383 based on farnesol monoglycoside. It is notablethat some terpenoid compounds are esterified to mal-onate, as evidenced by the malonylated diterpene gly-cosides of Nicotiana attenuata and the malonylatedsesquiterpenes of Panax ginseng (Guangzhi et al.,2005; Heiling et al., 2010; Ruan et al., 2010; Sun et al.,2011). To account for the acetate or malonate esters ofsesquiterpenoid glycosides and to allow for experimentalerror in mass measurement, we propose that nearly allcompounds in this class should fall in the RMD rangeof 440 to 640 ppm. The number of detected ions withRMD of 440 to 640 ppm detected was 1,280 (38% of to-tal), and this number was reduced to 1,076 (33% of total)after application of the low-mass (m/z 383) cutoff. Next,this list of putative sesquiterpenoid glycosides wassorted by descending peak area, and the 200 mostabundant ions were selected for further processing(Supplemental Table S1).

Distinguishing Terpenoid Glycosides fromOther Compounds

Applying the RMD and molecular mass criteria de-scribed allows for the inclusion of nearly all sesquiterpeneglycosides, but the final list also may include some non-terpenoids (Supplemental Table S1). To distinguish these,calculation of the RMD of fragment ions generated usingeither MS/MS or non-mass-selective CID can provideadditional discriminating information. Fragment ionRMD values distinguish terpenoid glycosides from othercompounds, since losses of all carbohydrate moieties willyield a fragment ion corresponding to a terpenoid core,yielding RMD values greater than 800 ppm. Further-more, even if all sugars are not removed during frag-mentation, the RMD values of fragment ions formed byterpenoid glycosides will be greater than that of thepseudomolecular ion, because RMDs are less than

350 ppm for the neutral hexose substructures andtheir fragments, values much less than for terpenoidcores (Table I). Therefore, terpenoid glycosides arecharacterized by fragment ions that display increasingRMD as their masses decrease from the removal ofglycoside groups. This phenomenon is illustrated inFigure 2, which shows the MS/MS spectra of sev-eral abundant metabolites, with pseudomolecular ionRMD falling in the 440- to 636-ppm range. Among thefragment ions of these compounds, only the fragmentsm/z 199 (RMD of approximately 840 ppm; Fig. 2, A and2B) display RMD close to that of isoprene or its olig-omers (919 ppm), but m/z 199 has a mass too low tobe an oxygenated sesquiterpenoid core (C15H24 wouldbe 204 D). In addition, there is no systematic increaseof RMD as fragment masses decrease among the frag-ment ions in any of these compounds, suggestingthat the groups being lost have high hydrogen con-tents similar to, or greater than, the intact molecule.These findings suggest that the molecules are not ter-penoid glycosides; in fact, the metabolites whose MS/MS spectra are depicted in Figure 2, A to C, are allacylsucroses. In contrast, the MS/MS spectra of glyco-sides of the sequiterpenoid campherenane diol (discussedbelow) shown in Figure 2, D and E, show a systematicincrease in RMD as major fragment masses decrease,consistent with a hydrogen-rich terpenoid core andneutral mass losses of glycosides. Only the less-abundantcarbohydrate-derived product ions of m/z 161 and 323(Fig. 2D) and m/z 179 and 323 (Fig. 2E) deviate from thistrend.

Annotation of Sesquiterpene Diol Glycosides fromS. habrochaites LA1777

An example workflow for metabolite annotation fol-lows. In the list of the 200 S. habrochaites LA1777metaboliteions with greatest peak areas within RMD 440 to 640 ppm(as discussed below), a metabolite m/z of 609 was ranked17th in peak area (Supplemental Table S1). Negative ionmode multiplexed CIDmass spectra of these compoundsyielded fragment ions that displayed a systematic in-crease of RMD with decreasing mass, consistent withlosses of neutral fragments lower in hydrogen contentthan the intact molecule. This observation flagged themetabolite as a potential terpenoid glycoside. In orderto ensure that these fragment ions were derived from theproposed pseudomolecular ion, MS/MS spectra weregenerated.

Since multiplexed CID results can be complicated bythe formation of fragments arising from other coelutingmetabolites, the MS/MS spectrum of products ofm/z 609(Fig. 2D) was generated. A fragment ion at m/z 563 wasobserved, corresponding to the loss of HCOOH. Sinceformic acid was in the mobile phase, m/z 609 was an-notated as [M+formate]2 of a metabolite of 564 D. Suchionization behavior is common for glycosides that lackacidic functional groups. The next most abundant fragmentwas m/z 401, corresponding to one less hexose moiety

Figure 3. Relationships between negative ion mode MS/MS fragment ionmasses (m/z values) and RMD values for a triterpenoid glycoalkaloid(tomatine; indicated by triangles), a sesquiterpene triglycoside malonateester (sesquiterpene triglycoside-811; indicated by diamonds), and a tet-raacylsucrose (AS2-765; indicated by circles) from S. habrochaites LA1777leaf dip extract.

Plant Physiol. Vol. 167, 2015 1225

RMD Filtering for Specialized Metabolite Discovery

Page 6: Relative Mass Defect Filtering of Mass Spectra: A Path to · metabolites of relatively low molecular mass, such mea-surements provide sufficient information to assign mo-lecular

Table II. Groups of metabolites identified as sesquiterpenoid glycosides from S. habrochaites LA1777 based on RMD filtering of molecular andfragment ions

Dm, Difference between theoretical and measured m/z values, in parts per million.

Compound

No.

Experimental

m/zRMD

Proposed Elemental

Formula of Neutral

Molecule

Δm Compound Type

Measured m/z of

Terpenoid Core

Fragment Ion

Elemental Formula

of Terpenoid Core

Fragment Ion

Δm

RMD of

Terpenoid

Core

Fragment

No. of

Isomers

ppm ppm ppm1 545.2932 538 C27H46O11 21 Sesquiterpene II

diglycoside221.1527 C14H21O2

2 29 691 8

2 587.3049 519 C29H48O12 23 Sesquiterpene IIdiglycosideacetate ester

221.1529 C14H21O22 28 692 1

3 631.2929 464 C37H44O13 3 Sesquiterpene IIdiglycosidemalonate ester

221.1565 C14H21O22 8 708 2

4 707.3112 440 C32H52O17 23 Sesquiterpene IItriglycoside

221.1539 C14H21O22 24 696 4

5 793.3474 438 C36H58O19 1 Sesquiterpene II dioltriglycosidemalonate ester

221.1527 C14H21O22 29 691 11

6 401.2520 628 C21H38O7 26 Campherenane diolmonoglycoside

239.1997 C15H27O22 28 836 6

7 443.2632 594 C23H40O8 24 Campherenane diolmonoglycosideacetate ester

239.1994 C15H27O22 29 834 14

8 445.2421 544 C22H38O9 24 Campherenane diolmonoglycosidederivative

239.1999 C15H27O22 27 836 5

9 487.2536 520 C24H40O10 23 Campherenane diolmonoglycosidemalonate ester

239.1994 C15H27O22 29 834 9

10 563.3055 542 C27H48O12 23 Campherenane dioldiglycoside

239.2048 C15H27O22 13 857 7

11 605.3152 521 C29H50O13 24 Campherenane dioldiglycosideacetate ester

239.1993 C15H27O22 210 834 9

12 649.3042 469 C30H50O15 25 Campherenane dioldiglycosidemalonate ester

239.1993 C15H27O22 210 834 14

13 725.3607 497 C33H58O17 2 Campherenane dioltriglycoside

239.2007 C15H27O22 24 840 6

14 811.3587 442 C36H60O20 22 Campherenane dioltriglycosidemalonate ester

239.2006 C15H27O22 24 839 13

15 411.1986 483 C21H32O8 29 Sesquiterpene IIImonoglycoside

249.1481 C15H21O32 24 595 13

16 453.2082 459 C23H34O9 29 Sesquiterpene IIImonoglycosideacetate ester

249.1483 C15H21O32 23 596 14

17 497.2011 404 C24H34O11 23 Sesquiterpene IIImonoglycosidemalonate ester

249.1475 C15H21O32 26 592 11

18 413.2143 519 C21H34O8 29 Sesquiterpene Imonoglycoside

251.1634 C15H23O32 28 651 17

19 455.2270 499 C22H36O9 24 Sesquiterpene Imonoglycosideacetate ester

251.1638 C15H23O32 26 653 21

20 499.2162 433 C24H36O12 25 Sesquiterpene Imonoglycosidemalonate ester

251.1643 C15H23O32 24 655 11

(Table continues on following page.)

1226 Plant Physiol. Vol. 167, 2015

Ekanayaka et al.

Page 7: Relative Mass Defect Filtering of Mass Spectra: A Path to · metabolites of relatively low molecular mass, such mea-surements provide sufficient information to assign mo-lecular

than m/z 563. RMD values of both of these fragments(539 and 631 ppm for m/z 563 and 401, respectively) in-creased as the m/z of the fragment decreased (Fig. 2D),consistent with annotation as a terpenoid glycoside.However, no prominent fragment ions with high RMDtypical of terpenoid cores were observed with m/z, 401.The fragment ion mass of m/z 401.2532 suggested a for-mula of C21H37O7

2, which has six more than the 15carbon atoms that are usually found in sesquiterpenes,suggesting the presence of an additional hexose not re-leased during fragmentation. A fragment ion of m/z323.098 (RMD = 303 ppm) was tentatively assigned as[dihexose-H-H2O]2 (C12H19O10

2), suggesting a digluco-side where the two hexose groups are linked to oneanother. Additional evidence for a glycoside was pro-vided by the fragment at m/z 161.04 (RMD = 271 ppm),corresponding to C6H9O5

2. Further characterization ofthis compound required purification and structure de-termination using one-dimensional and two-dimensionalNMR, since further fragmentation of the core was notobserved in negative ion mode and the mass spectrawere not consistent with previously known metabolites.The NMR spectra confirmed the structure as the ses-quiterpenoid glycoside campherenane diol diglucoside,as we reported earlier (Ekanayaka et al., 2014).Another example of how RMD values guide the dis-

covery of more complex terpenoid glycosides is presentedin the form of a metabolite detected in negative ionprofiling of S. habrochaites LA1777 leaf trichomes. Themetabolite was detected as m/z 811.3587 (RMD = 442ppm), perhaps higher in molecular mass and with lowerRMD than expected for a sesquiterpenoid glycoside. ItsMS/MS product ion spectrum (Fig. 2E) shows fragmentsformed by the loss of CO2 to give m/z 767.3707 (RMD =483 ppm) followed by the loss of C2H2O to give m/z725.3587 (RMD = 495 ppm). Both fragments are charac-teristic of malonate esters. Further fragmentation gener-ated ions ofm/z 563.3057 (RMD = 543 ppm),m/z 401.2516(RMD = 627 ppm), and m/z 239.2017 (RMD = 841 ppm).With the exception of common carbohydrate frag-ment ions, RMD values increased as fragment ion

mass decreased, again consistent with a terpenoidglycoside. The fragment ion at m/z 239.2017 was an-notated as a sesquiterpenoid core (C15H27O2

2), as it didnot undergo further fragmentation. Based on theseobserved characteristics, this compound can be anno-tated as a sesquiterpenoid triglycoside malonate ester.

The application of RMD analysis is not limited tosesquiterpenoid metabolites but is readily extended toother related and unrelated substances. MS/MS spectraof the triterpenoid glycoalkaloid tomatine displayedsimilar behavior. The RMD of the [M-H]2 ion of tomatine(m/z 1032.5) is 520 ppm. The major fragments display an

Table II. (Continued from previous page.)

Compound

No.

Experimental

m/zRMD

Proposed Elemental

Formula of Neutral

Molecule

Δm Compound Type

Measured m/z of

Terpenoid Core

Fragment Ion

Elemental Formula

of Terpenoid Core

Fragment Ion

Δm

RMD of

Terpenoid

Core

Fragment

No. of

Isomers

21 617.2795 453 C29H45O14 22 Sesquiterpene Idiglycosideacetate ester

251.1636 C15H23O32 27 652 4

22 661.2693 407 C30H46O16 23 Sesquiterpene Idiglycosidemalonate ester

251.1639 C15H23O32 26 653 8

23 823.3245 394 C36H56O21 1 Sesquiterpene Itriglycosidemalonate ester

251.1640 C15H23O32 25 653 5

24 985.3742 380 C30H50O26 23 Sesquiterpene Itetraglycosidemalonate ester

251.1641 C15H23O32 25 654 11

Figure 4. HPLC-MS extracted ion chromatogram profiles for masses ofthree compounds purified from S. habrochaites LA1777 showing evi-dence of 10 isomers of [M-H]2 (m/z 661) of sesquiterpene I dioldiglucoside malonate ester (compound 22 formula from Table II; A),three isomers of [M+formate]2 (m/z 633) of sesquiterpene II alcoholdiglucoside acetate ester (compound 2 formula from Table II; B), andeight isomers of [M+formate]2 (m/z 591) of sesquiterpene II alcoholdiglucoside (compound 1 formula from Table II; C). Structures ofcompounds of Equations 1, 2, and 22 determined by NMR and MS/MSare presented. Carbon atoms are numbered in accordance with NMRassignments in Supplemental Table S2.

Plant Physiol. Vol. 167, 2015 1227

RMD Filtering for Specialized Metabolite Discovery

Page 8: Relative Mass Defect Filtering of Mass Spectra: A Path to · metabolites of relatively low molecular mass, such mea-surements provide sufficient information to assign mo-lecular

increasing RMD from 520 to 674 ppm with decreasingproduct ion mass and correspond to neutral losses ofrelatively hydrogen-deficient carbohydrate moieties (Fig.2F). The relationship between fragment ion RMD valueand ion mass (m/z) for a tetraacylsucrose, a sesquiterpe-noid glycoside, and the triterpenoid glycoside tomatineshows how terpenoid glycosides can be distinguishedfrom other compounds based on fragment ion RMD (Fig.3). For the two glycosides that possess a terpenoid core,RMD values of fragment ions that contain the terpenoidcore are greater than that for the precursor ion, whereasthe reverse is true, with the exception of the fatty acylanion at m/z 199, for the tetraacylsucrose.

The utility of RMD filtering was then assessed byapplying RMD-based filtering criteria for sesquiterpenoidglycosides, as described above, to the most abundant 200metabolite ions in the list of S. habrochaites LA1777 me-tabolites detected by nontargeted LC-MS profiling. Therewere 224 peaks annotated as sesquiterpene glycosides,including multiple isomers for each elemental formula, aspresented in Table II. Three different sesquiterpenoid coreformulas were established from MS/MS spectra, includ-ing the campherenane diol core (C15H28O2). The MS/MSdata generated for each of these compounds and theRMD of each fragment ion are presented in SupplementalFigures S1 to S21.

Discovery of Conjugated Terpenoid Glycosides fromS. habrochaites LA1777

RMD filtering of the list of m/z-retention time pairsextracted from nontargeted metabolite profiling ofS. habrochaites LA1777 revealed numerous m/z valuesconsistent with sesquiterpenoid glycosides. Among thosegiving the greatest integrated peak areas were threenominal masses (m/z 661, 633, and 591) that gave CIDmass spectra suggestive of glycosides, yielded multiple

chromatographic peaks consistent with several isomers(Fig. 4), and were judged to represent metabolites insufficient abundance for their isolation and character-ization by NMR spectroscopy.

Three individual isomers designated with formulas1, 2, and 22 (Table II) were selected based on RMDcriteria and were purified using HPLC. Their NMRspectra (presented in Supplemental Table S2) revealedstructures with methyl branching consistent with iso-prenoid precursors, although the aglycone cores differedin structure from known volatile or nonvolatile sesquiter-pene metabolites within the genus Solanum. The core forthe isolated compound (Fig. 4A, peak 8a; detected as m/z661 in negative ion mode) was consistent with an oxidizedcore of formula C15H24O3, and this core formula wasdesignated as sesquiterpene I. NMR spectra revealed thepurified isomer to be an acyclic metabolite with a ketonegroup near the center of the carbon chain (SupplementalFig. S22). NMR spectra of two additional metabolites (Fig.4, B [peak 3b;m/z 663] and C [peak 6c;m/z 591]) suggestedthat both were glycosides of a common aglycone core ofC15H26O, and this was designated sesquiterpene II. Thetwo masses were consistent with the compounds differingby the attachment of one acetyl group. Structures of thethree metabolites are presented in Figure 4.

Figure 5. Histogram of RMD values for C. roseusmetabolites extractedfrom the Medicinal Plants Consortium metabolite database (availableat http://metnetdb.org/PMR). The highlighted region corresponds to therange of RMD values anticipated for monoterpene indole alkaloidpathway intermediates. Data were generated by UHPLC-time-of-flightMS in positive ion mode.

Figure 6. RMD filtering relationships between pseudomolecular andfragment ion masses for the terpene indole alkaloids vinblastine (A),vindoline (B), and catharanthine (C) from C. roseus. A slight increase ofRMD of fragment ions is observed as their m/z decreases, consistentwith a relatively hydrogen-rich terpenoid core.

1228 Plant Physiol. Vol. 167, 2015

Ekanayaka et al.

Page 9: Relative Mass Defect Filtering of Mass Spectra: A Path to · metabolites of relatively low molecular mass, such mea-surements provide sufficient information to assign mo-lecular

Neither terpenoid core displays much structuralsimilarity to volatile sesquiterpenes of S. habrochaitesor other documented metabolites within the genus,nor do their structures suggest that they are formedby the action of common terpene synthase enzymatictransformations. RMD filtering accelerated the rec-ognition of the presence of these unusual compounds,but it is clear that more investigation is needed todetermine the biosynthetic origins and the biologicalfunctions of these metabolites.

Application of RMD Filtering for the Mining ofIntermediates in Terpene Indole Alkaloid Biosynthesisin Catharanthus roseus

The original metabolome data set from the MedicinalPlants Consortium database has 3,229 m/z-retentiontime pairs derived from nontargeted LC-MS analysisof C. roseus tissue extracts. The distribution of RMDvalues for all detected ions is presented in Figure 5.Applying the boundary conditions discussed belowresulted in 2,109 m/z-retention time pairs (65%) thatsatisfy the criteria as potential terpene alkaloid pathwayintermediates, and this finding suggests that a largefraction of the specialized metabolome may be de-rived from common or similar precursors (SupplementalTable S3). All 12 monoterpene indole alkaloids annotatedin the Medicinal Plants Consortium database werecorrectly assigned as lying within the RMD searchcriteria. The only annotated metabolite falling out-side this range was Suc, reflecting the success of thefiltering in excluding metabolites from different struc-tural classes. For distinguishing potential terpene in-dole alkaloids from the nonterpenoid compounds,RMD of fragment ions can be used. Fragment ionsof terpene indole alkaloids that possess the terpe-noid component in it display a characteristic increaseof RMD compared with the parent ion as the frag-ment ion m/z values decrease. This can be inferredbased on the relationship between RMD and frag-ment ion masses reported for vinblastine, vindoline,and catharanthine (Fig. 6). Furthermore, terpene indolealkaloids display the presence of a number of even-mass-fragment ions as observed in their CID mass spectra(Supplemental Figs. S23–S25) that are characteristicof fragment ions containing an odd number of nitrogenatoms.The applications discussed in this report demon-

strate the applicability of RMD filtering for discov-ering terpenoid compounds even when the terpenoidcomponent represents only a minority fraction of themass of the molecules of interest and has been subjectedto a number of biotransformations, as in the case ofvinblastine. However, RMD filtering still allows fordistinguishing vinblastine among other compounds,and the gradual increase of the RMD of fragmentions with decreasing ion mass suggests the presenceof a terpenoid core. Similar to the case with sesquiter-pene triglycoside malonate esters in S. habrochaites, the

sesquiterpene component was a minor fraction of me-tabolite mass.

CONCLUSION

The range of specialized metabolites in the plantkingdom is astounding, yet deep explorations intothe metabolomes of nonmodel plants face enormousdata sets and would benefit from tools that guide afocus on specific biosynthetic classes. Analyses usingLC-MS with multiplexed CID generate molecularand fragment mass information for all ionized me-tabolites, providing that the ion signal is sufficient.When coupled with medium- to high-resolution massmeasurements, this approach reduces the need forseparate MS/MS analyses for all metabolites andgenerates information about molecular and fragmentmasses with sufficient accuracy to allow for usefulRMD measurements. Analysis of the RMD variationamong precursor ions and product/fragment ionsaccelerates metabolite discovery by eliminating sig-nals with RMD values that are inconsistent with atarget class of metabolites. For this investigation,while we cannot exclude the possibility that the RMDvalues consistent with terpenoid conjugates may in-clude nonterpenoids, this sorting removes many non-terpenoid metabolites and guides a focus on candidateterpenoid conjugates. This strategy enabled the anno-tation of more than 200 novel sesquiterpene glyco-sides from a plant system that has been studied forseveral decades. We anticipate that RMD filtering ofnontargeted metabolite profiles will propel the an-notation and identification of terpenoid glycosidesthat have been underappreciated owing to the lackof a systematic method for the recognition of theirpresence.

The research discussed here used negative ionmode MS/MS for the annotation of terpene glyco-sides from complex matrices and provides, to ourknowledge, the first evidence for a remarkably ex-tensive and diverse group of sesquiterpenoid gly-cosides in S. habrochaites LA1777. We recognize thatRMD values for molecular mass alone yield limitedresolution of compound classes, and many metab-olites are derived from multiple precursors (e.g.prenylated polyphenols and terpenoid glycosides).For more refined annotations, examinations of RMDvalues of molecular and fragment ions as well asneutral mass losses provide evidence of multiplebiosynthetic precursors, as was demonstrated forthe terpenoid glycosides described in this report. Inaddition, RMD filtering is equally applicable topositive ion mode data sets, as demonstrated by itsapplication in the correct annotation of known ter-pene indole alkaloids and the assignment of morethan 1,000 ions as candidate terpenoid intermediatesfrom C. roseus. We envision that the development of al-gorithms that provide automated classification of me-tabolites based on molecular and fragment RMD values

Plant Physiol. Vol. 167, 2015 1229

RMD Filtering for Specialized Metabolite Discovery

Page 10: Relative Mass Defect Filtering of Mass Spectra: A Path to · metabolites of relatively low molecular mass, such mea-surements provide sufficient information to assign mo-lecular

will accelerate discoveries of gene functions that regulateplant chemistry.

MATERIALS AND METHODS

Plant Material

Solanum habrochaites LA1777 plants were grown in Michigan State Uni-versity plant growth chambers (28°C, 16 h of light/8 h of dark day/nightcycle, 150 mmol m22 s21, 96% humidity) for 6 weeks from seeds obtainedfrom the C.M. Rick Tomato Genetics Resource Center (University of California,Davis). Ten leaflets harvested from each plant (6 weeks postgermination) wereextracted by dipping in 5 mL of methanol:water (80:20, v/v) for about 30 s. Threebiological replicates were used for profiling. Extracts were concentrated by dryingunder a stream of N2 gas at room temperature, and the residues were redissolvedin 0.5 mL of methanol:water (80:20, v/v). The extract was centrifuged (10,000g for10 min at 25°C) to remove debris; the supernatant was transferred to an auto-sampler vial for LC-MS and liquid chromatography-tandem mass spectrometry(LC-MS/MS) analyses.

LC-MS and MS/MS Analyses

Initial exploration of the complexity of S. habrochaites LA1777 extracts wasperformed using a Waters LCT Premier time-of-flight mass spectrometercoupled to Shimadzu LC-20AD pumps. Separations were performed using anAscentis Express C18 UHPLC column (2.1 3 100 mm, 2.7 mm; Supelco), andmetabolites were detected using electrospray ionization in negative ion mode.Solvents used were 0.15% (v/v) aqueous formic acid, pH 2.85 (A), andmethanol (B). The LC gradient was as follows: 0 to 1 min (99:1), 1.01 to 100 min(linear ramp to 20:80), 100.01 to 101 min (linear ramp to 1:99), 101 to 105 min(hold at 1:99), 105 to 106 min (linear ramp to 99:1), and hold at 99:1 over 106 to110 min. The flow rate was 0.3 mL min21, and column temperature was heldat 35°C. Sample volume injected to the column was 10 mL. Mass spectra wereacquired over m/z 50 to 1,500 using dynamic range extension. Mass resolution(mass divided by mass peak width [M/DM] measured as the full width-halfmaximum) was approximately 10,000. Five parallel collision energy functionswere used by switching the aperture 1 voltage between 5, 20, 40, 60, and 80 Vwith 0.1 s per function. Other parameters include capillary voltage of 2.50 kV,desolvation temperature of 350°C, source temperature of 100°C, cone gas (N2)at 40 L h21, and desolvation gas (N2) at 350 L h21.

All LC-MS/MS experiments used for the characterization of novel ter-penoid metabolites from S. habrochaites LA1777 were performed using aWaters Xevo G2-S QToF mass spectrometer coupled to a Waters Acquityultra-high pressure LC system. The same chromatographic column andsolvents used in experiments performed on the LCT Premier were usedhere. The solvent gradient (A:B) was as follows: 0 to 1 min (99:1), 1.01 to 4min (linear ramp to 55:45), 4.01 to 9 min, (hold at 55:45), step to 50:50 and holdat 50:50 over 9.01 to 14 min, step to 1:99 and hold at this ratio over 14.01 to 17min, and step to 99:1 and hold at this composition over 17.01 to 20 min. Theflow rate was 0.3 mL min21, and column temperature was held at 35°C. Massspectra were acquired using negative ion mode electrospray ionization anddynamic range extension over m/z 50 to 1,500, with mass resolution (M/DM,full width-half maximum) of approximately 20,000. Five parallel collision energyfunctions were used, with 0.1 s per function. Collision cell potentials used fornegative ion mode fragmentation for each function were 5, 15, 25, 35, and 60 V.Other parameters include capillary voltage of 2.14 kV, desolvation temperature of280°C, source temperature of 90°C, cone gas (N2) at 0 L h21, and desolvation gas(N2) at 800 L h21.

Data Processing

Automated peak detection, integration, and retention time alignment wereperformed using Waters MarkerLynx XS software, and lists of m/z values,retention times, and extracted ion chromatogram peak areas were exported astext files and processed further using Microsoft Excel software. The lowestcollision energy function (function 1) was used for peak detection, integration,retention time alignment, and deisotoping. The parameters used with MarkerLynxprocessing were as follows: marker intensity threshold, 800 counts; mass window,0.05 D; retention time window, 0.25 min; m/z range, 100 to 1,500; and retentiontime range, 0.5 to 20 min. Peak smoothing was not applied.

Structure Elucidation of Candidate SequiterpenoidGlycosides from S. habrochaites LA1777

Details of the experimental procedures used to isolate sesquiterpenoidglycoside metabolites and determine their structures using LC-MS/MS andNMR spectroscopy are presented in the Supplemental Text S1. Metabolitestructures, NMR chemical shifts (Supplemental Table S1), accurate massmeasurements, and MS/MS data are included. NMR assignments were madebased on one-dimensional 1H and 13C spectra and two-dimensional 1H-1Hcorrelation spectroscopy and Nuclear Overhauser Effect spectroscopy or1H-13C Heteronuclear Single Quantum Coherence, Heteronuclear MultipleBond Correlation, and Total Correlation Spectroscopy spectra. Since none ofthe metabolite identities have been confirmed by synthesis, structures shouldbe considered putatively annotated compounds or level 2 based on theMetabolomics Standards Initiative guidelines (Sumner et al., 2007).

LC-MS Analysis of Catharanthus roseus Metabolites

C. roseus tissue extracts were analyzed by LC-MS, and the data areavailable to the public at the Medicinal Plants Consortium Metabolomedatabase (http://metnetdb.org/PMR/; Wurtele et al., 2012). Analyses wereperformed using a Waters LCT Premier time-of-flight mass spectrometercoupled to Shimadzu LC-20AD pumps. Separations were performed usingan Ascentis Express C18 UHPLC column (2.1 3 50 mm, 2.7 mm; Supelco),and the ionized compounds were detected in positive ion mode electrosprayionization. Solvents used were 10 mM aqueous ammonium formate, pH 2.85 (A),and a 1:1 mixture of methanol and acetonitrile (B). The solvent gradient (A:B) wasas follows: 0 to 1 min (90:10), 1.01 to 23 min (linear ramp to 10:90), 23.01 to 27 min(hold at 10:90), linear ramp to 90:10 by 28 min, and hold at 90:10 over 28 to 32 min.The flow rate was 0.3 mL min21, and column temperature was held at 40°C. Massspectra were acquired using positive ion mode electrospray ionization and dy-namic range extension over m/z 50 to 1,500, with mass resolution (M/DM, fullwidth-half maximum) of approximately 8,500. Four parallel collision energyfunctions were used, with 0.15 s per function. Collision cell potentials used forpositive ion mode fragmentation for each function were 20, 40, 60, and 80 V. Otherparameters include capillary voltage of 3 kV, desolvation temperature of 350°C,source temperature of 100°C, cone gas (N2) at 40 L h21, and desolvation gas (N2) at350 L h21.

Annotation of C. roseus Metabolomes

The performance of the RMD filtering approach was evaluated by applyingit to annotate metabolites from C. roseus, which accumulates monoterpeneindole alkaloids (Svoboda et al., 1959; O’Connor and Maresh, 2006). This wasperformed by establishing appropriate boundary conditions for monoterpenemetabolites and assessing whether the proposed monoterpene-derived inter-mediates from C. roseus documented in the Medicinal Plants ConsortiumMetabolome database were correctly classified. The precursors of monoter-pene indole alkaloids are tryptamine {theoretical m/z of 161.1073 ([M+H]+);RMD = 666 ppm} and the iridoid glycoside secologanin {theoretical m/z of389.1442 ([M+H]+); RMD = 371 ppm}. One proposed precursor of secologaninis iridotrial {theoretical m/z of 183.1016 ([M+H]+); RMD = 555 ppm; Miettinenet al., 2014}. The biosynthetic pathway dictates that the indole component ofthese compounds originated from the tryptamine group, while the iridotrialacts as a precursor of the monoterpene component (El-Sayed and Verpoorte,2007). Direct condensation of iridotrial with tryptamine would generate thesimplest form of terpene indole alkaloid with elemental formula C20H25N2O2

+

{theoretical m/z of 325.1911 ([M+H]+); RMD = 588 ppm}. Strictosidine {theo-retical m/z of 531.2337 ([M+H]+); RMD = 440 ppm} is formed by condensationof the monoterpenoid glycoside secologanin with tryptamine (El-Sayed andVerpoorte, 2007). Additional biosynthetic steps result in the formation of morecomplicated terpene indole alkaloids, including vinblastine, which is consis-tent with the formation of a strictosidine dimer {theoretical m/z of 1,061.46([M+H]+); RMD = 434 ppm}. Based on this information, the minimum RMDfor mining for terpene indole alkaloids can be proposed as about 420 ppm,and the maximum RMD can be estimated as about 588 ppm. To account forerrors in mass measurements, the RMD range of 350 to 600 ppm wasemployed, and the mass range was estimated to span from aboutm/z 325 to 1,061for [M+H]+ ions. Therefore, compounds with RMDs ranging from 350 to 600 ppmandm/z ranging from 320 to 1,100 were selected as representing potential pathwayintermediates involved in monoterpene indole alkaloid biosynthesis in C. roseus.

1230 Plant Physiol. Vol. 167, 2015

Ekanayaka et al.

Page 11: Relative Mass Defect Filtering of Mass Spectra: A Path to · metabolites of relatively low molecular mass, such mea-surements provide sufficient information to assign mo-lecular

Supplemental Data

The following supplemental materials are available.

Supplemental Figure S1. Negative ion mode product ion MS/MS spec-trum of products from [M-H]2 (m/z 649) for campherenan-2,12-dioldiglucoside malonate ester.

Supplemental Figure S2. Negative ion mode product ion MS/MS spec-trum of products from [M-H]2 (m/z 487) for campherenan-2,12-diolmonoglucoside malonate ester.

Supplemental Figure S3. Negative ion mode product ion MS/MS spec-trum of products from [M+formate]2 (m/z 609) for campherenan-2,12-diol diglucoside.

Supplemental Figure S4. Negative ion mode product ion MS/MS spec-trum of products from [M+formate]2 (m/z 651) for campherenan-2,12-diol diglucoside acetate ester.

Supplemental Figure S5. Negative ion mode product ion MS/MS spec-trum of products from [M+formate]2 (m/z 771) for a campherenane-2,12-diol triglycoside.

Supplemental Figure S6. Negative ion mode product ion MS/MS spec-trum of products from [M+formate]2 (m/z 591) for a sesquiterpene IIdihexoside.

Supplemental Figure S7. Negative ion mode product ion MS/MS spec-trum of products from [M-H]2 (m/z 631) for a sesquiterpene II dihexo-side malonate ester.

Supplemental Figure S8. Negative ion mode product ion MS/MS spec-trum of products from [M-H]2 (m/z 793) for a sesquiterpene II trihexo-side malonate ester.

Supplemental Figure S9. Negative ion mode product ion MS/MS spec-trum of products from [M+HCOO]2 (m/z 633) for a sesquiterpene IIdihexoside acetate ester.

Supplemental Figure S10. Negative ion mode product ion MS/MS spec-trum of products from [M-H]2 (m/z 661) for a sesquiterpene I diglycosidemalonate ester.

Supplemental Figure S11. Negative ion mode product ion MS/MS spec-trum of products from [M-H]2 (m/z 499) sesquiterpene I monoglycosidemalonate ester.

Supplemental Figure S12. Negative ion mode product ion MS/MS spec-trum of [M-H]2 (m/z 455) for sesquiterpene I monoglycoside acetate ester.

Supplemental Figure S13. Negative ion mode product ion MS/MS spec-trum of [M-H]2 (m/z 413) of sesquiterpene I monoglycoside.

Supplemental Figure S14. Negative ion mode product ion MS/MS spec-trum of [M+HCOO]2 (m/z 499) for sesquiterpene I monoglycoside malonateester.

Supplemental Figure S15. Negative ion mode product ion MS/MS spectrumof [M-H]2 (m/z 497) of sesquiterpene III monoglycoside malonate ester.

Supplemental Figure S16. Negative ion mode product ion MS/MS spectrumof products of [M-H]2 (m/z 411) of sesquiterpene III monoglycoside.

Supplemental Figure S17. Negative ion mode product ion MS/MS spectrumof products from [M-H]2 (m/z 617) for sesquiterpene I diglycoside acetate ester.

Supplemental Figure S18. Negative ion mode product ion MS/MS spec-trum of products from [M-H]2 (m/z 823) for sesquiterpene I triglycosidemalonate ester.

Supplemental Figure S19. Negative ion mode product ion MS/MS spec-trum of products from [M-H]2 (m/z 985) for sesquiterpene I tetraglyco-side malonate ester.

Supplemental Figure S20. Negative ion mode product ion MS/MS spec-trum of products from [M+HCOO]2 (m/z 489) for campherenane diolmonoglycoside acetate ester.

Supplemental Figure S21. Negative ion mode product ion MS/MS spec-trum of products from [M+HCOO]2 (m/z 753) sesquiterpene II triglyco-side.

Supplemental Figure S22. Proposed structures of sesquiterpene I dioldiglucoside malonate ester, sesquiterpene II alcohol diglucoside acetateester, and sesquiterpene II alcohol diglucoside.

Supplemental Figure S23. Positive ion mode multiplexed CID mass spec-trum of vinblastine (m/z 811.4 is [M+H]+) from C. roseus.

Supplemental Figure S24. Positive ion mode multiplexed CID mass spec-trum of vindoline (m/z 457.2 is [M+H]+) from C. roseus.

Supplemental Figure S25. Positive ion mode multiplexed CID mass spec-trum of catharanthine (m/z 337.2 is [M+H]+) from C. roseus.

Supplemental Table S1. Extracted ion chromatogram peak areas forS. habrochaites LA1777 leaf dip extracts satisfying the mass and relativemass defect criteria consistent with sesquiterpenoid glycosides.

Supplemental Table S2. 1H and 13C NMR chemical shifts for sesquiterpeneglycoside compound formulas 1, 2, and 22 listed in Table 2.

Supplemental Table S3. List of metabolite ions from C. roseus mined fromthe Medicinal Plants Consortium database (http://metnetdb.org/PMR)that satisfy proposed mass and relative mass defect criteria for mono-terpene indole alkaloids and their pathway intermediates.

Supplemental Text S1. Supplemental information.

ACKNOWLEDGMENTS

We thank Drs. Robert Last, Tony Schilmiller, and Dean DellaPenna(Michigan State University) and Eran Pichersky (University of Michigan) forvaluable suggestions and comments; Lijun Chen (Michigan State UniversityResearch Technology Support Facility Mass Spectrometry and MetabolomicsCore staff) for assistance with analyses performed on the Xevo G2-S QToFmass spectrometer; Dr. Joseph Chappell, Scott Kinison, and Yunsoo Yeo(University of Kentucky) for processing C. roseus tissues; Dr. Eve Wurtele,Manhoi Hur, Nick Ransom, and Luda Rizshsky (Iowa State University) forthe development and organization of the online Plant Metabolomics Resourcedatabase that includes the C. roseus metabolome data set used in this report;and all of the National Institutes of Health Medicinal Plants Consortium par-ticipants for access to extracts of numerous medicinal plant tissues.

Received October 9, 2014; accepted February 5, 2015; published February 6,2015.

LITERATURE CITED

Ajikumar PK, Tyo K, Carlsen S, Mucha O, Phon TH, Stephanopoulos G(2008) Terpenoids: opportunities for biosynthesis of natural productdrugs using engineered microorganisms. Mol Pharm 5: 167–190

Challinor VL, De Voss JJ (2013) Open-chain steroidal glycosides, a diverseclass of plant saponins. Nat Prod Rep 30: 429–454

Chang J, Xuan LJ, Xu YM, Zhang JS (2002) Cytotoxic terpenoid and im-munosuppressive phenolic glycosides from the root bark of Dictamnusdasycarpus. Planta Med 68: 425–429

Chappell J (1995) Biochemistry and molecular biology of the isoprenoid biosyn-thetic pathway in plants. Annu Rev Plant Physiol Plant Mol Biol 46: 521–547

Coates RM, Denissen JF, Juvik JA, Babka BA (1988) Identification ofalpha-santalenoic and endo-beta-bergamotenoic acids as moth oviposi-tion stimulants from wild tomato leaves. J Org Chem 53: 2186–2192

da Silva VC, Giannini M, Carbone V, Piacente S, Pizza C, Bolzani VD,Lopes MN (2008) New antifungal terpenoid glycosides from Alibertiaedulis (Rubiaceae). Helv Chim Acta 91: 1355–1362

Dembitsky VM (2006) Astonishing diversity of natural surfactants. 7. Bio-logically active hemi- and monoterpenoid glycosides. Lipids 41: 1–27

Ekanayaka EA, Li C, Jones AD (2014) Sesquiterpenoid glycosides fromglandular trichomes of the wild tomato relative Solanum habrochaites.Phytochemistry 98: 223–231

El-Sayed M, Verpoorte R (2007) Catharanthus terpenoid indole alkaloids:biosynthesis and regulation. Phytochem Rev 6: 277–305

Fraga BM (2012) Natural sesquiterpenoids. Nat Prod Rep 29: 1334–1366Ghosh B, Westbrook TC, Jones AD (2014) Comparative structural profil-

ing of trichome specialized metabolites in tomato (Solanum lycopersicum)and S. habrochaites: acylsugar profiles revealed by UHPLC/MS andNMR. Metabolomics 10: 496–507

Plant Physiol. Vol. 167, 2015 1231

RMD Filtering for Specialized Metabolite Discovery

Page 12: Relative Mass Defect Filtering of Mass Spectra: A Path to · metabolites of relatively low molecular mass, such mea-surements provide sufficient information to assign mo-lecular

Goodger JQ, Woodrow IE (2011) a,b-Unsaturated monoterpene acid glucoseesters: structural diversity, bioactivities and functional roles. Phytochemistry72: 2259–2266

Guangzhi S, Zhi L, Xianggao L, Yinan Z, Jiyan W (2005) Isolation andidentification of two malonyl-ginsenosides from the fresh root of Panaxginseng. Chin J Anal Chem 33: 1783–1786

Heiling S, Schuman MC, Schoettner M, Mukerjee P, Berger B, SchneiderB, Jassbi AR, Baldwin IT (2010) Jasmonate and ppHsystemin regulatekey malonylation steps in the biosynthesis of 17-hydroxygeranyllinaloolditerpene glycosides, an abundant and effective direct defense againstherbivores in Nicotiana attenuata. Plant Cell 22: 273–292

Huhman DV, Sumner LW (2002) Metabolic profiling of saponins in Med-icago sativa and Medicago truncatula using HPLC coupled to an elec-trospray ion-trap mass spectrometer. Phytochemistry 59: 347–360

Ji HF, Li XJ, Zhang HY (2009) Natural products and drug discovery: can thou-sands of years of ancient medical knowledge lead us to new and powerful drugcombinations in the fight against cancer and dementia? EMBO Rep 10: 194–200

Kind T, Fiehn O (2007) Seven golden rules for heuristic filtering of mo-lecular formulas obtained by accurate mass spectrometry. BMC Bio-informatics 8: 105

Lei Z, Huhman DV, Sumner LW (2011) Mass spectrometry strategies inmetabolomics. J Biol Chem 286: 25435–25442

Li C, Schmidt A, Pichersky E, Shi F, Jones AD (2013) Identification ofmethylated flavonoid regioisomeric metabolites using enzymatic semi-synthesis and liquid chromatography-tandem mass spectrometry. Me-tabolomics 9: 92–101

Ma YL, Li QM, Van den Heuvel H, Claeys M (1997) Characterization offlavone and flavonol aglycones by collision-induced dissociation tan-dem mass spectrometry. Rapid Commun Mass Spectrom 11: 1357–1364

Maier W, Peipp H, Schmidt J, Wray V, Strack D (1995) Levels of a ter-penoid glycoside (blumenin) and cell wall-bound phenolics in somecereal mycorrhizas. Plant Physiol 109: 465–470

Miettinen K, Dong L, Navrot N, Schneider T, Burlat V, Pollier J, WoittiezL, van der Krol S, Lugan R, Ilc T, et al (2014) The seco-iridoid pathwayfrom Catharanthus roseus. Nat Commun 5: 3606

Mizutani M, Ohta D (2010) Diversification of P450 genes during land plantevolution. Annu Rev Plant Biol 61: 291–315

O’Connor SE, Maresh JJ (2006) Chemistry and biology of monoterpeneindole alkaloid biosynthesis. Nat Prod Rep 23: 532–547

Pfander H, Stoll H (1991) Terpenoid glycosides. Nat Prod Rep 8: 69–95Ruan CC, Liu Z, Li X, Liu X, Wang LJ, Pan HY, Zheng YN, Sun GZ, Zhang

YS, Zhang LX (2010) Isolation and characterization of a new ginsenosidefrom the fresh root of Panax ginseng. Molecules 15: 2319–2325

Sahu NP, Achari B (2001) Advances in structural determination of sapo-nins and terpenoid glycosides. Curr Org Chem 5: 315–334

Stagliano MC, DeKeyser JG, Omiecinski CJ, Jones AD (2010) Bioassay-directed fractionation for discovery of bioactive neutral lipids guided byrelative mass defect filtering and multiplexed collision-induced disso-ciation. Rapid Commun Mass Spectrom 24: 3578–3584

Sumner LW, Amberg A, Barrett D, Beale MH, Beger R, Daykin CA, FanTW, Fiehn O, Goodacre R, Griffin JL, et al (2007) Proposed mini-mum reporting standards for chemical analysis. Metabolomics 3:211–221

Sun LR, Yan J, Zhou L, Li ZR, Qiu MH (2011) Two new triterpene gly-cosides with monomethyl malonate groups from the rhizome of Cimi-fuga foetida L. Molecules 16: 5701–5708

Svoboda GH, Neuss N, Gorman M (1959) Alkaloids of Vinca rosea Linn.(Catharanthus roseus G. Don.). V. Preparation and characterization ofalkaloids. J Am Pharm Assoc Am Pharm Assoc 48: 659–666

Ward JL, Baker JM, Llewellyn AM, Hawkins ND, Beale MH (2011) Me-tabolomic analysis of Arabidopsis reveals hemiterpenoid glycosides asproducts of a nitrate ion-regulated, carbon flux overflow. Proc NatlAcad Sci USA 108: 10762–10767

Wurtele ES, Chappell J, Jones AD, Celiz MD, Ransom N, Hur M, Rizshsky L,Crispin M, Dixon P, Liu J, et al (2012) Medicinal plants: a public resource formetabolomics and hypothesis development. Metabolites 2: 1031–1059

Zhou B, Xiao JF, Tuli L, Ressom HW (2012) LC-MS-based metabolomics.Mol Biosyst 8: 470–481

Zwenger S, Basu C (2008) Plant terpenoids: applications and future po-tentials. Biotechnol Mol Biol Rev 3: 001–007

1232 Plant Physiol. Vol. 167, 2015

Ekanayaka et al.