1-s2.0-S0168945212000830-main

18
Plant Science 191–192 (2012) 53–70 Contents lists available at SciVerse ScienceDirect Plant Science jo u rn al hom epa ge: www.elsevier.com/locate/plantsci Review Are we ready for genome-scale modeling in plants? Eva Collakova a,, Jiun Y. Yen b , Ryan S. Senger b a Department of Plant Pathology, Physiology, and Weed Science, 308 Latham Hall, Virginia Tech, Blacksburg, VA, USA b Biological Systems Engineering Department, Virginia Tech, Blacksburg, VA, USA a r t i c l e i n f o Article history: Received 26 January 2012 Received in revised form 17 April 2012 Accepted 18 April 2012 Available online 3 May 2012 Keywords: AraGEM C4GEM Flux balance analysis Genome-scale model Metabolic network reconstruction Photorespiration Photosynthesis Plant primary and secondary metabolism a b s t r a c t As it is becoming easier and faster to generate various types of high-throughput data, one would expect that by now we should have a comprehensive systems-level understanding of biology, biochemistry, and physiology at least in major prokaryotic and eukaryotic model systems. Despite the wealth of available data, we only get a glimpse of what is going on at the molecular level from the global perspective. The major reason is the high level of cellular complexity and our limited ability to identify all (or at least impor- tant) components and their interactions in virtually infinite number of internal and external conditions. Metabolism can be modeled mathematically by the use of genome-scale models (GEMs). GEMs are in silico metabolic flux models derived from available genome annotation. These models predict the combination of flux values of a defined metabolic network given the influence of internal and external signals. GEMs have been successfully implemented to model bacterial metabolism for over a decade. However, it was not until 2009 when the first GEM for Arabidopsis thaliana cell-suspension cultures was generated. Genome- scale modeling (“GEMing”) in plants brings new challenges primarily due to the missing components and complexity of plant cells represented by the existence of: (i) photosynthesis; (ii) compartmentation; (iii) variety of cell and tissue types; and (iv) diverse metabolic responses to environmental and developmental cues as well as pathogens, insects, and competing weeds. This review presents a critical discussion of the advantages of existing plant GEMs, while identifies key targets for future improvements. Plant GEMs tend to be accurate in predicting qualitative changes in selected aspects of central carbon metabolism, while secondary metabolism is largely neglected mainly due to the missing (unknown) genes and metabolites. As such, these models are suitable for exploring metabolism in plants grown in favorable conditions, but not in field-grown plants that have to cope with environmental changes in complex ecosystems. AraGEM is the first GEM describing a photosynthetic and photorespiring plant cell (Arabidopsis thaliana). We demonstrate the use of AraGEM given the current (limited) knowledge of plant metabolism and reveal the unexpected robustness of AraGEM by a series of in silico simulations. The major focus of these simulations is on the assessment of the: (i) network connectivity; (ii) influence of CO 2 and photon uptake rates on cellular growth rates and production of individual biomass components; and (iii) stability of plant central carbon metabolism with internal pH changes. © 2012 Elsevier Ireland Ltd. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 1.1. Metabolic networks for GEMing are reconstructed from genome annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 1.2. Constraining the model allows minimizing the number of possible flux solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 1.2.1. Stoichiometric constraints (mass balancing) assume no changes of fluxes with time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 1.2.2. Constraining the flux solution space minimizes the number of possible flux distribution solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 1.2.3. Objective functions are a mathematical representation of cellular “goals” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 1.3. GEMs can be used to predict metabolic phenotypes in microorganisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 2. Challenges associated with GEMing in plants revolve around the complexity of plant cell metabolism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 2.1. Lack of comprehensiveness in plant metabolic networks and missing components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 2.2. Compartmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Corresponding author. Tel.: +1 540 231 7867. E-mail address: [email protected] (E. Collakova). 0168-9452© 2012 Elsevier Ireland Ltd. http://dx.doi.org/10.1016/j.plantsci.2012.04.010 Open access under CC BY-NC-ND license. Open access under CC BY-NC-ND license.

description

adalaha gaya bahasa yang digunakan samurai

Transcript of 1-s2.0-S0168945212000830-main

Page 1: 1-s2.0-S0168945212000830-main

R

A

Ea

b

a

ARRAA

KACFGMPPP

C

0h

Plant Science 191– 192 (2012) 53– 70

Contents lists available at SciVerse ScienceDirect

Plant Science

jo u rn al hom epa ge: www.elsev ier .com/ locate /p lantsc i

eview

re we ready for genome-scale modeling in plants?

va Collakovaa,∗, Jiun Y. Yenb, Ryan S. Sengerb

Department of Plant Pathology, Physiology, and Weed Science, 308 Latham Hall, Virginia Tech, Blacksburg, VA, USABiological Systems Engineering Department, Virginia Tech, Blacksburg, VA, USA

r t i c l e i n f o

rticle history:eceived 26 January 2012eceived in revised form 17 April 2012ccepted 18 April 2012vailable online 3 May 2012

eywords:raGEM4GEMlux balance analysisenome-scale modeletabolic network reconstruction

hotorespirationhotosynthesislant primary and secondary metabolism

a b s t r a c t

As it is becoming easier and faster to generate various types of high-throughput data, one would expectthat by now we should have a comprehensive systems-level understanding of biology, biochemistry, andphysiology at least in major prokaryotic and eukaryotic model systems. Despite the wealth of availabledata, we only get a glimpse of what is going on at the molecular level from the global perspective. Themajor reason is the high level of cellular complexity and our limited ability to identify all (or at least impor-tant) components and their interactions in virtually infinite number of internal and external conditions.Metabolism can be modeled mathematically by the use of genome-scale models (GEMs). GEMs are in silicometabolic flux models derived from available genome annotation. These models predict the combinationof flux values of a defined metabolic network given the influence of internal and external signals. GEMshave been successfully implemented to model bacterial metabolism for over a decade. However, it was notuntil 2009 when the first GEM for Arabidopsis thaliana cell-suspension cultures was generated. Genome-scale modeling (“GEMing”) in plants brings new challenges primarily due to the missing components andcomplexity of plant cells represented by the existence of: (i) photosynthesis; (ii) compartmentation; (iii)variety of cell and tissue types; and (iv) diverse metabolic responses to environmental and developmentalcues as well as pathogens, insects, and competing weeds. This review presents a critical discussion of theadvantages of existing plant GEMs, while identifies key targets for future improvements. Plant GEMs tendto be accurate in predicting qualitative changes in selected aspects of central carbon metabolism, whilesecondary metabolism is largely neglected mainly due to the missing (unknown) genes and metabolites.As such, these models are suitable for exploring metabolism in plants grown in favorable conditions,but not in field-grown plants that have to cope with environmental changes in complex ecosystems.

AraGEM is the first GEM describing a photosynthetic and photorespiring plant cell (Arabidopsis thaliana).We demonstrate the use of AraGEM given the current (limited) knowledge of plant metabolism andreveal the unexpected robustness of AraGEM by a series of in silico simulations. The major focus of thesesimulations is on the assessment of the: (i) network connectivity; (ii) influence of CO2 and photon uptakerates on cellular growth rates and production of individual biomass components; and (iii) stability ofplant central carbon metabolism with internal pH changes.

© 2012 Elsevier Ireland Ltd.

ontents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541.1. Metabolic networks for GEMing are reconstructed from genome annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541.2. Constraining the model allows minimizing the number of possible flux solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

1.2.1. Stoichiometric constraints (mass balancing) assume no changes of fluxes with time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541.2.2. Constraining the flux solution space minimizes the number of possible flux distribution solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551.2.3. Objective functions are a mathematical representation of cellular “goals” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

1.3. GEMs can be used to predict metabolic phenotypes in microorganisms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Open access under CC BY-NC-ND license.

2. Challenges associated with GEMing in plants revolve around the comple2.1. Lack of comprehensiveness in plant metabolic networks and miss2.2. Compartmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

∗ Corresponding author. Tel.: +1 540 231 7867.E-mail address: [email protected] (E. Collakova).

168-9452© 2012 Elsevier Ireland Ltd. ttp://dx.doi.org/10.1016/j.plantsci.2012.04.010

Open access under CC BY-NC-ND license.

xity of plant cell metabolism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57ing components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Page 2: 1-s2.0-S0168945212000830-main

54 E. Collakova et al. / Plant Science 191– 192 (2012) 53– 70

2.3. Photosynthesis and photorespiration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582.4. Diversity of plant cell and tissue types and responses to the environment influence the objective functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3. GEMing in plants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593.1. Arabidopsis thaliana cell-suspension culture GEM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593.2. Seed GEMs in plants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

3.2.1. Compartmented barley seed endosperm flux balance analysis models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613.2.2. Compartmented flux balance analysis models of developing Brassica napus embryos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.3. AraGEM, an Arabidopsis flux balance analysis model representing a C3 plant cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633.4. C4GEM, maize, sugarcane, and sorghum flux balance analysis models representing C4 plant cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633.5. Maize GEM iRS1563 representing C4 plants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4. Evaluating AraGEM through simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645. How can GEMing in plants be improved? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 . . . . . .

1

fengm[scaic(ms[mt

1g

arrbgtbatardecesccGCatro

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. Introduction

Genome-scale models (GEMs) have proven useful in bacteriaor: (i) elucidating metabolism and physiology; (ii) assessing thessentiality of the metabolic steps; and (iii) for designing ratio-al metabolic engineering strategies [1–5]. The first GEM wasenerated in 1999 for Haemophilus influenza with the limited infor-ation about available genome annotation and metabolic data

6]. With the development and improvement of high-throughputequencing, proteomic, and metabolomic technologies, GEMs wereonstructed for many organisms including bacteria, cyanobacteria,nd fungi, animals, and plants [7–13]. The first GEMs generatedn plants are as follows: (i) Arabidopsis thaliana cell-suspensionulture GEM [14]; (ii) barley seed flux balance models [15,16];iii) AraGEM and C4GEM models for Arabidopsis and the monocots

aize, sugarcane, and sorghum, respectively [17,18]; (iv) rape-eed models [19–21]; and (v) updated AraGEM and C4GEM models22]. These models and challenges associated with genome-scaleodeling (“GEMing”) will be reviewed here after a brief introduc-

ion to generating GEMs.

.1. Metabolic networks for GEMing are reconstructed fromenome annotation

GEMs are in silico metabolic flux models derived from genomennotation that contain stoichiometry of all known metaboliceactions of an organism of interest [3,23]. Metabolic flux (totaleaction rate) is typically defined as the amount of a metaboliteeing converted by a particular enzyme per unit of time and perram (dry cell weight) of cells (or biomass). Each metabolic reac-ion in every pathway of a reconstructed network is representedy relevant metabolites (including cofactors) and an enzyme or

transporter encoded by the corresponding gene(s) present inhe genome of the organism. These gene-to-protein-to-reactionssociations are indispensable in deciding whether a particulareaction takes place in the organism of interest as well as vali-ating mutant predictions and implementing the actual metabolicngineering strategies [24]. Fully sequenced genomes, or at leastomprehensive high-quality transcriptomic data, are needed toxtract the information about the participating enzymes and, sub-equently, metabolic reactions and pathways. The databases mostommonly used to extract the metabolic information for spe-ific organisms include: (i) Kyoto Encyclopedia for Genes andenomes (KEGG); (ii) the SEED database; (iii) BioCyc and Ara-yc; (iv) Reactome; (v) TAIR; and (vi) the Biochemical Genetic

nd Genomic knowledgebase (BiGG) [25–31]. It is well establishedhat many inconsistencies exist among these databases, and theesulting network must be manually curated by an expert of therganism of interest. These irregularities include inaccurate gene

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

annotation, inconsistent metabolite names, and incomplete for-mulas as well as the presence of unbalanced reactions forwater, protons, high-energy cofactors, and monomers/polymers.However, several groups in the field are dedicated to build-ing consistency among databases and enabling fully automatedgenome-scale metabolic network reconstructions [32]. In the caseof plant cells, when compartmentation of metabolic pathwaysin individual organelles is desired, one can consult a numberof databases for the localization of the corresponding enzymesand their isoforms and seek transporters enabling metabolitetransport between organelles. These databases include: (i) SUBA;(ii) Plant Proteome Database (PPDB); (iii) AraPerox; and (iv)UniproKB/SwissProt [33–36].

1.2. Constraining the model allows minimizing the number ofpossible flux solutions

The reconstructed metabolic network and its reaction sto-ichiometry (stored in a stoichiometric matrix) extracted fromdatabases determine the possible routes of carbon flow and cofactorbalancing. This represents the first constraint, as metabolites canonly be converted along pathways specific to the studied system.Nevertheless, this model enables infinite possibilities for combina-tions of all allowed fluxes and needs to be further constrained tominimize the number of the possible flux distribution solutions.GEMs tend to be under-determined, constraint-based flux mod-els yielding multiple solutions to a single problem [37,38]. As withany under-determined system, one has to minimize the large num-ber of possible flux solutions that are allowed by the structure ofthe metabolic network to a few likely solutions. This goal can beachieved by using various constraints on flux values within individ-ual steps in metabolic pathways [39]. Examples of these constraints,which include mass balancing, “physico-thermo-chemical”, andactual flux measurements and are used to define a “window” ofpossible values a flux can have in the studied system, are brieflydescribed here. Detailed information on constraint-based modelingand its applications can be found elsewhere [5,10,39–43].

1.2.1. Stoichiometric constraints (mass balancing) assume nochanges of fluxes with time

During metabolism, concentrations of individual metaboliteschange with time t, which is represented by Eq. (1), where Xi isthe concentration of the metabolite M, Sij is a stoichiometric coeffi-cient matrix containing the stoichiometry of compounds i (placed

in rows) for each reaction j (placed in columns) in the metabolicnetwork. The parameter �j is a vector containing the flux val-ues through all reactions in the network [3]. The stoichiometricmatrix S contains all metabolites and reactions that ultimately are
Page 3: 1-s2.0-S0168945212000830-main

ience 191– 192 (2012) 53– 70 55

ri

asmTEs

S

tTwmttrsra[ftew(eatc“mrt

1o

iucremcandstoms([ssw[i

Fig. 1. Simplified GEMing workflow (modified from [47]). Step 1: Compartmentedmetabolic network is extracted from genome annotation and available metabolicand protein location databases. Metabolites are shown as circles of different colors;external substrates and biomass products are in gray and black, respectively, whilethe internal metabolites are in white. Enzymes E localized to different compartmentscarry out the reactions and transporters T enable metabolite uptake and transport.Stoichiometry and mass balancing S·� = 0 is applied. Step 2: Other constraints (Sec-tion 1.2.2) are implemented to minimize the phenotypic allowable solution space(shown as a heptahedron for all possible combinations of three fluxes �1,2,3) withinthe 3D flux vector space. Imagine that this space would be 15-dimensional for theexample of the network (for 15 fluxes represented by 8 enzymatic steps and 7transporters) and 1,798-dimensional for the updated AraGEM [22]. Step 3: Specific

E. Collakova et al. / Plant Sc

econstructed from the genome annotation of the studied organ-sm, as described above.

dXi

dt=

j

Sij · �j (1)

Because the rate of cell growth is much slower than the rate ofny enzymatic reaction, one may assume metabolic pseudo steady-tate for very short time scales such that the rate of change ofetabolite concentrations dXi/dt is assumed to be zero [3,24,44].

his is the basis for steady-state mass balancing represented byq. (2). This commonly termed “flux balance equation” is typicallyolved by linear programming software.

· � = 0 (2)

By applying the pseudo steady-state assumption, the differen-ial equation (Eq. (1)) is replaced with a linear equation (Eq. (2)).his was originally designed for bacteria growing in a chemostat,here the culture was believed to very nearly achieve steady-stateetabolism. Usefulness of the modeling strategy has resulted in

his approach being applied more broadly. However, for a GEM,here are usually more reactions than metabolites in S and weefer to such a system as under-determined [37,38]. The many fluxolutions to an under-determined metabolic network must be nar-owed down by introducing additional constraints to reduce thellowable flux solution space into a “phenotypic solution space”38]. An example of the phenotypic solution space is shown in Fig. 1or three fluxes as a heptahedron. In theory, this phenotypic solu-ion space contains all possible flux combinations that the cell couldver use. The particular combination of flux values that the cellill use is dependent on the influences of: (i) the cellular objective

e.g., maximizing growth rate); (ii) genetic regulation; and (iii) thenvironmental and developmental cues. The question we are reallysking is: What combination of the fluxes can be expected in a cellhat has adapted to the specific conditions (e.g., a nitrogen-starvedell)? In GEMing, knowledge of a “cellular objective” and realisticconstraints” are required to solve this problem by linear program-ing with Eq. (2). The topic of cellular objective functions will be

evisited after introducing other potential methods for constraininghe GEM.

.2.2. Constraining the flux solution space minimizes the numberf possible flux distribution solutions

The solution space can be further reduced in size by implement-ng “physico-thermo-chemical” constraints. The most commonlysed of these constraints include thermodynamic and enzymeapacity constraints [45–47]. Implementing thermodynamicsequires that each reaction is described by its standard Gibb’s freenergy of formation, which together with known (or assumed)etabolite concentrations, helps to determine the reversibility

onstraints for each reaction. Thus, thermodynamically unfavor-ble reactions are removed by constraining them to zero flux [47]. Aumber of reactions operate only in a single direction. For example,ecarboxylases tend to catalyze only the decarboxylation of theirubstrates at physiological conditions. In the case of the absence ofhe corresponding carboxylase in the studied organism, the rangef possible flux solutions can be restricted by constraining theinimal decarboxylase flux to zero. The maximal possible flux con-

traints can be set by using maximum capacity for a specific enzyme�max) if known from in vitro or in vivo enzyme activity studies24]. Maximum flux constraints can also be based on measuredubstrate uptake rates, which directly relate to the maximal pos-

ible biomass formed in a growing cell [24]. More sophisticatedays to constrain the model include proton and cofactor balancing

24,48] and integrating a variety of high-throughput data throughntegrative metabolic omic analysis [49]. The resulting phenotypic

objective functions (Section 1.2.3) are minimized or maximized to obtain solutionsthat lie within the phenotypic space (a single solution with flux values of, e.g., 1, 2,and 0.5 mmol g−1 dry weight h−1 for �1, �2, and �3, respectively, is shown as a circle).

solution space can ideally be further explored to identify a singleflux distribution (or a family of fluxes) to describe the relationshipsamong competing metabolic pathways or between metabolism andthe cellular environment. Method development for extracting themost useful flux distributions from the phenotypic space remainsan important area of systems biology research.

1.2.3. Objective functions are a mathematical representation ofcellular “goals”

As mentioned above, the phenotypic solution space containsa very large number of possible flux combinations. Each reac-tion is represented by a range of flux values (from minimum tomaximum) that the reaction could ever have in the reconstructedmetabolic network. The actual solution (a combination of valuesfor all fluxes in the network, or what modelers like to call “fluxdistribution”) is dependent on the “goal” of the cell. This goal iscommonly described by one or more objective functions whenthe flux balance equation is solved by linear programming [38].Anthropomorphically oriented cellular goals are only a mathemat-ical convenience of capturing the influence of the environmentor genotype on the flux phenotype for modeling purposes and

should be viewed as such in this review. The set of flux valueswill be different in the cell that, for example, operates to grow asfast as possible from those in a cell that operates to generate andstore chemical energy and this is reflected in objective functions.
Page 4: 1-s2.0-S0168945212000830-main

5 ience

I(p(pt(mpgtitctm

nnapscfmatsipodctb

m[mnoeeemusecg

MsdimasccwmMfSt

6 E. Collakova et al. / Plant Sc

n the bacterial world, these objective functions typically include:i) regulating growth (minimizing or usually maximizing biomassroduction); (ii) increasing the production of desired metabolitesmaximizing fluxes through particular pathways); (iii) minimizingroduction of undesired metabolites (minimizing fluxes throughhe relevant pathway(s)); and/or (iv) chemical energy productione.g., ATP or reductant) [1,3,38,50]. The rationale behind maxi-

izing the biomass production is based on the natural selectionrocess. Bacteria growing under favorable growth conditions willrow and divide as fast as possible to out-compete any other poten-ially present bacterial species and become the dominant speciesn that environment. On the other hand, many bacteria are knowno grow slowly and export antimicrobial agents as a mechanism ofompetition. For chemostat-based GEMing studies when competi-ion is not an issue, the most commonly used objective function is

aximizing the biomass production.The production of existing (e.g., antibiotics, solvents, etc.) or

ovel metabolites (e.g., by metabolic engineering) is usually con-ected with low growth rates. In that case, the cellular goal becomes

bi-level optimization, maximizing the growth rate as well as theroduction of a target metabolite. Some highly reduced metabolitesuch as fatty acids found in oil and membranes require significanthemical energy in the form of ATP and reductant (e.g., NADPH)or their synthesis. If the goal is to produce high levels of these

etabolites then maximizing the fluxes through the pathways thatre known to produce these high-energy cofactors will also needo be considered as another objective function. It is apparent thatome objective functions need to be maximized, while others min-mized to obtain a desired metabolic phenotype. Regardless, just asrotein folding is a thermodynamically driven process, distributionf metabolic flux throughout a network is also thermodynamicallyriven. Researchers in systems biology are still looking for theseonnections. Until found, the use of multiple objective functionso describe cellular function is the best approach to study multipleehaviors in the cell.

From the mathematical perspective, flux balance analysis is theost popular method to explore the phenotypic solution space

1,3,38]. This approach uses linear programming (with Eq. (2)) toinimize or maximize the objective function(s) within the phe-

otypic solution space to find the flux solution(s). Because thebjective function is either minimized or maximized, it representsxtremes at either end of the average (possibly physiologically rel-vant) objective function. As a result, the solution usually lies at thedges of the phenotypic solution space (Step 3 in Fig. 1). There isuch debate about how well this represents an in vivo scenario. The

se of multiple objective functions is needed to model for instancetressful or nutrient-limited conditions that are not represented byxtreme objective functions, which currently represents a majorhallenge of GEMing. Much research is also in progress to build-inenetic regulation to this approach.

Modeling (including linear programming) is typically done inATLAB using the COBRA toolbox that has been developed as a

hared resource with numerous useful tools for GEMing [51,52]. Asiscussed above, due to the large phenotypic solution space, there

s usually more than one solution, but linear programming and theajority of flux balance analysis algorithms commonly only return

single set of solutions [2]. From all possible solutions, the one thathows the smallest value for the sum of all fluxes will be typicallyhosen to represent the specific objective function(s) [53]. Thishoice is based on the assumption that cells will conserve resourceshen it comes to nutrients and energy and will, therefore, likelyinimize the total carbon flux through the metabolic network [54].

uch research has been performed to identify suitable objective

unctions in microbes. One such example is the Biological Objectiveolution Search algorithm [55], which relies on experimental datao locate a suitable objective for the network. With a description of

191– 192 (2012) 53– 70

GEMs and their representation by flux values established in thesesections, the next sections describe the applications of GEMs.

1.3. GEMs can be used to predict metabolic phenotypes inmicroorganisms

GEMs provide information about the distribution of fluxesin the genome-scale metabolic network for: (i) a specific geno-type; (ii) a developmental stage governed by regulatory factors;and (iii) environmental conditions. These are built into the GEMthrough the stoichiometric matrix, constraints, and chosen objec-tive functions. Historically, GEMs have been said to provide theconnection between the genotype and/or environmental influencesand resulting metabolic phenotypes, represented by fluxes. Gener-ating genetic mutants in silico along with implementing diverseobjective functions through flux balance analysis and comparingthe resulting GEMs offer an insight into the changes of fluxes withinthe metabolic network [42]. This type of in silico enzymatic reac-tion (‘gene’) knockout analysis is also useful in identifying essentialmetabolic reactions and genes by constraining the flux throughthe candidate reaction to zero to address the question of whetherall biomass components are still capable of being produced bythe metabolic network [56–58]. The reaction (and correspondinggene(s) encoding enzyme(s) that catalyze it) will be consideredessential if other reactions of the network cannot synthesize thecomponents of biomass. The wild type and knockout mutants canthen be compared in effort to recognize metabolic re-programmingthat occurs in response to genetic or environmental perturbations.Of course, correct gene-to-protein-to-reaction associations [24] areneeded for validation purposes, specifically to genetically mod-ify expression of relevant genes targeted for mutant analysis andmetabolic engineering.

Several more appropriate optimization methods in addition toflux balance analysis have been used to predict the metabolicphenotype in perturbed cells. Two most commonly used meth-ods include the “minimization of metabolic adjustment” [59] andthe “regulatory on/off minimization” [60] algorithms. Minimiza-tion of metabolic adjustment uses linear (as flux balance analysis)or quadratic optimization and the assumption that the affected cellwill prefer to minimize the effect of the genetic or environmentalperturbation. This is reflected as the smallest possible change in thetotal flux between the wild type and perturbed cells [59]. The regu-latory on/off minimization algorithm is similar to minimization ofmetabolic adjustment except that instead of the minimization ofthe total flux value, the total number of altered fluxes is minimized[60]. The rationale behind regulatory on/off minimization is basedon the assumption that the cell will dump the extra carbon to shortas opposed to long alternate pathways as a metabolic adaptationto the specific perturbation. Minimization of metabolic adjust-ment and regulatory on/off minimization are based on attractiveassumptions that coincide with the idea of conservation of cellularresources committed to metabolism, while achieving the cellularobjective. Minimization of metabolic adjustment tends to be moreaccurate for an initial mutant (before strain evolution), while fluxbalance analysis and regulatory on/off minimization were shownto be suitable for an adapted mutant [60].

GEMs are invaluable tools in rational metabolic engineering inmicroorganisms. So how can GEMs be used to predict genetic mod-ifications to obtain a desired metabolic phenotype (e.g., to producelarge amounts of a highly valuable metabolite) in a systematicway without knowing which genes to knockout? The OptKnock,OptFlux, and OptORF approaches (and their improved versions)

are three commonly used algorithms that perform systematicgene knockout simulations with GEMs, given specific conditions,to enhance production of specific metabolites [39,61–63]. Thesemethods were useful in increasing the production of lactate in
Page 5: 1-s2.0-S0168945212000830-main

ience

Eemstt

2a

2m

etaptovw“risvs

oiw3s[1tn3tifagmmrewtghbfPt

Cmliprip

E. Collakova et al. / Plant Sc

scherichia coli K-12, succinate in Mannheimia succiniproducens, andthanol in Saccharomyces cerevisiae [64]. Efforts have also beenade to model the production of valuable chemicals during the

tationary phase of culture growth by instituting additional objec-ive functions and minimizing the metabolic adjustment betweenhe exponential and stationary growth phases [65].

. Challenges associated with GEMing in plants revolveround the complexity of plant cell metabolism

.1. Lack of comprehensiveness in plant metabolic networks andissing components

Ideally, a GEM should contain all possible metabolites andnzymes organized in interconnected pathways that can exist inhe given organism, based on its genome sequence. This can be

challenge especially for complex eukaryotic organisms, such aslants, when the function of most proteins remains unknown. Tryo imagine that the task is to put together a functional car, for whichne has all parts, but limited knowledge about the function of indi-idual parts and how they assemble and function together. Thoseith limited knowledge may still have a pretty good idea about the

engine, fuel injection, and transmission systems” of the plant cellepresented by the central carbon metabolism. But, this knowledges probably not so great regarding the obscure parts representingecondary metabolism. In general, we could assemble a workingehicle, but we would not know if we had a sedan, a truck, or achool bus.

The E. coli K-12 genome contains 4,290 predicted protein-codingpen reading frames, of which 1,366 (32%) have been includedn a comprehensive genome-scale reconstruction metabolic net-

ork iJO1366 [12]. Of the 1,366 metabolism-related genes, only8 were inferred through bioinformatics. Conversely, Arabidop-is is predicted to have 28,775 genes as of September 11, 201129]. Molecular function was experimentally determined for only3% of all predicted genes, 44% of gene functions were compu-ationally predicted, 30% are of unknown function, and 12% areot annotated [29]. The maize genome is predicted to contain2,540 genes and 53,764 transcripts [66]. For about 42% of theotal maize genes there is very little or no functional annotationnformation available [22]. The high percentage of plant genes withunction inferred only through bioinformatics has raised concernsbout mis-annotation. The problem with mis-annotation and theenes with unknown function relates mostly to plant secondaryetabolism and transporters. Here, many enzymes and the inter-ediates of specialized metabolic pathways as well as transporters

emain uncharacterized regarding their substrate specificities. Forxample, methyltransferases involved in tocopherol biosynthesisere originally mis-annotated as sterol methyltransferases until

heir true function was confirmed by a combination of reverseenetics and biochemical characterization [67]. In some cases,omologues in different organisms have different functions. Inacteria, PurU (N5-formyl tetrahydrofolate deformylase) providesormyl groups for purine biosynthesis [68], but two ArabidopsisurU homologues are involved in photorespiration and do not seemo have anything to do with purine biosynthesis [69].

The situation gets even more complicated with metabolites.ollectively, plants produce over 200,000 primary and secondaryetabolites, but structures of only about 10–20% of these metabo-

ites are known [70,71]. The collection of all metabolites in a cells referred to as the metabolome [72–74]. Obviously, only some

athways are active under specific conditions at any given time,esulting in the production of a portion of all possible metabolitesn specific cells/tissues, which exacerbates the efforts to capture thelant metabolome. As such, the metabolic networks extracted from

191– 192 (2012) 53– 70 57

fully sequenced plant genomes and metabolic network databaseswill have many gaps and errors. Automated gap-filling algorithmssuch as GapFind and GapFill can be implemented [75]. However,these must be and often are not used with caution. There areinstances when a particular step or a pathway is present in oneorganism but not the other, or when completely different pathwaysin different organisms can produce the same metabolite.

For example, plants use homoserine kinase that provides O-phosphohomoserine for the biosynthesis of both threonine andmethionine [76]. In bacteria, threonine is made from homoserinethe same way as in plants, but methionine is made using pathwaysinvolving succinylated or acetylated as opposed to the phospho-rylated forms of homoserine [77]. Threonine catabolism involvesthreonine dehydrogenase in microorganisms and animals [78,79].However, the activity and roles of putative plant threonine dehy-drogenases in threonine degradation remain to be demonstratedexperimentally [76]. This problem relates to differences betweenprokaryotic and eukaryotic organisms and also to organisms inthe same taxa. Unlike other bacteria and plants, Clostridium aceto-butylicum is a fermentative microorganism that has an incompleteTCA cycle and does not use oxidative pentose phosphate pathway[80,81]. However, both initial drafts of the flux balance analysismodel for this organism had different versions of the TCA cycle thatalso differed from what was eventually found through 13C-basedsteady-state metabolic flux analysis [48,65,81,82]. Metabolic net-works that are extracted from databases using bioinformatics toolsand then used for GEMing should still be curated by an expert andsupplemented with biochemical data where necessary.

2.2. Compartmentation

Central carbon metabolism in bacteria occurs inside the cytosol,so many bacterial GEMs include only ‘extracellular’ and ‘intra-cellular’ compartments. However, microbial GEMs that consideradditional compartments, such as the ‘periplasm’ of Gram negativebacteria, do exist [12,58,83]. Minimizing compartments is preferredto the mathematical modeling aspect. From the modeling perspec-tive, the extracellular metabolites include substrates and productsof metabolism as well as some transported cofactors. The intra-cellular compounds represent the intermediates of metabolism.The substrates can originate from the extracellular compartment inabundance and are provided to the cell through transporters oftenconstrained to a desired flux. In the case of multicellular plants,the substrates are produced elsewhere (e.g., photosynthetic assim-ilate or CO2 from the environment) and supplied to the sink cells.The products must accumulate and constitute cellular biomass orare excreted to the extracellular medium. Internal metabolites areassociated with at least one reaction that produces them and at leastone reaction that consumes them for those reactions to receive flux.All metabolites transported between the extracellular and intra-cellular compartments must be accompanied by specific transportreactions that supply or remove metabolites from the extracellu-lar environment. Identifying relevant transporters and determiningtheir substrate specificities remains of the major challenges inGEMing.

Plant cells have several compartments defined by organellesthat separate different metabolic processes and provide variableconditions for enzymes. Many plant pathways of central carbonmetabolism are duplicated, which is represented by the presenceof differentially localized isoforms of the same enzymatic activi-ties in proteomes from individual cellular compartments. In manydifferent plant cell types, glycolytic enzymes are present in both

cytosol and chloroplasts [84–87]. The existence of different com-partments and enzyme isoforms in plant cells exacerbates GEMingin terms of metabolic network reconstruction and the actual mod-eling. First, plant metabolic networks are more complex and fluxes
Page 6: 1-s2.0-S0168945212000830-main

5 ience

twtisamipsptemnm

ccdsmPtrnbiibo[aisitlfttlicaoula

2

ctTombacNocpr

8 E. Collakova et al. / Plant Sc

hrough duplicated (and this is true for any set of parallel path-ays whether duplicated or alternate) can be dissected only when

here are differences in the production or consumption of chem-cal energy between the parallel pathways. Second, deciding thepecific compartment for the particular isoform is cumbersomes there is only bioinformatic localization prediction available forany proteins. Ideally, direct experimental evidence (e.g., protein

mport and/or co-localization studies), should be combined withroteomic studies for the specific cell/tissue/organ type and (if pos-ible) at relevant developmental stages and conditions. With theroteomic studies, caution is advised, as many cytosolic proteinsend to be associated with individual organelles. Many glycolyticnzymes in the cytosol are bound to the outside membranes ofitochondria and chloroplasts probably enabling metabolite chan-

eling and transport [88] and could be misinterpreted as beingitochondrial and plastidic.Third, the cytosol and different organelles provide different

onditions for metabolism in terms of: (i) pH; (ii) salt con-entrations; and (iii) energy/redox status. Not considering theifferences in environments in diverse compartments also haserious implications when thermodynamic constraints are imple-ented to narrow down possible flux distribution solutions.

ossible reversibility of a reaction depends on thermodynamics andhe concentration of participating metabolites and cofactors. Cur-ent methods of isolating organelles for metabolomic purposes areot reliable [89], though non-aqueous fractionation can potentiallye useful [90]. The major problem of non-aqueous fractionation

s associated with the inability to separate metabolites presentn different organelles to discrete regions, but this problem cane overcome by implementing marker molecules for differentrganelles in combination with novel computational approaches90]. Other novel approaches using laser-capture microdissectionnd pressure catapulting coupled with high spatial resolution imag-ng mass spectrometry (referred to as the “mass microscope”)how a lot of promise for measuring levels of major metabolitesn specific cell types and even large organelles [91,92]. However,hese approaches still need significant improvements. The reso-ution (10 �m) of this “mass microscope” is too low to be usefulor distinguishing metabolites in cytosol, mitochondria, and plas-ids. Sensitivity is also low, as only major metabolites present athe levels of about 1% of dry weight can be detected [92]. Metabo-ite charge depends on its pKa and pH of the environment and itnfluences enzyme activity. Considering pH changes in differentompartments is useful for proton balancing. Thermodynamic bal-ncing constraints have not yet been applied to plant GEMs andnly one has included proton balancing [22]. Currently, research isnderway to allow GEMing to aid in the quantification of metabo-

ite fluxes between different tissues. These obstacles should beddressed as additional models are created.

.3. Photosynthesis and photorespiration

The existence of photosynthesis and photorespirationontributes to the complexity of plant metabolic networks. Pho-osynthetic rates can be measured under specific conditions [93].hese measurements can then be used to constrain the flux modelr used to validate GEM predictions. However, in some recalcitrantodel systems, including developing seeds, photosynthesis cannot

e easily measured, and the validation of flux models is not easilychieved in these systems [94]. When modeling photosyntheticells, the light reactions produce chemical energy in the form ofADPH and ATP from splitting water, while the dark reactions

f the Calvin–Benson–Bassham cycle consume these high-energyofactors to ultimately produce sucrose. The linear electron trans-ort chain produces both NADPH and ATP, while the dark reactionsequire more ATP than NADPH. To balance the NADPH and ATP

191– 192 (2012) 53– 70

production, ferredoxin will transfer electrons back to photosystemI (through the cytochrome bf complex and plastocyanin) ratherthan to NADP reductase in the alternate cyclic electron transportand only ATP is produced [95]. Both cyclic and linear electrontransport chains must be included in the model and the relativefluxes through these routes depend on ATP and NADPH demand bythe dark reactions and other pathways. RubisCO can also fix O2 inC3 plants and photorespiration should be included. Photorespira-tion is minimized or excluded in C4 plants and the relevant C4 cycleinvolving phosphoenolpyruvate carboxylase in the mesophyll cellsis included in the model. Obviously, photosynthetic fluxes must beconstrained to zero in non-photosynthetic plant cells.

Significant modeling challenges exist involving the changesin central carbon metabolism of photosynthetic organisms dur-ing the diurnal and circadian rhythms. Many models have aimedto predict light-dependent photosynthetic metabolism, but in allcases biomass composition was used for plants grown in stan-dard light/dark regimes. As such, these GEMs may represent anaverage of mixed metabolism (represented by average flux val-ues) over a long period of time during biomass accumulation.A GEM provides a combination of static flux values that wouldexplain the resulting biomass composition, given all implementedconstraints and objective functions. The clashing of two differenttypes of metabolic activities that are then averaged leads to a fail-ure to adhere to the metabolic steady-state assumption neededfor flux balance analysis. Different pathways will be active underlight compared to dark conditions. Under light, photosystems willbe active, generating chemical energy used for CO2 fixation inthe Calvin–Benson–Bassham cycle. Sucrose and starch are made.In the dark, the light reactions will no longer be active, chang-ing the redox status of the cell. The chemical energy for theCalvin–Benson–Bassham cycle has to come from other sources, e.g.,respiration, in the dark. Dynamic flux balance analysis or moresophisticated types of kinetic modeling provide a way to modelchanges in fluxes [53,96]. Biomass and internal metabolites can bemeasured at short intervals during a time course involving diurnalor circadian metabolite changes for dynamic flux balance analysis.Metabolite levels and fluxes are assumed not to change dramati-cally over a short period of time of the measurements, conformingto the steady-state assumption. Subsequently, the measurementsfor each time point are used to model fluxes individually by fluxbalance analysis to obtain a view of dynamic change in fluxes in nor-mally non-stationary model systems. Additional GEMing tools thatwill aid this problem are new unpublished methods that removethe dependency of the biomass composition. These approachesremain in their infancy and are not discussed further in this review.

2.4. Diversity of plant cell and tissue types and responses to theenvironment influence the objective functions

Plants are sessile organisms that have evolved numerous sophis-ticated mechanisms in response to environmental changes. Thesemechanisms involve: (i) sensing specific types of biotic and abi-otic stimuli; (ii) transducing relevant signals; and (iii) turningon/off genes involved in the actual response to the changes in theenvironment. These responses include various types of metabolicand physiological adaptation processes, from the molecular to thewhole plant and even ecosystem level. At the same time, the plantfollows its own developmental program that is also tightly con-nected to changes in physiology and metabolism in specific organsand cell and tissue types. From the GEMing perspective, this trans-lates into an enormous complexity of possible metabolic networks

and flux combinations in these networks, as the responses to devel-opmental and environmental changes are associated with globalmetabolic reprogramming. Available high-throughput transcrip-tomic, proteomic, and metabolomic data are needed to refine the
Page 7: 1-s2.0-S0168945212000830-main

ience

nt(

wMinltptSsitpiiHgm

amcdsamwwairwfdmwpa

3

(etoop(iombon

3

Ti

E. Collakova et al. / Plant Sc

etwork after reconstructing it from genome annotation to ensurehat all enzymes and metabolites relevant to the model systemspecific cell types under defined conditions) are present.

Such complexity leads to an inherent problem associatedith GEMing, the selection of appropriate objective functions.inimization of metabolic adjustment and regulatory on/off min-

mization are algorithms designed to minimize total flux andumber of fluxes, respectively, and eliminate redundant paral-

el pathways. Plant transcriptomes and proteomes often revealhat the enzymes of the redundant pathways are still present androbing metabolism with 13C-baleled substrates demonstrates thathere is indeed carbon flowing through these pathways [33,97–99].o, what is the cause for this discrepancy? Does a cell not try to con-erve resources by minimizing fluxes? It probably does, but GEMingnvolves very simple and limited number of major objective func-ions. Other possible cellular goals are not considered and manyathways will not be ‘needed’ in the model (but needed in real-

ty) when these goals are not included. Bacteria are much simplern that sense, though they also respond to environmental changes.owever, from the metabolic engineering perspective, bacteria arerown under optimized controlled growth conditions that maxi-ize the production of the target metabolite.What about plants? Given the existing toolsets, it is conceiv-

ble that GEMs can be used to predict phenotypes and designetabolic engineering strategies for specific cell type and growth

onditions. However, there is no guarantee that these specific con-itions will be met in the field. In fact, plants growing in the field willimultaneously experience numerous abiotic and biotic stressesnd will adapt accordingly. The cellular objectives become muchore complicated than maximizing growth. Constantly changingeather and plant–plant, –pathogen, and –herbivore interactionsill cause up-regulation of various metabolic and physiological

daptations and defense mechanisms in field-grown plants. Signif-cant cellular resources in terms of carbon and energy sources areequired for these mechanisms to operate. As such, substantial fluxill be redirected from the central carbon metabolism intended

or growth to the pathways of secondary metabolism for the pro-uction of defense metabolites. Limited information on secondaryetabolism in a plant GEM will diminish its predictive capabilitieshen plant–environment and plant–pathogen interactions are inlay. Clearly this is an area that will benefit from incorporation ofdvanced experimental research into the systems biology.

. GEMing in plants

It is clear that GEMing in plants is extremely challenging, but as:i) the high-throughput technologies are improved; (ii) novel onesmerge; and (iii) the missing parts of metabolic networks are iden-ified, it will be possible to gather a systems level of understandingf metabolic processes. Currently, there are several GEMs devel-ped for plants. These models adequately describe the primarilyhotosynthetic or non-photosynthetic central carbon metabolismTable 1). While secondary metabolism is included in these models,t must be largely neglected in flux calculations due to unavailabilityf complete metabolomic and proteomic data. However, all theseodels were able to correctly predict certain aspects of central car-

on metabolism in specific cell types. This leads to the questionf how much information about the metabolic network is reallyeeded to be able to predict specific phenotypes.

.1. Arabidopsis thaliana cell-suspension culture GEM

Cell-suspension cultures are a useful tool in plant research.hough not photosynthetic, they have been used as an approx-mation of a heterotrophic plant cell (e.g., root cells) with the

191– 192 (2012) 53– 70 59

assumption that at least some aspects of central carbon metabolismwill be similar between cell cultures and diverse specialized rootcell types [100,101]. Clearly, the diversity of non-photosyntheticplant cells cannot easily be captured with simple cell culture mod-els. Such models should be viewed as a stepping stone for moreappropriate models using specific/relevant constraints and objec-tive functions applicable to plant metabolism in specialized cells.They should also be viewed more as models demonstrating theapplication of modeling approaches to plant cells rather than pro-viding meaningful metabolic and mutant predictions. The modelingapproaches used to generate the first plant cell-culture model [14]were similar to those in bacteria. As with bacteria, using cultures ofhomogeneous plant cells clearly offers advantages when it comesto simplifications of the models and the ability to use a number ofassumptions that greatly simplify the modeling process. Not deal-ing with: (i) diverse cell types; (ii) photosynthesis; and (iii) theneed for compartments in bacterial cells provide an obvious advan-tage, but at the same time, this diminishes the relevancy and thepredictive power of the analogous plant models.

The Arabidopsis cell-culture GEM is a minimization-of-metabolic-adjustment-like model based on linear programmingwith the minimization of the total flux as an objective function[14]. Organellar separation of metabolic pathways is not includedin this model. The metabolic network for this GEM was extractedfrom AraCyc [30] using ScrumPy [102]. The network was carefullycurated to the final version of 855 reactions that were mass bal-anced for major atoms (each reaction has the same number of eachatom on both sides). Unbalanced reactions in the model wouldupset the fundamental laws of mass and energy conservation. Ara-Cyc is a useful tool for most applications, and improvements areanticipated for its use with GEMing. However, it contains errors informulas as well as orphan reactions and metabolites, which pre-vents accurate atom balancing of the model; nearly 700 AraCycreactions were unbalanced for protons, of which over 370 reactionsalso did not have the correct oxygen balancing [14].

Constraints included the stoichiometric mass balancing and thephysiologically relevant biomass composition. This was based onexperimental measurements of individual major biomass compo-nents in cells growing on glucose as the sole carbon source. Solublemetabolites, salts, and CO2 were not accounted for in the totalbiomass. While the soluble metabolites are difficult to quantify (andindividually they do not contribute much to the final biomass), CO2lost during metabolism contributes significantly to the biomassespecially in heavily respiring heterotrophic cells. The total netflux of CO2 production can be easily measured [103] and used asan additional constraint. Nevertheless, this setup is rather cleverbecause rather than maximizing the biomass production as in mostbacterial studies, specific biomass production with the correct com-position [50] is defined based on the typical growth of real plant cellcultures under standard growth conditions. Specific biomass infor-mation including composition is mathematically expressed in socalled “biomass constituting equation”, representing a sum of theamounts of all significant individual components constituting thebiomass.

The goal of this study was to interrogate the phenotypic solutionspace. In other words, the authors sought to ascertain which fluxesand by how much will these fluxes change in cells existing at vari-ous energy states (cells with different ATP availabilities) driven bygrowth conditions. It is interesting to note that only 232 reactionswere active and only 42 reactions responded to systematic alter-ations in ATP demands [14]. As expected, the responsive reactionswere all related to the production or consumption of chemical

energy. Minimization of the total flux coincides with the objectiveof minimizing the number of active routes (fluxes), which couldexplain the low number of reactions carrying fluxes [14]. The objec-tive function used here prevented the use of most reactions (they
Page 8: 1-s2.0-S0168945212000830-main

60 E. Collakova et al. / Plant Science 191– 192 (2012) 53– 70

Table 1Summary of plant flux balance analysis models. All models presented here used biomass composition measurements as constraints and flux balance analysis except for thebna572, which is a flux variability analysis model* [19,20]. Photosynthetic models assume that the ratio of RubisCO CO2 fixation and oxygenation is 3:1. Most of these modelsused physiologically irrelevant, extreme objective functions because the real ones are not known. With the exception of the updated AraGEM and C4GEM [22], secondarymetabolism is not included.** denotes corresponding number of reactions and metabolites in the updated AraGEM model.

Models ReactionsMetabolites

Objective functions Biological question Comments

Arabidopsis – heterotrophic cellsuspension culture [14]

1,4061,253

• Minimize total flux Assessing changes in metabolismat different energy states in aheterotrophic cell

• Objective function prevented manyredundant reactions from being used• No compartments

Barley – photoheterotrophic andheterotrophic cereal endosperm[15,16]

257234

• Maximize growth (linearoptimization)• Minimize total flux(quadratic optimization)

Estimating the influence ofhypoxia and aerobic conditions onmetabolic fluxes

• Small compartmented tissue-specificmodel (65 transporters)• 2nd model – improved with theactual differential O2 and metabolitemeasurements within the endosperm

Rape – photoheterotrophicmetabolism in developing oilseed [21]

313262

• Actual substrate uptake andbiomass production ratesbased on high-qualitymeasurements from theliterature

Predicting metabolic regulationand mutant phenotypes related tooil accumulation

• Flux balance analysis used in acombination with principal componentanalysis to minimize the number ofsolutions (objective functions werephysiologically relevant)• Regulation of oil biosynthesispredicted

Rape – bna572, photoheterotrophicmetabolism in developing oilseed*[19,20]

572376

• Flux balance: minimizesubstrate uptake rates(photons and carbon andnitrogen sources)• Flux variability: fixedobjective function, thenminimize and maximize eachreaction

Exploring phenotypic solutionspace with flux variability analysisin search for alternate activepathways in developing rapeseedembryos

• Highly comprehensive andcompartmented oilseed model• Flux variability along with in silicomutant analysis – exploration ofphenotypic solution space obtainedfrom flux balance analysis and the useof alternate pathways in relation tocarbon use efficiency• Some fluxes in the bna572 modelwere comparable to the fluxes in thecorresponding 13C metabolic fluxanalysis model• Flux balance and variability analyses– unable to select less efficientalternate pathways that are known tobe active in vivo

Arabidopsis – AraGEM (iRS1597, anupdated AraGEM**) – C3

photosynthesis, photorespiration, andheterotrophic metabolism inmesophyll cells [18,22]

1,567(1,798**)1,748(1,820**)

• Minimize substrate uptakerates (photons for C3 andsucrose for heterotrophicmetabolism)

Assessing metabolic changes in C3

vs. photosynthetic andheterotrophic metabolism,respectively; predicting sources ofenergy in mesophyll cells

• Compartmented model of C3

photosynthesis• Good qualitative model predictingexpected metabolic changes

Maize, sorghum, sugar cane – C4GEM –C4 photosynthesis in mesophyll andbundle sheath cells [17]

1,5881,755

• As in AraGEM As in AraGEM, but comparingfluxes in 2 cell types anddistinguishing different types of C4

photosynthesis

• Compartmented 2-cell model of C4

photosynthesis containing 112transporters• Fluxes in photosystem I and II weredistinguished in 2 cell types• Detailed analysis of high-energycofactor requirements in differentmodes of C4 photosynthesis

Maize – iRS1563, an updated C4GEM[22]

1,9851,825

• Maximize biomassproduction

Predicting metabolic phenotypesin known lignin biosynthesismutants and assessing the effectsof changes in cell wall composition

• Expanded C4GEM specific to maize• Known aspects of secondarymetabolism included• Photon utilization and CO2 uptake –

ciurbsocflglTi

arried zero flux). In reality, there is much metabolic redundancyn plant central carbon metabolism and plant cells simultaneouslyse these redundant and parallel pathways. Carbon can easily beedirected from one pathway to the other if the main route islocked, resulting in the same amount of cell biomass and compo-ition. The flux solutions obtained in [14] were biased by the usef the highly limiting objective function and the question is howomparable these modeled fluxes are to the situation in vivo. Twoux maps were generated for Arabidopsis cell-suspension cultures

rown under heterotrophic conditions at two different oxygenevels [99] using 13C-based steady-state metabolic flux analysis.his procedure involves a combination of measurements andterative modeling [104,105]. Accumulation of individual biomass

on biomass production not constrained• Validation provided by existing dataon lignin biosynthetic mutants

components and substrate uptake rate measurements are used toconstrain fluxes. Tracking the carbon from substrates through adefined metabolic network to final products of metabolism using13C techniques enables one to discern which pathways are activeand, in some cases, even dissection of the forward and reverse fluxesthrough reversible reactions [104,106]. Metabolic flux analysisshowed that under both normal- and high-oxygen conditions, bothglycolysis and TCA cycle were very active and that the TCA cyclebehaved as a cycle [99]. In the case of the GEM, the overall fluxes

of these two pathways were low and TCA was not a complete cycle[14]. This case illustrates the need for experimentally-derived con-straints in central carbon metabolism. Installing a level of geneticregulation in a plant GEM to satisfy the need for constraints is an
Page 9: 1-s2.0-S0168945212000830-main

ience

ab

bmbttctaiimfan

isaogBbtdGdtmtsdwnbttd

3

tspcgSclssa(uocmaafstt

E. Collakova et al. / Plant Sc

mbitious but ultimately necessary approach from a systemsiology perspective.

It is difficult to compare these two models in this caseecause the metabolic network of the metabolic flux analysisodel lacked reactions involving ribulose-1,5-bisphosphate car-

oxylase/oxygenase (RubisCO) and photorespiratory pathway. Inheory, RubisCO is not supposed to be active in non-photosyntheticissues, such as non-green seeds, roots, and plant cell-suspensionultures. RubisCO is assumed to be inactive and is not included inhe metabolic flux analysis models of maize and sunflower seedsnd maize root-tips [107–110]. However, this enzyme is involvedn increasing the carbon use efficiency in green oilseeds by refix-ng significant amounts of CO2 lost in photoheterotrophic-type

etabolism [103,111]. RubisCO and photorespiratory enzymes areound in the proteomes of plant cell cultures [112], but their in vivoctivities and physiological relevance to heterotrophic metabolismeed to be experimentally elucidated.

There was a discrepancy around CO2 and O2 fixation by RubisCOn the flux balance analysis model. The two gasses compete for theame active site in RubisCO [113]. This means that CO2 and O2 fix-tion cannot be separated when both gases are present and thenly way to shift the preference for either carboxylation or oxy-enation is by changing the relative availability of the two gases.ased on the flux balance analysis model, RubisCO was active withoth CO2 and O2 at low ATP demand, but the activity towards thewo substrates was disconnected, when O2, but not CO2 was pre-icted to be used at all times regardless of the ATP demand in theEM [14]. These results contradict the expectations for high ATPemand regarding the separation of carboxylation and oxygena-ion. At high ATP demands, substrates are oxidized at high levels,

eaning that O2 is consumed and CO2 produced, which would favorhe carboxylation of ribulose-1,5-bisphosphate by RubisCO. Sub-equently, both the oxygenation and photorespiration should beisfavored [14]. The most likely explanation for this discrepancyas that RubisCO-catalyzed carboxylation and oxygenation wereot linked. There was also a need to constrain these two insepara-le reactions together by using experimentally-derived constraintso drive the relative flux through carboxylation versus oxygena-ion based on the actual relative availabilities of CO2 and O2 underifferent energy conditions.

.2. Seed GEMs in plants

Seeds provide means of sexual propagation in plants, buthey are also a source of nutritious and useful chemicals. Dryeed biomass consists primarily of proteins, lipids, starch andolysaccharides, and cell wall material and the relative levels andomposition of these individual storage compounds depend on theenotype of the plant species and growth conditions [114,115].eed storage compounds are synthesized by the pathways ofentral carbon metabolism during seed development and are mobi-ized during seed germination, providing nutrients and energy foreedling establishment before photosynthesis takes over, enablingeedling growth [94,103,110,116]. Maternally provided organicnd inorganic nutrients are metabolized through the: (i) glycolysis;ii) pentose phosphate pathway; (iii) TCA cycle; and (iv) individ-al pathways leading to amino acids for protein, fatty acids foril, and monosaccharide precursors for starch, oligosaccharide, andell wall syntheses [94,103,110]. Seeds also accumulate secondaryetabolites, providing protection against herbivores, but individu-

lly, these compounds contribute less than 1% to the final biomassnd due to poor flux value scalability so far have been excluded

rom the seed models. For example, glucosinolates, which are con-idered major secondary metabolites in Brassicales, constitute lesshan 1% and 4% of biomass in rape and Arabidopsis seed, respec-ively [117]. In addition, glucosinolates are synthesized most likely

191– 192 (2012) 53– 70 61

in both maternal tissues and the embryo, but the specific locationsand their contribution to overall glucosinolate synthesis remaincontroversial [118,119]. Glucosinolate metabolism in seeds can-not be currently accurately modeled due to this ambiguity in thebiosynthesis location. It is quite acceptable to argue that the levelsof these protective metabolites will increase in field-grown plantsand we have to include the relevant metabolic pathways in futureGEMs intended for elucidating seed metabolism for metabolic engi-neering purposes.

Bacteria will maximize their growth while saving the resourcesunder optimal conditions as one of the mechanisms to becomethe dominant species in the environment. Similar to bacteria, onewould expect developing seeds to maximize the production ofbiomass of optimal composition with the minimal effort underfavorable conditions, enabling the competition with other plantspecies during germination. As such, real biomass measurementscan be used to constrain the model, and the maximization ofbiomass production and minimization of total flux appear to beacceptable objective functions. However, as noted in Section 3.1,minimizing total flux is completely inappropriate for complexeukaryotic cells due to active redundant and parallel pathways andthe lack of knowledge about additional required, but ignored pro-cesses that will influence the biomass production in seeds. Despitethese issues, seed central carbon metabolism is relatively simpleand usually fulfills the metabolic steady-state requirement neededfor flux balance analysis [104,120]. Therefore, it is not surprisingthat the very first flux balance model in plants [15] was focused onelucidating seed metabolism.

3.2.1. Compartmented barley seed endosperm flux balanceanalysis models

The endosperm of developing cereal seeds represents a combi-nation of photoheterotrophic and heterotrophic metabolism andis the site of the synthesis of major storage compounds starchand seed storage proteins [121]. Maternally provided sucrose isused in glycolysis or in starch biosynthesis, while asparagine andglutamine are sources of nitrogen for other amino acids for seedstorage protein synthesis. The endosperm also has to metaboli-cally adapt to low O2 availability as it becomes less permeable toO2 with increasing density, while still being able to make seedstorage compounds, ensuring seed growth [121,122]. This bio-chemical and physiological information and other genomic andproteomic data provided the basis to refine the model of central car-bon metabolism in developing barley endosperm extracted from anumber of databases [15]. The final model contained four differentcompartments: (i) a single extracellular compartment for thesubstrates, cofactors, and biomass compounds and (ii) three intra-cellular compartments (cytosol, mitochondria, and amyloplast) forinternal metabolites. Particular focus was given to transporters,enabling the movement of internal metabolites between com-partments. Barley endosperm comprises the majority, while theembryo less than 4% of dry, husk-free barley seed biomass [123].As such, the actual composition of barley seeds (predominantlyrepresenting the endosperm) was used to constrain the biomassfluxes. Two successive objective functions were used: (i) maximiz-ing growth by linear programming followed by (ii) minimizing theoverall flux by quadratic optimization [124]. Fluxes were modeledin barley endosperm under: (i) anoxic; (ii) hypoxic; and (iii) aero-bic conditions by simulating different O2 and sucrose uptake rates[15].

However, barley endosperm is not quite uniform and it containsa gradient of O2 levels, from aerobiosis at the periphery to hypoxia

in the middle of the seed. High O2 levels (over 450 �M) are observedat the periphery where photosynthesis occurs and low O2 levels(less than 5 �M) in the middle of illuminated barley seeds [16]. Thismeans that the metabolism is different in the endosperm located
Page 10: 1-s2.0-S0168945212000830-main

6 ience

ceictctbclbhoadau[

mhsopaer(taiepwttaFtis

mppoptgRdwtmsclte

ewatflb

2 E. Collakova et al. / Plant Sc

lose to the seed coat and in the middle of the endosperm. It waslegantly shown by using non-invasive in vivo magnetic resonancemaging that alanine is made as final product of fermentation in theentral endosperm where O2 levels are low and then transportedhrough plasmodesmata to peripheral endosperm layers where it isonverted to pyruvate and used in aerobic metabolism [16]. Addi-ional metabolomic measurements on dissected different layers ofarley endosperm demonstrated changes in several metabolites ofentral carbon metabolism [16]. As such, the individual endospermayers corresponding to different levels of O2 have to be used foriomass measurements individually. It appears that high degree ofypoxia in cereal seeds is relevant not only to later stages of devel-pment when the endosperm is already densely packed with starchnd storage proteins, lowering the permeability to O2 [121], but alsoeep within seeds [16]. To account for these differences in O2 levelsnd central carbon metabolism, flux balance analysis was re-donesing the original network [15] and these relevant measurements16].

Barley endosperm flux balance analysis models predictedetabolic changes associated with anoxia and different levels of

ypoxia relative to the aerobic conditions in developing barleyeeds [15,16]. Both models predicted the expected accumulationf alanine under hypoxic conditions [15,16]. From the qualitativeerspective, most of the predicted changes in flux distributionsssociated with anaerobiosis vs. aerobiosis were expected. Forxample, the models described in detail the gradual switch fromespiration and high starch production in aerobic endosperme.g., young seeds or peripheral endosperm exposed to light)o an incomplete TCA cycle, high fluxes through the glycolyticnd fermentative pathways, and low starch/biomass productionn hypoxic dense endosperm (e.g., mature seeds or the centralndosperm in the darkness). The non-cyclic TCA cycle is observedrimarily because succinate dehydrogenase connects the TCA cycleith the mitochondrial electron transport chain, which carries lit-

le flux when O2 is limited. Based on these simulations, the rate ofhe biomass accumulation should decline with the increasing agend density of developing barley seeds as O2 availability decreases.rom the quantitative perspective, the model predictions regardinghe growth rates were in accordance with the experimental find-ngs [15]. This is not surprising when the actual composition of seedtorage compounds was used to constrain biomass fluxes.

The original model also predicted the existence of unexpectedetabolic adaptations to anoxia [15]. First, the futile cycle involving

hosphofructokinase and pyrophosphate:fructose-6-phosphate 1-hosphotransferase running in opposite directions leads to the usef ATP and generation of pyrophosphate during anoxia. Pyrophos-hate produced by ADP-glucose pyrophosphorylase could be usedo help drive sucrose mobilization for starch synthesis by UDP-lucose pyrophosphorylase in anoxic endosperm. Second, theubisCO bypass was required for growth in anoxic endospermue to an already incomplete TCA cycle and low O2 availability,hich limited additional available routes of carbon flow [15]. While

he RubisCO bypass can re-fix carbon lost during central carbonetabolism in developing green oilseeds that are capable of photo-

ynthesis [103,111], its activity may be irrelevant in heterotrophicereal endosperm during anoxia. Based on O2 measurements, bar-ey endosperm is probably never completely anoxic [16]. As such,he relevance of the futile cycle and RubisCO bypass to cerealndosperm metabolism remains elusive.

Overall, these flux balance analysis models predicted barleyndosperm metabolism during growth. The next question is howell the predicted internal fluxes are in an agreement with the

ctual internal fluxes. Many GEMs predict the biomass fluxes, buthere is little agreement between the modeled and real internaluxes [125]. This presents an enormous opportunity for systemsiologists. One also has to keep in mind that plant metabolism

191– 192 (2012) 53– 70

is resilient and highly regulated, and plant cells utilize redundantpathways that: (i) may be missing from the model or (ii) not cho-sen as possible solutions due to the flux-minimization bias of theoptimization algorithms.

3.2.2. Compartmented flux balance analysis models of developingBrassica napus embryos

Rapeseeds predominantly accumulate oil and protein as storagecompounds and represent an important source of food and biofuel[115]. Oil and proteins are synthesized through the central carbonand nitrogen metabolic pathways during embryo development.The embryos are green and capable of performing photosynthesis,which generates over 50% of the reducing power and ATP used inextensive oil accumulation [94,126]. A flux balance analysis modelwas reconstructed primarily from the AraCyc database [30] andavailable literature. The objective function was based on data fromthe literature regarding substrate uptake and rapeseed biomassproduction during the early stages of oil accumulation. Flux balanceanalysis was used along with principal component analysis andin silico mutant analysis to investigate “metabolic network plas-ticity” with specific emphasis on the metabolic regulation of lipidproduction. The principal component analysis-assisted explorationof the entire phenotypic solution space was based on the reduc-tion of the dimensionality in a highly complex biological system toidentify viable solutions. The predicted growth including lipid accu-mulation and its regulation by the transcription factor WRINKLED1agreed with the experimental observations [21].

Brassica napus bna572 model is so far the most comprehensiveseed model containing 10 compartments, including apoplastic andintermembrane spaces and thylakoid lumen [20]. Flux variabilityanalysis [53] was used to interrogate other possible optimal fluxdistribution solutions in this flux balance analysis model to pre-dict metabolic fluxes under photoheterotrophic and heterotrophicconditions in combination with different organic and inorganicnitrogen sources, respectively [19,20]. Flux variability analysisenables to assess ranges of flux values representing optimal solu-tions given used constraints and objective functions. This meansthat alternate and redundant pathways are also explored by thisapproach and may be chosen as an optimal solution. The actualbiomass production rates including the carbon balance and house-keeping energy requirements of cultured B. napus embryos wereused as constraints, while the substrate uptake (photons and carbonand nitrogen sources) was minimized in flux balance analysis [20].For flux variability analysis, fixed objective function followed bysequential minimizing and maximizing each reaction, respectively,were used to narrow the ranges for flux distributions. Individualfluxes were categorized into 18 different types and 6 classes basedon the flux variability/stability and essentiality/substitutability toassess which pathways were active under four different conditions.

The predicted fluxes were compared to the actual fluxesobtained by 13C-based steady-state metabolic flux analysis [19].It is very important to keep in mind that these two models haddifferent network structures and performing any direct compari-son of these models is extremely difficult. Differences in networkstructure are a result of poor resolution of compartmentation ofsome pathways (e.g., cytosolic and plastidic glycolysis) or miss-ing steps (e.g., plastidic malic enzyme as a source of carbon andenergy for fatty acid synthesis) in 13C-based flux models [19]. Onthe other hand, 13C-based flux models are well constrained basedon 13C-signatures reflecting real fluxes. The flux variability anal-ysis yielded predictions that were in most part consistent withthe actual fluxes in developing rapeseed embryos. However, the

results obtained from the correlation analysis used for flux compar-ison are somewhat misleading. Although the correlation coefficientwas high when bna572 was compared to the 13C-flux model [19],this result was biased because of all small fluxes falling together
Page 11: 1-s2.0-S0168945212000830-main

ience

cdsbbtomkfbcicewaiaoflb

3r

ac(isCMafliatAp8Atciyanii1m

tTbtuflwpcaat

E. Collakova et al. / Plant Sc

lose to zero. In fact, numerous small positive or negative (oppositeirection) fluxes in 13C-flux model were compared to the corre-ponding small, but several-fold different or zero fluxes in thena572 model. For example, the TCA cycle never carried any fluxetween �-ketoglutarate and malate in the bna572 model, whilehere was a small positive flux between these metabolites basedn 13C metabolic flux analysis [19]. There were also clearly otherore significant disagreements between the two models. Pathways

nown to be active in vivo that represented unfavorable “waste-ul” or “energetically inefficient” processes were never selected toe part of the solution by flux variability analysis due to insuffi-ient constraints and extreme objective functions. These pathwaysncluded: (i) amino acid uptake (ammonia uptake was energeti-ally most efficient); (ii) glucose and fructose uptake (sucrose wasnergetically most efficient); and (iii) pentose-phosphate path-ay (reductant was produced exclusively by photosynthesis and

number of dehydrogenases) [19,20,94,126–128]. However, it ismportant to note that flux variability analysis is extremely useful,s it enables predictions regarding the existence and essentialityf novel and alternate pathways not considered in small 13C-basedux models. These predictions then can be tested experimentallyy 13C-labeling approaches in plants [104,120].

.3. AraGEM, an Arabidopsis flux balance analysis modelepresenting a C3 plant cell

AraGEM v.1.0 is a flux balance analysis model and it was the firstttempt to predict and compare the central carbon metabolism inompartmented plant cells representing: (i) photosynthetic leaves;ii) photorespiring leaves; and (iii) non-photosynthetic respir-ng cells/tissues/organs (e.g., roots). The first, photosynthesis-onlycenario is realistic for plants grown under saturating levels ofO2 when the oxygenation reaction of RubisCO is suppressed.odeling of: (i) photosynthetic and (ii) photorespiring leaves

llowed estimation of the effects of photorespiration on changes ofuxes in central carbon metabolism. The network reconstruction

nvolved all available databases that provided the stoichiometrynd enzyme localization information as well as the informa-ion pertaining to individual gene-enzyme-reaction relationships.raGEM includes several compartments (cytosol, mitochondria,lastids, peroxisomes, and vacuoles) and 18 plasma membrane and3 interorganellar transporters [18]. A new, updated version ofraGEM network, iRS1597was generated using up-to-date anno-

ations, but this network was not used for modeling [22]. Duringuration, 75 reactions that were absent from KEGG and AraCyc weredentified as essential for biomass production by flux balance anal-sis. Fluxes leading to the synthesis of sugars, amino acids, fattycids, nucleotides, vitamins and cofactors, and cell-wall compo-ents represent 47 different biomass drains that were included

n the biomass constituting equation. The uptake of photons, CO2,norganic salts, and selected sugars and amino acids represent8 uptake fluxes. During network reconstruction, 446 singletonetabolites were also identified [18].In non-photosynthetic cells, photosynthetic and photorespira-

ory fluxes were set to zero, while sucrose uptake was allowed.he rate of biomass production was held constant and the actualiomass composition measurements [129] were used. The objec-ive function involved the minimization of sucrose uptake andtilization. For photosynthetic cells without photorespiration, theux through the oxygenation reaction of RubisCO was set to zero,hile for normally photosynthesizing and photorespiring meso-hyll cells the carboxylation-oxygenation ratio of RubisCO was

onstrained to 3:1, based on available data [130]. In both cases, CO2nd photon uptake were set as free fluxes (subjected to modeling)nd sucrose uptake from the extracellular space was constrainedo zero. Leaf biomass composition [131] was used to constrain the

191– 192 (2012) 53– 70 63

biomass fluxes. The objective function involved minimizing thephoton uptake rates.

AraGEM is an extensive model of Arabidopsis metabolism, andits development is considered an important achievement. Giventhe complexity of plant metabolism, a number of simplificationshad to be implemented to reduce this complexity and to enableflux balance analysis. Carbon and energy resources needed forgeneral cell maintenance, degradation and component turn-overprocesses, and futile cycles are estimated at about 37% of totalcellular needs [125], and were included as a simple ATP main-tenance term in the AraGEM model. The inclusion of fatty acidswas represented solely by palmitic acid and reduced the complex-ities that arise with lipid heterogeneity. The resulting impacts ofthese simplifications on the relationship between cell growth andphoton uptake rates are minimal. However, the impacts of simi-lar simplifications on fluxes through individual pathways remainunder investigation for several models. Despite the simplifications,AraGEM was able to predict metabolic phenotypes in photosyn-thetic and non-photosynthetic Arabidopsis cells quite accuratelyfrom the qualitative perspective. These predictions included: (i) theactivation of the traditional photorespiratory cycle and correspond-ing biomass loss as CO2 during photorespiration; (ii) activation ofsucrose degradation, glycolysis, and TCA cycle in root cells; and(iii) identity of specific sources of reductant for nitrogen reduc-tion in light and dark conditions. AraGEM also provided novelpredictions regarding metabolic flexibility and the production ofhigh-energy cofactors in different compartments under variousconditions [18].

3.4. C4GEM, maize, sugarcane, and sorghum flux balance analysismodels representing C4 plant cells

C4 metabolism enables some plants such as maize, sugarcane,and sorghum to minimize water loss and photorespiration byconcentrating CO2 levels used by RubisCO [132–134]. Briefly, C4metabolism involves CO2 fixation by phosphoenolpyruvate car-boxylase, generating malate in mesophyll cells. Malate is thentransported to bundle-sheath cells, where its decarboxylation topyruvate by malic enzyme takes place. In the bundle-sheath cells,CO2 released from malate is fixed by RubisCO in the traditionalCalvin–Benson–Bassham cycle. Pyruvate is transported back tothe mesophyll cells and enters the C4 cycle. There are species-specific variations to C4 metabolism depending on whether NAD-or NADP-dependent malic enzymes and phosphoenolpyruvate car-boxykinase and mitochondria are involved, and these modes wereaccounted for in the selected model C4 plant species. Compart-mented AraGEM was used as a template for the reconstructionof C4GEM, to which pathways and transporters related to the C4metabolism were added [17]. Species-specific gene annotation wasimplemented to generate models of central carbon metabolism inmaize (11,623 genes), sugarcane (3,881 genes), and sorghum (3,557genes) [17].

GEMing of C4 metabolism offers a unique opportunity to inter-rogate the interactions between two different cell types and servesas a basis for future studies involving modeling of multiple celltypes. Bundle-sheath and mesophyll cells were modeled as dupli-cated subsets of the whole system and the intercellular transport bythe plasmodesmata provided the needed flux link between thesetwo cell types [17]. Metabolites were exchanged between the cellsand remained ‘intracellular’ in both cells. In a single-cell model,they would be excreted from the cell and considered ‘extracellu-lar’. The network differences between the cell types were achieved

by implementing cell-specific constraints based on available tran-scriptomic and/or proteomic data for that cell type. For example,the C4 cycle operates only in mesophyll cells, so fluxes through thiscycle can be constrained to zero in the bundle-sheath cells.
Page 12: 1-s2.0-S0168945212000830-main

6 ience

tmtmammarommthtANpccsmcf

stacdfaafirctotpoeaMcsfPbflpoAlp

3

mdaoah

4 E. Collakova et al. / Plant Sc

C4GEM delineated major differences in metabolism betweenhe bundle-sheath and mesophyll cells that overall agreed with

aize proteomic data. Bundle-sheath cells tend to have high fluxeshrough the pathways relevant to CO2 assimilation, while the

esophyll cells have highly active pathways involving pyruvatend nucleotide metabolism as well as nitrogen assimilation andetabolism [17]. A surprising prediction that came out of theodel was the existence of differential fluxes in photosystems I

nd II represented by the linear and cyclic electron transport chains,espectively, in two cell types. This distinction was enabled becausef the differences in ATP requirements in the bundle-sheath versusesophyll cells and for the different malate decarboxylation modesentioned above. C4 cycling in mesophyll cells requires more ATP

han C3 photosynthesis and therefore, mesophyll cells should havead very active cyclic electron transport chain around photosys-em I, which produces only ATP and helps keep the appropriateTP/NADPH balance. However, C4GEM predicted that for theADP-dependent malic enzyme mode, only linear electron trans-ort chain was active in mesophyll cells, while the bundle-sheathells had only cyclic electron transport chain. Based on simulations,yclic electron transport chain is more efficient in the bundle-heath than in mesophyll cells with the NADP-malic enzyme-typeetabolism [17]. Because the photon uptake was minimized, the

yclic electron transport was selected as the most efficient solutionor ATP production in the bundle-sheath, but not in mesophyll cells.

Minimizing the uptake of substrates (e.g., photons anducrose under photoautotrophic and photoheterotrophic condi-ions, respectively) with maximal production of biomass of theppropriate composition provides the ability to assess the effi-iency of cellular metabolism. It would be interesting to see fluxistribution predictions for physiologically relevant conditionsor a C4 plant model that includes constraints relating to thectual capacities (maximal quantum yield) of the photosystemsnd the downstream photosynthetic components including CO2xation based on available measurements [135]. Using photonsather than the components of the photosystems is a mathemati-al convenience that allows controlling the downstream fluxes. Ishe minimization of the photon uptake a physiologically relevantbjective function for a photosynthetic cell? We are questioninghis objective function because the amount of light absorbed byhotosynthetic apparatus does not translate to the same amountf electron flux. Not every photon transferring its energy to anlectron is used in photochemistry and there are processes suchs photoinhibition that are not accounted for in these models.inimizing photon uptake might be relevant to very low light

onditions or during an abiotic stress when photosynthesis isuppressed [136,137], but not to a healthy plant growing underavorable conditions when photons and minerals are abundant.hotosystems are usually hit with excess light energy that needs toe dissipated as heat (non-photochemical quenching of chlorophylluorescence) or by release of fluorescent light from excited chloro-hylls [138]. To assess the influence of photon and CO2 uptake ratesn biomass production, we performed a simulation study usingraGEM [18], which also used the minimization of photon uti-

ization as an objective function. Results of these simulations areresented in Section 4.

.5. Maize GEM iRS1563 representing C4 plants

An updated model of C4GEM (iRS1563) was generated foraize mesophyll and bundle-sheath cells [22]. This model

escribes more detailed biosyntheses of cell wall components

nd lipids and includes selected, well-studied pathways of sec-ndary metabolism (flavonoids, terpenoids, phenylpropanoids,nthocyanins, carotenoids, chlorophylls, glucosinolates, and someormones) not present in previous models of central carbon

191– 192 (2012) 53– 70

metabolism [17,18]. iRS1563 contains 674 more metabolites and893 more reactions than C4GEM, which is relevant mostly to cen-tral carbon metabolism. This model also has 445 reactions and369 metabolites that are unique to maize when compared to theupdated AraGEM (iRS1597) due to the differences between C3 andC4 metabolism as well as in secondary metabolism. All reactions areelementally and charge balanced in six compartments [22]. How-ever, challenges still exist for proton balancing based on pKa valuesto enable organelles to operate at different pH values. Measure-ments of biomass composition for maize and other closely relatedspecies were used to constrain biomass fluxes in this flux balanceanalysis model [22].

As with C4GEM, iRS1563 was designed to predict flux distri-butions under different physiological conditions represented byphotosynthesis, photorespiration, and respiration in mesophyll andbundle-sheath cells of a C4 plant. Unfortunately, no data or discus-sion was provided to compare this model to C4GEM in terms offlux predictions for these different types of physiological condi-tions. Changing lignin composition in cell walls without affectingtotal biomass accumulation is a challenging task important for cel-lulosic biofuel production [139,140]. Therefore, the authors focusedon predicting metabolic phenotypes in two well-characterizedlignin biosynthesis mutants under photosynthetic conditions andon prediction validation [22]. Fluxes through the steps disruptedin the mutants, cinnamyl alcohol dehydrogenase and caffeate 3-O-methyltranferase, were set to 10% of the wild-type activitiespredicted by the model. Presence of a residual activity of theseenzymes was necessary to ensure lignin biomass production. Max-imal photon utilization and CO2 uptake were allowed. The fluxbalance analysis model was about 81% accurate in predicting theexpected qualitative changes in mutant fluxes when compared tothe experimental findings [22].

4. Evaluating AraGEM through simulations

The AraGEM model (v. 1.0) [18] was simulated in an effort todemonstrate to plant modelers: (i) how critical constraints canimpact overall model results; and (ii) the robustness of the AraGEMmodel. To achieve this goal, the following were examined: (i)the relationship between the specific growth rate and the photonuptake rate given a constrained CO2 uptake rate; (ii) the impact ofvarying the stoichiometry of the biomass equation; (iii) the abilityof AraGEM to synthesize all compounds in the model; and (iv) theimpact of a net influx or efflux of free protons on the overall modelconversion. These factors play a key role in using GEMs to guidemetabolic engineering in silico. As shown in Fig. 2, it is critical toconsider how a metabolic engineering strategy might influence thegrowth rate, photon uptake rate, and CO2 uptake rate of the mutantplant. If one of these parameters is known (or can be assumed),AraGEM will calculate the rest. However, it is presently unreal-istic to assume that a GEM (of any kind) can predict all of theseparameters for a genetic mutant without some input or constraintssupplied by the user. For each curve in Fig. 2, the steady-statesolution for a given CO2 uptake rate lies at the junction betweenthe horizontal and vertical portions of the curve. At lower photonuptake rates, relative to this junction, an imbalance between CO2and the required photons for growth is observed, leading to a modelthat does not converge in flux balance analysis. At higher photonuptake rates, this represents a scenario in which photons are con-sumed in excess. The regulatory structure of the plant cell does notallow either scenario to happen in the physical world.

Next, the biomass constituting equation of the AraGEM modelwas investigated to determine its impact on the relationshipbetween the growth rate and the photon uptake rate and to addressthe question how well individual biomass components need to be

Page 13: 1-s2.0-S0168945212000830-main

E. Collakova et al. / Plant Science

Fig. 2. AraGEM model conversion with constrained CO2 uptake. The relationshipbetween the cellular growth rate and photon uptake rate is shown for cases in whicht4

dtcgtwic

model, only 435 receive non-zero (or non-trivial) flux. A simulation

Ftsa

he CO2 uptake rate was constrained to 0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, and.5 mmol g−1 dry weight h−1 as labeled in the figure.

etermined. The latter is especially relevant to plant models, ashe biomass measurements for different biomass components oftenome from diverse types of measurements, experiments, researchroups, and even cell types and plant species. For these simula-ions, the CO uptake rate was held constant at 2.5 mmol g−1 dry

2eight h−1. The following components of the biomass constitut-

ng equation were varied: (i) cellulose and xylose; (ii) the ligninomponents (4-coumaryl alcohol, coniferyl alcohol, and sinapyl

ig. 3. Some biomass components influence the modeling results more than others. The rehe stoichiometric coefficients of specific components by multipliers of 0, 0.5, 1, 1.5, 2, atoichiometric parameters of the following biomass components were varied: (A) celluloslcohol, and (iii) sinapyl alcohol; (C) lipids; and (D) maintenance ATP requirements of the

191– 192 (2012) 53– 70 65

alcohol); (iii) the lipids; and (iv) the maintenance ATP built intothe biomass constituting equation. The stoichiometric coefficientfor each of these parameters was varied by multipliers of 0; 0.5; 1;1.5; 2; and 4, relative to their values in the AraGEM model, while thecomposition of the rest of the biomass components was kept con-stant (Fig. 3). For the cases of varying lignin components (Fig. 3B)and lipids (Fig. 3C), very little influence is observed for the rela-tionship between the growth rate and photon uptake rate. Thecellulose and xylose components (Fig. 3A) show greater sensitiv-ity and lead to a reduction in the growth and photon uptake ratesas their stoichiometric coefficients are increased in the biomassconstituting equation. The ATP maintenance component (Fig. 3D)shows by far the greatest to the relationship between the growthrate and photon uptake rate of the cell. As the stoichiometry ofrequired ATP maintenance was increased, the photon uptake rate toachieve the steady-state growth rate increased considerably. Thisresult is common among several GEMs and has also been reportedfor the bacteria when increases in ATP demands are accompaniedby increases in substrate uptake rates [5]. Nevertheless, the AraGEMmodel is considered to be a robust model, providing a fairly consis-tent relationship between the growth rate and photon uptake rateeven when considerable changes are made to its biomass consti-tuting equation.

Additional simulations were performed with the AraGEM modelto test its robustness and capabilities. When the biomass consti-tuting equation provided with the AraGEM model is used, fluxesof central carbon metabolism dominate the flux balance analysisresults, and little to no flux is observed through the secondarymetabolite pathways. Of the 1,567 total reactions in the AraGEM

study was performed to determine whether this was: (i) the resultof an incomplete model or (ii) a function of secondary metabo-lites not being included in the biomass constituting equation. We

lationship between the growth rate and the photon uptake rate is shown for varyingnd 4 (direction shown from 0 to 4 by arrows). CO2 uptake was held constant. Thee and xylose; (B) lignin components including (i) 4-coumaryl alcohol, (ii) coniferyl

biomass equation.

Page 14: 1-s2.0-S0168945212000830-main

66 E. Collakova et al. / Plant Science

Fig. 4. AraGEM is not sensitive to pH change. The number of reactions remaininguH

cmaiTAtcAimscfmbaomeg[

udaepe0rrct1pGpw(d

proper objective functions of specific plant cell types will improve

nchanged by varying the specific proton flux from (i) −10, (ii) 0, and (iii) 10 mmol+ g−1 dry weight h−1.

onstructed an algorithm where every metabolite of the AraGEModel was added to the biomass constituting equation individually,

nd the model was tested for growth. If the cell was able to grown silico, the metabolite was capable of being produced by AraGEM.he results of this study revealed that all 1,748 metabolites of theraGEM model are capable of being produced. This further reveals

hat all 1,567 reactions of the model are complete and capable ofarrying flux. This also indicates that the 435 reactions used by theraGEM model, as published, are a function of its biomass constitut-

ng equation. This leads to the following important point that if thisodel is to be used to study secondary metabolite pathways, these

econdary metabolites must be added (manually) to the biomassonstituting equation. This means that the AraGEM model is readyor additional uses involving secondary metabolism (given small

odifications to the biomass constituting equation). This is applica-le primarily for studies involving plant-environment interactionss well as interactions of plants with pathogen, herbivores, andther plants or any study highlighting the importance of secondaryetabolism. As is the case of all GEMs, the biomass constituting

quation ultimately determines how a metabolic network is used,iven constraints applied to substrate uptake and product secretion5].

Finally, the impact of the specific proton flux was investigatedsing the AraGEM model. The specific proton flux was initiallyeveloped for bacteria and describes the total rate at which protonsre secreted or consumed by the cell [48]. Obviously, this param-ter is more important to acid-producing bacteria than it is forlants. To study the specific proton flux using AraGEM, a protonxchange reaction was installed and constrained to values of −10,, and 10 mmol protons g−1 dry cell weight h−1. These values cor-espond to proton efflux, no proton exchange, and proton influx,espectively. Flux balance analysis was applied to all cases, and fewhanges in fluxes were observed. The number of reactions main-aining identical fluxes is shown in Fig. 4. Of the 1,567 reactions,,449 maintained an identical flux for all three cases of the specificroton flux. This represents a considerable change from bacterialEMs, which are particularly sensitive to the specific proton fluxarameter [5]. Collectively, the results of these simulation studies

ith AraGEM suggest that this model is: (i) completely functional;

ii) robust; and (iii) has a metabolic network that is functionallyifferent from GEMs created for bacteria.

191– 192 (2012) 53– 70

5. How can GEMing in plants be improved?

Constructing the first compartmented plant GEMs describingphotoautotrophic and photoheterotrophic metabolism includingphotorespiration and C4 metabolism in two different cell typesrequired an enormous effort and was quite an accomplishment.The models described here represent the first stepping stone inplant GEMing, but there is room for improvement. Modeling involv-ing these plant genome-scale reconstructions provided mostlyqualitative predictions of flux distributions for selected aspects ofprimarily central carbon metabolism, but most importantly an ideaabout model properties and a framework for improvements. One ofthe major obstacles in plant GEMing is the large number of missingparts and unknown interactions among the known parts. Reversegenetics and enzyme activity and specificity studies along with sub-cellular localization will slowly populate the list of genes/enzymesof known metabolic functions facilitating far better gene anno-tation. Transcriptomics and proteomics are extremely useful indelineating the presence of transcripts and proteins in plant cellsand subcellular localization of enzymes [112,141,142]. Advanc-ing high-throughput non-targeted metabolomics will help identifynovel secondary metabolites that need to be included in plantGEMs, but this effort has to be connected to metabolite localizationto relevant cell types and organelles and for defined developmen-tal stages and growth conditions. Laser-capture microdissectionand pressure catapulting, highly sensitive liquid chromatography-tandem mass spectrometry, and novel approaches including the‘mass microscope’ [89,91,92] will become indispensible for thisgoal. Recent advancements in non-aqueous fractionation will alsoaid in these goals [90]. Considering that there are many differ-ent plant cell types that are recalcitrant to separation from othercells and virtually an infinite number of internal and environmentalstimuli and their combinations, this is a Sisyphean task. Probablythe best approach with the current knowledge base is to focus oncell types and conditions of interest. New parts and functionalitiescan be added to the existing networks to generate comprehensivetemplate models for major plant species, from which specific net-works operating in cells of interest can be extracted. This procedureis similar to what evolved from the first E. coli K-12 MG1655 modelsfor bacteria [7,12,56].

Current plant GEMs tend to use the existing data to constrain thebiomass fluxes, but often from different studies of different biomassmetabolites, even extrapolated from different plant species andthe same is true for model validation. Often, predicted flux dis-tribution validation involves solely experimental evidence for thepresence of an enzyme in the similar tissues that was predictedto be active in in silico modeling. The following relates to prob-lems with different data sources used in GEMing. Biomass andits composition are highly variable traits that largely depend onthe developmental stage and environmental conditions. GEMingshould include measurements relevant to the model system at adefined developmental stage and conditions and, if possible, fromappropriate cell types. Additional flux constraints should be addedto minimize the phenotypic solution space. Plant models are cur-rently only constrained based on biomass fluxes, substrate uptake,or a broad range of physiologically relevant metabolite concen-trations [14,15,17,18,21,22], but do not consider thermodynamicconstraints for the internal reactions or other constraints. Onlymaize iRS1563 was fully balanced for all elements and includedproton and charge balancing [22].

Objective functions used routinely for bacterial GEMing maynot be the most appropriate for GEMing in plant cells. Identifying

the predictive capabilities of resulting models. Because plant cellsare exposed to complex signals at the same time, modeling willrequire multiple and more complex objective functions. The idea

Page 15: 1-s2.0-S0168945212000830-main

ience

ogoiemactcmoapni

itfltbptciopawrci

asspeapldsfytfeft1

flittsacsamfsc[

E. Collakova et al. / Plant Sc

f minimizing the use of resources while maximizing the outputiven circumstances is probably valid, but difficult to prove becausef these circumstances surrounding growth, which often has miss-ng information and is not considered in the model as resource andnergy drains (e.g., responses to pathogens and herbivores). Cellaintenance, “wasteful” degradation processes, and futile cycles

lso should be considered individually, as they represent signifi-ant sinks of resources and energy that are different in different cellypes [18,22,125]. In addition, plant cells respond metabolically toonvoluted stimuli coming from the adjacent cells and the environ-ent, which also requires redirection of carbon flow to additional,

ften neglected pathways. It is advisable to start looking at inter-ctions of cells in tissues and include these pathways as active inlant GEMs. The Nielsen and Maranas groups are elegantly pio-eering the way of addressing these problems with their systems

nvolving two cell types in C4 plants [17,22].Using the minimization of fluxes or steps as objective functions

n GEMing tends to get rid of “redundant” pathways. If there arewo redundant pathways that have similar cofactor balance, theux minimization algorithms will choose one pathway, despitehe fact that they are both active in vivo, which can be proveny transcriptomic, metabolomic, and fluxomic approaches. Theseathways may seem to be redundant, but obviously they are neededo allow metabolic adaptation in plant cells and are a result ofomplex metabolic regulation that cannot easily be accounted forn a flux balance analysis model. Finding the minimum numberf appropriate objective functions that correctly represent cellularhysiology and metabolism is probably one of the most problem-tic challenges in GEMing in general. One will probably get awayith using a few objective functions if appropriate, physiologically

elevant, flux constraints are implemented. Then such properlyonstrained models should start yielding accurate predictions ofnternal fluxes.

The presented plant flux balance analysis models were reason-bly accurate in predicting biomass fluxes, which is not surprising,ince the actual cellular biomass composition was used to con-train most of these models. Prediction of internal fluxes in theselant models was only qualitative and the model was consid-red a success when the expected pathways were predicted to bective in silico. Flux balance analysis has been found to performoorly in isolated cases in predicting the internal fluxes regard-

ess of the model system. This is possibly a result of too manyegrees of freedom in these under-determined models and pos-ibly due to the use of extreme, physiologically irrelevant objectiveunctions. Comparison of results obtained from flux balance anal-sis and 13C-based steady-state metabolic flux analysis revealedhat the actual flux distributions are within the phenotypicallyeasible solution space of bacterial flux balance models [125]. How-ver, the sampling of this solution space with extreme objectiveunctions used in flux balance analysis yielded flux combinationshat were substantially different from the actual flux distributions.3C-based fluxes were in close agreement with the actual in vivouxes and therefore, 13C-based steady-state metabolic flux analysis

s suitable to provide validation of flux balance analysis predic-ions [125]. The situation is similar in plant systems, althoughhe direct comparison with the actual fluxes is not always pos-ible. Comparison of developing rapeseed flux balance/variabilitynd 13C-based flux models revealed that while some fluxes wereomparable between the two models, others were not and manyuboptimal, but in vivo-active alternate pathways were not selecteds part of solution [19,20]. Flux variability analysis (all possibleetabolic routes are considered and the resulting range of fluxes

or each reaction are returned from the analysis) [53] used in thesetudies should be expanded with proper experimentally-derivedonstraints and the exclusion of extreme objective functions19,20].

191– 192 (2012) 53– 70 67

A significant challenge for GEMing in the future involvesthe addition of flux constraints on internal fluxes or at branchpoints within the metabolic network itself. 13C-based steady-statemetabolic flux analysis provided an accurate representation ofinternal flux distributions, as demonstrated by the consistencybetween the predicted and the actual, experimentally measured13C-labeling in internal metabolites [125]. As such, this approachshould be used to provide validation for flux balance analysisflux distribution predictions or these approaches should be usedtogether to obtain the most appropriate predictions of fluxesand metabolic phenotypes. Alternatively, rather than using 13C-based steady-state metabolic flux analysis to validate flux balanceor variability models, 13C-based flux measurements should beused as physiologically relevant flux constraints, which wouldpreclude the use of extreme objective functions (Jackie Shanks,personal communication). If necessary, unique objective func-tions (or multiple combinations of these functions) instead of thephysiologically irrelevant extreme ones should be used. Theseappropriate objective functions may ultimately be back-calculatedby comparing measured fluxes and flux balance analysis results[47,55]. However, 13C-based steady-state metabolic flux anal-ysis cannot be used in purely photosynthetic systems, where13CO2 is the sole source of carbon because every carbon inevery metabolite will become labeled to the same degree. Insuch cases, kinetic modeling such as isotopically nonstation-ary 13C-based metabolic flux analysis is needed [143]. Despitethe many successes of GEMing, significant challenges lie aheadfor systems biology research not only in plants, but also inbacteria.

6. Conclusions

The current plant GEMs have provided very promising predic-tions. However, given the extreme level of complexity of plantmetabolism, much research is left to be done. Missing parts can beincorporated as technology involving different omics progresses.Future focus will be on secondary metabolism. Choosing the rightmodel systems, for which experimental validations are available,as well as using known constraints and appropriate objectivefunctions will play a crucial role in improving plant GEMs. How-ever, the missing parts do not relate to only to metabolites andenzymes, but also to known and novel regulators in both pri-mary and secondary metabolism. Fluxes represent the cellularmetabolic phenotypes and inherently reflect convoluted regula-tory mechanisms. In another words, the final flux value througha specific step is a result of the influence of all possible activeregulatory mechanisms that can affect the enzyme activity of thisstep. The question is, how we can dissect these mechanisms andintegrate them into the model by associating relevant regula-tors with the enzymes. Whether we like it or not, integration ofregulation into GEMs will become important not only for under-standing the basis of plant metabolic plasticity, but also for rationalmetabolic engineering, as regulation tends to “overwrite” effects ofchanges to expression levels of biosynthetic genes. Then we willbe able to address the questions of metabolic adaptations in plantsexposed to diverse stimuli and generate predictions for rationalplant metabolic engineering to improve plant stress tolerance andresistance to pathogens and herbivores and ultimately biomassproduction. So, are we ready for GEMing in plants? Definitely!

Acknowledgements

Research funding was obtained from NSF-MCB-1052145 and theBiodesign and Bioprocessing Research Center at Virginia Tech. EC isprimarily responsible for writing the review. RS and JY performed

Page 16: 1-s2.0-S0168945212000830-main

6 ience

tc

R

8 E. Collakova et al. / Plant Sc

he simulations study with the AraGEM model. RS provided the dis-ussion of AraGEM model simulations and edited the manuscript.

eferences

[1] K.J. Kauffman, P. Prakash, J.S. Edwards, Advances in flux balance analysis,Current Opinion in Biotechnology 14 (2003) 491–496.

[2] E. Murabito, E. Simeonidis, K. Smallbone, J. Swinton, Capturing the essence ofa metabolic network: a flux balance analysis approach, Journal of TheoreticalBiology 260 (2009) 445–452.

[3] C.H. Schilling, J.S. Edwards, D. Letscher, B.O. Palsson, Combining pathwayanalysis with flux balance analysis for the comprehensive study of metabolicsystems, Biotechnology and Bioengineering 71 (2000) 286–306.

[4] L.M. Liu, R. Agren, S. Bordel, J. Nielsen, Use of genome-scale metabolic modelsfor understanding microbial physiology, FEBS Letters 584 (2010) 2556–2564.

[5] R.S. Senger, Biofuel production improvement with genome-scale models: therole of cell composition, Biotechnology Journal 5 (2010) 671–685.

[6] J.S. Edwards, B.O. Palsson, Systems properties of the Haemophilus influen-zae Rd metabolic genotype, Journal of Biological Chemistry 274 (1999)17410–17416.

[7] D.J. Baumler, R.G. Peplinski, J.L. Reed, J.D. Glasner, N.T. Perna, The evolutionof metabolic networks of E. coli, BMC System Biology 5 (2011) 182.

[8] N.C. Duarte, et al., Global reconstruction of the human metabolic networkbased on genomic and bibliomic data, Proceedings of the National Academyof Sciences of the United States of America 104 (2007) 1777–1782.

[9] I. Famili, J. Forster, J. Nielson, B.O. Palsson, Saccharomyces cerevisiae pheno-types can be predicted by using constraint-based analysis of a genome-scalereconstructed metabolic network, Proceedings of the National Academy ofSciences of the United States of America 100 (2003) 13134–13139.

[10] A.M. Feist, M.J. Herrgard, I. Thiele, J.L. Reed, B.O. Palsson, Reconstruction ofbiochemical networks in microorganisms, Nature Reviews Microbiology 7(2009) 129–143.

[11] P.C. Fu, Genome-scale modeling of Synechocystis sp PCC 6803 and predictionof pathway insertion, Journal of Chemical Technology and Biotechnology 84(2009) 473–483.

[12] J.D. Orth, et al., A comprehensive genome-scale reconstruction ofEscherichia coli metabolism-2011, Molecular Systems Biology 7 (2011) 1–9.

[13] O. Rolfsson, B.O. Palsson, I. Thiele, The human metabolic reconstruction Recon1 directs hypotheses of novel human metabolic functions, BMC System Biol-ogy 5 (2011) 155.

[14] M.G. Poolman, L. Miguet, L.J. Sweetlove, D.A. Fell, A genome-scale metabolicmodel of Arabidopsis and some of its properties, Plant Physiology 151 (2009)1570–1581.

[15] E. Grafahrend-Belau, F. Schreiber, D. Koschutzki, B.H. Junker, Flux balanceanalysis of barley seeds: a computational approach to study systemic prop-erties of central metabolism, Plant Physiology 149 (2009) 585–598.

[16] H. Rolletschek, et al., Combined noninvasive imaging and modelingapproaches reveal metabolic compartmentation in the barley endosperm,Plant Cell 23 (2011) 3041–3054.

[17] C.G.D. Dal’Molin, L.E. Quek, R.W. Palfreyman, S.M. Brumbley, L.K. Nielsen,C4GEM, a genome-scale metabolic model to study C4 plant metabolism, PlantPhysiology 154 (2010) 1871–1885.

[18] C.G.D. Dal’Molin, L.E. Quek, R.W. Palfreyman, S.M. Brumbley, L.K. Nielsen,AraGEM, a genome-scale reconstruction of the primary metabolic networkin Arabidopsis, Plant Physiology 152 (2010) 579–589.

[19] J. Hay, J. Schwender, Computational analysis of storage synthesis in devel-oping Brassica napus L. (oilseed rape) embryos: flux variability analysis inrelation to 13C metabolic flux analysis, Plant Journal 67 (2011) 513–525.

[20] J. Hay, J. Schwender, Metabolic network reconstruction and flux variabilityanalysis of storage synthesis in developing oilseed rape (Brassica napus L.)embryos, Plant Journal 67 (2011) 526–541.

[21] E. Pilalis, A. Chatziioannou, B. Thomasset, F. Kolisis, An in silico compart-mentalized metabolic model of Brassica napus enables the systemic study ofregulatory aspects of plant central metabolism, Biotechnology and Bioengi-neering 108 (2011) 1673–1682.

[22] R. Saha, P.F. Suthers, C.D. Maranas, Zea mays iRS1563: a comprehensivegenome-scale metabolic reconstruction of maize metabolism, PLoS One 6(2011) 1–12.

[23] E. Ruppin, J.A. Papin, L.F. de Figueiredo, S. Schuster, Metabolic reconstruction,constraint-based analysis and game theory to probe genome-scale metabolicnetworks, Current Opinion in Biotechnology 21 (2010) 502–510.

[24] J.L. Reed, T.D. Vo, C.H. Schilling, B.O. Palsson, An expanded genome-scalemodel of Escherichia coli K-12 (iJR904 GSM/GPR), Genome Biology 4 (2003)R54.

[25] P.D. Karp, et al., Expansion of the BioCyc collection of pathway/genomedatabases to 160 genomes, Nucleic Acids Research 33 (2005) 6083–6089.

[26] H. Ogata, et al., KEGG: Kyoto encyclopedia of genes and genomes, NucleicAcids Research 27 (1999) 29–34.

[27] R. Overbeek, et al., The subsystems approach to genome annotation and its use

in the project to annotate 1000 genomes, Nucleic Acids Research 33 (2005)5691–5702.

[28] J. Schellenberger, J.O. Park, T.M. Conrad, B.O. Palsson, BiGG: a BiochemicalGenetic and Genomic knowledgebase of large scale metabolic reconstruc-tions, BMC Bioinformatics 11 (2010) 213.

191– 192 (2012) 53– 70

[29] P. Lamesch, et al., The Arabidopsis Information Resource (TAIR): improvedgene annotation and new tools, Nucleic Acids Research 40 (2012)D1202–D1210.

[30] P.F. Zhang, et al., MetaCyc and AraCyc. Metabolic pathway databases for plantresearch, Plant Physiology 138 (2005) 27–37.

[31] D. Croft, et al., Reactome: a database of reactions, pathways and biologicalprocesses, Nucleic Acids Research 39 (2011) D691–D697.

[32] C.S. Henry, et al., High-throughput generation, optimization and analysis ofgenome-scale metabolic models, Nature Biotechnology 28 (2010), 977-U922.

[33] J.L. Heazlewood, R.E. Verboom, J. Tonti-Filippini, I. Small, A.H. Millar, SUBA:the Arabidopsis subcellular database, Nucleic Acids Research 35 (2007)D213–D218.

[34] Q. Sun, et al., PPDB, the Plant Proteomics Database at Cornell, Nucleic AcidsResearch 37 (2009) D969–D974.

[35] S. Reumann, C.L. Ma, S. Lemke, L. Babujee, AraPerox., A database of putativeArabidopsis proteins from plant peroxisomes, Plant Physiology 136 (2004)2587–2608.

[36] B. Boeckmann, et al., The SWISS-PROT protein knowledgebase and its supple-ment TrEMBL in 2003, Nucleic Acids Research 31 (2003) 365–370.

[37] H.P.J. Bonarius, G. Schmid, J. Tramper, Flux analysis of underdeterminedmetabolic networks: the quest for the missing constraints, Trends in Biotech-nology 15 (1997) 308–314.

[38] A. Varma, B.O. Palsson, Metabolic flux balancing – basic concepts, scientificand practical use, Bio-Technology 12 (1994) 994–998.

[39] N.D. Price, J.L. Reed, B.O. Palsson, Genome-scale models of microbial cells:evaluating the consequences of constraints, Nature Reviews Microbiology 2(2004) 886–897.

[40] A.M. Feist, B.O. Palsson, The growing scope of applications of genome-scalemetabolic reconstructions using Escherichia coli, Nature Biotechnology 26(2008) 659–667.

[41] R. Mahadevan, B.O. Palsson, D.R. Lovley, In situ to in silico and back: elucidatingthe physiology and ecology of Geobacter spp. using genome-scale modelling,Nature Reviews Microbiology 9 (2011) 39–50.

[42] C.B. Milne, P.J. Kim, J.A. Eddy, N.D. Price, Accomplishments in genome-scalein silico modeling for industrial and medical biotechnology, BiotechnologyJournal 4 (2009) 1653–1670.

[43] J.A. Papin, N.D. Price, S.J. Wiback, D.A. Fell, B.O. Palsson, Metabolic path-ways in the post-genome era, Trends in Biochemical Sciences 28 (2003)250–258.

[44] E.T. Papoutsakis, Equations and calculations for fermentations of butyric acidbacteria, Biotechnology and Bioengineering 26 (1984) 174–187.

[45] C.S. Henry, M.D. Jankowski, L.J. Broadbelt, V. Hatzimanikatis, Genome-scalethermodynamic analysis of Escherichia coli metabolism, Biophysical Journal90 (2006) 1453–1461.

[46] M.D. Jankowski, C.S. Henry, L.J. Broadbelt, V. Hatzimanikatis, Group contri-bution method for thermodynamic analysis of complex metabolic networks,Biophysical Journal 95 (2008) 1487–1499.

[47] J.L. Reed, B.O. Palsson, Thirteen years of building constraint-based in silicomodels of Escherichia coli, Journal of Bacteriology 185 (2003) 2692–2699.

[48] R.S. Senger, E.T. Papoutsakis, Genome-scale model for Clostridium aceto-butylicum: Part II. Development of specific proton flux states and numericallydetermined sub-systems, Biotechnology and Bioengineering 101 (2008)1053–1071.

[49] K. Yizhak, T. Benyamini, W. Liebermeister, E. Ruppin, T. Shlomi, Integratingquantitative proteomics and metabolomics with a genome-scale metabolicnetwork model, Bioinformatics 26 (2010) i255–i260.

[50] A.M. Feist, B.O. Palsson, The biomass objective function, Current Opinion inMicrobiology 13 (2010) 344–349.

[51] S.A. Becker, et al., Quantitative prediction of cellular metabolism withconstraint-based models: the COBRA Toolbox, Nature Protocols 2 (2007)727–738.

[52] J. Schellenberger, et al., Quantitative prediction of cellular metabolism withconstraint-based models: the COBRA Toolbox v2.0, Nature Protocols 6 (2011)1290–1307.

[53] R. Mahadevan, C.H. Schilling, The effects of alternate optimal solutions inconstraint-based genome-scale metabolic models, Metabolic Engineering 5(2003) 264–276.

[54] K. Smallbone, E. Simeonidis, Flux balance analysis: a geometric perspective,Journal of Theoretical Biology 258 (2009) 311–315.

[55] E.P. Gianchandani, M.A. Oberhardt, A.P. Burgard, C.D. Maranas, J.A. Papin,Predicting biological system objectives de novo from internal state measure-ments, BMC Bioinformatics 9 (2008) 43.

[56] J.S. Edwards, B.O. Palsson, The Escherichia coli MG1655 in silico metabolicgenotype: its definition, characteristics, and capabilities, Proceedings of theNational Academy of Sciences of the United States of America 97 (2000)5528–5533.

[57] J. Forster, I. Famili, B. Palsson, J. Nielsen, Large-scale evaluation of in silico genedeletions in Saccharomyces cerevisiae, OMICS 7 (2003) 193–202.

[58] I. Thiele, et al., A community effort towards a knowledge-base and mathemat-ical model of the human pathogen Salmonella Typhimurium LT2, BMC SystemBiology 5 (2011) 8.

[59] D. Segre, D. Vitkup, G.M. Church, Analysis of optimality in natural and per-turbed metabolic networks, Proceedings of the National Academy of Sciencesof the United States of America 99 (2002) 15112–15117.

[60] T. Shlomi, O. Berkman, E. Ruppin, Regulatory on/off minimization ofmetabolic flux changes after genetic perturbations, Proceedings of the

Page 17: 1-s2.0-S0168945212000830-main

ience

E. Collakova et al. / Plant Sc

National Academy of Sciences of the United States of America 102 (2005)7695–7700.

[61] A.P. Burgard, P. Pharkya, C.D. Maranas, OptKnock., A bilevel programmingframework for identifying gene knockout strategies for microbial strain opti-mization, Biotechnology and Bioengineering 84 (2003) 647–657.

[62] J. Kim, J.L. Reed, OptO.R.F., Optimal metabolic and regulatory perturbationsfor metabolic engineering of microbial strains, BMC System Biology 4 (2010)53.

[63] I. Rocha, et al., OptFlux: An open-source software platform for in silicometabolic engineering, BMC System Biology 4 (2010) 45.

[64] R. Mahadevan, Constraint-based genome-scale models of cellularmetabolism, in: C.D. Smolke (Ed.), The Metabolic Pathway EngineeringHandbook, Taylor & Francis Group, Boca Raton, 2010.

[65] J. Lee, H. Yun, A.M. Feist, B.O. Palsson, S.Y. Lee, Genome-scale reconstructionand in silico analysis of the Clostridium acetobutylicum ATCC 824 metabolicnetwork, Applied Microbiology and Biotechnology 80 (2008) 849–862.

[66] P.S. Schnable, et al., The B73 maize genome: complexity, diversity, anddynamics, Science 326 (2009) 1112–1115.

[67] D. Shintani, D. DellaPenna, Elevating the vitamin E content of plants throughmetabolic engineering, Science 282 (1998) 2098–2100.

[68] P.L. Nagy, G.M. McCorkle, H. Zalkin, PurU a source of formate forPurT-dependent phosphoribosyl-N-formylglycinamide synthesis, Journal ofBacteriology 175 (1993) 7066–7073.

[69] E. Collakova, et al., Arabidopsis 10-formyl tetrahydrofolate deformylases areessential for photorespiration, Plant Cell 20 (2008) 1818–1832.

[70] R.J. Bino, et al., Potential of metabolomics as a functional genomics tool, Trendsin Plant Science 9 (2004) 418–425.

[71] E. Pichersky, D.R. Gang, Genetics and biochemistry of secondary metabo-lites in plants: an evolutionary perspective, Trends in Plant Science 5 (2000)439–445.

[72] K. Saito, F. Matsuda, Metabolomics for functional genomics, systems biology,and biotechnology, in: S. Merchant, W.R. Briggs, D. Ort (Eds.), Annual Reviewof Plant Biology, 2010, pp. 463–489.

[73] O. Fiehn, Metabolomics – the link between genotypes and phenotypes, PlantMolecular Biology 48 (2002) 155–171.

[74] A.R. Fernie, Metabolome characterisation in plant system analysis, FunctionalPlant Biology 30 (2003) 111–120.

[75] V.S. Kumar, M.S. Dasika, C.D. Maranas, Optimization based automated cura-tion of metabolic reconstructions, BMC Bioinformatics 8 (2007) 212.

[76] G. Jander, V. Joshi, Aspartate-derived amino acid biosynthesis in Arabidopsisthaliana, in: The Arabidopsis Book, The American Society of Plant Biologists,2009, pp.c1–15.

[77] U. Gophna, E. Bapteste, W.F. Doolittle, D. Biran, E.Z. Ron, Evolutionary plastic-ity of methionine biosynthesis, Gene 355 (2005) 48–57.

[78] B.R. Epperly, E.E. Dekker, l-Threonine dehydrogenase from Escherichia coli –identification of an active-site cysteine residue and metal-ion studies, Journalof Biological Chemistry 266 (1991) 6086–6092.

[79] A.J. Edgar, Molecular cloning and tissue distribution of mammalian l-threonine 3-dehydrogenases, BMC Biochemistry 3 (2002) 19.

[80] D. Amador-Noguez, et al., Systems-level metabolic flux profiling elucidatesa complete, bifurcated tricarboxylic acid cycle in Clostridium acetobutylicum,Journal of Bacteriology 192 (2010) 4452–4461.

[81] S.B. Crown, et al., Resolving the TCA cycle and pentose-phosphate pathway ofClostridium acetobutylicum ATCC 824: isotopomer analysis, in vitro activitiesand expression analysis, Biotechnology Journal 6 (2011) 300–305.

[82] R.S. Senger, E.T. Papoutsakis, Genome-scale model for Clostridium aceto-butylicum: Part I. Metabolic network resolution and analysis, Biotechnologyand Bioengineering 101 (2008) 1036–1052.

[83] A.M. Feist, et al., A genome-scale metabolic reconstruction for Escherichia coliK-12 MG1655 that accounts for 1260 ORFs and thermodynamic information,Molecular Systems Biology 3 (2007) 121.

[84] V.M.E. Andriotis, N.J. Kruger, M.J. Pike, A.M. Smith, Plastidial glycolysis indeveloping Arabidopsis embryos, New Phytologist 185 (2010) 649–662.

[85] S. Baud, I.A. Graham, A spatiotemporal analysis of enzymatic activitiesassociated with carbon metabolism in wild-type and mutant embryos ofArabidopsis using in situ histochemistry, Plant Journal 46 (2006) 155–169.

[86] L.E. Anderson, J.A. Bryant, A.A. Carol, Both chloroplastic and cytosolic phos-phoglycerate kinase isozymes are present in the pea leaf nucleus, Protoplasma223 (2004) 103–110.

[87] W.C. Plaxton, The organization and regulation of plant glycolysis, AnnualReview of Plant Physiology and Plant Molecular Biology 47 (1996) 185–214.

[88] P. Giege, et al., Enzymes of glycolysis are functionally associated with themitochondrion in Arabidopsis cells, Plant Cell 15 (2003) 2140–2151.

[89] L.W. Sumner, et al., Spatially resolved plant metabolomics, in: R.D. Hall (Ed.),Annu. Plant Rev. Biol. Plant Metabolomics, John Wiley & Sons, Chichester,West Sussex, UK, 2011, pp. 343–366.

[90] S. Klie, et al., Analysis of the compartmentalized metabolome – a validation ofthe non-aqueous fractionation technique, Frontiers in Plant Science 2 (2011)1–27.

[91] J. Grassl, N.L. Taylor, A.H. Millar, Matrix-assisted laser desorption/ionisationmass spectrometry imaging and its development for plant protein imaging,

Plant Methods 7 (2011) 21.

[92] T. Harada, et al., Visualization of volatile substances in different organelleswith an atmospheric-pressure mass microscope, Analytical Chemistry 81(2009) 9153–9157.

191– 192 (2012) 53– 70 69

[93] N.R. Baker, J. Harbinson, D.M. Kramer, Determining the limitations and reg-ulation of photosynthetic energy transduction in leaves, Plant, Cell andEnvironment 30 (2007) 1107–1125.

[94] J. Schwender, J.B. Ohlrogge, Y. Shachar-Hill, A flux model of glycolysis and theoxidative pentosephosphate pathway in developing Brassica napus embryos,Journal of Biological Chemistry 278 (2003) 29442–29453.

[95] C. Miyake, Alternative electron flows (water-water cycle and cyclic electronflow around PSI) in photosynthesis: molecular mechanisms and physiologicalfunctions, Plant Cell Physiology 51 (2010) 1951–1963.

[96] A.L. Meadows, R. Karnik, H. Lam, S. Forestell, B. Snedecor, Application ofdynamic flux balance analysis to an industrial Escherichia coli fermentation,Metabolic Engineering 12 (2010) 150–160.

[97] J. Kilian, et al., The AtGenExpress global stress expression data set: protocols,evaluation and model data analysis of UV-B light, drought and cold stressresponses, Plant Journal 50 (2007) 347–363.

[98] Y. Shimada, H. Goda, Y. Yoshida, An introduction to AtGenExpress. How touse the data sets and what to learn from large scale gene expression analysis,Plant Cell Physiology 46 (2005), S141-S141.

[99] T.C.R. Williams, et al., Metabolic network fluxes in heterotrophic Arabidopsiscells: stability of the flux distribution under different oxygenation conditions,Plant Physiology 148 (2008) 704–718.

[100] C.J. Baxter, et al., The metabolic response of heterotrophic Arabidopsis cellsto oxidative stress, Plant Physiology 143 (2007) 312–325.

[101] M. Lehmann, et al., The metabolic response of Arabidopsis roots to oxidativestress is distinct from that of heterotrophic cells in culture and highlights acomplex relationship between the levels of transcripts, metabolites, and flux,Molecular Plant 2 (2009) 390–406.

[102] M.G. Poolman, ScrumPy: metabolic modelling with Python, IEE Proceedings– Systems Biology 153 (2006) 375–378.

[103] D.K. Allen, J.B. Ohlrogge, Y. Shachar-Hill, The role of light in soybean seedfilling metabolism, Plant Journal 58 (2009) 220–234.

[104] R.G. Ratcliffe, Y. Shachar-Hill, Measuring multiple fluxes through plantmetabolic networks, Plant Journal 45 (2006) 490–511.

[105] W. Wiechert, M. Mollney, S. Petersen, A.A. de Graaf, A universal frame-work for C-13 metabolic flux analysis, Metabolic Engineering 3 (2001)265–283.

[106] W. Wiechert, A.A. deGraaf, Bidirectional reaction steps in metabolic networks:1. Modeling and simulation of carbon isotope labeling experiments, Biotech-nology and Bioengineering 55 (1997) 101–117.

[107] A.P. Alonso, V.L. Dale, Y. Shachar-Hill, Understanding fatty acid synthesis indeveloping maize embryos using metabolic flux analysis, Metabolic Engineer-ing 12 (2010) 488–497.

[108] A.P. Alonso, F.D. Goffman, J.B. Ohlrogge, Y. Shachar-Hill, Carbon conversionefficiency and central metabolic fluxes in developing sunflower (Helianthusannuus L.) embryos, Plant Journal 52 (2007) 296–308.

[109] A.P. Alonso, et al., A metabolic flux analysis to study the role of sucrose syn-thase in the regulation of the carbon partitioning in central metabolism inmaize root tips, Metabolic Engineering 9 (2007) 419–432.

[110] A.P. Alonso, D.L. Val, Y. Shachar-Hill, Central metabolic fluxes in theendosperm of developing maize seeds and their implications for metabolicengineering, Metabolic Engineering 13 (2011) 96–107.

[111] J. Schwender, F. Goffman, J.B. Ohlrogge, Y. Shachar-Hill, Rubisco without theCalvin cycle improves the carbon efficiency of developing green seeds, Nature432 (2004) 779–782.

[112] K. Baerenfaller, et al., Genome-scale proteomics reveals Arabidopsisthaliana gene models and proteome dynamics, Science 320 (2008)938–941.

[113] I. Andersson, A. Backlund, Structure and function of Rubisco, Plant Physiologyand Biochemistry 46 (2008) 275–291.

[114] S. Baud, J.P. Boutin, M. Miquel, L. Lepiniec, C. Rochat, An integrated overviewof seed development in Arabidopsis thaliana ecotype WS, Plant Physiology andBiochemistry 40 (2002) 151–160.

[115] R.J. Weselake, et al., Increasing the flow of carbon into seed oil, BiotechnologyAdvances 27 (2009) 866–878.

[116] M.J. Holdsworth, L. Bentsink, W.J.J. Soppe, Molecular networks regulating Ara-bidopsis seed maturation, after-ripening, dormancy and germination, NewPhytologist 179 (2008) 33–54.

[117] P.D. Brown, J.G. Tokuhisa, M. Reichelt, J. Gershenzon, Variation of glucosi-nolate accumulation among different organs and developmental stages ofArabidopsis thaliana, Phytochemistry 62 (2003) 471–481.

[118] S.X. Chen, B.L. Petersen, C.E. Olsen, A. Schulz, B.A. Halkier, Long-distancephloem transport of glucosinolates in Arabidopsis, Plant Physiology 127(2001) 194–201.

[119] H.H. Nour-Eldin, B.A. Halkier, Piecing together the transport pathway ofaliphatic glucosinolates, Phytochemistry Reviews 8 (2009) 53–67.

[120] D.K. Allen, I.G.L. Libourel, Y. Shachar-Hill, Metabolic flux analysis inplants: coping with complexity, Plant, Cell and Environment 32 (2009)1241–1257.

[121] J.T. van Dongen, et al., Phloem import and storage metabolism are highly coor-dinated by the low oxygen concentrations within developing wheat seeds,Plant Physiology 135 (2004) 1809–1821.

[122] H. Rolletschek, W. Weschke, H. Weber, U. Wobus, L. Borisjuk, Energy state andits control on seed development: starch accumulation is associated with highATP and steep oxygen gradients within barley grains, Journal of ExperimentalBotany 55 (2004) 1351–1359.

Page 18: 1-s2.0-S0168945212000830-main

7 ience

0 E. Collakova et al. / Plant Sc

[123] M.M. Paluska, A.K. Dobrenz, R.T. Ramage, Seed size and seedling componentsin Arivat barley, Journal of the Arizona-Nevada Academy of Science 14 (1979)88–90.

[124] R. Schuetz, L. Kuepfer, U. Sauer, Systematic evaluation of objective functionsfor predicting intracellular fluxes in Escherichia coli, Molecular Systems Biol-ogy 3 (2007) 119.

[125] X.W. Chen, A.P. Alonso, D.K. Allen, J.L. Reed, Y. Shachar-Hill, Synergy between13C-metabolic flux analysis and flux balance analysis for understandingmetabolic adaption to anaerobiosis in E. coli, Metabolic Engineering 13 (2011)38–48.

[126] J. Schwender, J.B. Ohlrogge, Probing in vivo metabolism by stable isotopelabeling of storage lipids and proteins in developing Brassica napus embryos,Plant Physiology 130 (2002) 347–361.

[127] B.H. Junker, J. Lonien, L.E. Heady, A. Rogers, J. Schwender, Parallel determina-tion of enzyme activities and in vivo fluxes in Brassica napus embryos grown onorganic or inorganic nitrogen source, Phytochemistry 68 (2007) 2232–2242.

[128] J. Schwender, Y. Shachar-Hill, J.B. Ohlrogge, Mitochondrial metabolism indeveloping embryos of Brassica napus, Journal of Biological Chemistry 281(2006) 34040–34047.

[129] G.J. Niemann, J.B.M. Pureveen, G.B. Eijkel, H. Poorter, J.J. Boon, Differentialchemical allocation and plant adaptation – A Py-Ms study of 24 species dif-fering in relative growth rate, Plant Soil 175 (1995) 275–289.

[130] R.R. Wise, J.K. Hoober, Synthesis, export and partitioning of end products ofphotosynthesis, in: R.R. Wise, J.K. Hoober (Eds.), Structure and Function ofPlastids, Springer, Dordrecht, The Netherlands, 2007, pp. 274–288.

[131] H. Poorter, M. Bergkotte, Chemical composition of 24 wild species differingin relative growth rate, Plant, Cell and Environment 15 (1992) 221–229.

[132] G.E. Edwards, V.R. Franceschi, E.V. Voznesenskaya, Single-cell C4 photosyn-thesis versus the dual-cell (Kranz) paradigm, Annual Review of Plant Biology55 (2004) 173–196.

191– 192 (2012) 53– 70

[133] W. Majeran, Y. Cai, Q. Sun, K.J. van Wijk, Functional differentiation of bun-dle sheath and mesophyll maize chloroplasts determined by comparativeproteomics, Plant Cell 17 (2005) 3111–3140.

[134] S. von Caemmerer, R.T. Furbank, The C4 pathway: an efficient CO2 pump,Photosynthesis Research 77 (2003) 191–207.

[135] W.R. Chen, X. Yang, Z.L. He, Y. Feng, F.H. Hu, Differential changes in photosyn-thetic capacity, 77K chlorophyll fluorescence and chloroplast ultrastructurebetween Zn-efficient and Zn-inefficient rice genotypes (Oryza sativa) underlow zinc stress, Physiologia Plantarum 132 (2008) 89–101.

[136] C.H. Foyer, H. Vanacker, L.D. Gomez, J. Harbinson, Regulation of photo-synthesis and antioxidant metabolism in maize leaves at optimal andchilling temperatures: review, Plant Physiology and Biochemistry 40 (2002)659–668.

[137] S.I. Allakhverdiev, et al., Heat stress: an overview of molecular responses inphotosynthesis, Photosynthesis Research 98 (2008) 541–550.

[138] Z.R. Li, S. Wakao, B.B. Fischer, K.K. Niyogi, Sensing and responding to excesslight, Annual Review of Plant Biology 60 (2009) 239–260.

[139] O.J. Sanchez, C.A. Cardona, Trends in biotechnological production of fuelethanol from different feedstocks, Bioresource Technology 99 (2008)5270–5295.

[140] V. Mechin, et al., In search of a maize ideotype for cell wall enzymatic degrad-ability using histological and biochemical lignin characterization, Journal ofAgricultural and Food Chemistry 53 (2005) 5872–5881.

[141] A. Brautigam, U. Gowik, What can next generation sequencing do for you?Next generation sequencing as a valuable tool in plant research, Plant Biology

12 (2010) 831–841.

[142] F.M. Canovas, et al., Plant proteome analysis, Proteomics 4 (2004) 285–298.[143] J.D. Young, A.A. Shastri, G. Stephanopoulos, J.A. Morgan, Mapping pho-

toautotrophic metabolism with isotopically nonstationary 13C flux analysis,Metabolic Engineering 13 (2011) 656–665.