Combinatorial completion for the reconstruction of ... › tel-01093287 › file ›...

53
Combinatorial completion for the reconstruction of metabolic networks, and application to the brown alga model Ectocarpus siliculosus Sylvain Prigent Dr Anne Siegel, IRISA Dr Thierry Tonon, SBR November 14th, 2014 Sylvain Prigent PhD defense November 14th, 2014 1 / 49

Transcript of Combinatorial completion for the reconstruction of ... › tel-01093287 › file ›...

Page 1: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Combinatorial completion for the reconstruction ofmetabolic networks, and application to the brown alga

model Ectocarpus siliculosus

Sylvain Prigent

Dr Anne Siegel, IRISADr Thierry Tonon, SBR

November 14th, 2014

Sylvain Prigent PhD defense November 14th, 2014 1 / 49

Page 2: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Outlines

1 Introduction

2 Combinatorial completion

3 Global workflow

4 Biological results

5 Conclusion and perspectives

Sylvain Prigent PhD defense November 14th, 2014 2 / 49

Page 3: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

Ectocarpus siliculosus

Stramenopiles:Diverged from opisthokonta and plantae more than 1 billion years ago

Secondary endosymbiosisCapture red alga ⇒ plastids

Evolved many unusual characteristicsAdaptation to intertidal zoneAcclimation to abiotic stresses

A complex evolutionary history

Cock et al., 2009Sylvain Prigent PhD defense November 14th, 2014 3 / 49

Page 4: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

Ectocarpus siliculosus, available data

An annotated genome (Cock et al., 2010);

Transcriptomic data (Dittami et al., 2009);

Metabolite profiling (Gravot et al. 2010, Dittami et al., 2011);

Knowledge on its adaptation and acclimation capacities toenvironmental changes

Can genomic data explain metabolite profiling, adaptation and acclimationcapacities?

Sylvain Prigent PhD defense November 14th, 2014 4 / 49

Page 5: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

Systems biology

”To understand complex biological systems requires the integration ofexperimental and computational research — in other words a systems

biology approach.” Kitano, 2002

Metabolic networks: relevant biological scale to study functionality andadaptation

Machado et al., 2011Sylvain Prigent PhD defense November 14th, 2014 5 / 49

Page 6: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

Metabolic networks

Metabolic network: complete set of metabolic reactions that determine thephysiological and biochemical properties of a cell.

Large scale models of metabolic pathways

source: expasy

Sylvain Prigent PhD defense November 14th, 2014 6 / 49

Page 7: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

Metabolic networks

Reactions:R1: 1 A → 1 BR2: B + 2 C → 3 D

Network representation:

R1 BA

R2 D

Cenzyme 2enzyme 1

enzyme 3

Annot. 1 Annot. 2

Annot. 3

2

3

Genome

Stoichiometric matrix:R2R1

-1 0A

-11B

-20C

0 3D

Sylvain Prigent PhD defense November 14th, 2014 7 / 49

Page 8: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

Studying metabolic networks using Mixed Integer LinearProgramming

Flux Balance Analysis

To predict unique distribution of internal fluxesTo hypothesize maximization of biomass: maximize Z = cT v

Flux Variability Analysis

To predict range of fluxes related to biomassTo maximize and minimize vTo identify 3 classes of reactions: obligatory, blocked andalternatives

Highly dependent on stoichiometry, structure and cofactorsequilibrium of the network

Sylvain Prigent PhD defense November 14th, 2014 8 / 49

Page 9: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

Metabolic networks reconstruction

1. Draft reconstruction

1| Obtain genome annotation.2| Identify candidate metabolic functions.3| Obtain candidate metabolic reactions.4| Assemble draft reconstruction.5| Collect experimental data.

2. Refinement of reconstruction6| Determine and verify substrate and cofactor usage.7| Obtain neutral formula for each metabolite.8| Determine the charged formula.9| Calculate reaction stoichiometry.10| Determine reaction directionality.11| Add information for gene and reaction localization.12| Add subsystems information.13| Verify gene−protein-reaction association.14| Add metabolite identifier.15| Determine and add confidence score.16| Add references and notes.17| Flag information from other organisms.18| Repeat Steps 6 to 17 for all genes.19| Add spontaneous reactions to the reconstruction.20| Add extracellular and periplasmic transport reactions.21| Add exchange reactions.22| Add intracellular transport reactions.23| Draw metabolic map (optional).24−32| Determine biomass composition.33| Add biomass reaction.34| Add ATP-maintenance reaction (ATPM).35| Add demand reactions.36| Add sink reactions.37| Determine growth medium requirements.

3. Conversion of reconstructioninto computable format

38| Initialize the COBRA toolbox.39| Load reconstruction into Matlab.40| Verify S matrix.41| Set objective function.42| Set simulation constraints.

4. Network evaluation43−44| Test if network is mass-and charge balanced.45| Identify metabolic dead-ends.46−48| Perform gap analysis.49| Add missing exchange reactions to model.50| Set exchange constraints for a simulation condition.51−58| Test for stoichiometrically balanced cycles.59| Re-compute gap list.60−65| Test if biomass precursors can be produced in standard medium.66| Test if biomass precursors can be produced in other growth media.67−75| Test if the model can produce known secretion products.76−78| Check for blocked reactions.79−80| Compute single gene deletion phenotypes.81−82| Test for known incapabilites of the organism.83| Compare predicted physiological properties with known properties.84−87| Test if the model can grow fast enough.88−94| Test if the model grows too fast.

Data assembly and dissemination95| Print Matlab model content.96| Add gap information to the reconstruction output.

Data Miningand

Knowledgerepresentation

Simulation and

Completion

Metabolic Draft

Genomic Functional

Reconstructing metabolic networks: a task highly dependent on datasources

Thiele & Palson, 2010

Sylvain Prigent PhD defense November 14th, 2014 9 / 49

Page 10: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

Previous metabolic networks reconstruction

Deinococcus-ThermusNitrospiraeCaldisericaArmatimonadetesDictyoglomiElusimicrobiaGemmatimonadetesAquif caeDeferribacteresChrysiogenetesThermodesulfobacteriaPlanctomycetesSpirocheteAminanaerobia

XenarchaeaNanoarchaeotaAigarchaeotaIgnavibacteria

Chlorobi

TardigradaOnychophora

EchinodermsXenoturbellidaHemichordataChaetognathaBasidomycota

Chromerida

Firmicutes (8)

Actinobacteria (6)

Proteobacteria (32)

Cyanobacteria (3)

Tenericutes (2)

Thermotogae (1)

Chlorof exi (1)

Bacteroidetes (1)

Crenarchaeota (1)

Euryarchaeota (4)

Arthropoda (2)

Streptophyta (2)Chlorophyta (2)

Apicomplexa (2)

Euglenozoa (1)

Ascomycota (8)

Chordata (2)

Bacteria

Archaea

Eukaryota

{Unrepresented phyla:No metabolicreconstructionexists

Blue: Reconstructed phylum(count of metabolic reconstructions)

Red: Unrepresented phylum(no metabolic reconstructions exist)

P. aeruginosa

P. putid

a

B. aphidicola

Y. pestis

S. typhimurium

E. coliK. pneumoniae

S. glossinidius

A. bauman

nii AYE

A. sp A

DP1H. inf

uenz

a

V. vulnif c

us

C. s

alex

igen

s

S. o

neiden

sis

F.tu

lare

nsis

R. e

tli

S. m

elilo

ti

M.ext

orq

uens

K. vu

lgar

um

Z. m

obili

s

G.su

lfurr

educens

P. pro

pio

nic

us

C.je

juin

H. pylo

ri

B.cenocepacia

C.necato

rN

.m

enin

gitid

is

L. la

ctis

L.pla

ntar

um

S. a

ureu

s

B. s

ubtilis

C. b

eijerin

ickii

C.a

ceto

butylic

um

R. erythropolis

C. glutamicum

M. tuberculosisS. erythraea

A. balhimycinaS. coelicolor

A. platensis

P. gingivalis

T. maritima

S. pom

be

A. nid

ula

ns

A.ory

zae

A.nig

er

S. st

ipitis

Y.lip

oly

tica

S.cere

visi

ae

K.pas

toris

H. sap

iens

D. melanogaster

C. hominis

P. falciparum

A. thaliana

Z. mays

C. reinhardtii

L. major

N. pharaonis

H. salinarumM. acetivoransM. barkeriS. solfataricus

M. s

uccin

icipr

od.

C.t

herm

ocellum

S.th

erm

ophilu

s

R.sp

haero

ides

G.m

eta

llire

ducens

R.fe

rrireducens

Syn. PCC 6803Cyan. ATCC 51142

M. genitaliumM. pneumoniae

D. ethenogenes

M.m

usculus

P. carb

onic

us

Ectocarpus siliculosus

Automatic workflows exist for bacteria

Microscope, the SEED, Pathway tools

Rely on genome structure, genetic perturbations

Monk et al., 2014

Sylvain Prigent PhD defense November 14th, 2014 10 / 49

Page 11: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

Metabolic network reconstructions for algae

Species Annotation format Draft reconstruction Functional refine-ment

Chlamydomonasreinhardtii (1)

KEGG KEGG extraction Manual

Chlamydomonasreinhardtii (2)

Pre-existing network Manual No information

Ostreococcus (3) KEGG KEGG extraction Automatic

Phaeodactylumtricornutum (4)

KEGG KEGG extraction Manual

Ectocarpus siliculo-sus

html pages ? ?

Data mining and automatic refinement are needed to reconstruct themetabolic network of Ectocarpus siliculosus

(1) Dal’Molin et al., 2011 (2) Chang et al., 2011 (3) Krumholz et al., 2012 (4) Fabris

et al., 2012

Sylvain Prigent PhD defense November 14th, 2014 11 / 49

Page 12: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

An overview of the completion problem

S

T2

T1

Seed

Targets

Problem: minimizing the number of added reactions to produce thetargets from the seeds

Sylvain Prigent PhD defense November 14th, 2014 12 / 49

Page 13: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

An overview of the completion problem

−→ Draft −→ Putative

S

T2

A G

C

B

D

E

F

T1

Problem: minimizing the number of added reactions to produce thetargets from the seeds

Sylvain Prigent PhD defense November 14th, 2014 13 / 49

Page 14: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

An overview of the completion problem

−→ Draft −→ Putative

S

T2

A G

C

B

D

E

F

T1

R5

R2

R1

R3

R42

1

Problem: minimizing the number of added reactions to produce thetargets from the seeds

Sylvain Prigent PhD defense November 14th, 2014 13 / 49

Page 15: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

An overview of the completion problem

−→ Draft −→ Putative

S

T2

A G

C

B

D

E

F

T1

R5

R2

R1

R3

R42

1

Completion Minimality Functional?

R1R2 Cardinal Yes

Problem: minimizing the number of added reactions to produce thetargets from the seeds

Sylvain Prigent PhD defense November 14th, 2014 13 / 49

Page 16: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

An overview of the completion problem

−→ Draft −→ Putative

S

T2

A G

C

B

D

E

F

T1

R5

R2

R1

R3

R42

1

Completion Minimality Functional?

R1R2 Cardinal Yes

R1 R3R5

Subset No

Problem: minimizing the number of added reactions to produce thetargets from the seeds

Sylvain Prigent PhD defense November 14th, 2014 13 / 49

Page 17: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

An overview of the completion problem

−→ Draft −→ Putative

S

T2

A G

C

B

D

E

F

T1

R5

R2

R1

R3

R42

1

Completion Minimality Functional?

R1R2 Cardinal Yes

R1 R3R5

Subset No

R4R1R5 Subset Yes

Problem: minimizing the number of added reactions to produce thetargets from the seeds

Sylvain Prigent PhD defense November 14th, 2014 13 / 49

Page 18: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

Description of the problemS

T2

A G

C

B

D

E

F

T1

R5

R2

R1

R3

R42

1

Search space:A metabolic draft: Directed bipartite graph Rdraft ;A database of reactions: Rdb;A group of metabolic seeds: Mseed ⊂ M;A group of metabolic targets: Mtarget ⊂ M;The research space: R = Rdraft ∪ Rdb

Completion:A group of reactions Rcompletion ⊆ Rdb \ Rdraft such that:

Mtarget is reachable from Mseed in the network((Rdraft ∪ Rcompletion) ∪ (Mdraft ∪Mcompletion),Edraft ∪ Ecompletion)

Problem: find a minimal completion

Highly dependent on reachabilitySylvain Prigent PhD defense November 14th, 2014 14 / 49

Page 19: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

Metabolic network gap-fillingS

T2

A G

C

B

D

E

F

T1

R5

R2

R1

R3

R42

1

Name Producibility Minimality criteria Completeness

Optstrain (1) &SMILEY (2)

FBA Cardinal Unique solution

GapFill (3) FBA Cardinal Unique solution

Christian et al. (4) Topologic Subsets Sampling

Network-expansion (5)

Topologic Cardinal Exhaustive

Are topologic studies precise enough to perform gap-filling?

(1) Pharkya et al., 2004 (2) Reed et al., 2006 (3) Satish Kumar et al., 2007 (4)

Christian et al., 2009 (5) Schaub and Thiele, 2009

Sylvain Prigent PhD defense November 14th, 2014 15 / 49

Page 20: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Introduction

Conclusion

How can we perform accurate and exhaustive gap-filling that scalesto targeted applications?

Which kind of metabolic network reconstruction pipeline can wepropose for non-classical species?

Which biological knowledge do we gain by reconstructing themetabolic network of Ectocarpus siliculosus?

Sylvain Prigent PhD defense November 14th, 2014 16 / 49

Page 21: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Combinatorial completion

Outlines

1 Introduction

2 Combinatorial completionThe combinatorial problemMeneco and functionalityImproving Network-expansion

3 Global workflow

4 Biological results

5 Conclusion and perspectives

Sylvain Prigent PhD defense November 14th, 2014 17 / 49

Page 22: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Combinatorial completion The combinatorial problem

Combinatorial problemS

T2

A G

C

B

D

E

F

T1

R5

R2

R1

R3

R42

1

A topological study of the network and the databaseImplement producibility criteria proposed in (1)Solve combinatorial problem

Reachability:

A metabolite is producible iff:It is a seedIt is a product of a reaction

If all reactants of this reaction are producible

Problem: find a minimal completion with respect to reachability

(1) Ebenhoh et al., 2004

Sylvain Prigent PhD defense November 14th, 2014 18 / 49

Page 23: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Combinatorial completion The combinatorial problem

How to solve combinatorial problems ?

Dedicated Algorithm Use constraints solvers

Sylvain Prigent PhD defense November 14th, 2014 19 / 49

Page 24: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Combinatorial completion The combinatorial problem

How to solve combinatorial problems ?

Dedicated Algorithm Use constraints solvers

Answer Set Programming

Sylvain Prigent PhD defense November 14th, 2014 20 / 49

Page 25: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Combinatorial completion The combinatorial problem

Answer Set Programming in a nutshell

Declarative programming

High-level modeling language (ASP ' Prolog expressivity)

The order of rules has no impactNo infinite loops in the resolution

High performance solving capabilities (ASP ' SAT, ILP)

SAT & deductive databases technics for ASPOptimisation with different heuristics

Different reasoning modes

EnumerationIntersectionUnion

Sylvain Prigent PhD defense November 14th, 2014 21 / 49

Page 26: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Combinatorial completion The combinatorial problem

Reachability in Network-expansionS

T2

A G

C

B

D

E

F

T1

R5

R2

R1

R3

R42

1

Topological study of the network

Reachability:

A metabolite is producible iff:

It’s a seedIt’s a product of a reaction

If all reactants of this reaction are producible

Computing scope of seeds in ASP:scope(M) :- seed(M).

scope(M) :- product(M,R), reaction(R,N), scope(M2) : reactant(M2,R).

Thiele & Schaub, 2009

Sylvain Prigent PhD defense November 14th, 2014 22 / 49

Page 27: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Combinatorial completion Meneco and functionality

Topologic completion VS stoichiometric studies?

Benchmark on Palsson’s E. coli networkFVA: identification of obligatory, blocked and alternative reactionsDegradation of the network → 3.600 replicates

Per

cent

age

of re

triev

ed re

actio

ns

Obligatory Blocked Alternatives

Degradation: 10%

Degradation: 20%

Degradation: 30%

Degradation: 40%

Most of obligatory reactions are identifiedBlocked reactions are missed

Topological criteria are precise enough to recover functionality

Sylvain Prigent PhD defense November 14th, 2014 23 / 49

Page 28: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Combinatorial completion Improving Network-expansion

Limitations of Network-expansion

5000 6000 7000 8000 9000 10000 Full

Number of reactions

0.1

1

10

100

1000

10000

100000

Tim

e in

sec

onds

(log

)

Clasp

Do not scale for large metabolic reactions databases

Reversible reactions are splitting into two reactions

Improvements are mandatory

Sylvain Prigent PhD defense November 14th, 2014 24 / 49

Page 29: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Combinatorial completion Improving Network-expansion

Changing solver (LPMNR 2013)

5000 6000 7000 8000 9000 10000 Full

Number of reactions

0.1

1

10

100

1000

10000

100000

Tim

e in

sec

onds

(log

)

ClaspUnclasp

Solution size is small (∼ 10-100) with respect to size of the searchspace (∼ 10.000)

Use of a new ASP solver: constraints relaxations

Using unsatisfiable cores enables finding optima in linear time

Sylvain Prigent PhD defense November 14th, 2014 25 / 49

Page 30: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Combinatorial completion Improving Network-expansion

Reversibility (LPMNR 2013)

New representation of reversibility in the encoding

Fit with biological reality

Smaller solution space

5000 6000 7000 8000 9000 10000 Full

Number of reactions

0.1

1

10

Tim

e in

sec

onds

(log

)With ReversibilityWithout Reversibility

Improving biological relevance

Sylvain Prigent PhD defense November 14th, 2014 26 / 49

Page 31: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Combinatorial completion Improving Network-expansion

Conclusion

Topological criteria are efficient to do the completion

Computation time improved by changing the solver

Biological relevance improved by changing encoding of reversibility

Collet et al., LPNMR, 2013

⇒ Meneco

Packaged into a python package

Available online

http://mobyle.genouest.org/http://bioasp.github.io/meneco/

Sylvain Prigent PhD defense November 14th, 2014 27 / 49

Page 32: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Global workflow

Outlines

1 Introduction

2 Combinatorial completion

3 Global workflowCreating metabolic draftCompletionStudy of the completion

4 Biological results

5 Conclusion and perspectives

Sylvain Prigent PhD defense November 14th, 2014 28 / 49

Page 33: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Global workflow Creating metabolic draft

Building a metabolic draft

Functional annotation

Genome annotations are not standardized

May loose information

Orthology research from cousin species

Gene sequences have derived

Orthology search may fail

Combining annotations and orthology information to improve draftreconstruction

Sylvain Prigent PhD defense November 14th, 2014 29 / 49

Page 34: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Global workflow Creating metabolic draft

Building a metabolic draft for Ectocarpus siliculosus

Sylvain Prigent PhD defense November 14th, 2014 30 / 49

Page 35: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Global workflow Creating metabolic draft

Merging two metabolic drafts

If both draft are not based on the same database

Unification of identifiers neededCross-referencesSame reactants & products ⇒ same reaction

⇒ MeMap/MeMerge

Sylvain Prigent PhD defense November 14th, 2014 31 / 49

Page 36: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Global workflow Creating metabolic draft

Merging metabolic drafts for Ectocarpus siliculosus

Sylvain Prigent PhD defense November 14th, 2014 32 / 49

Page 37: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Global workflow Completion

Meneco

25 of 50 target metabolites not producible

Completion using MetaCyc database and Meneco

∼ 1 hour for the union

Minimal number of reactions to add in the network: 44

4.320 different sets of 44 reactions can fill the network

Union of these sets: 60 reactions

Completion is highly combinatorial

Sylvain Prigent PhD defense November 14th, 2014 33 / 49

Page 38: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Global workflow Study of the completion

Semantic analysis of the 4.320 solutions

35 reactions are ubiquitousSome reactions are mutually exclusive

Never present together in the same completionShould be biologically equivalent

Dihydrofolatesynth-RXNH2neopterinaldol-RXN

RXN-9655...

35 ubiquitousreactions

RXN-9549RXN3O-9780

1/2 reactions

Phosphoglycerate-phosphatase-RXN

Glycerol-dehydrogenase-NADP+-RXN3-phosphoglycerate-phosphatase-RXN

2/3 reactions

Fatty-acid-synthase-RXNFatty-acyl-CoA-synthase-RXN

RXN-127661/3 reactions

1/3 reactionsACP-S-acetyltransfer-RXNRXN-2361

2.3.1.180-RXN

Adenylylsulfate-reductase-RXN 1 reaction

RXN-8389 1 reaction

RXN-961Glyoxylate-reductase-NADP+-RXNGlycolate-reductase-RXNGlycolald-dehydrog-RXN

1/4 reactions

1/2 pairs of reactions

RXN-9634 + RXN3O-5304RXN-9634 + RXN-9543EctoGEM-combined

1,785 reactions1,981 compounds

Before: 60 reactions, 4.320 completionsAfter: 56 reactions, 432 completions

Semantic analysis reduced combinatorial of the completion

Sylvain Prigent PhD defense November 14th, 2014 34 / 49

Page 39: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Global workflow Study of the completion

Looking for enzymes in the genome

Proposed completions should have a biological relevance

For each reaction:

Construct an Hidden Markov Model based on existing sequencesSearch for this model in the genome

If match found:

Gene previously not or badly annotatedHelping manual curation

Focus on particular enzymes provides new insights into the reannotation ofthe genome

Sylvain Prigent PhD defense November 14th, 2014 35 / 49

Page 40: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Global workflow Study of the completion

EctoGEM 1.0

Prigent et al., Plant Journal, 2014Sylvain Prigent PhD defense November 14th, 2014 36 / 49

Page 41: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Global workflow Study of the completion

Conclusion

Data mining and knowledge representation

Combining data sources

Automatic combinatorial completion

Many solutions but not so much reactionsScaling

Towards an automatic workflow

Helping manual curation

Pre-treatement and post-treatment of data are mandatory

Sylvain Prigent PhD defense November 14th, 2014 37 / 49

Page 42: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Biological results

Outline

1 Introduction

2 Combinatorial completion

3 Global workflow

4 Biological resultsFunctionalityReannotation of genesNew insights into aromatic amino acid synthesis

5 Conclusion and perspectives

Sylvain Prigent PhD defense November 14th, 2014 38 / 49

Page 43: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Biological results Functionality

Functionality of the obtained network

Development of a specific biomass function

Bibliographic study30 metabolites

Flux Balance Analysis study

Network functionally valid

Topologic completion was sufficient to have a functional network

Sylvain Prigent PhD defense November 14th, 2014 39 / 49

Page 44: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Biological results Reannotation of genes

Reannotations

Words proportion of pathways in which genes are involved

56 genes reannotated

Reannotation of biosynthesis pathways

Sylvain Prigent PhD defense November 14th, 2014 40 / 49

Page 45: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Biological results New insights into aromatic amino acid synthesis

Aromatic amino acid biosynthesis

Reconstruction of metabolic network pinpoints a different pathway whencompared to other stramenopiles

Sylvain Prigent PhD defense November 14th, 2014 41 / 49

Page 46: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Biological results New insights into aromatic amino acid synthesis

Aromatic amino-acid biosynthesis

Arrows: bifunctional enzymes

New insights into the evolution of aromatic amino acids synthesis

Sylvain Prigent PhD defense November 14th, 2014 42 / 49

Page 47: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Biological results New insights into aromatic amino acid synthesis

Conclusion

Reconstruction process provides new insights into the physiology oforganisms

Reconstruction of Ectocarpus siliculosus metabolic network enables abetter understanding of:

Metabolism of Ectocarpus siliculosusEvolution of aromatic amino acid biosynthesis

Sylvain Prigent PhD defense November 14th, 2014 43 / 49

Page 48: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Conclusion and perspectives

Outline

1 Introduction

2 Combinatorial completion

3 Global workflow

4 Biological results

5 Conclusion and perspectivesConclusionPerspectives

Sylvain Prigent PhD defense November 14th, 2014 44 / 49

Page 49: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Conclusion and perspectives Conclusion

Conclusion

Topologic completion

Sufficient to obtain a functional network

Semi-automatic pipeline to reconstructmetabolic networks

New insights into the evolution of Ectocarpussiliculosus

Reannotation of the genome

Sylvain Prigent PhD defense November 14th, 2014 45 / 49

Page 50: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Conclusion and perspectives Perspectives

Perspectives in bioinformatics

Improving the network

New metabolite profiling

Better completion

New RNA-seq data

How to include them in the pipeline?

Ectocarpus siliculosus associated with bacteria

Study the association between metabolic networks of different originsHolobiont metabolic network

Sylvain Prigent PhD defense November 14th, 2014 46 / 49

Page 51: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Conclusion and perspectives Perspectives

Perspectives in computer science

Continue improvements on Meneco

Studying subset minimality

New incremental solving in ASP

Deepest study of effect of cycles

New semantics of productibility

Preliminary results: totally differentcompletions

S

T2

A G

C

B

D

E

F

T1

R5

R2

R1

R3

R42

1

Sylvain Prigent PhD defense November 14th, 2014 47 / 49

Page 52: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Conclusion and perspectives Perspectives

Perspectives in biology

Study adaptation and acclimation

Freshwater species of Ectocarpus is sequenced

Reconstruct its metabolic networkComparative metabolic network analysis

Use transcriptomic data

Decomposition of the networkFound up- or down-regulated parts of this network in response to stress

Sylvain Prigent PhD defense November 14th, 2014 48 / 49

Page 53: Combinatorial completion for the reconstruction of ... › tel-01093287 › file › presentation_Sylvain_Prigent.pdf · Diverged from opisthokonta and plantae more than 1 billion

Conclusion and perspectives Perspectives

Thanks for your attention

Sylvain Prigent PhD defense November 14th, 2014 49 / 49