Outline 1 1.INTRODUCTION 2. METABOLOMICS WORKFLOW 3. CHALLENGES AND LIMITATIONS.

1

Outline

1. INTRODUCTION

2. METABOLOMICS WORKFLOW

3. CHALLENGES AND LIMITATIONS

2

1. INTRODUCTION

Sensitive and specific methods, but…

… targeted approaches focus on particular compounds / activities

Increasing need for methods allowing a global characterisation whatever the situation

Objective: to address, in a comprehensive manner, complex situations that were previously dealt as piecemeal

We only search for the “known” & we only find what is search

Concept of biological fingerprint

From targeted to untargeted approaches

Different levels of investigation1. INTRODUCTION

3

GenomicsTranscriptomics

Proteomics

Metabolomics

GENOTYPE

PHENOTYPE

What may happen

What makes happen

What happens

ADN & ARN

Proteins & Peptides

Metabolites

GLOBAL APPROACHES without any A PRIORI

Different levels of investigation but similar objectives

4

Large-scale analysis of biological systems (cells, tissues, complex matrices)

• Transcriptomics: large-scale analysis of mRNA transcripts

• Proteomics: large-scale analysis of proteins

• Metabolomics: large-scale analysis of metabolites

• other “…omics”

Markers = genes, proteins, metabolites, …

1. INTRODUCTIONDifferent levels of

investigation

5

METABONOMICS “…measurement of the dynamic multiparametric metabolic

response of living systems to pathophysiological stimuli or genetic

modification…” Nicholson et al., 1999.

METABOLOMICS “...the complete set of metabolites/low-molecular-weight

intermediates, which are context dependent, varying according to the physiology,

developmental or pathological state of the cell, tissue, organ or organism…”

Oliver, 1998

1. INTRODUCTION Définitions

In practice, same final objective : compare patterns, signatures or ‘‘fingerprints’’ of metabolites that change in response to external stimuli using a differential analysis of samples collected from two (or more) populations, namely ‘case’ and ‘control’.

6

Large diversity of Molecular Weight (from 50 to 1500 Da)

Organic acids- sugars- fatty acids - lipids- aminoacids- peptides - vitamins - …

METABOLOME

Multiple classes of compounds=> ≠ chemical properties=> ≠ physical properties

Extented range of concentrations: pmol mmol

What is the metabolome?

Metabolites come from:● catabolism (Degradation of organic matter)● anabolism (synthesis of components by the cell)● external sources: diet, medication, … (xenometabolome)

1. INTRODUCTION

7

The metabolome size:

● Vegetal : more than 200 000 metabolites

● Human : unknown but estimated to 1000’s metabolites and even larger (if we consider the external sources, diet, medicines…)

1. INTRODUCTION

8

Organisms

Biological Fluids Food matrices

« Controls »

« Cases »

2/ Generation of fingerprints and search for differences in the

metabolic profiles (= potential biomarkers)

3/ Biological interpretation of observations

1/ Collection and preparation of the samples (2 or more sub-

groups to be compared)

1. INTRODUCTION Concept

Plants

REGULATION

Biochemical Profile

DOWNUP

9

1. INTRODUCTION 3 different approaches

● Metabolite targeted analysis: detection and precise quantification of a single or small set of target compounds

● Metabolic profiling: analysis of a group of metabolites either related to a specific metabolic pathway or a class of compounds

● Metabolic fingerprinting

Scope

Accuracyhttp://manet.illinois.edu/pathways.php

10

Outline

1. INTRODUCTION




DEFINITION AND STUDY DESIGN

SAMPLE PREPARATION

DATA PROCESSING

EXPERIMENT / SAMPLE

COLLECTION

STRUCTURAL ELUCIDATION / BIOLOGICAL INTERPRETATION

METABOLOMICS PROFILES

GENERATION

DATA ANALYSIS

List of samples

List

of i

ons

Ion description / stat. indicators

total_1.M2 (PCA-X)Colored according to Obs ID (Serie)

t[2

]

-30

-20

-10

0

10

20

-60 -40 -20 0 20 40

t[1]

R2X[1] = 0.525 R2X[2] = 0.113 Ellipse: Hotelling's T2 (95%)

1

2

3

SIMCA 13.0 - 03/04/2013 13:17:51 (UTC+2)

11

2.1 Study Design – Sample Collection

Study Design is a step which needs for multidisciplinary skills

- statisticians- biologists / clinicians- chemists

Different questions have to be answered at this stage:

● A priori knowledge of the subject?

● Confusing factors?

● Statistical power?

12

13

● Sample selection ?

-for mammalian species: urine, plasma, serum, tissue…?

-for plant: root, leave, flower…?

● Feasibility of collection?

● Single collection protocol?

● Interruption of the metabolism? (e.g. quenching with liquid nitrogen or organic solvents )

● Sample storage? -80°C? -20°C

2.1 Study Design – Sample Collection

2.2 Sample preparation

Liquid matrices

Filtering

Solid matrices

Protein Removal

10 Kda cut-off

-

homogenization

Freeze-drying for better extraction capabilities

- Freeze-drying for urine samples to normalize the dilution factor- Dilution

Matrices

Protocols

MeOH

ACN/Acetone

Ethyl Acetate

Chloroforme

Polarity

+

-

Different solvents may give access to different parts of the

metabolome

Further protocols

- Solid Phase extraction (for better selectivity)

- Derivatization

14

- Liquid/liquid partitioning

What is an optimal sample preparation?

Elimination of interferences (limit matrix effects)

Repeatable and reproducible

Simple and fast

Maximisation of the information to answer the biological question

Fingerprint as exhaustive as possible

2.2 Sample preparation

15

16

m/z reading

2.3 Metabolomics profiles generationMass Measurment

principle

m/z

17


principle

IONISATIONIONISATION

FRAGMENTATIONFRAGMENTATION

Metabolite Intact Metabolite = precursor ion

Fragmented MetaboliteEach part is a product ion

18


principle

AbundanceAbundance

m/zm/z

Largest FragmentLargest FragmentSmallest FragmentSmallest Fragment

CO m/z = 27.99491N2 m/z = 28.00614

R = 1000 R = 5000

N2: N2 p(gss, s/p:40) Chrg 1R: 1000 Res.Pwr. @FWHM

27.92 27.94 27.96 27.98 28.00 28.02 28.04 28.06 28.08 28.10

m/z

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

75

80

85

90

95

100

Re

lativ

e A

bu

nd

an

ce

28.0

27.95 28.00 28.05 28.100

10

20

30

40

50

60

70

80

90

100 NL:2.33E4

N2: N2p (gss, s /p:40) Chrg 1R: 5000 Res .Pwr . @FWHMNL:2.32E4

CO: C 1 O1p (gss, s /p:40) Chrg 1R: 5000 Res .Pwr . @FWHM

27.99491 28.00614

resolution : ability to distinguish two peaks of slightly different mass-to-charge ratios

ΔM, in a mass spectrum

M DM

● Resolution =

19

With increased resolving power, increased information is obtained from the acquisition

2.3 Metabolomics profiles generation

m/z

Abundance

m/z

Abundance

Mass Measurment principle

LOW< 1000

MIDDLE> 5000

HIGHUp to 100000

ULTRA-HIGHUp to 2000000

RESOLVING POWER

QUADRUPOLEION TRAP

TOF ORBITRAP FT-ICR

MASS SPECTROMETER

ZOOM

20

Mass spectrometry Different analysers2.3 Metabolomics profiles generation

21

2.3 Metabolomics profiles generationMass spectrometry

Different acquisition modes

Scan Dissociation Select

Precursor Ion Set Product Ion Set

Scan Dissociation Scan

Selection Dissociation Selection

Scan

Full Scan

Precursor ion scanning

Neutral Loss scanning

Multiple Reaction

Monitoring

(a)

(b)

(c)

(d)

Full scan mode

● When there is no presupposed hypothesis on the subject… be as open as possible

22

More information to process less visibility of the differences between groups

« Control » Sample

« Test » Sample



23

Product ion scan : Example of estradiol-17-sulfate

Control animal

Treated animal

● When there is a presupposed hypothesis on the subject… more specificity can be useful

precursor ion scan mode:

When prior knowledge is known, relevant information can be made easier to find with appropriate acquisition parameters



Scan Dissociation Select

Precursor Ion Set Product Ion Set

24

● Coupling mass spectrometry with GC or LC

Gas Chromatography Liquid Chromatography

Volatile compounds

EI (CI) ESI (APCI, APPI)

Polar and ionic compounds

Apolar columns (polar columns) Reverse phase HPLC (HILIC )

Many choices have to be made to reach the most relevant part of the metabolome (to answer the biological question)

Often you do with the configuration that you have in the lab…

2.3 Metabolomics profiles generation

25

1 sample = 3D matrix

Time m/z

Inte

nsit

y

1 Sample = 1 total ionic current

Temps (min)

Inte

nsit

y

1 sample = x HR mass spectratypically 2 scans [m/z 60-1000] per second

m/z

Intensity

Complexity of a metabolic fingerprint Need for a well-

defined peaklist

2.4 Data processing Problematic

Metabolic profiles to be compared

Initial information Analysable information

26

2.4 Data processing

List of samples

List

of i

ons

Ion description / stat. indicators

Need for tools to convert the 3D matrix into a table of descriptors

Problematic

Raw data

Filtration

Peak picking

Alignment

Reporting

Statistical Analysis

DA

TA P

RO

CE

SSI

NG

SIEVE

Many tools2.4 Data processing

27

\\ORBITRAP\Xcalibur\...\ALR\121112011 1 2-picolinic acid pos

RT: 0.00 - 25.00

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Time (min)

0

10

20

30

40

50

60

70

80

90

100

Rel

ativ

e A

bund

ance

23.8723.74

23.996.300.84 6.40 23.356.0423.04

24.396.67

0.966.85

2.611.03 7.070.16 22.811.94 5.944.512.79 4.20 7.177.55

22.75

8.07 22.668.32 8.65 9.30

22.479.61 9.87 15.2210.48 22.1311.1313.63 22.0712.49 13.93

21.0415.58 20.4415.76

20.1616.49 19.6717.45

NL:4.92E6

m/z= 279.15769-279.16049 MS 121112011

121112011 #585 RT: 4.98 AV: 1 NL: 2.67E6T: FTMS + c ESI Full ms [65.00-1000.00]

100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000

m/z

0

10

20

30

40

50

60

70

80

90

100

Rel

ativ

e A

bund

ance

174.96886

279.15909

151.09651

206.99507113.05968

93.05450236.07147 304.17545 413.26624

348.34726 550.21783476.19882

269.24744

917.70935667.55054 961.10596624.23785 769.67108733.36682 858.52100807.13611

Dibutylphthalate

\\ORBITRAP\Xcalibur\...\ALR\121112011 1 2-picolinic acid pos

RT: 0.00 - 25.00

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Time (min)

0

10

20

30

40

50

60

70

80

90

100

Rel

ativ

e A

bund

ance

23.8723.74

23.996.300.84 6.40 23.356.0423.04

24.396.67

0.966.85

2.611.03 7.070.16 22.811.94 5.944.512.79 4.20 7.177.55

22.75

8.07 22.668.32 8.65 9.30

22.479.61 9.87 15.2210.48 22.1311.1313.63 22.0712.49 13.93

21.0415.58 20.4415.76

20.1616.49 19.6717.45

NL:4.92E6

m/z= 279.15769-279.16049 MS 121112011

121112011 #585 RT: 4.98 AV: 1 NL: 2.67E6T: FTMS + c ESI Full ms [65.00-1000.00]

100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000

m/z

0

10

20

30

40

50

60

70

80

90

100

Rel

ativ

e A

bund

ance

174.96886

279.15909

151.09651

206.99507113.05968

93.05450236.07147 304.17545 413.26624

348.34726 550.21783476.19882

269.24744

917.70935667.55054 961.10596624.23785 769.67108733.36682 858.52100807.13611

Filtration Peak picking Alignment Reporting

Analytical background noise

2.4 Data processing

28

Time

Peak Picking: report the signal abundance observed for each ion [m/z;rt]i in all the analyzed samples: various methods largely influenced by multiple end-user parameters

m/z

Extracted ion chromatogram

Integrated peak areas

rt (min)

Sig

nal i

nten

sity

m/z

Extracted mass spectrum

Ion signal intensities

Sig

nal i

nten

sity


2.4 Data processing

29

Samples

One injection : 10’s min

• Stability/reproducibility of retention times ? Alignment required


Peak Alignment

MxTy+/- e

tr

Int

tr

Int

MxTy

2.4 Data processing

30


Generation of final report

2.4 Data processing

Peak list which can be analyzed to extract relevant information (differences in intensity for defined ions between groups)

31

Det

ecte

d Io

ns

Characterisiticsd m/zd TR

Sign

al I

nten

siti

es in

sam

ples

Sample 1 Sample 2 …

Need for tools to explore the inherent structure and meaning of the data set

2.5 Statistical Analysis

In metabolomics

K >> N

K variables

N samples

X Y

Exploratory methods (unsupervised)

Modeling methods (supervised)

• Exploratory data analysis (exploring similarities and differences among samples and data variables, e.g. PCA)

• Data modeling (construction of classification / discrimination models, e.g. PLS-DA)

2.5 Statistical Analysis Principal Component Analysis

t[3] PCA : • Process that transforms a number of possibly correlated variables into a fewer number of uncorrelated latent variables or principal components

• Reduces 1000’s of variables into 2-3 key features

Score plot

v1

v2

v3

v4

v5

v6v7

v8

v9

vk

Need for the definition of new axis to project the observations from the original space (k dimensions) to 2D - 3D surfaces with maximal information and minimal deformation.

34

v1

v2

v3

v4

v5

v6v7

v8

v9

vk

2.5 Statistical Analysis Partial Least Square Discriminant Analysis

Need for the definition of new axis to project the observations from the original space (p dimensions) from 2D surfaces with maximal distances between the pre-determined groups

PLS-DA: • Data reduction as for PCA but…

• Process that uses multiple linear regression technique to find the direction of maximum covariance between a data set (X) and a class membership (Y)

• Extracted features are in the form of latent variables

t[3]

t[2]

Score plot

-50

-40

-30

-20

-10

0

10

20

30

40

50

-20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20

to[1

]

t[1]

Raw div MT.M3 (OPLS/O2PLS)t[Comp. 1]/to[XSide Comp. 1]Colored according to Obs ID (C/T)

R2X[1] = 0.0711369 R2X[XSide Comp. 1] = 0.548962 Ellipse: Hotelling T2 (0.95)

12

F

MM

F

M

FF

M

FFFF

M

F

M

M

F

SIMCA-P+ 12 - 2012-02-06 16:48:25 (UTC+1)

PLS-DA

-25

-20

-15

-10

-5

0

5

10

15

20

25

-60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60

t[2]

t[1]

B+T PUB.M1 (PCA-X)t[Comp. 1]/t[Comp. 2]Colored according to Obs ID (M/F)

R2X[1] = 0.544619 R2X[2] = 0.106451 Ellipse: Hotelling T2 (0.95)

12

F-14

F-4F-3F0-

F+2

F+1M

F0+

M-4

M0

M-2M+1

M+3

M+2

M+4

F+4

F+1A

F+3

SIMCA-P+ 12 - 2010-02-15 11:40:20 (UTC+1)

Males / Females

Example of PCA/ PLS-DA performed on metabolomics profiles acquired from urine samples collected on control versus treated animals with an anabolic substance

A commonly encountered problematic : the biological variability.

Example: in metabolomics, the studied factor is commonly associated with a discrete signature difficult to be revealed by non supervised analyses.

Supervised analysis

PCA

2.5 Statistical Analysis

35

Controls/ Treated

PC1

PC2

PC1

PC2

Molecular Ion = radical cation is produced from the neutral molecule that has lost one electron M+ (mass M)

● First step = identification of the MS molecular ion

Electron Ionization (EI)

Pseudomolecular Ion : In positive mode: - adducts with a proton: (M+H)+ (mass : M+1)

- adducts with reageant gaz (M+NH4+) (mass :

M+18)

In negative mode: - radical cation: M - (mass : M) - loss of a proton : (M-H)- (mass : M-1)

Chemical Ionization (CI)

Atmospheric Pressure Ionization sources (ESI, APCI…)

Pseudomolecular Ion : In positive mode : - adducts with a proton : (M+H)+ (mass : M+1)

- other adducts: (M+Na)+ , (M+K)+

… (mass : M+23 or M+39

…)

In negative mode : - loss of proton: (M-H)- (mass : M-1) - acetate adduct: (M+CH3COO)- (mass :

M+59) 36

2.6 Structure Elucidation

Application example in positive Electrospray (ESI)

86

132

149

154

263

170

(Fragment)

(M+H)+

(M+NH4)+

(M+Na)+

(M+K)+

(2M+H)+

Leucine: What is the molecular ion?

37


38

When the molecular ion has been identified…

● Second step: determination of the elemental composition thanks to:

- High resolution- Nitrogen rule- Isotopic pattern- Atom valence

● Third step: from the elemental composition to chemical structure

- Database searching- MS/MS experiment- molecule polarity- Use of other instruments (NMR, IR, …)- …


Confirmation of a chemical structure can only be made by the comparative analysis of the corresponding authentic reference standard

39

REGULATION

2.7 Biological interpretation

Biochemical Profile

DOWNUP

Give biological sense to the observations made

1. INTRODUCTION



Outline

40


41

● Challenge 1: be able to characterize discrete signature

Limitation 1: it may be hidden by other sources of variability

● Challenge 2: be able to characterize one system through the generation of a unique metabolic profile

Limitation 2: the metabolome is a dynamic system ( for example diurnal and seasonal variation in human studies…)

● Challenge 3: be able to connect genome and metabolome (systems biology)

Limitation 3: difficulties to collect both informations

On a biological point of view…

… these biological challenges correspond to future directions in research

42


On an analytical point of view…

… these analytical challenges correspond to future directions in research

● Challenge 1: be able to characterize the whole metabolome

Limitation 1: at the moment, there is no such versatile instrument allowing to analyze such chemical diversity

● Challenge 2: long term repeatability of analytical sequences (when 100’s to 1000’s samples are analyzed)

Limitation 2: still insufficient stability of the MS-instrument acknowledged, need for efficient way of normalization with Quality Controls

● Challenge 3: be reproducible between analytical platforms to allow comparison

Limitation 3: Used protocols are different, need for standardization procedures

Outline 1 1.INTRODUCTION 2. METABOLOMICS WORKFLOW 3. CHALLENGES AND LIMITATIONS.

Documents

Transcript of Outline 1 1.INTRODUCTION 2. METABOLOMICS WORKFLOW 3. CHALLENGES AND LIMITATIONS.