Outline 1 1.INTRODUCTION 2. METABOLOMICS WORKFLOW 3. CHALLENGES AND LIMITATIONS.
-
Upload
caitlin-fox -
Category
Documents
-
view
216 -
download
2
Transcript of Outline 1 1.INTRODUCTION 2. METABOLOMICS WORKFLOW 3. CHALLENGES AND LIMITATIONS.
1
Outline
1. INTRODUCTION
2. METABOLOMICS WORKFLOW
3. CHALLENGES AND LIMITATIONS
2
1. INTRODUCTION
Sensitive and specific methods, but…
… targeted approaches focus on particular compounds / activities
Increasing need for methods allowing a global characterisation whatever the situation
Objective: to address, in a comprehensive manner, complex situations that were previously dealt as piecemeal
We only search for the “known” & we only find what is search
Concept of biological fingerprint
From targeted to untargeted approaches
Different levels of investigation1. INTRODUCTION
3
GenomicsTranscriptomics
Proteomics
Metabolomics
GENOTYPE
PHENOTYPE
What may happen
What makes happen
What happens
ADN & ARN
Proteins & Peptides
Metabolites
GLOBAL APPROACHES without any A PRIORI
Different levels of investigation but similar objectives
4
Large-scale analysis of biological systems (cells, tissues, complex matrices)
• Transcriptomics: large-scale analysis of mRNA transcripts
• Proteomics: large-scale analysis of proteins
• Metabolomics: large-scale analysis of metabolites
• other “…omics”
Markers = genes, proteins, metabolites, …
1. INTRODUCTIONDifferent levels of
investigation
5
METABONOMICS “…measurement of the dynamic multiparametric metabolic
response of living systems to pathophysiological stimuli or genetic
modification…” Nicholson et al., 1999.
METABOLOMICS “...the complete set of metabolites/low-molecular-weight
intermediates, which are context dependent, varying according to the physiology,
developmental or pathological state of the cell, tissue, organ or organism…”
Oliver, 1998
1. INTRODUCTION Définitions
In practice, same final objective : compare patterns, signatures or ‘‘fingerprints’’ of metabolites that change in response to external stimuli using a differential analysis of samples collected from two (or more) populations, namely ‘case’ and ‘control’.
6
Large diversity of Molecular Weight (from 50 to 1500 Da)
Organic acids- sugars- fatty acids - lipids- aminoacids- peptides - vitamins - …
METABOLOME
Multiple classes of compounds=> ≠ chemical properties=> ≠ physical properties
Extented range of concentrations: pmol mmol
What is the metabolome?
Metabolites come from:● catabolism (Degradation of organic matter)● anabolism (synthesis of components by the cell)● external sources: diet, medication, … (xenometabolome)
1. INTRODUCTION
7
The metabolome size:
● Vegetal : more than 200 000 metabolites
● Human : unknown but estimated to 1000’s metabolites and even larger (if we consider the external sources, diet, medicines…)
1. INTRODUCTION
8
Organisms
Biological Fluids Food matrices
« Controls »
« Cases »
2/ Generation of fingerprints and search for differences in the
metabolic profiles (= potential biomarkers)
3/ Biological interpretation of observations
1/ Collection and preparation of the samples (2 or more sub-
groups to be compared)
1. INTRODUCTION Concept
Plants
REGULATION
Biochemical Profile
DOWNUP
9
1. INTRODUCTION 3 different approaches
● Metabolite targeted analysis: detection and precise quantification of a single or small set of target compounds
● Metabolic profiling: analysis of a group of metabolites either related to a specific metabolic pathway or a class of compounds
● Metabolic fingerprinting
Scope
Accuracyhttp://manet.illinois.edu/pathways.php
10
Outline
1. INTRODUCTION
2. METABOLOMICS WORKFLOW
3. CHALLENGES AND LIMITATIONS
2. METABOLOMICS WORKFLOW
DEFINITION AND STUDY DESIGN
SAMPLE PREPARATION
DATA PROCESSING
EXPERIMENT / SAMPLE
COLLECTION
STRUCTURAL ELUCIDATION / BIOLOGICAL INTERPRETATION
METABOLOMICS PROFILES
GENERATION
DATA ANALYSIS
List of samples
List
of i
ons
Ion description / stat. indicators
total_1.M2 (PCA-X)Colored according to Obs ID (Serie)
t[2
]
-30
-20
-10
0
10
20
-60 -40 -20 0 20 40
t[1]
R2X[1] = 0.525 R2X[2] = 0.113 Ellipse: Hotelling's T2 (95%)
1
2
3
SIMCA 13.0 - 03/04/2013 13:17:51 (UTC+2)
11
2.1 Study Design – Sample Collection
Study Design is a step which needs for multidisciplinary skills
- statisticians- biologists / clinicians- chemists
Different questions have to be answered at this stage:
● A priori knowledge of the subject?
● Confusing factors?
● Statistical power?
12
13
● Sample selection ?
-for mammalian species: urine, plasma, serum, tissue…?
-for plant: root, leave, flower…?
● Feasibility of collection?
● Single collection protocol?
● Interruption of the metabolism? (e.g. quenching with liquid nitrogen or organic solvents )
● Sample storage? -80°C? -20°C
2.1 Study Design – Sample Collection
2.2 Sample preparation
Liquid matrices
Filtering
Solid matrices
Protein Removal
10 Kda cut-off
-
homogenization
Freeze-drying for better extraction capabilities
- Freeze-drying for urine samples to normalize the dilution factor- Dilution
Matrices
Protocols
MeOH
ACN/Acetone
Ethyl Acetate
Chloroforme
Polarity
+
-
Different solvents may give access to different parts of the
metabolome
Further protocols
- Solid Phase extraction (for better selectivity)
- Derivatization
14
- Liquid/liquid partitioning
What is an optimal sample preparation?
Elimination of interferences (limit matrix effects)
Repeatable and reproducible
Simple and fast
Maximisation of the information to answer the biological question
Fingerprint as exhaustive as possible
2.2 Sample preparation
15
16
m/z reading
2.3 Metabolomics profiles generationMass Measurment
principle
m/z
17
2.3 Metabolomics profiles generationMass Measurment
principle
IONISATIONIONISATION
FRAGMENTATIONFRAGMENTATION
Metabolite Intact Metabolite = precursor ion
Fragmented MetaboliteEach part is a product ion
18
2.3 Metabolomics profiles generationMass Measurment
principle
AbundanceAbundance
m/zm/z
Largest FragmentLargest FragmentSmallest FragmentSmallest Fragment
CO m/z = 27.99491N2 m/z = 28.00614
R = 1000 R = 5000
N2: N2 p(gss, s/p:40) Chrg 1R: 1000 Res.Pwr. @FWHM
27.92 27.94 27.96 27.98 28.00 28.02 28.04 28.06 28.08 28.10
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lativ
e A
bu
nd
an
ce
28.0
27.95 28.00 28.05 28.100
10
20
30
40
50
60
70
80
90
100 NL:2.33E4
N2: N2p (gss, s /p:40) Chrg 1R: 5000 Res .Pwr . @FWHMNL:2.32E4
CO: C 1 O1p (gss, s /p:40) Chrg 1R: 5000 Res .Pwr . @FWHM
27.99491 28.00614
resolution : ability to distinguish two peaks of slightly different mass-to-charge ratios
ΔM, in a mass spectrum
M DM
● Resolution =
19
With increased resolving power, increased information is obtained from the acquisition
2.3 Metabolomics profiles generation
m/z
Abundance
m/z
Abundance
Mass Measurment principle
LOW< 1000
MIDDLE> 5000
HIGHUp to 100000
ULTRA-HIGHUp to 2000000
RESOLVING POWER
QUADRUPOLEION TRAP
TOF ORBITRAP FT-ICR
MASS SPECTROMETER
ZOOM
20
Mass spectrometry Different analysers2.3 Metabolomics profiles generation
21
2.3 Metabolomics profiles generationMass spectrometry
Different acquisition modes
Scan Dissociation Select
Precursor Ion Set Product Ion Set
Scan Dissociation Scan
Selection Dissociation Selection
Scan
Full Scan
Precursor ion scanning
Neutral Loss scanning
Multiple Reaction
Monitoring
(a)
(b)
(c)
(d)
Full scan mode
● When there is no presupposed hypothesis on the subject… be as open as possible
22
More information to process less visibility of the differences between groups
« Control » Sample
« Test » Sample
2.3 Metabolomics profiles generationMass spectrometry
Different acquisition modes
23
Product ion scan : Example of estradiol-17-sulfate
Control animal
Treated animal
● When there is a presupposed hypothesis on the subject… more specificity can be useful
precursor ion scan mode:
When prior knowledge is known, relevant information can be made easier to find with appropriate acquisition parameters
2.3 Metabolomics profiles generationMass spectrometry
Different acquisition modes
Scan Dissociation Select
Precursor Ion Set Product Ion Set
24
● Coupling mass spectrometry with GC or LC
Gas Chromatography Liquid Chromatography
Volatile compounds
EI (CI) ESI (APCI, APPI)
Polar and ionic compounds
Apolar columns (polar columns) Reverse phase HPLC (HILIC )
Many choices have to be made to reach the most relevant part of the metabolome (to answer the biological question)
Often you do with the configuration that you have in the lab…
2.3 Metabolomics profiles generation
25
1 sample = 3D matrix
Time m/z
Inte
nsit
y
1 Sample = 1 total ionic current
Temps (min)
Inte
nsit
y
1 sample = x HR mass spectratypically 2 scans [m/z 60-1000] per second
m/z
Intensity
Complexity of a metabolic fingerprint Need for a well-
defined peaklist
2.4 Data processing Problematic
Metabolic profiles to be compared
Initial information Analysable information
26
2.4 Data processing
List of samples
List
of i
ons
Ion description / stat. indicators
Need for tools to convert the 3D matrix into a table of descriptors
Problematic
Raw data
Filtration
Peak picking
Alignment
Reporting
Statistical Analysis
DA
TA P
RO
CE
SSI
NG
SIEVE
Many tools2.4 Data processing
27
\\ORBITRAP\Xcalibur\...\ALR\121112011 1 2-picolinic acid pos
RT: 0.00 - 25.00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Time (min)
0
10
20
30
40
50
60
70
80
90
100
Rel
ativ
e A
bund
ance
23.8723.74
23.996.300.84 6.40 23.356.0423.04
24.396.67
0.966.85
2.611.03 7.070.16 22.811.94 5.944.512.79 4.20 7.177.55
22.75
8.07 22.668.32 8.65 9.30
22.479.61 9.87 15.2210.48 22.1311.1313.63 22.0712.49 13.93
21.0415.58 20.4415.76
20.1616.49 19.6717.45
NL:4.92E6
m/z= 279.15769-279.16049 MS 121112011
121112011 #585 RT: 4.98 AV: 1 NL: 2.67E6T: FTMS + c ESI Full ms [65.00-1000.00]
100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000
m/z
0
10
20
30
40
50
60
70
80
90
100
Rel
ativ
e A
bund
ance
174.96886
279.15909
151.09651
206.99507113.05968
93.05450236.07147 304.17545 413.26624
348.34726 550.21783476.19882
269.24744
917.70935667.55054 961.10596624.23785 769.67108733.36682 858.52100807.13611
Dibutylphthalate
\\ORBITRAP\Xcalibur\...\ALR\121112011 1 2-picolinic acid pos
RT: 0.00 - 25.00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Time (min)
0
10
20
30
40
50
60
70
80
90
100
Rel
ativ
e A
bund
ance
23.8723.74
23.996.300.84 6.40 23.356.0423.04
24.396.67
0.966.85
2.611.03 7.070.16 22.811.94 5.944.512.79 4.20 7.177.55
22.75
8.07 22.668.32 8.65 9.30
22.479.61 9.87 15.2210.48 22.1311.1313.63 22.0712.49 13.93
21.0415.58 20.4415.76
20.1616.49 19.6717.45
NL:4.92E6
m/z= 279.15769-279.16049 MS 121112011
121112011 #585 RT: 4.98 AV: 1 NL: 2.67E6T: FTMS + c ESI Full ms [65.00-1000.00]
100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000
m/z
0
10
20
30
40
50
60
70
80
90
100
Rel
ativ
e A
bund
ance
174.96886
279.15909
151.09651
206.99507113.05968
93.05450236.07147 304.17545 413.26624
348.34726 550.21783476.19882
269.24744
917.70935667.55054 961.10596624.23785 769.67108733.36682 858.52100807.13611
Filtration Peak picking Alignment Reporting
Analytical background noise
2.4 Data processing
28
Time
Peak Picking: report the signal abundance observed for each ion [m/z;rt]i in all the analyzed samples: various methods largely influenced by multiple end-user parameters
m/z
Extracted ion chromatogram
Integrated peak areas
rt (min)
Sig
nal i
nten
sity
m/z
Extracted mass spectrum
Ion signal intensities
Sig
nal i
nten
sity
Filtration Peak picking Alignment Reporting
2.4 Data processing
29
Samples
One injection : 10’s min
• Stability/reproducibility of retention times ? Alignment required
Filtration Peak picking Alignment Reporting
Peak Alignment
MxTy+/- e
tr
Int
tr
Int
MxTy
2.4 Data processing
30
Filtration Peak picking Alignment Reporting
Generation of final report
2.4 Data processing
Peak list which can be analyzed to extract relevant information (differences in intensity for defined ions between groups)
31
Det
ecte
d Io
ns
Characterisiticsd m/zd TR
Sign
al I
nten
siti
es in
sam
ples
Sample 1 Sample 2 …
Need for tools to explore the inherent structure and meaning of the data set
2.5 Statistical Analysis
In metabolomics
K >> N
K variables
N samples
X Y
Exploratory methods (unsupervised)
Modeling methods (supervised)
• Exploratory data analysis (exploring similarities and differences among samples and data variables, e.g. PCA)
• Data modeling (construction of classification / discrimination models, e.g. PLS-DA)
2.5 Statistical Analysis Principal Component Analysis
t[3] PCA : • Process that transforms a number of possibly correlated variables into a fewer number of uncorrelated latent variables or principal components
• Reduces 1000’s of variables into 2-3 key features
Score plot
v1
v2
v3
v4
v5
v6v7
v8
v9
vk
Need for the definition of new axis to project the observations from the original space (k dimensions) to 2D - 3D surfaces with maximal information and minimal deformation.
34
v1
v2
v3
v4
v5
v6v7
v8
v9
vk
2.5 Statistical Analysis Partial Least Square Discriminant Analysis
Need for the definition of new axis to project the observations from the original space (p dimensions) from 2D surfaces with maximal distances between the pre-determined groups
PLS-DA: • Data reduction as for PCA but…
• Process that uses multiple linear regression technique to find the direction of maximum covariance between a data set (X) and a class membership (Y)
• Extracted features are in the form of latent variables
t[3]
t[2]
Score plot
-50
-40
-30
-20
-10
0
10
20
30
40
50
-20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18 20
to[1
]
t[1]
Raw div MT.M3 (OPLS/O2PLS)t[Comp. 1]/to[XSide Comp. 1]Colored according to Obs ID (C/T)
R2X[1] = 0.0711369 R2X[XSide Comp. 1] = 0.548962 Ellipse: Hotelling T2 (0.95)
12
F
MM
F
M
FF
M
FFFF
M
F
M
M
F
SIMCA-P+ 12 - 2012-02-06 16:48:25 (UTC+1)
PLS-DA
-25
-20
-15
-10
-5
0
5
10
15
20
25
-60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60
t[2]
t[1]
B+T PUB.M1 (PCA-X)t[Comp. 1]/t[Comp. 2]Colored according to Obs ID (M/F)
R2X[1] = 0.544619 R2X[2] = 0.106451 Ellipse: Hotelling T2 (0.95)
12
F-14
F-4F-3F0-
F+2
F+1M
F0+
M-4
M0
M-2M+1
M+3
M+2
M+4
F+4
F+1A
F+3
SIMCA-P+ 12 - 2010-02-15 11:40:20 (UTC+1)
Males / Females
Example of PCA/ PLS-DA performed on metabolomics profiles acquired from urine samples collected on control versus treated animals with an anabolic substance
A commonly encountered problematic : the biological variability.
Example: in metabolomics, the studied factor is commonly associated with a discrete signature difficult to be revealed by non supervised analyses.
Supervised analysis
PCA
2.5 Statistical Analysis
35
Controls/ Treated
PC1
PC2
PC1
PC2
Molecular Ion = radical cation is produced from the neutral molecule that has lost one electron M+ (mass M)
● First step = identification of the MS molecular ion
Electron Ionization (EI)
Pseudomolecular Ion : In positive mode: - adducts with a proton: (M+H)+ (mass : M+1)
- adducts with reageant gaz (M+NH4+) (mass :
M+18)
In negative mode: - radical cation: M - (mass : M) - loss of a proton : (M-H)- (mass : M-1)
Chemical Ionization (CI)
Atmospheric Pressure Ionization sources (ESI, APCI…)
Pseudomolecular Ion : In positive mode : - adducts with a proton : (M+H)+ (mass : M+1)
- other adducts: (M+Na)+ , (M+K)+
… (mass : M+23 or M+39
…)
In negative mode : - loss of proton: (M-H)- (mass : M-1) - acetate adduct: (M+CH3COO)- (mass :
M+59) 36
2.6 Structure Elucidation
Application example in positive Electrospray (ESI)
86
132
149
154
263
170
(Fragment)
(M+H)+
(M+NH4)+
(M+Na)+
(M+K)+
(2M+H)+
Leucine: What is the molecular ion?
37
2.6 Structure Elucidation
38
When the molecular ion has been identified…
● Second step: determination of the elemental composition thanks to:
- High resolution- Nitrogen rule- Isotopic pattern- Atom valence
● Third step: from the elemental composition to chemical structure
- Database searching- MS/MS experiment- molecule polarity- Use of other instruments (NMR, IR, …)- …
2.6 Structure Elucidation
Confirmation of a chemical structure can only be made by the comparative analysis of the corresponding authentic reference standard
39
REGULATION
2.7 Biological interpretation
Biochemical Profile
DOWNUP
Give biological sense to the observations made
1. INTRODUCTION
2. METABOLOMICS WORKFLOW
3. CHALLENGES AND LIMITATIONS
Outline
40
3. CHALLENGES AND LIMITATIONS
41
● Challenge 1: be able to characterize discrete signature
Limitation 1: it may be hidden by other sources of variability
● Challenge 2: be able to characterize one system through the generation of a unique metabolic profile
Limitation 2: the metabolome is a dynamic system ( for example diurnal and seasonal variation in human studies…)
● Challenge 3: be able to connect genome and metabolome (systems biology)
Limitation 3: difficulties to collect both informations
On a biological point of view…
… these biological challenges correspond to future directions in research
42
3. CHALLENGES AND LIMITATIONS
On an analytical point of view…
… these analytical challenges correspond to future directions in research
● Challenge 1: be able to characterize the whole metabolome
Limitation 1: at the moment, there is no such versatile instrument allowing to analyze such chemical diversity
● Challenge 2: long term repeatability of analytical sequences (when 100’s to 1000’s samples are analyzed)
Limitation 2: still insufficient stability of the MS-instrument acknowledged, need for efficient way of normalization with Quality Controls
● Challenge 3: be reproducible between analytical platforms to allow comparison
Limitation 3: Used protocols are different, need for standardization procedures