Post on 18-Nov-2014
description
http://www.bits.vib.be/training
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
Lennart MARTENSlennart.martens@ebi.ac.uk
Proteomics Services GroupEuropean Bioinformatics Institute
Hinxton, CambridgeUnited Kingdomwww.ebi.ac.uk
kenny helsens
kenny.helsens@ugent.be
Computational Omics and Systems Biology Group
Department of Medical Protein Research, VIBDepartment of Biochemistry, Ghent University
Ghent, Belgium
introduction to proteomics
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
Adapted from the NCBI Science Primerhttp://www.ncbi.nih.gov/About/primer/genetics_cell.html
- Primary structure (sequence)
- Secondary structure (structural elements)
- Tertiairy structure (3D shape)
- Modifications (dynamic, function)
- Processing (targetting, activation)
…YSFVATAER…
phosphorylation
trypsinplatelet activity
The central paradigm
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
Principle
Protein A Protein B
Protein C Protein Dcells protein mixture
cell lysisprotein extraction
2D-PAGE
pI
MrChemistrytoolbox
2D-PAGE separation of proteins (Est. 1975)
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0
100
%
100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0
100
%
300 400 500 600 700 800 900 1000 1100m/z0
100
%
300 400 500 600 700 800 900 1000 1100m/z0
100
%
protein extraction complex protein mixture
2D-PAGE separation
MS analysis
MS/MS analysis
pI
MW
http://www.akh-wien.ac.at/biomed-research/htx/platweb1.htm
fragmentation
tryptic
digest
2D-PAGE separation of proteins (Est. 1975)
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0
100
%
100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0
100
%
enzymaticdigest
extremely complexpeptide mixture
Data-dependent MS/MS analyses
separationselection
MS analysis
protein extraction complex protein mixture
http://www.akh-wien.ac.at/biomed-research/htx/platweb1.htm
less complexpeptide fractions
100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0
100
%
100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0
100
%
100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0
100
%
100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0
100
%
Overall gel-free proteomics workflow
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
• ICAT (Gygi et al., 1999)
• MudPIT (Washburn et al., 2001)
• Accurate Mass Tags for proteome analysis (Conrads et al., 2000)
• Signature Peptides approach for proteomics (Ji et al., 2000)
• AA-based covalent chromatography peptide selection (Wang & Regnier, 2001)
• Affinity-based enrichment of phosphopeptides (Oda et al., 2001)
• ICAT for phosphopeptides (Zhou et al., 2001)
• Reversible biotinylation of Cys-peptides (Spahr et al., 2000)
• COFRADIC (Gevaert et al., 2002)
Going gel-free in the new millennium
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
• Massive increase in mixture redundancy (eg. membrane proteins) Corresponding increase in mixture complexity (from a few
thousand proteins to hundreds of thousands of peptides!)
• Easier seperation of peptides instead of proteins Loss of protein-level information (pI, MW, isoforms)
• Mixture complexity can be reduced by peptide selection (Cys-peptides, Met-peptides, N-terminal peptides, phospho-peptides, …) Again leading to reduced redundancy of the mixture
• Choice of selection technique, depending on circumstances/analyte Massive amounts of data generated (10.000 spectra per hour)
• Additional processing information (N-terminal peptides) Unadapted database search engines (N-terminal processing)
An overview of the pro’s and cons
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
AN INCOMPLETE OVERVIEW
OF GEL-FREE TECHNIQUES
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
RPSCX ESI-based MS
Strong cationexchanger
Reverse-phaseresin
• Orthogonal, 2D separation of peptides
• 2D analogon: pI = SCX, Mr = RP
MudPIT: that which we call a rose…
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
e.g., Escherichia coli 4,349 predicted proteins
if 100% expressed 109,934 detectable tryptic peptides
if 50% expressed 54,967 detectable tryptic peptides
Sample complexity increased one order of magnitude!
But what about the complexity?
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
What happens when there are 100.000 peptides present?
How often do we need to repeat an analysis of an identical sample in order to obtain reasonable coverage?
The explorative aspect
A thought experiment seems appropriate
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
The explorative aspect
2002
2006
2010
Complete coverage
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
Tissue
one cell-type
one organel /
compartment
subset of
proteins
subset of
peptides
cells
compartments
proteins
peptides
Preselected, representative peptides
• Laser capture microdissection• Flow cytometry
• Differential Detergent fractionation
• Differential centrifugation
• Gel-filtration• 1D-gel electrophoresis• Ion-exchange
• ICAT-method• COmbined FRActional
Diagonal Chromatography
More coverage by reducing population size
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
Isotope Coded Affinity Tag
1) Modify cysteine residues using a molecule consisting of 3 parts:
• a thiol reactive group
• a biotin label
• a linker that may contain light or heavy atoms
2) Digest proteins
3) Affinity isolation of labeled cysteine-peptides
4) Use cysteine-peptides for LC-MS/MS analysis
Peptide selection techniques: ICAT
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
O
NH NH
SNH
O
OOX
XO N
H
X
X
X
X
X
X
IO
biotinheavy reagent: X = deuteriumlight reagent: X = hydrogen
thiol-specificreactive group
The linker allows differential proteome analysis!
Evoked mass difference = 8 amu’s.
From: Gygi SP et al., Nature Biotechnology, 1999
The ICAT molecule
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
COmbined FRActional DIagonal Chromatography
• Selection technique based on diagonal chromatography
• Versatile – requires only a specific modification that changes chromatographic properties
• Already applied to methionine, cysteine, N-terminal, nitrosylated, glycosylated, phosphorylated and ATP-binding peptides
• N-terminal analysis is well-suited for detecting proteolytic events
From: Gevaert et al., Molecular & Cellular Proteomics, 2002Gevaert et al., Nature Biotechnology, 2003
Peptide selection techniques: COFRADIC
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
AU
time
gradient
Separate and collect in fractions
Chemical (or enzymatic) alteration of subset of peptides
in separate or combined fractions
Altered peptides display changed chromatographic properties
(-, +)Alternatively: selected peptides are not altered (=0), while non selected peptides are altered
AU
time
gradient
- +
=0
COFRADIC in principle
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
Methionine COFRADIC(Gevaert et al., 2002)
N-terminal COFRADIC(Gevaert et al., 2003)
... N C C ...O
CH2
CH2
SCH3
HH... N C C ...
O
CH2
CH2
SCH3
O
HH
H2O2-oxidation
methionine methionine-sulfoxide
primary run secondary run
... N C C ...O
CH2
CH2
SCH3
HH... N C C ...
O
CH2
CH2
SCH3
O
HH
H2O2-oxidation
methionine methionine-sulfoxide
primary run secondary run
Ac AA1 AA2 AA3 AA4 ... Arg
NH2 AA1 AA2 AA3 AA4 ... Arg
NH2 AA1 AA2 Lys AA4 ... Arg
NH-Ac
Ac AA1 Lys AA3 AA4 ... Arg
NH-Ac
Ac AA1 AA2 AA3 AA4 ... Arg
Ac AA1 Lys AA3 AA4 ... Arg
NH-Ac
NH
AA1 AA2 AA3 AA4 ... Arg
NO2
NO2
NO2
NH
AA1 AA2 Lys AA4 ... Arg
NH-Ac
NO2
NO2
NO2
primary run secondary run
TNBS modification
N-terminalpeptides
internalpeptides
Ac AA1 AA2 AA3 AA4 ... Arg
NH2 AA1 AA2 AA3 AA4 ... Arg
NH2 AA1 AA2 Lys AA4 ... Arg
NH-Ac
Ac AA1 Lys AA3 AA4 ... Arg
NH-Ac
Ac AA1 AA2 AA3 AA4 ... Arg
Ac AA1 Lys AA3 AA4 ... Arg
NH-Ac
NH
AA1 AA2 AA3 AA4 ... Arg
NO2
NO2
NO2
NH
AA1 AA2 Lys AA4 ... Arg
NH-Ac
NO2
NO2
NO2
primary run secondary run
TNBS modification
N-terminalpeptides
internalpeptides
COFRADIC in practice (I)
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
... N C C ...O
CH2
SH
HH... N C C ...
O
CH2
S
HH
S
NO2
HOOC
... N C C ...O
CH2
SH
HH
primary run secondary run
cysteine cysteine
TNB-cysteine
Ellman’s reagent TCEP reduction
Cysteine COFRADIC(Gevaert et al., 2004)
COFRADIC in practice (II)
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
COFRADIC in practice (III)~60% Detectable!
log1
0(M
ass
N-te
rmin
al P
eptid
e)
log10(Mass C-terminal Peptide)
~60% Detectable!
BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011
Kenny Helsenskenny.helsens@UGent.be
Thank you!
Questions?