BITS - Introduction to proteomics

Post on 18-Nov-2014

896 views 0 download

description

This is the second presentation of the BITS training on 'Mass spec data processing'. It reviews the methods for separating protein mixtures prior to further analysis.Thanks to the Compomics Lab of the VIB for contribution.

Transcript of BITS - Introduction to proteomics

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

Lennart MARTENSlennart.martens@ebi.ac.uk

Proteomics Services GroupEuropean Bioinformatics Institute

Hinxton, CambridgeUnited Kingdomwww.ebi.ac.uk

kenny helsens

kenny.helsens@ugent.be

Computational Omics and Systems Biology Group

Department of Medical Protein Research, VIBDepartment of Biochemistry, Ghent University

Ghent, Belgium

introduction to proteomics

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

Adapted from the NCBI Science Primerhttp://www.ncbi.nih.gov/About/primer/genetics_cell.html

- Primary structure (sequence)

- Secondary structure (structural elements)

- Tertiairy structure (3D shape)

- Modifications (dynamic, function)

- Processing (targetting, activation)

…YSFVATAER…

phosphorylation

trypsinplatelet activity

The central paradigm

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

Principle

Protein A Protein B

Protein C Protein Dcells protein mixture

cell lysisprotein extraction

2D-PAGE

pI

MrChemistrytoolbox

2D-PAGE separation of proteins (Est. 1975)

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

300 400 500 600 700 800 900 1000 1100m/z0

100

%

300 400 500 600 700 800 900 1000 1100m/z0

100

%

protein extraction complex protein mixture

2D-PAGE separation

MS analysis

MS/MS analysis

pI

MW

http://www.akh-wien.ac.at/biomed-research/htx/platweb1.htm

fragmentation

tryptic

digest

2D-PAGE separation of proteins (Est. 1975)

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

enzymaticdigest

extremely complexpeptide mixture

Data-dependent MS/MS analyses

separationselection

MS analysis

protein extraction complex protein mixture

http://www.akh-wien.ac.at/biomed-research/htx/platweb1.htm

less complexpeptide fractions

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

100 300 500 700 900 1100 1300 1500 1700 1900 2100m/z0

100

%

Overall gel-free proteomics workflow

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

• ICAT (Gygi et al., 1999)

• MudPIT (Washburn et al., 2001)

• Accurate Mass Tags for proteome analysis (Conrads et al., 2000)

• Signature Peptides approach for proteomics (Ji et al., 2000)

• AA-based covalent chromatography peptide selection (Wang & Regnier, 2001)

• Affinity-based enrichment of phosphopeptides (Oda et al., 2001)

• ICAT for phosphopeptides (Zhou et al., 2001)

• Reversible biotinylation of Cys-peptides (Spahr et al., 2000)

• COFRADIC (Gevaert et al., 2002)

Going gel-free in the new millennium

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

• Massive increase in mixture redundancy (eg. membrane proteins) Corresponding increase in mixture complexity (from a few

thousand proteins to hundreds of thousands of peptides!)

• Easier seperation of peptides instead of proteins Loss of protein-level information (pI, MW, isoforms)

• Mixture complexity can be reduced by peptide selection (Cys-peptides, Met-peptides, N-terminal peptides, phospho-peptides, …) Again leading to reduced redundancy of the mixture

• Choice of selection technique, depending on circumstances/analyte Massive amounts of data generated (10.000 spectra per hour)

• Additional processing information (N-terminal peptides) Unadapted database search engines (N-terminal processing)

An overview of the pro’s and cons

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

AN INCOMPLETE OVERVIEW

OF GEL-FREE TECHNIQUES

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

RPSCX ESI-based MS

Strong cationexchanger

Reverse-phaseresin

• Orthogonal, 2D separation of peptides

• 2D analogon: pI = SCX, Mr = RP

MudPIT: that which we call a rose…

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

e.g., Escherichia coli 4,349 predicted proteins

if 100% expressed 109,934 detectable tryptic peptides

if 50% expressed 54,967 detectable tryptic peptides

Sample complexity increased one order of magnitude!

But what about the complexity?

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

What happens when there are 100.000 peptides present?

How often do we need to repeat an analysis of an identical sample in order to obtain reasonable coverage?

The explorative aspect

A thought experiment seems appropriate

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

The explorative aspect

2002

2006

2010

Complete coverage

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

Tissue

one cell-type

one organel /

compartment

subset of

proteins

subset of

peptides

cells

compartments

proteins

peptides

Preselected, representative peptides

• Laser capture microdissection• Flow cytometry

• Differential Detergent fractionation

• Differential centrifugation

• Gel-filtration• 1D-gel electrophoresis• Ion-exchange

• ICAT-method• COmbined FRActional

Diagonal Chromatography

More coverage by reducing population size

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

Isotope Coded Affinity Tag

1) Modify cysteine residues using a molecule consisting of 3 parts:

• a thiol reactive group

• a biotin label

• a linker that may contain light or heavy atoms

2) Digest proteins

3) Affinity isolation of labeled cysteine-peptides

4) Use cysteine-peptides for LC-MS/MS analysis

Peptide selection techniques: ICAT

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

O

NH NH

SNH

O

OOX

XO N

H

X

X

X

X

X

X

IO

biotinheavy reagent: X = deuteriumlight reagent: X = hydrogen

thiol-specificreactive group

The linker allows differential proteome analysis!

Evoked mass difference = 8 amu’s.

From: Gygi SP et al., Nature Biotechnology, 1999

The ICAT molecule

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

COmbined FRActional DIagonal Chromatography

• Selection technique based on diagonal chromatography

• Versatile – requires only a specific modification that changes chromatographic properties

• Already applied to methionine, cysteine, N-terminal, nitrosylated, glycosylated, phosphorylated and ATP-binding peptides

• N-terminal analysis is well-suited for detecting proteolytic events

From: Gevaert et al., Molecular & Cellular Proteomics, 2002Gevaert et al., Nature Biotechnology, 2003

Peptide selection techniques: COFRADIC

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

AU

time

gradient

Separate and collect in fractions

Chemical (or enzymatic) alteration of subset of peptides

in separate or combined fractions

Altered peptides display changed chromatographic properties

(-, +)Alternatively: selected peptides are not altered (=0), while non selected peptides are altered

AU

time

gradient

- +

=0

COFRADIC in principle

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

Methionine COFRADIC(Gevaert et al., 2002)

N-terminal COFRADIC(Gevaert et al., 2003)

... N C C ...O

CH2

CH2

SCH3

HH... N C C ...

O

CH2

CH2

SCH3

O

HH

H2O2-oxidation

methionine methionine-sulfoxide

primary run secondary run

... N C C ...O

CH2

CH2

SCH3

HH... N C C ...

O

CH2

CH2

SCH3

O

HH

H2O2-oxidation

methionine methionine-sulfoxide

primary run secondary run

Ac AA1 AA2 AA3 AA4 ... Arg

NH2 AA1 AA2 AA3 AA4 ... Arg

NH2 AA1 AA2 Lys AA4 ... Arg

NH-Ac

Ac AA1 Lys AA3 AA4 ... Arg

NH-Ac

Ac AA1 AA2 AA3 AA4 ... Arg

Ac AA1 Lys AA3 AA4 ... Arg

NH-Ac

NH

AA1 AA2 AA3 AA4 ... Arg

NO2

NO2

NO2

NH

AA1 AA2 Lys AA4 ... Arg

NH-Ac

NO2

NO2

NO2

primary run secondary run

TNBS modification

N-terminalpeptides

internalpeptides

Ac AA1 AA2 AA3 AA4 ... Arg

NH2 AA1 AA2 AA3 AA4 ... Arg

NH2 AA1 AA2 Lys AA4 ... Arg

NH-Ac

Ac AA1 Lys AA3 AA4 ... Arg

NH-Ac

Ac AA1 AA2 AA3 AA4 ... Arg

Ac AA1 Lys AA3 AA4 ... Arg

NH-Ac

NH

AA1 AA2 AA3 AA4 ... Arg

NO2

NO2

NO2

NH

AA1 AA2 Lys AA4 ... Arg

NH-Ac

NO2

NO2

NO2

primary run secondary run

TNBS modification

N-terminalpeptides

internalpeptides

COFRADIC in practice (I)

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

... N C C ...O

CH2

SH

HH... N C C ...

O

CH2

S

HH

S

NO2

HOOC

... N C C ...O

CH2

SH

HH

primary run secondary run

cysteine cysteine

TNB-cysteine

Ellman’s reagent TCEP reduction

Cysteine COFRADIC(Gevaert et al., 2004)

COFRADIC in practice (II)

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

COFRADIC in practice (III)~60% Detectable!

log1

0(M

ass

N-te

rmin

al P

eptid

e)

log10(Mass C-terminal Peptide)

~60% Detectable!

BITS MS Data Processing – Protein InferenceUGent, Gent, Belgium – 16 December 2011

Kenny Helsenskenny.helsens@UGent.be

Thank you!

Questions?