Already said Outlook - Insermicim.marseille.inserm.fr/IMG/pdf/2007-proteoquant-binz.pdf · 3 MSight...

13
1 A look at data for quantitative analysis using MSight and Phenyx Atelier Protéomique Quantitative 25-27 Juin 2007 La Grande Motte Pierre-Alain Binz Institut Suisse de Bioinformatique GeneBio SA Already said Importance of biological question, sample choice, experimental strategy Complexity of sample is a challenge for MS Peak capacity, concentration range, chemical properties,… Many methods with goods and bads iTRAQ, SILAC, ICAT, MRM, label-free, … Many instrumental settings: heterogeneity of data type, amount, resolution Many bioinformatics tools Identification, signal detection, quantitation Validation methods P-A Binz, Atelier Proteomique Quantitative, juin 2007 Already said Importance of biological question, sample choice, experimental strategy Complexity of sample is a challenge for MS Peak capacity, concentration range, chemical properties,… Many methods with goods and bads iTRAQ, SILAC, ICAT, MRM, label-free, … Many instrumental settings: heterogeneity of data type, amount, resolution Many bioinformatics tools Identification, signal detection, quantitation Validation methods P-A Binz, Atelier Proteomique Quantitative, juin 2007 Outlook • Visualise LC-MS data • Detect signal • Align LC-MS runs • Match images (differential analysis) • Add identification results • Quantitation with search engine P-A Binz, Atelier Proteomique Quantitative, juin 2007 What data for quantitation? MS data: dimensions: – m/z – Intensity Rt, pI, scan number Secondary data Sample (one, more than one) Molecular interpretation (peptide, protein) Quantitation method (label description, comparison method, thresholds, corrections) P-A Binz, Atelier Proteomique Quantitative, juin 2007 Look at LC-MS data Raw MS traces or peaklists (spectrum view or gel view) Chromatographic profiles (TIC, XIC ) 2D images (LC-MS) Annotated spectra Overlapped spectra, head-to-head view Overlapped images P-A Binz, Atelier Proteomique Quantitative, juin 2007

Transcript of Already said Outlook - Insermicim.marseille.inserm.fr/IMG/pdf/2007-proteoquant-binz.pdf · 3 MSight...

Page 1: Already said Outlook - Insermicim.marseille.inserm.fr/IMG/pdf/2007-proteoquant-binz.pdf · 3 MSight • LC- MS data analysis tool • Developed by the Proteome Informatics Group of

1

A look at data for quantitative analysis using

MSight and Phenyx

Atelier Protéomique Quantitative25-27 Juin 2007La Grande Motte

Pierre-Alain Binz

Institut Suisse de BioinformatiqueGeneBio SA

Already said• Importance of biological question, sample choice, experimental

strategy

• Complexity of sample is a challenge for MS– Peak capacity, concentration range, chemical properties,…

• Many methods with goods and bads– iTRAQ, SILAC, ICAT, MRM, label-free, …

• Many instrumental settings: heterogeneity of data – type, amount, resolution

• Many bioinformatics tools– Identification, signal detection, quantitation

• Validation methods

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Already said• Importance of biological question, sample choice, experimental

strategy

• Complexity of sample is a challenge for MS– Peak capacity, concentration range, chemical properties,…

• Many methods with goods and bads– iTRAQ, SILAC, ICAT, MRM, label-free, …

• Many instrumental settings: heterogeneity of data – type, amount, resolution

• Many bioinformatics tools– Identification, signal detection, quantitation

• Validation methods

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Outlook

• Visualise LC-MS data• Detect signal• Align LC-MS runs• Match images (differential analysis)• Add identification results• Quantitation with search engine

P-A Binz, Atelier Proteomique Quantitative, juin 2007

What data for quantitation?

• MS data: dimensions:– m/z– Intensity– Rt, pI, scan number

• Secondary data– Sample (one, more than one)– Molecular interpretation (peptide, protein)– Quantitation method (label description, comparison

method, thresholds, corrections)

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Look at LC-MS data

• Raw MS traces or peaklists (spectrum view or gel view)

• Chromatographic profiles (TIC, XIC )• 2D images (LC-MS)

• Annotated spectra• Overlapped spectra, head-to-head view• Overlapped images

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Page 2: Already said Outlook - Insermicim.marseille.inserm.fr/IMG/pdf/2007-proteoquant-binz.pdf · 3 MSight • LC- MS data analysis tool • Developed by the Proteome Informatics Group of

2

Visualise LC-MS data:spectrum view, gel view, chromatograms

m/zI

Rt

P-A Binz, Atelier Proteomique Quantitative, juin 2007

2D representation

183 122 88 84 98 104 104 104 110 108 108 116 112 106 116 120 98 74 32 76 122 145 141 133 106 84 110 116 78 74 104 104 114 102 92 88106 72 88 82 92 98 96 90 82 82 86 90 90 94 96 94 74 44 26 52 100 141 124 114 68 50 88 100 56 92 100 116 116 96 82 108

68 72 82 86 84 88 92 88 84 58 60 56 50 66 56 66 40 36 28 38 66 92 82 58 28 26 50 52 48 72 102 114 120 72 88 12066 64 82 74 62 52 54 74 70 48 46 50 44 36 40 36 34 30 24 32 34 42 36 36 28 24 26 28 28 40 68 114 88 84 112 13172 60 64 56 56 42 34 42 48 36 36 38 34 32 36 30 30 34 32 34 34 32 30 26 24 24 24 20 18 24 36 52 60 94 131 13554 52 50 46 40 34 32 34 34 32 32 34 30 32 26 28 28 26 30 42 28 36 38 34 48 24 26 20 18 26 22 22 38 82 124 13342 34 36 38 34 30 34 32 32 40 32 26 32 32 26 26 26 26 24 32 36 46 68 56 36 32 26 20 26 18 18 28 20 36 74 10040 34 30 26 28 34 34 32 34 36 32 34 26 22 22 26 26 28 28 64 50 84 108 100 80 54 40 26 20 18 28 20 18 26 42 7232 28 28 34 28 28 28 28 36 26 32 28 26 24 26 26 24 32 36 52 76 131 159 147 135 92 64 36 22 20 20 24 18 20 26 5020 26 28 28 30 28 34 34 26 22 34 30 30 20 26 28 58 44 52 82 120 159 195 195 175 143 108 86 40 22 18 16 18 22 24 3222 20 20 32 34 44 60 50 22 26 44 42 24 18 14 18 72 72 48 72 112 173 205 207 193 175 161 149 84 54 24 24 40 60 58 4624 20 24 32 48 66 76 66 32 42 64 64 28 16 12 18 36 76 48 32 56 161 207 207 203 195 193 187 133 96 56 42 44 92 96 6822 26 32 38 52 78 90 76 30 40 80 80 60 26 18 20 50 102 62 36 50 155 207 207 201 201 201 195 171 139 88 56 52 80 124 9830 26 26 50 56 82 104 96 54 34 78 86 76 56 48 50 76 98 58 34 74 175 211 207 207 211 207 203 193 171 120 74 54 74 114 11226 22 26 52 58 80 106 112 86 50 40 68 84 70 70 78 68 56 40 48 116 199 213 211 213 215 207 211 203 183 145 98 60 54 96 11836 26 32 52 68 76 96 120 104 74 40 42 64 72 76 64 60 36 48 104 171 211 215 213 215 215 207 211 205 183 155 114 80 68 86 11860 30 24 32 50 80 94 104 116 102 74 44 40 36 42 40 46 62 96 157 199 211 211 213 211 213 205 203 195 175 155 124 106 84 90 11674 48 28 26 56 88 104 100 106 112 104 82 66 46 48 58 74 118 155 189 205 207 205 205 211 207 199 197 189 161 143 133 124 110 94 10688 68 50 50 66 86 104 110 108 116 124 112 98 88 90 106 131 159 187 199 201 205 207 211 213 211 205 201 175 147 131 139 137 143 102 100

106 94 82 76 80 82 82 100 110 122 133 135 124 124 133 141 155 183 189 195 199 201 205 211 213 211 203 181 155 139 133 141 149 133 106 102122 114 106 96 86 92 68 58 102 116 129 133 141 145 151 151 155 167 173 175 189 187 195 197 195 187 175 165 169 151 143 137 129 116 106 108129 131 120 102 98 104 94 98 112 104 100 106 124 126 135 147 149 147 155 167 165 171 179 179 169 177 181 187 189 179 157 151 147 143 131 131137 135 126 112 108 116 118 118 116 96 98 96 114 100 84 112 126 131 141 147 141 143 165 157 135 157 159 163 175 173 171 169 173 173 157 143143 141 131 129 124 131 131 124 114 98 92 110 116 88 74 106 120 122 124 120 92 96 120 104 88 120 157 159 165 165 179 175 175 167 155 139141 143 139 133 139 135 135 124 120 98 110 120 112 98 76 120 120 131 129 133 104 100 120 114 90 116 165 149 143 153 165 161 163 149 147 133135 141 139 139 147 145 143 135 122 110 120 122 114 104 100 129 118 129 133 137 114 98 126 131 120 129 165 141 141 149 149 149 149 141 137 137

38x26m/z

Rt

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Time

30

20

40

10

0

m/z1200400 1000600 800

200 Da

20 min

Example: LC-ESI-Q-TOF

28’800’000 measures(55 MB)

900 spectra3 s0-45 mintime

32’000 measures0.025 m/z400-1200 m/zmass

sampling rateinterval

42-59 kDa extract of human BJAB B-cell line

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Image part to display6000x400

Data display principle

MS data32000x900

Screen size800x600

Projection

Time

m/z

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Time

30

20

40

10

0

m/z1200400 1000600 800

200 Da

20 min

Full image

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Less than 0.001 % of the data displayed

0.5 Da

30 s

m/z660.25658.25 659 659.5659.25 659.75 660658 658.5 658.75657.5 657.75 660.5

Time

32.5

33

Zoom 256x

0.33

3+

0.5

2+

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Page 3: Already said Outlook - Insermicim.marseille.inserm.fr/IMG/pdf/2007-proteoquant-binz.pdf · 3 MSight • LC- MS data analysis tool • Developed by the Proteome Informatics Group of

3

MSight

• LC- MS data analysis tool

• Developed by the Proteome Informatics Group of theSwiss Institute of Bioinformatics

• Based on Melanie 2D gel analysis software

It looks a bit like Melanie

http://www.expasy.orgP-A Binz, Atelier Proteomique Quantitative, juin 2007

Why MSight?

• Generate and evaluate LC-MS images– Import LC-MS and MS/MS runs from various MS instruments and formats– Workspace to manage experiments and data– Rich visualisation and annotation – Visualise the complexity of a LC-MS run– Detect contaminants, running aberations

• Perform peak detection from raw LC-MS data– Improve Rt and m/z accuracy using 2D

• Quantitation and comparison– Alignment and matching of LC-MS “images”– Quantitation reports for differential expression analysis – Label-free quantitation, – Generation of inclusion/exclusion list

• Integrate with identification tools (Phenyx)– Annotate MS “peaks” with peptide identity labels– Use the annotations to validate matching peaks across LC-MS experiments

Import

• Raw LC-MS and MS/MS data format– Native format (yep, baf, fid, T2D, dat)– mzXML, mzData– Ascii exports

• Handle big original files (100MB-1GB)

• Include profile LC-MS trace and MS/MS spectra

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Visualisation

• Open multiple images• Zoom in/out• Chromatographic profile (« XIC »)• Spectrum view• Editable and searchable annotations

– landmarks, Rt, m/z, peptide sequence, hyperlinks, others

• Synchronisation between views• Superpose images in transparency mode and

complementary colors• 3D view

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Artefacts

1 min

100 Da

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Artefacts

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Page 4: Already said Outlook - Insermicim.marseille.inserm.fr/IMG/pdf/2007-proteoquant-binz.pdf · 3 MSight • LC- MS data analysis tool • Developed by the Proteome Informatics Group of

4

500 Da SDS-MALDI-TOF

Mass calibration

4’392’000 measures

90 spectra

48’800 measures0.05 m/z560-3000 m/zmass

sampling rateinterval

2 Da

0.15

P-A Binz, Atelier Proteomique Quantitative, juin 2007

2 Da

30 s

100 Da

5 min

Contaminants

44 Da Polymer PEG

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Contaminants (2)

5 min

100 Da

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Redundancy: Peptide modifications

10 min

100 Da

Spot from 2DE gel

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Redundancy: Peptide modifications

2 min

5 Da

5.33(3+)

5.33(3+)

Oxidation

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Redundancy: Peptide modifications

10 min

100 Da

3+ 2+4+ 3+5+ 4+

2+

Oxidation

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Page 5: Already said Outlook - Insermicim.marseille.inserm.fr/IMG/pdf/2007-proteoquant-binz.pdf · 3 MSight • LC- MS data analysis tool • Developed by the Proteome Informatics Group of

5

Outlook

• Visualise LC-MS data• Detect signal• Align LC-MS runs• Match images (differential analysis)• Add identification results• Quantitation with search engine

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Peak detection

• Detect and quantify MS peaks in a 2D image• Interactive use• Manual validation via visualisation• Export in centroid mode

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Peak detection variability

• High vs low resolution in m/z axis– Isotopic profile vs bump

• Sampling resolution (Rt and m/z)– LC-MALDI < ESI-MS with MS/MS < ESI-MS (QTOF<LTQ)

• Noise (chemical, electronic)• Shape (rectangle, circle, other)• Intensity (max, sum, fit max, integrate)

• And for quantitation:– Detect individual sample and compare vs

align and use one single shape per aligned feature

P-A Binz, Atelier Proteomique Quantitative, juin 2007

5 min

5 Da

15 s

Locating the source of noise

37.15

P-A Binz, Atelier Proteomique Quantitative, juin 2007

2 min

1 Da

37.15

Locating the source of noise

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Streak

10 min

1 Da

2000 (5+)

12000 (2+)

80

807 809 810808m/z

3000 (2+)

b

c d

e f

g h

a b c d e f g ha b c d e f g h

b

c d

e f

g h

i j k L

i j k

L

m n

mn

28 min

i j k

L

i j k L

m n

mn

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Page 6: Already said Outlook - Insermicim.marseille.inserm.fr/IMG/pdf/2007-proteoquant-binz.pdf · 3 MSight • LC- MS data analysis tool • Developed by the Proteome Informatics Group of

6

time: 31.9 min

Peptide deconvolution

1 Da

1 min

2+ 2+

4+

2+

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Outlook

• Visualise LC-MS data• Detect signal• Align LC-MS runs• Match images (relative quantitation)• Add identification results• Quantitation with identification results

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Alignment and comparison

• Align images via landmarks (corrections for local deviations)

• Match images (pair peaks together)

• Report relative quantification information

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Alignment

620 628 632624m/z

transformation

4 min

P-A Binz, Atelier Proteomique Quantitative, juin 2007

A - B

1 min

2 Da

BA

1 min

2 Da

Migration variability

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Outlook

• Visualise LC-MS data• Detect signal• Align LC-MS runs• Match images (differential analysis)• Add identification results• Quantitation with identification results

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Page 7: Already said Outlook - Insermicim.marseille.inserm.fr/IMG/pdf/2007-proteoquant-binz.pdf · 3 MSight • LC- MS data analysis tool • Developed by the Proteome Informatics Group of

7

• Protein Mixture – 32-45 kDa fraction of lysate from a culture of a

B-cell line

– ~ 1 pmol

– up to 180 proteins detectable in this sample when analysed extensively by LC-MS/MS

10 Da

5 min

+26 fmol

+83 fmol

+520 fmol

BSA

Quantitation

740.35 (2+)LGEYGFQNAL

P-A Binz, Atelier Proteomique Quantitative, juin 20072 Da

2 min

+26 fmol

+83 fmol

+520 fmol

Quantitation

3+

3+

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Quantitation

+26 fmol

+83 fmol +520 fmol

P-A Binz, Atelier Proteomique Quantitative, juin 2007

5 min

20 DaBSA BSA+Lyz

Differential (low resolution)

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Differential analysis

A

B

A

B

A-B

100 DaP-A Binz, Atelier Proteomique Quantitative, juin 2007

Differential analysis

AA-B

2 Da

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Page 8: Already said Outlook - Insermicim.marseille.inserm.fr/IMG/pdf/2007-proteoquant-binz.pdf · 3 MSight • LC- MS data analysis tool • Developed by the Proteome Informatics Group of

8

Outlook

• Visualise LC-MS data• Detect signal• Align LC-MS runs• Match images (differential analysis)• Add identification results• Quantitation with search engine

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Coupling with identification

• Sofar, quantitation without consideration of molecular interpretation

• To quantitate protein, need to select signals and to couple with peptide identification

P-A Binz, Atelier Proteomique Quantitative, juin 2007

PhenyxA software platform dedicated to the identification and

characterization of proteins and peptides from mass spectrometry data

• Developed by GeneBio, in collaboration with the Swiss Institute of Bioinformatics (SIB)

• Launched in September 2004 (version 1.8) • Version 2.3 in April 2007

• Rapid development and recognized tool• Integration in a number of third-party software (Scaffold,

TPP, MSight, ProteinScape, Proteus LIMS, …) • Adopted by a number of large renowned Proteomics centres

http://www.phenyx-ms.com http://phenyx.vital-it.ch/pwiP-A Binz, Atelier Proteomique Quantitative, juin 2007

Some features

Core calculationRobust and flexible scoring including log likelihood measuresConflict resolution algorithmUse of annotations in databases (PTMs, variants, AA modifs…)

Flexible and interactive interface: the “Phenyx Web Interface”User and jobs properties (user privileges, job sharing)Manual validation functionalityImport third party jobs (Mascot, Sequest, X!Tandem, Popitam, …)Many exports (native Phenyx, Excel, XML, text…)Results comparison functionality Integration of Phenyx into workflows: a job follows a suite of

configurable events (pre-processing, processing and post-processing)

http://www.phenyx-ms.com http://phenyx.vital-it.ch/pwiP-A Binz, Atelier Proteomique Quantitative, juin 2007

The Phenyx Web Interface:

Excel, xml and text exports

DesktopResultsviews

Submission

Management consoleResults comparison

http://phenyx.vital-it.ch/pwiP-A Binz, Atelier Proteomique Quantitative, juin 2007

Integrate MSight and Phenyx

• Example: Annotate LC-MS images with peptide identifications

RawLC-MS

Peaklists

Exportedpeptide

identifications

Annotated images

Phenyx interface

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Page 9: Already said Outlook - Insermicim.marseille.inserm.fr/IMG/pdf/2007-proteoquant-binz.pdf · 3 MSight • LC- MS data analysis tool • Developed by the Proteome Informatics Group of

9

Phenyx results are stored as annotations in the images

P-A Binz, Atelier Proteomique Quantitative, juin 2007

LC-MS and MS/MS: undersampling

621 m/z 655

21.15

Time[min]

34.85

LC-MS and LC-MS/MS on a QStar of 49-62 KDa SDS separated and trypsin digested proteins, from a human B-cell lineFocus on a small time x m/z region (about 1/250 of the full run)

P-A Binz, Atelier Proteomique Quantitative, juin 2007

LC-MS and MS/MS: undersampling

621 m/z 655

21.15

Time[min]

34.85

7/40 peptides analysed3/7 identified< 10% positively identified using stringent criteria

FFADLLDYIK

SLDLDSIIAEVKLALDLEIATYR

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Outlook

• Visualise LC-MS data• Detect signal• Align LC-MS runs• Match images (relative quantitation)• Add identification results• Quantitation with search engine

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Quantitation with search engine

• Use of MS/MS data– Reporter ions: isobaric labeling (iTRAQ, TMT)– emPAI (~ratio observed/predicted peptides)– Multiplex (SILAC, 18O)

• Use of MS raw traces– Stable isotope labeling (ICAT, SILAC, AQUA, 18O, ICPL, …)– Label-free

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Quantitation: needed information

• Need identified peptides • Need access to intensities (MS/MS and MS)• Need quantitation method

– Labeling method (fixed, variable mode)– Definition of “pairs”– Intensity correction factors– Thresholds for what peptides to consider (confidence levels,

scores, #pep / protein)– Create report, calculate ratios, evaluate outliers – Include in search engine GUI

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Page 10: Already said Outlook - Insermicim.marseille.inserm.fr/IMG/pdf/2007-proteoquant-binz.pdf · 3 MSight • LC- MS data analysis tool • Developed by the Proteome Informatics Group of

10

A quantitation module for Phenyx

P-A Binz, Atelier Proteomique Quantitative, juin 2007

GenericQuantitation

methods

GenericQuantitation

methods

Prediction ofCo-peptidesPrediction ofCo-peptides

Extraction of Intensities:

MS level

Extraction of Intensities:

MS level

Extraction of Intensities:MS/MS level

Extraction of Intensities:MS/MS level

+

Calculation of ratios;

exportation

Calculation of ratios;

exportation

A quantitation module for Phenyx

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Quantitation module

Quantitation module

APIInSilicoSpectro

PhenyxPerl

APIInSilicoSpectro

PhenyxPerl

QuantitationResult file

(text)

(Phenyx)result file

Labelingconfig file

(xml)

InSilicoDefdefinition file (xml) External

statistics ( R )External

statistics ( R )

One possible integration with MSight (label-free)

RawLC-MS

Peaklists

Exportedpeptide

identifications

Annotated images

RawLC-MS

Peaklists

Exportedpeptide

identifications

Align, compare

Annotated peptide ratios

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Phenyx: generate reports from identification results

Perl scripts to generate many kinds of exports

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Example for iTRAQ

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Examples of filters and search parameters that alter quantitation results

• Minimal number of peptides per protein• Minimal number of proteotypic peptides• Minimal score for each peptide• Filter on redundancy

– same sequence (same or different charge states)– same exact primary structure, – Imbedded sequences (missed-cleavages, etc.)

• Remove outliers (quant values > threshold CV)

• Number of missed cleavages allowed• Semi-tryptic peptides and fully unspecific cleavages• Number of queried modifications

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Page 11: Already said Outlook - Insermicim.marseille.inserm.fr/IMG/pdf/2007-proteoquant-binz.pdf · 3 MSight • LC- MS data analysis tool • Developed by the Proteome Informatics Group of

11

Only valid peptides: 6 proteins, 22 peptides

Min. 3 valid peptides: 4 proteins, 19 peptides

Min. 3 valid peptides, Intensities >10’000: 4 proteins, 15 peptides

Min. 3 valid peptides, Intensities >10’000, CV<20%: 2 proteins, 7 peptides

Effect of filters

27+ CV

415+ Intensity

419+ 3 peptides

622Z-score

# proteins# peptidesFilter

P-A Binz, Atelier Proteomique Quantitative, juin 2007

# peptide in decoy database# peptide in forward database

False discovery rate export

Number of valid hits as fct of zscore

0

2000

4000

6000

8000

10000

4.0 6.0 8.0 10.0 12.0 14.0

z-score

# h

its True hits

Hits in reverse

FDR (hits in rev / hits in fw d)

0%2%4%

6%8%

10%12%14%16%

18%20%

5.0 6.0 7.0 8.0 9.0 10.0

z-score

FD

R (

hit

s in

rev

/ h

its

in f

wd

)

P-A Binz, Atelier Proteomique Quantitative, juin 2007

= f(z-score and p-value)

Page 12: Already said Outlook - Insermicim.marseille.inserm.fr/IMG/pdf/2007-proteoquant-binz.pdf · 3 MSight • LC- MS data analysis tool • Developed by the Proteome Informatics Group of

12

Calibration status of instrument (3 datasets)

Calibration status of instrument

3

5

7

9

11

13

15

17

19

-0.6 -0.4 -0.2 0.0 0.2 0.4

delta m/z

zsco

re

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Effect of the search parameters1rnd,Only 3 fixed mods131 valid, 75% cov.

2rnd,Add variable mods205 valid,84% cov.

P-A Binz, Atelier Proteomique Quantitative, juin 2007

2rnd,With all modsAnd half cleaved348 valid,90% cov.

Import jobs into Phenyx

Mascot

X!Tandem

Sequest

Phenyx

Manual validation and then quantitation as if Phenyx job

Results comparison tool

What protein in what job? What peptide in what protein/job?

Concatenate results from different runs/search engineAnd then go to quantitation…

Summary

• LC-MS data and 2D image analysis (MSight)– Rich source of information– Detect strange behaviors (discontuity, contaminations, QC

issues)– Use of 2 dimensions efficient for signal detection– Alignment of multiple MS runs: consider local aberrations– Quantitation possible for pairs and for groups (statistics)

• Quantitation with protein identification tool only (Phenyx)– Quantitation methods limited to information in peaklists

(isobaric labeling, emPAI, Multiplex)• Quantitation with MSight and Phenyx

– Get access to raw data information– Full panel of quantitation methods– Need tight integration (annotation, statistics, filters)– Thanks to import functionality, access to other search

enginesP-A Binz, Atelier Proteomique Quantitative, juin 2007

Take-home messages

Biological variability Experimental variability Error to appreciateQuantitation method tolerance

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Many tools available, make your choice according to:

biological questioncapacity to analyse data from the chosen quantitation methodcapacity to analyse data from your instrumentspossibility to validate generated data (interactivity)

Understand, evaluateUnderstand, evaluate

Page 13: Already said Outlook - Insermicim.marseille.inserm.fr/IMG/pdf/2007-proteoquant-binz.pdf · 3 MSight • LC- MS data analysis tool • Developed by the Proteome Informatics Group of

13

Aknowledgements

• Phenyx devel team– Alexandre Masselot– Nicolas Budin– Anne Niknejad– Olivier Evalet

• PIG group– Ron Appel– Daniel Walther– Gerard Bouchet– Sébastien Catherinet– Stéphane Pelhâtre– Patricia Palagi

• BPRG– Ali Vaezzadeh

• PAF– Manfredo Quadroni

• University Bern– Manfred Heller

• IPBS– David Bouyssié

P-A Binz, Atelier Proteomique Quantitative, juin 2007

Thank you for your attention!

MSight: http://www.expasy.org

Phenyx: http://phenyx.vital-it.ch/pwi