Novel MS discovery-to-targeted SRM workflows incorporating ROC curve analysis of putative biomarker...

43
Novel MS discovery-to-targeted SRM workflows incorporating ROC curve analysis of putative biomarker candidates in bona fide clinical samples Mary F Lopez Director, BRIMS Swedish Proteomics Society, Gothenberg, Sweden 11-21-10

Transcript of Novel MS discovery-to-targeted SRM workflows incorporating ROC curve analysis of putative biomarker...

Novel MS discovery-to-targeted SRM workflows incorporating ROC curve analysis of putative biomarker candidates in bona fide clinical samples

Mary F Lopez

Director, BRIMSSwedish Proteomics Society,

Gothenberg, Sweden 11-21-10

Biomarker Discovery-to-Targeted Workflow for Proteomics

Fishing for differentially expressed proteinsDiscovery of putative biomarkers

Targeting proteins in known pathways Verification of putative biomarkers

Quantitative Biomarker Discovery using SIEVE and Two-Pass Workflow on Orbitrap Velos

The SIEVE workflow can be described in 3 main steps:

Frame

•Global intensity-based features

•Reconstructed chromatograms

•Significance statistics and annotation filters

Align

•Chromatographic alignment

•Scalable Adaptive Tiled Algorithm

Identify

•SEQUEST or Mascot for protein/peptides

•ChemSpider for small molecules

Design and Optimization• Robust, commercially

available nanoflow LC• Commercially

available columns• Focus on stable spray• Focus on high

reproducibility of peak intensities, CV<8%

Pass 1: Quantification• Chromatographic alignment• Uncompromised full scan

measurements • Each sample is measured

once – no need for replicates

• Internal peptide standards (normalization)

• Triplicate runs of peptide standards every 12 runs (instrument QC)

• “Top10” data dependent acquisition

• Stringent Precursor ion selection criteria

Pass 2: Identification• Targeted fragmentation by Inclusion

list• Relaxed Precursor ion selection

criteria• Not all samples measured – subset

as determined from SIEVE analysis• Internal peptide standards • Marker stratification using multi

marker and single ROC AUC (SIEVE 1.3)

• Export to Ingenuity pathway analysis

In order to realize the quantitative power of SIEVE, data collection must be very robust

methods Inclusion list

BRIMS Two-Pass Discovery Workflow using SIEVE and Orbitrap Velos

LC setup for Two-Pass Workflow

• Thermo Proxeon EasyNanoLC eliminates need for time consuming SPE and sample pump downs. Just acidify, add standards and load digested peptides.

• Controlled trapping flow rates ensure consistent sample retention and salt removal.• Rapid column equilibration allows for enhanced duty cycle. • Hydrophobicity differences from trapping column to resolving column allows for effective refocusing.• Larger resolving column allows for higher capacity, and rapid application of gradient to the

column(flow rates to 1.0uL/min)

Waste tubing

HV in(from source)

5cm trap column

25cm resolving column

From pump/autosampler

To Orbitrap Velos

Data Quality – Spray Stability

March 30 April 4

Spray stability is the largest factor in reproducible measurements.

Data Quality – Peak Shape

Data Reproducibility - CV

Aligned data

Monitor Lab Environment

BRIMS Lab Weather• Temperature, Humidity, Power

Method for Assessing Systematic Errors without Sample Technical Replicates

• Systematic errors are assessed from triplicate acquisition of standard sample.

• Internal standards are spiked in all samples.

Blank

Standards Calibration

Standards Calibration

Standards Calibration

Top 10 Fragmenta-

tion

Sample Full Scan

Sample Full Scan

Sample Full Scan

Sample Full Scan

Sample Full Scan

Sample Full Scan

Blank run

Standards calibration

Column regeneration – top 10

Patient samples – full scan onlyPass 1 Acquisition cycle

The Two- Pass workflow increases sensitivity by effectively fractionating samples in silico

• Typical MS acquisition parameters are not geared for quantification.

• Data dependent acquisition triggers MS2 based on intensity so most low abundance biomarkers are not identified in complex mixtures with large dynamic range ie blood.

• Classical “shotgun” approaches focus on physical sample fractionation strategies such as depletion and cation exchange coupled with data dependent acquisition.

• Physical fractionation such as depletion and cation exchange results in loss of albumin binding proteins and multiple runs for each sample.

• These approaches are very labor intensive, time consuming and typically do not allow for rigorous quantification and statistical power because fewer samples are analyzed due to time and instrument constraints.

• The Two-Pass Workflow using Inclusion Lists optimizes parameters for full scan quantification and MS2 triggering separately.

• This results in:

Higher sensitivity and getting deeper into the proteome, ie more ID’s

Precise and reproducible quantification

Flexibility in creating the inclusion list based upon desired attributes such as differential expression, PTM’s or other parameters.

Reducing the number of replicates needed since LC reproducibility and %CV’s are so low (ca 8%)

Increases the biological sampling power (can run more samples in a shorter time).

Decreases the circular biomarker identification syndrome, ie we identified Albumin AGAIN.

Quantitative Statistics for the Two-Pass Workflow

2076

498 461540un

ique

pep

tides

Data dependent “Top 10”

Inclusion list 1

Inclusion list 2

Inclusion list 3

Data Dependent “Top 10” vs Inclusion list

Dataset from a recent collaboration on stroke (discussed in later slides)

Ion Score vs Concentration of Spiked Standard Peptide in Plasma

1 10 100 1000 100000

50

100

150

200

250

300

Top 10

Two-Pass

Concentration (amol)

Ma

sco

t Io

n S

core

Ongoing collaboration with Dr. MingMing Ning, Mass General Hospital and Harvard University

Discovery of Blood Biomarkers in PFO related Acute Stroke

Application of Discovery Two-Pass Workflow using SIEVE and Orbitrap Velos

Atrial septum

•The prevalence of PFOs in the general population is around 25%, but it is doubled in cryptogenic (unknown cause) stroke patients. These patients are often young and “healthy”.

•If there is a clot traveling into the right side of the heart, it can cross the PFO, enter the left atrium, and travel out of the heart and to the brain causing a stroke.

•This suggests a causal relationship between PFO and cryptogenic stroke.

•Supported by NIH/NINDS (Dr Tom Jacobs), MGH Cardio-Neurology Division evaluates patients with PFO related stroke and the therapeutic efficacy of surgical PFO closure and other stroke treatment.

•Venous blood samples from stroke patients are taken before (upon admission) and at12 month follow up after PFO closure.

•Biomarkers for PFO-related stroke could be clinically useful.

Number ofpatients

Sample type Patient

5 PFO pre OP Stroke

8 Patient matched PFO post OP Stroke

Collaboration with Dr. M. Ning, Harvard, MGH, on PFO Stroke

When the atrial septum does not close properly, it is called a patent foramen ovale or PFO.

SIEVE experiment for the PFO stroke study

Sample groups were identified in SIEVE at the beginning of the analysis

Number ofpatients

Sample type Patient

5 PFO pre OP Stroke

8 Patient matched PFO post OP Stroke

SIEVE data demonstrated high reproducibility and robustness of measurements

Reconstructed ion chromatogram of an example frame (not differentially expressed)

Whisker plot of expression ratios for all 13 peptides identified for protein gi119372317Gray area represents 90% confidence interval for expected protein ratio

3575 unique peptides and 263 proteins were identified in the study with high confidence128 were differentially expressed (determined by ratio)

ROC* analysis: How can we quickly rank the potential“usefulness” of putative biomarkers for clinical research?

Why? Expression ratio and Pvalue may not necessarily be specific to the pathology.

How can we query the data and test the classification power of the target analytes?• Create ROC curves by plotting false positives vs true positives while adjusting the criteria threshold. The area under

the curve, AUC is a measurement of classification power.

• Use AUC to select optimal candidates and discard suboptimal candidates.

• AUC values range from 0.5 to 1.0. An AUC of 1.0 indicates a specificity and sensitivity of 100%.

• Generate AUC values for individual markers and marker ratios.

*Receiver Operating Characteristic (a classification model)

Specificity

Sen

sitiv

ity

Description Peptides

Ratio*Pre OP

VSPost OP

%standard error

StdDevPre OP

VSPost OP

PvaluePre OP

VSPost OP Avg ROC

AUC

_gi_4503635_ref_NP_000497.1_ prothrombin preproprotein [Homo sapiens] 4 0.55 16.57 0.09 9.9E-20 1.00

_gi_261878616_ref_NP_001159907.1_ inter_alpha_trypsin inhibitor heavy chain H1 isoform c [Homo sapiens] 5 0.48 19.94 0.10 9.9E-20 1.00

_gi_283806712_ref_NP_001164609.1_ clusterin isoform 3 [Homo sapiens] 6 0.53 16.32 0.09 9.9E-20 0.99_gi_70778918_ref_NP_002207.2_ inter_alpha_trypsin inhibitor heavy chain H2 [Homo sapiens] 16 0.51 9.27 0.05 9.9E-20 0.99

_gi_32483410_ref_NP_000574.2_ vitamin D_binding protein precursor [Homo sapiens] 7 0.45 19.76 0.09 9.9E-20 0.99

_gi_41393602_ref_NP_958850.1_ complement C1s subcomponent precursor [Homo sapiens] 3 0.56 18.32 0.10 9.9E-20 0.99

_gi_4502261_ref_NP_000479.1_ antithrombin_III precursor [Homo sapiens] 12 0.31 13.57 0.04 9.9E-20 0.98

_gi_31542984_ref_NP_002209.2_ inter_alpha_trypsin inhibitor heavy chain H4 isoform 1 precursor [Homo sapiens] 19 0.45 13.00 0.06 9.9E-20 0.97

_gi_50659080_ref_NP_001076.2_ alpha_1_antichymotrypsin precursor [Homo sapiens] 11 0.49 12.94 0.06 9.9E-20 0.96

_gi_239752152_ref_XP_002348153.1_ PREDICTED: hypothetical protein XP_002348153 [Homo sapiens] 3 0.56 16.58 0.09 9.9E-20 0.96

_gi_73858570_ref_NP_001027466.1_ plasma protease C1 inhibitor precursor [Homo sapiens] 9 0.57 11.69 0.07 9.9E-20 0.96

_gi_38016947_ref_NP_001726.2_ complement C5 preproprotein [Homo sapiens] 8 0.60 16.35 0.10 9.9E-20 0.96

_gi_4557321_ref_NP_000030.1_ apolipoprotein A_I preproprotein [Homo sapiens] 13 0.54 10.15 0.05 9.9E-20 0.96

_gi_62739186_ref_NP_000177.2_ complement factor H isoform a precursor [Homo sapiens] 4 0.60 19.42 0.12 9.9E-20 0.95

_gi_4557871_ref_NP_001054.1_ serotransferrin precursor [Homo sapiens] 16 0.21 13.61 0.03 9.9E-20 0.95

_gi_4557485_ref_NP_000087.1_ ceruloplasmin precursor [Homo sapiens] 22 0.37 13.15 0.05 9.9E-20 0.95

_gi_296080754_ref_NP_001171670.1_ fibrinogen beta chain isoform 2 preproprotein [Homo sapiens] 18 0.21 14.15 0.03 9.9E-20 0.95

_gi_70906437_ref_NP_000500.2_ fibrinogen gamma chain isoform gamma_A precursor [Homo sapiens] 16 0.54 9.85 0.05 9.9E-20 0.94

_gi_169214179_ref_XP_001724196.1_ PREDICTED: similar to complement component 3 [Homo sapiens] 12 0.49 14.89 0.07 9.9E-20 0.94

_gi_4557325_ref_NP_000032.1_ apolipoprotein E precursor [Homo sapiens] 9 0.45 18.63 0.08 9.9E-20 0.94

Top 21 single proteins with highest ROC AUC for PFO Stroke Study

* Ratio = PRE OP/POST OP

Biological context? Ingenuity Pathways Analysis (IPA)

Top network Lipid Metabolism

Top physiological system development and function

Neurological Disease

Top disease Hematological system

Top Canonical pathways

Acute phase signalingCoagulation systemComplement systemIntrinsic Prothrombin PathwayExtrinsic Prothrombin Pathway

The entire PFO stroke dataset was uploaded and analyzed with IPA

Top 2 ROC AUC candidates, selected literature references

Clin Chim Acta. 2009 Apr;402(1-2):160-3.Inter-alpha-trypsin inhibitor heavy chain 4 is a novel marker of acute ischemic stroke.Kashyap RS, Nayak AR, Deshpande PS, Kabra D, Purohit HJ, Taori GM, Daginawala HF.Biochemistry Research Laboratory, Central India Institute of Medical Sciences, 88/2 Bajaj Nagar Nagpur-10, India.

Stroke. 2007 Jul;38(7):2070-3. Epub 2007 May 24.Prothrombotic mutations as risk factors for cryptogenic ischemic cerebrovascular events in young subjects with patent foramen ovale.Botto N, Spadoni I, Giusti S, Ait-Ali L, Sicari R, Andreassi MG.CNR Institute of Clinical Physiology, G. Pasquinucci Hospital, Massa, Italy.

Description Avg ROC AUC_gi_4503635_ref_NP_000497.1_ prothrombin preproprotein [Homo sapiens] 1.00

_gi_261878616_ref_NP_001159907.1_ inter_alpha_trypsin inhibitor heavy chain H1 isoform c [Homo sapiens] 1.00

Verification and translation of putative biomarkers into targeted assays using SRM and PinpointTM Software

Pinpoint software was developed (at BRIMS) to make SRM assays easy, automated and efficient

List of Targeted Proteins

Discovery data:Protein DiscovererSIEVEPeptide AtlasNISTGPMRecombinant ProteinHeavy-Labeled PeptidesQC Standards

Exhaustive List: - Peptides - Transitions

Identify and Verify: - Best Peptides - Best Transitions Refine Transition ListOptimize LC Gradient

Verify the LC-SRM Assay with Recombinant Digests

Analyze Biological Samples

Pinpoint

Pinpoint Algorithmic prediction

Pinpoint provides assay throughput options…

5-10 peptides

50-100 peptides

500-1000 peptides

5000-10000 peptides

Regular multiple SRM

Scheduled SRM

(tSRM)

tSRM

+

iSRM

tSRM

+

iSRM

+

Split-n-stitchAutomated scoring schemes to help prioritize large analysis into high, medium, low quality bins

And more…• Single software to help iterative method building to go from protein list to absolute abundance• Multi-threaded• Extremely easy data and results sharing• Customers can give video feedback• Video help tutorials to get you started

iSRM – Quantifying and verifying low level biomarkers in biological matrices

y3

y4

y5

y6

y7 y8

y9 y10

E L A S G L F P V G F K

Primary SRM Transitionm/z 680.37 → 789.44NL: 2.48E2

Primary SRM Transitionm/z 680.37 → 959.54NL: 1.50E2

Data Dependent SRMPrimary and Secondary SRMTransitionsNL: 1.12E3

Ongoing collaboration with Dr. MingMing Ning, Mass General Hospital and Harvard University

Development of a multiplexed SRM assay for Apolipoproteins:

Application Cardiovascular disease and stroke

Targeted assay development for high abundance proteins

Ischemic vs hemorrhagic stroke

• About 80 percent of strokes are ischemic, caused by a blockage of the vessels that supply blood to the brain. More than 400,000 people in the United States every year are affected.

• About 20 percent of all strokes are hemorrhagic; this type of stroke involves the rupture of a blood vessel in or around the brain.

• TPA is the only treatment for ischemic stroke. It can only be given within 6 hrs of the event.

• If TPA is given to a hemorrhagic stroke patient, death can result. 

• An assay that could accurately differentiate ischemic from hemorrhagic stroke quickly would be clinically useful.

Diagnosis for acute stroke is currently by:• Neurological exam• CAT scan• MRI• Lumbar pucture

Number ofpatients

Blood Collection times Sample type

53 Upon admission Ischemic Stroke

26 Upon admission Hemorrhagic stroke

Development of a multiplexed assay for a panel of apolipoproteins: application to stroke

The relative levels of various apolipoproteins can be important biomarkers for heart disease, stroke, Alzheimer’s, diabetes and metabolic syndrome.

Typically, these proteins are individually measured in blood by immunoassay.

The availability of a multiplexed assay that could simultaneously and quantitatively measure a panel of apolipoproteins would be an extremely useful clinical research tool.

We decided to interrogate clinical samples to see if apolipoproteins could be used to classify different types of strokes.

Clinical Samples

Single day development of a multiplexed assay for a panel of apolipoproteins using Pinpoint

Import protein sequences and priorLC-MS/MS discovery datalibrary for 10 Apolipoproteins

1Choose optimal “proteotypic” peptides: ie, Highest intensity and unique.Narrow list down to one peptide per protein

2

3Choose at least 5 fragment transitions per peptide. This ensures accurate identification of peptides.Create method and run sample triplicates.

ROC analysis of apolipoprotein levels in hemorrhagic vs ischemic stroke patients: Single marker AUC

Top AUC for single marker

Apo CIII 0.80

Apo AI 0.76

Apo CII 0.70

Apo D 0.66

1. Apolipoprotein Panel Apo AIApo AIIApo AIVApo BApo CIApo CIIApo CIIIApo DApo EApo H

ROC analysis of apolipoprotein levels in hemorrhagic vs ischemic stroke patients: Multi marker AUC

Top AUC for multi markers

Apo CIII and Apo AII 0.80

Apo CIII and Apo CI 0.87

Apo H and Apo AII 0.86

Apo AI and Apo CI 0.85

1. Apolipoprotein Panel Apo AIApo AIIApo AIVApo BApo CIApo CIIApo CIIIApo DApo EApo H

Development of an assay for PTH: Collaboration with Intrinsic

BioProbes and Mayo Clinic

Clinical Chemistry, 2010

Targeted assay development for low abundance proteins

Dr. Ravinder SinghDr. Randall Nelson

The large dynamic range of proteins in blood presents a technical hurdle to the development of SRM assays biomarkers present in low abundance

PTH is secreted into the circulatory system to produce healthy concentrations of ca 15 – 65 pg/mL, therefore enrichment is required for mass spec detection

PTH range

Intrinsic BioProbes/ThermoFisher PTH assay platform for enrichment of low abundance proteins using MSIA (Mass Spec Immuno Assay)

Clinical samples

Capture on AB activated tips

MSIA TIP Versette (ALH) TSQ Vantage (MS)

Affinity Capture Automated Processing Quantitative Analysis

• Conventional PTH assays typically rely on two-antibody recognition systems, ie ELISA.

• Immunoassays cannot accurately differentiate between full length (PTH aa1-84) and clinically important variants (aa7-84 and others).

• There is a need for more specific assays that can accurately quantify different clinical variants.

Not all immunocapture/immunoprecipitation methods can deliver the necessary recovery and signal

Antibody capture method

Analyte Location tested Limit of detection (LOD) in matrix pg/mL

Analyte MW LOD in matrix pmol/L

SISCAPA Troponin Addona et al Clin Chem 2009, 55:1108-1117

600 20K 50

SISCAPA Thyroglobulin Hoofnagle et al Clin Chem 2008, 54:1796-1804

2600 650K 4

96 well ELISA Plate

PTH Thermo BRIMS unpublished

250 10K 30

Magnetic beads PTH Thermo BRIMS unpublished

200 10K 25

MSIA Tip PTH Lopez et al Clin Chem 2010, 56:281-290

8 10K 1

Development of a PTH assay:Top down analysis confirmed that PTH is heterogeneous and variants have clinical relevance

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

1.2

m/z4200 4400 4600 4800 5000 5200 5400 5600 5800 6000 6200

37-8438-84

38-77

34-84

28-84

48-8445-84 34-77

37-77

m/z9325 9375 9425 9475 9525

1-84

Spectra at 3X

Rel

ativ

e In

ten

sity

Renal failure samples

We chose 4 tryptic monitoring and 2 variant specific peptides for SRM

PTH Variant MapResidue Number

N 20 40 60 80

Variant or

SRM Fragment

[1-84]

[7-84]

[34-84]

[37-84]

[38-84]

[45-84]

[28-84]

[48-84]

[34-77]

[37-77]

[38-77]

[1-13]

[7-13]

[14-20]

[28-44]

[34-44]

[73-80]

SVSEIQLMHNLGK

LMHNLGKHLNSMER LQDVHNFVALGAPLAPR

FVALGAPLAPRADVNVLTK

Variant specific

Standard curves for PTH peptide SRM assays demonstrate high precision

LQDVHNFVALGAPLAPR

SVSEIQLMHNLGK

LOD was estimated at ca 8pg/mL and LOQ was calculated to be ca 30 pg/mL.

R2 = 0.93%CV < 10

R2 = 0.98%CV < 10

Differential expression ratios of PTH peptides in renal failure vs normal Samples, ratios ranged from 4.4-12.3 LQDVHNFVALGAPLAPR (aa28-44) SVSSEIQLMHNLGK (aa1-13) HLNSMER (aa14-20)

0E+00

1E+03

2E+03

3E+03

4E+03

5E+03

6E+03

7E+03

8E+03

RenalControl

Raw

Sig

nal I

nten

sity

0E+00

1E+03

2E+03

3E+03

4E+03

5E+03

6E+03

RenalControl

Raw

Sig

nal I

nten

sity

0E+00

1E+02

2E+02

3E+02

4E+02

5E+02

6E+02

7E+02

8E+02

RenalControl

Raw

Sig

nal I

nten

sity

0E+00

1E+04

2E+04

3E+04

4E+04

RenalControl

Raw

Sig

nal I

nten

sity

0E+00

1E+03

2E+03

3E+03

4E+03

5E+03

6E+03

7E+03

8E+03

9E+03

RenalControl

Raw

Sig

nal I

nten

sity

FVALGAPLAPR (aa34-44) ADVNVLTK (aa73-80)

Ratio = 7.6 Ratio = 7.5Ratio = 12.3

Ratio = 9.2Ratio = 4.4

Summary

• An integrated workflow for quantitative, label-free proteomic analysis facilitates discovery

• Important components of a discovery platform include powerful instrumentation and software

• Results from discovery experiments can be translated into targeted assays for biomarker verification

Acknowledgements

Mary Lopez-Director

David Sarracino-

Manager, Biomarker Workflows

Bryan Krastins-Biomarker ScientistAmol Prakash-

Bioinformatic ScientistMichael Athanas-

Software Consultant

Taha Rezai

Quantitative Proteomics

Scientist

Jennifer Sutton-Manager, Biomarker Research

BRIMS TEAM

Thermo FisherScott PetermanAmy ZumwaltAndreas HuhmerBernard Delanghe

IBI, ASU Biodesign InstituteRandall NelsonDobrin NedelkovPaul OranChad Borges

Mass General Hospital, Harvard U.MingMing NingFerdinando S Buonanno Eng H Lo Mayo Clinic

Ravinder SinghDave Barnidge

BACKUP SLIDES

•Proxeon Easy-nLC•Trap Column 100um x 5 cm PS-DVB 5um(15-20um for dirty samples) particle 300A pore• Loading flow rates 5um traps 5uL a min; 15-20uL a min for 15-20um

particle traps•Resolving column 100um x 25cm C18AQ 200A•Buffer A 5% Methanol 0.2% formic acid/water•Buffer B 90% acetonitrile 0.2% formic acid water•Thermo Nanospray Source• Instrument Tuned on angiotensin 1•Lock masses used common polysiloxane and pthalates

Two-Pass workflow LC configuration