Investigation of Protein Signatures Associated with ALS ......2726 2817 3686 0 500 1000 1500 2000...

1
2726 2817 3686 0 500 1000 1500 2000 2500 3000 3500 4000 1 Investigation of Protein Signatures Associated with ALS Disease Progression Using Data Independent Acquisition Mass Spectrometry 1 Department of Chemistry, 2 Department of Statistics, and 3 Department of Biological Sciences North Carolina State University, Raleigh, NC Allyson L. Mellinger 1 , Emily H. Griffith 2 , and Michael S. Bereman 1,3 Clinical Care Hindered by Subjective “Measurements” Salivation 4 Normal 3 Slight, but definite excess of saliva in mouth, may have nighttime drooling 2 Moderately excessive saliva, may have minimal drooling 1 Marked excess of saliva with some drooling 0 Marked drooling, requires constant tissue Dyspnea (shortness of breath) 4 None 3 Occurs when walking 2 Occurs with one or more: eating, bathing, dressing 1 Occurs at rest, either sitting or lying 0 Significant difficulty, considering mechanical support ALS Functional Rating Scale: Current gold standard for evaluating therapeutic efficacy in clinical trials Data Dependent vs Data Independent Acquisition for CSF Analysis Longitudinal Proteomic Analysis of ALS Patient CSF Data Quality Control and Patient Variability Linear Mixed - Effects Modeling of Disease Progression LASSO and Forward Selection for Model Building Conclusions and Future Work References and Acknowledgements Peptide Gene Estimate p-value EHIVDLEMLTVSSIGTFR EFEMP1 5.49 0.098 AGLAASLAGPH SOD3 3.87 0.041 ESNPHCGSDGQTYGNK SPINK6 2.23 0.228 EVDSGNDIYGNPIK TIMP2 2.01 0.322 ISVTQGASK FCGBP 1.72 0.044 VPTADLEDVLPLAEDITNILSK GC 1.63 0.006 YVMLPVADQDQCIR HP -1.65 0.111 IGEWIGR C8orf34 -1.8 0.078 EVEELMEDTQHK DKK3 -2.45 0.412 STTAVVTNPK TTR -3.97 0.160 FTFHADICTLSEK ALB -5.32 0.012 IISVELPGDAK RELN -7.87 0.003 Model Predicted Score 30 40 30 40 20 Model Observed Score Fitted Residuals 0 -2.5 2.5 30 40 300 320 340 360 380 400 0 10 20 30 Minimum AIC # Variables in Model AIC Negative Control Model Number of Peptide IDs Coefficient of Variation (%) 0 25 50 100 75 Q1 Q2 Q3 Q4 DDA DIA DIA- Pan Human Library DIA- CSF Library DDA DDA 53.9% DIA 97.5% Log 2 (Peptide Abundance) 0 30 20 * 10 QC Patients by Color p = 0.038 p = 0.074 p = 0.031 Log(Percent Change from Baseline) Rate ( Change in ALSFRS Score Change in months ) Targeted Global Inflammation Index Pooled disease cohort sample for digestion QC Day 1 Day 2 Day 3 Filter aided tryptic digestion of 50 μg protein per sample Analysis by nLC- HRAM-MS/MS Pooled for chromatogram library 3 μL each digested ALS sample Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disease that destroys both upper and lower motor neurons. As a result, patients experience progressive voluntary muscle action loss and fatal respiratory failure within a few years of onset. Only 10% of cases are considered hereditary. There is no known cure for this disease, and the discovery of any disease reversing treatment is hampered by a lack in pathogenic understanding of the disease along with the absence of an objective and linear measure of disease progression. The ALS-Functional Rating Scale (ALS-FRS) (Figure 1) is currently the most widely used clinical measure. This 12-question and 48-point survey assesses one’s ability to perform everyday activities. A quantitative measurement is sorely needed. In this work, we investigate the efficacy of peptide measurements in cerebrospinal fluid (CSF) of ALS to be accurate signatures in modeling disease progression. Figure 1. Left: Example question from the ALS-FRS. Survey results in subjective measurements that are not linearly related to disease severity. Above: Scatterplot of scores from ALS patients in this study. Some scores measured did not decrease linearly over time (circled). Figure 2. Comparison of DDA and DIA for CSF proteomics. A: Bar graph comparing the average # of peptides identified with each method. B: Venn Diagram comparing average peptide identifications of four replicate analyses using each method. C: Boxplots comparing precision of peptide measurements made in each quantile of abundances using each method. A B Figure 4. A: Boxplots of logged peptide abundances measured in every sample to visualize data quality. One outlier sample is clear*, and this patient was excluded from most subsequent analyses. B: Principal component analysis separating peptide abundances of samples. C: Boxplots of variation both within and among patient and QC samples. A Figure 3. Experimental design for analysis of longitudinal CSF. 63 samples and one digestion quality control sample each day were split between three days of digestion and randomized before analysis sequencing within digestion day. Gas-phase fractionated injections for chromatogram library 1 building were collected in the middle of the experiment.. Longitudinal CSF Collection from ALS Patients × 23 × 23 × 17 + + + Figure 5. Global inflammation changes were studied by taking the median peak of peptides from inflammatory proteins and calculating the percent change in those medians from each patient’s baseline collection visit. This was plotted against the rate of disease progression in each patient, defined as the ratio of the change in score over the change in time between collections. We have shown the efficacy of proteomics in longitudinal sampling to quantitatively model disease progression in ALS patients. A targeted study of peptides used in the model within an independent cohort of ALS patients should be investigated for further validation of this study. Chromatogram Library and DIA data search Strenuous signal filtering criteria for eligible peptides Filter colinear peptides LASSO for variable selection Multivariate Mixed Effects Longitudinal Model Forward selection to minimize AIC Peptide Filtering Final Model: ALSFRS Score=βX i +Z i b i i A Years from Symptom Onset ALSFRS Score 20 30 40 3 6 9 12 Subjective Non-linear Figure 6. A: Summary of model building workflow B: Forward selection to build model and minimize AIC. C: Diagnostic plots for model fit and variance assumption Figure 7. A: 12 peptides with largest coefficients in model. B: Interaction network of proteins represented by peptides in the model. Peptide CV (%) 75 100 50 25 0 QC We gratefully acknowledge the NEALS Biorepository for providing all the biofluids from ALS patients used in this study. We also acknowledge the ALS Association (grant #19-SI-458) for the funding of this project. We acknowledge the MacCoss laboratory, University of Washington, for guidance and provision of biofluids for quality control. All mass spectrometry measurements were made in the Molecular Education, Technology, and Research Innovation Center (METRIC) at NCSU. (1) Searle, B., et al., Nat. Comm., 2018, 9, 5128. B A C C B A Log(Centered Abundance) Patient 1 PC1 (8.47%) PC2 (4.91%) QC Samples B 28 30 32 Patient 1 C DDA DIA

Transcript of Investigation of Protein Signatures Associated with ALS ......2726 2817 3686 0 500 1000 1500 2000...

Page 1: Investigation of Protein Signatures Associated with ALS ......2726 2817 3686 0 500 1000 1500 2000 2500 3000 3500 4000 1 Investigation of Protein Signatures Associated with ALS Disease

2726 2817 36860

500

1000

1500

2000

2500

3000

3500

4000

1

Investigation of Protein Signatures Associated with ALS Disease Progression Using Data Independent Acquisition Mass Spectrometry

1Department of Chemistry, 2Department of Statistics, and 3Department of Biological Sciences North Carolina State University, Raleigh, NC

Allyson L. Mellinger1, Emily H. Griffith2, and Michael S. Bereman1,3

Clinical Care Hindered by Subjective “Measurements”

Salivation

4 Normal

3 Slight, but definite excess of saliva in mouth,

may have nighttime drooling

2 Moderately excessive saliva, may have

minimal drooling

1 Marked excess of saliva with some drooling

0 Marked drooling, requires constant tissue

Dyspnea (shortness of breath)4 None

3 Occurs when walking

2 Occurs with one or more: eating, bathing,

dressing

1 Occurs at rest, either sitting or lying

0 Significant difficulty, considering mechanical

support

ALS Functional Rating Scale: Current gold standard

for evaluating therapeutic efficacy in clinical trials

Data Dependent vs Data Independent Acquisition for CSF Analysis

Longitudinal Proteomic Analysis of ALS Patient CSF

Data Quality Control and Patient Variability

Linear Mixed-Effects Modeling of Disease Progression

LASSO and Forward Selection for Model Building

Conclusions and Future Work

References and Acknowledgements

Peptide Gene Estimate p-value

EHIVDLEMLTVSSIGTFR EFEMP1 5.49 0.098

AGLAASLAGPH SOD3 3.87 0.041

ESNPHCGSDGQTYGNK SPINK6 2.23 0.228

EVDSGNDIYGNPIK TIMP2 2.01 0.322

ISVTQGASK FCGBP 1.72 0.044

VPTADLEDVLPLAEDITNILSK GC 1.63 0.006

YVMLPVADQDQCIR HP -1.65 0.111

IGEWIGR C8orf34 -1.8 0.078

EVEELMEDTQHK DKK3 -2.45 0.412

STTAVVTNPK TTR -3.97 0.160

FTFHADICTLSEK ALB -5.32 0.012

IISVELPGDAK RELN -7.87 0.003

Model Predicted Score30 40

30

40

20

Mo

de

l O

bs

erv

ed

Sc

ore

Fitted

Re

sid

ua

ls

0

-2.5

2.5

30 40

300

320

340

360

380

400

0 10 20 30

Minimum AIC

# Variables in Model

AIC

Negative Control

Model

Nu

mb

er

of

Pe

pti

de

ID

sC

oe

ffic

ien

t o

f V

ari

ati

on

(%

)

0

25

50

100

75

Q1 Q2 Q3 Q4

DDA

DIA

DIA-

Pan

Human

Library

DIA-

CSF

Library

DDADDA

53.9%

DIA

97.5%

Lo

g2(P

ep

tid

e A

bu

nd

an

ce

)

0

30

20

*

10 QC

Patients by Color

p = 0.038

p = 0.074

p = 0.031

Log(Percent Change from Baseline)R

ate

(C

ha

ng

ein

AL

SF

RS

Sc

ore

Ch

an

ge

inm

on

ths

)

Targeted Global Inflammation Index

Pooled

disease

cohort

sample

for

digestion

QC

Day 1

Day 2

Day 3

Filter aided

tryptic

digestion of

50 µg

protein per

sample

Analysis by nLC-

HRAM-MS/MS

Pooled for chromatogram library3 µL each digested ALS sample

Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disease that destroys both upper and

lower motor neurons. As a result, patients experience progressive voluntary muscle action loss and

fatal respiratory failure within a few years of onset. Only 10% of cases are considered hereditary.

There is no known cure for this disease, and the discovery of any disease reversing treatment is

hampered by a lack in pathogenic understanding of the disease along with the absence of an

objective and linear measure of disease progression. The ALS-Functional Rating Scale (ALS-FRS)

(Figure 1) is currently the most widely used clinical measure. This 12-question and 48-point survey

assesses one’s ability to perform everyday activities. A quantitative measurement is sorely needed.

In this work, we investigate the efficacy of peptide measurements in cerebrospinal fluid (CSF) of

ALS to be accurate signatures in modeling disease progression.

Figure 1. Left: Example question from the

ALS-FRS. Survey results in subjective

measurements that are not linearly related

to disease severity. Above: Scatterplot of

scores from ALS patients in this study.

Some scores measured did not decrease

linearly over time (circled).

Figure 2. Comparison of DDA

and DIA for CSF proteomics.

A: Bar graph comparing the

average # of peptides identified

with each method. B: Venn

Diagram comparing average

peptide identifications of four

replicate analyses using each

method. C: Boxplots comparing

precision of peptide

measurements made in each

quantile of abundances using

each method.

A B

Figure 4. A: Boxplots of logged peptide abundances measured in every sample to visualize data quality.

One outlier sample is clear*, and this patient was excluded from most subsequent analyses. B: Principal

component analysis separating peptide abundances of samples. C: Boxplots of variation both within and

among patient and QC samples.

A

Figure 3. Experimental design for analysis of longitudinal CSF. 63 samples and one digestion quality

control sample each day were split between three days of digestion and randomized before analysis

sequencing within digestion day. Gas-phase fractionated injections for chromatogram library1 building were

collected in the middle of the experiment..

Longitudinal

CSF Collection

from ALS

Patients

× 23

× 23

× 17

+

+

+

Figure 5. Global inflammation changes

were studied by taking the median peak of

peptides from inflammatory proteins and

calculating the percent change in those

medians from each patient’s baseline

collection visit. This was plotted against

the rate of disease progression in each

patient, defined as the ratio of the change

in score over the change in time between

collections.

• We have shown the efficacy of proteomics in longitudinal sampling to quantitatively model

disease progression in ALS patients.

• A targeted study of peptides used in the model within an independent cohort of ALS patients

should be investigated for further validation of this study.

Chromatogram Library and DIA

data search

Strenuous signal filtering criteria

for eligible peptides

Filter colinear peptides LASSO for variable selection

Multivariate Mixed Effects

Longitudinal Model

Forward selection to

minimize AIC

Peptide Filtering

Final Model: ALSFRS Score=βXi+Zibi+εi

A

Years from Symptom Onset

AL

SF

RS

Sc

ore

20

30

40

3 6 9 12

Subjective

Non-linear

Figure 6. A: Summary of model building

workflow B: Forward selection to build

model and minimize AIC. C: Diagnostic

plots for model fit and variance

assumption

Figure 7. A: 12 peptides with largest coefficients in

model. B: Interaction network of proteins

represented by peptides in the model.

Pe

pti

de

CV

(%

)

75

100

50

25

0QC

We gratefully acknowledge the NEALS Biorepository for providing all the biofluids from ALS patients used in this study. We also

acknowledge the ALS Association (grant #19-SI-458) for the funding of this project. We acknowledge the MacCoss laboratory,

University of Washington, for guidance and provision of biofluids for quality control. All mass spectrometry measurements were

made in the Molecular Education, Technology, and Research Innovation Center (METRIC) at NCSU.

(1) Searle, B., et al., Nat. Comm., 2018, 9, 5128.

B

A

C

C

BA

Log(Centered Abundance)

Patient 1

PC1 (8.47%)

PC

2 (

4.9

1%

)

QC Samples

B

28 30 32

Patient 1

C

DDA

DIA