METASPACE Training Course, OurCon'16

63
METASPACE training course OurCon’16, 17.10.2016 Theodore Alexandrov (EMBL/UCSD/SCiLS) Andy Palmer (EMBL) Vitaly Kovalev (EMBL) Artem Tarasov (EMBL) @METASPACE2020

Transcript of METASPACE Training Course, OurCon'16

Page 1: METASPACE Training Course, OurCon'16

METASPACE training courseOurCon’16, 17.10.2016

Theodore Alexandrov (EMBL/UCSD/SCiLS)Andy Palmer (EMBL)

Vitaly Kovalev (EMBL)Artem Tarasov (EMBL)

@METASPACE2020

Page 2: METASPACE Training Course, OurCon'16

Welcome everyone!

ArtemTarasov

“hacker”

AndyPalmer“scientist”

VitalyKovalev“developer”

TheodoreAlexandrov

“leader”

Page 3: METASPACE Training Course, OurCon'16

Part 1: Theory14:00-14:10 Welcome14:10-14:15 Introduction into the METASPACE project14:15-14:45 Metabolite annotation in HR imaging MS14:45-15:00 Overview of the annotation engine

Coffee Break 15:00-15:30

Part 2: Tutorial15:30-16:30 Step-by-step analysis of datasets provided in advance, questions

- Data requirements: 10 min- Upload UI: 5 min- Webapp UI: 15 min- Interpretation: 15 min

- Split into 2 groups: 5 min- Export to imzML, ideally parallel sessions: 15 min

- SCiLS, FTICR- EMBL, Orbitrap

Coffee Break 16:35-17:00

Part 3: Hands-on training17:00-18:00 questions, data analysis

Agenda

Page 4: METASPACE Training Course, OurCon'16

Introduction

Theodore Alexandrov (EMBL, UCSD, SCiLS)

Page 5: METASPACE Training Course, OurCon'16

What we hope you will learn today● Ins and outs of metabolite annotation in HR imaging MS● Bioinformatics we developed for this problem

○ Metabolite Signal Match (MSM) score○ False Discovery Rate estimation○ FDR-controlled annotation

● The online engine we implemented○ How to prepare data for submission to our service○ How to submit your data○ How to view molecular annotations in our webapp

Page 6: METASPACE Training Course, OurCon'16

Project overview: slides on slideshareBioinformatics: slides on slideshare

Theodore Alexandrov (EMBL, UCSD, SCiLS)

Page 7: METASPACE Training Course, OurCon'16

Overview of the annotation engine

Vitaly Kovalev (EMBL)

Page 8: METASPACE Training Course, OurCon'16

Outline

● Inputs (data and metadata)● Online Software● Data Submission● Annotation Browsing● Use Cases

a. mouse brain, MALDI-FTICR (UoR1)b. human colorectal tumor, DESI-Orbitrap (ICL)

Page 9: METASPACE Training Course, OurCon'16

Input

● Centroided imzML (http://ms-imaging.org/)

Page 10: METASPACE Training Course, OurCon'16

Input

● Centroided imzML (http://ms-imaging.org/)

● Dataset metadata

Page 11: METASPACE Training Course, OurCon'16

Online Software

annotations database

browse annotations

task scheduler

upload

Page 12: METASPACE Training Course, OurCon'16

Upload Web UI

● http://upload.metasp.eu● Easy upload

Page 13: METASPACE Training Course, OurCon'16

Upload Web App

● http://upload.metasp.eu● Easy upload● Metadata collection

Page 14: METASPACE Training Course, OurCon'16

SM Web UI● http://alpha.metasp.eu● Annotation browsing● Technical details● Feedback form

Page 15: METASPACE Training Course, OurCon'16

Use Case 1Mouse brain (MALDI-FTICR)

Data provided by Regis Lavigne, Charles Pineau,University of Rennes 1

Page 16: METASPACE Training Course, OurCon'16

Data provided by James McKenzie, Zoltan Takats,Imperial College London

Use Case 2Human colorectal tumor(DESI-Orbitrap)

Page 17: METASPACE Training Course, OurCon'16

Tutorial

Andy Palmer (EMBL)Artem Tarasov (EMBL)

Page 18: METASPACE Training Course, OurCon'16

Part 1: Theory14:00-14:10 Welcome, Outline, Learning objectives14:10-14:15 Introduction into the METASPACE project14:15-14:45 Metabolite annotation in HR imaging MS14:45-15:00 Overview of the annotation engine

Coffee Break 15:00-15:30

Part 2: Tutorial15:30-16:30 Step-by-step analysis of datasets provided in advance, questions

- Data requirements: 10 min- Upload UI: 5 min- Webapp UI: 15 min- Interpretation: 15 min

- Split into 2 groups: 5 min- Export to imzML, ideally parallel sessions: 15 min

- SCiLS, FTICR- EMBL, Orbitrap

Coffee Break 16:35-17:00

Part 3: Hands-on training17:00-18:00 questions, data analysis

Agenda

Page 19: METASPACE Training Course, OurCon'16

Learning Outcomes

1. Preparing data for submission2. Submitting data3. Browsing results

Page 20: METASPACE Training Course, OurCon'16

Data RequirementsImaging mass spectrometry data

- Any ionisation source- Any spatial resolution- Any tissue

- One section per dataset

Page 21: METASPACE Training Course, OurCon'16

Data RequirementsHigh resolving power

RFWHM(@400) > 90K

Well-calibrated

ideally < 3 ppm

Page 22: METASPACE Training Course, OurCon'16

Data RequirementsData Format

- imzML

Centroided

- vendor preferred- http://metasp.eu/imzml

http://imzml.org/wp/introduction/

Page 23: METASPACE Training Course, OurCon'16

Customised ProcessingProcessing is tailored to your data!

- Technical metadata- Resolving power

- isotope prediction- Polarity

- adducts

R200=70K R200=280K

[C41H78NO7P+K]+

Page 24: METASPACE Training Course, OurCon'16

Data RequirementsYour responsibility:

- Data is processed ‘as is’ - Check metadata is correct- Report resolving power accurately (check within data-set)

- Low numbers of annotations often correspond to poor quality mass spectra- Calibration inaccuracy- Lock-mass errors

Page 25: METASPACE Training Course, OurCon'16

Data Submission

Page 26: METASPACE Training Course, OurCon'16

1. Follow conversion instructions for your instrument

2. Select the centroided files, imzML and ibd

3. Click the Upload button.

The dataset will be copied to the cloud storage(accessible only to our team)

Data upload

Page 27: METASPACE Training Course, OurCon'16

Metadata form● Appears once the upload is started● Please fill truthfully

○ Most fields have ‘Other…’ option○ Don’t want to disclose → put ‘-‘

● Click (at the very bottom)

Page 28: METASPACE Training Course, OurCon'16

Browsing Results

Page 29: METASPACE Training Course, OurCon'16

Main web interface

http://alpha.metasp.eu

Results are public

Datasets are not

Page 30: METASPACE Training Course, OurCon'16

Annotation table

Sign in with a Google ID to provide feedback

Currently selected molecule (click to select)

MSM scoreprincipal peak m/z

Page 31: METASPACE Training Course, OurCon'16

Sorting/filtering annotationsClick on column headers to sortStart typing a formula or a metabolite name

Filter by database or dataset Select an adduct Set minimum

MSM scoreEnter m/z of interest

Page 32: METASPACE Training Course, OurCon'16

FDR color-coding

Green = annotated @ chosen FDR level

Red = not annotated @ chosen FDR

Page 33: METASPACE Training Course, OurCon'16

Details for highlighted annotation

molecule distribution (sum of isotope images)

Putative metabolite IDs from the database

Feedback!Thumbs up: reasonableThumbs down: dubious

- tell us why it could be wrong!

Feedback is not public

Page 34: METASPACE Training Course, OurCon'16

Visual insight into MSM score assignmentAdduct

Exact m/z of each ion image

Zoom plot

Ion images for each isotope peak

Isotopic patternsBlue: theoretical abundance(at instrument resolving power)Red: measured image intensity

Page 35: METASPACE Training Course, OurCon'16

Step-by-step searchChoose molecular formula database

Page 36: METASPACE Training Course, OurCon'16

Step-by-step searchChoose dataset

Page 37: METASPACE Training Course, OurCon'16

Step-by-step searchType molecular class

Page 38: METASPACE Training Course, OurCon'16

Step-by-step searchType single metabolite name Potassium adduct

Page 39: METASPACE Training Course, OurCon'16

Step-by-step searchType single metabolite name Sodium adduct

Page 40: METASPACE Training Course, OurCon'16

Step-by-step searchType single metabolite name Hydrogen adduct

Page 41: METASPACE Training Course, OurCon'16

Results Browsing Summary

1. Choose database2. Choose data-set3. Type ‘PC’

a. molecular class filter4. Type ‘PC(16:0/18:0)

a. single metabolite filter5. Select row of table

a. single ion filter6. Simple comparison of spatial distributions

between adducts

Also possible● Filter by m/z● Formula search● Comparison across datasets

Page 42: METASPACE Training Course, OurCon'16

Interpretation

Page 43: METASPACE Training Course, OurCon'16

FDR Controlled AnnotationFalse Discovery Rate - the fraction of incorrect annotations

Control - request a set of annotations at a fixed estimated FDR

Setting the level:- Adjust the number of molecules for follow-up analysis

- When only limited numbers of molecules can be reviewed, adjust the FDR so that fewer/great numbers of molecules are annotated

- Compare annotations between datasets- A principled way of selecting molecules to compare between

datasets

True annotationFalse discovery

MSM score

FDR = 0.1

FDR = 0.2

FDR = nTrue

nFalse + nTrue

Page 44: METASPACE Training Course, OurCon'16

Choice of metabolite database

synthesized/recorded88M CAS registry

biologically occurring/active 50M PubChem compounds

single biological system40K HMDB

sample specific1K LC-MS

Page 45: METASPACE Training Course, OurCon'16

Choice of metabolite databaseImpacts search and False-Discovery-Rate estimation

● Use one that’s relevant● Larger database

○ more false-hits --> fewer annotations at a fixed FDR

● Different databases give different annotations○ even for molecules in both databases due to FDR control○ for data-set comparison, use the same database

Page 46: METASPACE Training Course, OurCon'16

Annotating at level of molecular formula

● Possibility of multiple metabolites per sum formula○ webapp shows all hits from the database search (learn the ambiguity!)○ other databases can be searched (e.g. PubChem)○ use enrichment analysis to get biological leads

● Use an orthogonal technique for reporting individual metabolites○ not directly integrated (yet)○ use web-app results help to target MS/MS studies (e.g. purchase of standards)

Page 47: METASPACE Training Course, OurCon'16

● we annotate molecular formula along with several putative metabolites■ MSI Levels of classification:

1. identified metabolites 2. putatively annotated compounds 3. putatively characterised compound classes 4. unknown compounds

● In preparation: formal guidelines for reporting imaging mass spectrometry annotations

Guidelines for reporting

The role of reporting standards for metabolite annotation and identification in metabolomic studies, Salek et al, 2013, gigascience

Page 48: METASPACE Training Course, OurCon'16

● Preparing data for submission○ imzML export○ metadata

● Submitting data○ upload web-app

upload.metasp.eu

● Browsing results ○ results web-app

alpha.metasp.eu

Learning Summary● METASPACE team:

○ web: metaspace2020.eu○ email: [email protected]○ twitter: @metaspace2020○ github: github.com/spatialmetabolomics

● FTICR data conversion○ SCiLS: [email protected]

● Orbitrap data conversion○ Thermo Fisher Scientific:

[email protected]

How to get help?

Page 49: METASPACE Training Course, OurCon'16

Export to imzML

Page 50: METASPACE Training Course, OurCon'16

(Group 1) Export into imzML: FT-ICR dataUsing SCiLS Lab’s METASPACE export

Page 51: METASPACE Training Course, OurCon'16

Export to METASPACE● Export your centroided high-resolution spectra in the imzML format

● Only available for “FT-ICR type” SCiLS Lab files in SCiLS Lab 2016b

● Best results in METASPACE if peak list is required for centroiding

● Two different Bruker data formats○ SQLite peak list data: Peak list provided during import

○ FT-ICR profile data: Generate a peak list after import

Page 52: METASPACE Training Course, OurCon'16

Create imzML file for METASPACE● In the objects tab, click the export symbol of

the region to be exported and select “Export to METASPACE”

● The Export Spectra dialog opens

● Set your normalization of choice

● Select your peak list of choicefor example “Imported Peaks” in case of SQLite

● Provide your scan polarity

● Click OK to save imzML file

Page 53: METASPACE Training Course, OurCon'16

SQLite peak list data● Data must have been acquired with

on-the-fly centroid detection i.e. there is a file called ‘peaks.sqlite’ within the .d folder of the data-set

● In SCiLS Lab a peak list “Imported peaks” is available, selecting most frequent peaksBy default all peaks appearing more frequently than 1% of spectra

Page 54: METASPACE Training Course, OurCon'16

FT-ICR profile data● Older Solarix Files do not directly contain a peak

list to perform centroiding

● Create peak list with Data AnalysisSCiLS Lab Help Section 7.4

● Use METASPACE tool for peak findinghttps://spatialmetabolomics.github.io/centroidize/

● Use other external tools (mMass, …)

● Import the external peak list into SCiLS LabFile > Import > m/z intervals from CSV or Clipboard

Page 55: METASPACE Training Course, OurCon'16

Use METASPACE tool for peak finding● Select the overview spectrum CSV exported from SCiLS● Upload CSV file to METASPACE tool● Copy values to clipboard● Use File > Import > m/z intervals from CSV

Page 56: METASPACE Training Course, OurCon'16

Upload imzML files to METASPACE● Go to http://upload.metasp.eu/

Page 57: METASPACE Training Course, OurCon'16

SCiLS Cloud: Exchange within the Scientific Community1. SCiLS Lab: computational analysis2. SCiLS Cloud: data & results can be

shared and viewed in web browser, e.g.,○ MALDI imaging data,○ Discriminative m/z markers,○ Regions of interest, …

Comparison of mean spectra for ROIs m/z images of co-localized ions

Page 58: METASPACE Training Course, OurCon'16

Future Vision: SCiLS Cloud and METASPACE

SCiLS Lab

Statistical analysis

METASPACE SCiLS Cloud

Upload dataand findings

Export data to imzML

and upload

prospect: direct interface

Page 59: METASPACE Training Course, OurCon'16

(Group 2) Export into imzML: Orbitrap data (.raw)Instructions: metaspace2020.eu/imzML

Software tools:

imageQuest / raw-converter- Recommended for: MALDI images (Thermo MALDI- / TransMIT AP-S-MALDI-)

imzmlConverter- Recommended for: DESI/flowProbe with separate files per row

Recommended for bioinformaticians: pyimzML (Python parser)

Page 60: METASPACE Training Course, OurCon'16

.raw -> imzML

● Commercial○ Thermo

Scientific

imageQuest

Page 61: METASPACE Training Course, OurCon'16

Raw-imzml converter.raw -> imzML

● Free● http://ms-imaging.org/wp/raw-to-imzm

l-converter/

Page 62: METASPACE Training Course, OurCon'16

.raw -> mzML -> imzML ● MSConvert

○ free (link)

● imzMLConverter○ free ○ requires registration

○ http://www.cs.bham.ac.uk/~ibs/imzMLConverter/>

imzmlConverter

Page 63: METASPACE Training Course, OurCon'16

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement № 634402.

AcknowledgmentsExample data was provided by:University of Rennes 1Regis LavigneCharles PineauEMBLKsenija RadicAlexandra KoumoutsiAndrew Palmer

EMBLTheodore AlexandrovVitaly KovalevArtem TarasovAndrew PalmerDominik Fay

SCiLSDennis TredeJan Hendrik Kobarg