METASPACE Training Course, OurCon'16
Transcript of METASPACE Training Course, OurCon'16
METASPACE training courseOurCon’16, 17.10.2016
Theodore Alexandrov (EMBL/UCSD/SCiLS)Andy Palmer (EMBL)
Vitaly Kovalev (EMBL)Artem Tarasov (EMBL)
@METASPACE2020
Welcome everyone!
ArtemTarasov
“hacker”
AndyPalmer“scientist”
VitalyKovalev“developer”
TheodoreAlexandrov
“leader”
Part 1: Theory14:00-14:10 Welcome14:10-14:15 Introduction into the METASPACE project14:15-14:45 Metabolite annotation in HR imaging MS14:45-15:00 Overview of the annotation engine
Coffee Break 15:00-15:30
Part 2: Tutorial15:30-16:30 Step-by-step analysis of datasets provided in advance, questions
- Data requirements: 10 min- Upload UI: 5 min- Webapp UI: 15 min- Interpretation: 15 min
- Split into 2 groups: 5 min- Export to imzML, ideally parallel sessions: 15 min
- SCiLS, FTICR- EMBL, Orbitrap
Coffee Break 16:35-17:00
Part 3: Hands-on training17:00-18:00 questions, data analysis
Agenda
Introduction
Theodore Alexandrov (EMBL, UCSD, SCiLS)
What we hope you will learn today● Ins and outs of metabolite annotation in HR imaging MS● Bioinformatics we developed for this problem
○ Metabolite Signal Match (MSM) score○ False Discovery Rate estimation○ FDR-controlled annotation
● The online engine we implemented○ How to prepare data for submission to our service○ How to submit your data○ How to view molecular annotations in our webapp
Project overview: slides on slideshareBioinformatics: slides on slideshare
Theodore Alexandrov (EMBL, UCSD, SCiLS)
Overview of the annotation engine
Vitaly Kovalev (EMBL)
Outline
● Inputs (data and metadata)● Online Software● Data Submission● Annotation Browsing● Use Cases
a. mouse brain, MALDI-FTICR (UoR1)b. human colorectal tumor, DESI-Orbitrap (ICL)
Online Software
annotations database
browse annotations
task scheduler
upload
Upload Web UI
● http://upload.metasp.eu● Easy upload
Upload Web App
● http://upload.metasp.eu● Easy upload● Metadata collection
SM Web UI● http://alpha.metasp.eu● Annotation browsing● Technical details● Feedback form
Use Case 1Mouse brain (MALDI-FTICR)
Data provided by Regis Lavigne, Charles Pineau,University of Rennes 1
Data provided by James McKenzie, Zoltan Takats,Imperial College London
Use Case 2Human colorectal tumor(DESI-Orbitrap)
Tutorial
Andy Palmer (EMBL)Artem Tarasov (EMBL)
Part 1: Theory14:00-14:10 Welcome, Outline, Learning objectives14:10-14:15 Introduction into the METASPACE project14:15-14:45 Metabolite annotation in HR imaging MS14:45-15:00 Overview of the annotation engine
Coffee Break 15:00-15:30
Part 2: Tutorial15:30-16:30 Step-by-step analysis of datasets provided in advance, questions
- Data requirements: 10 min- Upload UI: 5 min- Webapp UI: 15 min- Interpretation: 15 min
- Split into 2 groups: 5 min- Export to imzML, ideally parallel sessions: 15 min
- SCiLS, FTICR- EMBL, Orbitrap
Coffee Break 16:35-17:00
Part 3: Hands-on training17:00-18:00 questions, data analysis
Agenda
Learning Outcomes
1. Preparing data for submission2. Submitting data3. Browsing results
Data RequirementsImaging mass spectrometry data
- Any ionisation source- Any spatial resolution- Any tissue
- One section per dataset
Data RequirementsHigh resolving power
RFWHM(@400) > 90K
Well-calibrated
ideally < 3 ppm
Data RequirementsData Format
- imzML
Centroided
- vendor preferred- http://metasp.eu/imzml
http://imzml.org/wp/introduction/
Customised ProcessingProcessing is tailored to your data!
- Technical metadata- Resolving power
- isotope prediction- Polarity
- adducts
R200=70K R200=280K
[C41H78NO7P+K]+
Data RequirementsYour responsibility:
- Data is processed ‘as is’ - Check metadata is correct- Report resolving power accurately (check within data-set)
- Low numbers of annotations often correspond to poor quality mass spectra- Calibration inaccuracy- Lock-mass errors
Data Submission
1. Follow conversion instructions for your instrument
2. Select the centroided files, imzML and ibd
3. Click the Upload button.
The dataset will be copied to the cloud storage(accessible only to our team)
Data upload
Metadata form● Appears once the upload is started● Please fill truthfully
○ Most fields have ‘Other…’ option○ Don’t want to disclose → put ‘-‘
● Click (at the very bottom)
Browsing Results
Main web interface
http://alpha.metasp.eu
Results are public
Datasets are not
Annotation table
Sign in with a Google ID to provide feedback
Currently selected molecule (click to select)
MSM scoreprincipal peak m/z
Sorting/filtering annotationsClick on column headers to sortStart typing a formula or a metabolite name
Filter by database or dataset Select an adduct Set minimum
MSM scoreEnter m/z of interest
FDR color-coding
Green = annotated @ chosen FDR level
Red = not annotated @ chosen FDR
Details for highlighted annotation
molecule distribution (sum of isotope images)
Putative metabolite IDs from the database
Feedback!Thumbs up: reasonableThumbs down: dubious
- tell us why it could be wrong!
Feedback is not public
Visual insight into MSM score assignmentAdduct
Exact m/z of each ion image
Zoom plot
Ion images for each isotope peak
Isotopic patternsBlue: theoretical abundance(at instrument resolving power)Red: measured image intensity
Step-by-step searchChoose molecular formula database
Step-by-step searchChoose dataset
Step-by-step searchType molecular class
Step-by-step searchType single metabolite name Potassium adduct
Step-by-step searchType single metabolite name Sodium adduct
Step-by-step searchType single metabolite name Hydrogen adduct
Results Browsing Summary
1. Choose database2. Choose data-set3. Type ‘PC’
a. molecular class filter4. Type ‘PC(16:0/18:0)
a. single metabolite filter5. Select row of table
a. single ion filter6. Simple comparison of spatial distributions
between adducts
Also possible● Filter by m/z● Formula search● Comparison across datasets
Interpretation
FDR Controlled AnnotationFalse Discovery Rate - the fraction of incorrect annotations
Control - request a set of annotations at a fixed estimated FDR
Setting the level:- Adjust the number of molecules for follow-up analysis
- When only limited numbers of molecules can be reviewed, adjust the FDR so that fewer/great numbers of molecules are annotated
- Compare annotations between datasets- A principled way of selecting molecules to compare between
datasets
True annotationFalse discovery
MSM score
FDR = 0.1
FDR = 0.2
FDR = nTrue
nFalse + nTrue
Choice of metabolite database
synthesized/recorded88M CAS registry
biologically occurring/active 50M PubChem compounds
single biological system40K HMDB
sample specific1K LC-MS
Choice of metabolite databaseImpacts search and False-Discovery-Rate estimation
● Use one that’s relevant● Larger database
○ more false-hits --> fewer annotations at a fixed FDR
● Different databases give different annotations○ even for molecules in both databases due to FDR control○ for data-set comparison, use the same database
Annotating at level of molecular formula
● Possibility of multiple metabolites per sum formula○ webapp shows all hits from the database search (learn the ambiguity!)○ other databases can be searched (e.g. PubChem)○ use enrichment analysis to get biological leads
● Use an orthogonal technique for reporting individual metabolites○ not directly integrated (yet)○ use web-app results help to target MS/MS studies (e.g. purchase of standards)
● we annotate molecular formula along with several putative metabolites■ MSI Levels of classification:
1. identified metabolites 2. putatively annotated compounds 3. putatively characterised compound classes 4. unknown compounds
● In preparation: formal guidelines for reporting imaging mass spectrometry annotations
Guidelines for reporting
The role of reporting standards for metabolite annotation and identification in metabolomic studies, Salek et al, 2013, gigascience
● Preparing data for submission○ imzML export○ metadata
● Submitting data○ upload web-app
upload.metasp.eu
● Browsing results ○ results web-app
alpha.metasp.eu
Learning Summary● METASPACE team:
○ web: metaspace2020.eu○ email: [email protected]○ twitter: @metaspace2020○ github: github.com/spatialmetabolomics
● FTICR data conversion○ SCiLS: [email protected]
● Orbitrap data conversion○ Thermo Fisher Scientific:
How to get help?
Export to imzML
(Group 1) Export into imzML: FT-ICR dataUsing SCiLS Lab’s METASPACE export
Export to METASPACE● Export your centroided high-resolution spectra in the imzML format
● Only available for “FT-ICR type” SCiLS Lab files in SCiLS Lab 2016b
● Best results in METASPACE if peak list is required for centroiding
● Two different Bruker data formats○ SQLite peak list data: Peak list provided during import
○ FT-ICR profile data: Generate a peak list after import
Create imzML file for METASPACE● In the objects tab, click the export symbol of
the region to be exported and select “Export to METASPACE”
● The Export Spectra dialog opens
● Set your normalization of choice
● Select your peak list of choicefor example “Imported Peaks” in case of SQLite
● Provide your scan polarity
● Click OK to save imzML file
SQLite peak list data● Data must have been acquired with
on-the-fly centroid detection i.e. there is a file called ‘peaks.sqlite’ within the .d folder of the data-set
● In SCiLS Lab a peak list “Imported peaks” is available, selecting most frequent peaksBy default all peaks appearing more frequently than 1% of spectra
FT-ICR profile data● Older Solarix Files do not directly contain a peak
list to perform centroiding
● Create peak list with Data AnalysisSCiLS Lab Help Section 7.4
● Use METASPACE tool for peak findinghttps://spatialmetabolomics.github.io/centroidize/
● Use other external tools (mMass, …)
● Import the external peak list into SCiLS LabFile > Import > m/z intervals from CSV or Clipboard
Use METASPACE tool for peak finding● Select the overview spectrum CSV exported from SCiLS● Upload CSV file to METASPACE tool● Copy values to clipboard● Use File > Import > m/z intervals from CSV
Upload imzML files to METASPACE● Go to http://upload.metasp.eu/
SCiLS Cloud: Exchange within the Scientific Community1. SCiLS Lab: computational analysis2. SCiLS Cloud: data & results can be
shared and viewed in web browser, e.g.,○ MALDI imaging data,○ Discriminative m/z markers,○ Regions of interest, …
Comparison of mean spectra for ROIs m/z images of co-localized ions
Future Vision: SCiLS Cloud and METASPACE
SCiLS Lab
Statistical analysis
METASPACE SCiLS Cloud
Upload dataand findings
Export data to imzML
and upload
prospect: direct interface
(Group 2) Export into imzML: Orbitrap data (.raw)Instructions: metaspace2020.eu/imzML
Software tools:
imageQuest / raw-converter- Recommended for: MALDI images (Thermo MALDI- / TransMIT AP-S-MALDI-)
imzmlConverter- Recommended for: DESI/flowProbe with separate files per row
Recommended for bioinformaticians: pyimzML (Python parser)
.raw -> imzML
● Commercial○ Thermo
Scientific
imageQuest
Raw-imzml converter.raw -> imzML
● Free● http://ms-imaging.org/wp/raw-to-imzm
l-converter/
.raw -> mzML -> imzML ● MSConvert
○ free (link)
● imzMLConverter○ free ○ requires registration
○ http://www.cs.bham.ac.uk/~ibs/imzMLConverter/>
imzmlConverter
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement № 634402.
AcknowledgmentsExample data was provided by:University of Rennes 1Regis LavigneCharles PineauEMBLKsenija RadicAlexandra KoumoutsiAndrew Palmer
EMBLTheodore AlexandrovVitaly KovalevArtem TarasovAndrew PalmerDominik Fay
SCiLSDennis TredeJan Hendrik Kobarg