Knowledge Enabled Information and Services Science Glycomics project overview.

15
Knowledge Enabled Information and Services Science Glycomics project overview

Transcript of Knowledge Enabled Information and Services Science Glycomics project overview.

Page 1: Knowledge Enabled Information and Services Science Glycomics project overview.

Knowledge Enabled Information and Services Science

Glycomics project overview

Page 2: Knowledge Enabled Information and Services Science Glycomics project overview.

Knowledge Enabled Information and Services Science

Life Science Ontologies

• ProPreO• An ontology for capturing process and lifecycle information

related to proteomic experiments• 398 classes, 32 relationships• 3.1 million instances• Published through the National Center for Biomedical

Ontology (NCBO) and Open Biomedical Ontologies (OBO)

• Glyco• An ontology for structure and function of Glycopeptides• 573 classes, 113 relationships• Published through the National Center for Biomedical

Ontology (NCBO)

Page 3: Knowledge Enabled Information and Services Science Glycomics project overview.

Knowledge Enabled Information and Services Science

Two aspects of glycoproteomics:

o What is it? → identificationo How much of it is there? → quantification

Heterogeneity in data generation process, instrumental parameters, formatsNeed data and process provenance → ontology-mediated provenanceHence, ProPreO models both the glycoproteomics experimental process and

attendant data

ProPreO ontology

Page 4: Knowledge Enabled Information and Services Science Glycomics project overview.

Knowledge Enabled Information and Services Science

ProPreO population: transformation to rdf

Scientific Data

Computational Methods

Ontology instances

Page 5: Knowledge Enabled Information and Services Science Glycomics project overview.

Knowledge Enabled Information and Services Science

“Protein RDF”

chemicalmass

monoisotopicmass

amino-acidsequence

n-glycosylationconcensus

Protein Dataamino-acidsequence

ChemicalMass RDF

MonoisotopicMass RDF

Amino-acidSequence

RDF

“Peptide RDF”

chemicalmass

monoisotopicmass

amino-acidsequence

n-glycosylationconcensus

parentprotein

CalculateChemical

Mass

CalculateMonoisotopic

Mass

DetermineN-glycosylation

Concensus

Key

Protein Path

Peptide Path

amino-acidsequence

Extract Peptide Amino-acid Sequence from Protein Amino-acid Sequence

ProPreO population: transformation to rdf

Scientific DataComputational Methods

RDF

Page 6: Knowledge Enabled Information and Services Science Glycomics project overview.

Knowledge Enabled Information and Services Science

Semantic annotation of scientific/experimental data

Page 7: Knowledge Enabled Information and Services Science Glycomics project overview.

Knowledge Enabled Information and Services Science

830.9570 194.9604 2

580.2985 0.3592

688.3214 0.2526

779.4759 38.4939

784.3607 21.7736

1543.7476 1.3822

1544.7595 2.9977

1562.8113 37.4790

1660.7776 476.5043

parent ion m/z

fragment ion m/z

ms/ms peaklist data

fragment ionabundance

parent ionabundance

parent ion charge

ProPreO: Ontology-mediated provenance

Mass Spectrometry (MS) Data

Page 8: Knowledge Enabled Information and Services Science Glycomics project overview.

Knowledge Enabled Information and Services Science

<ms-ms_peak_list>

<parameter instrument=“micromass_QTOF_2_quadropole_time_of_flight_mass_spectrometer”

mode=“ms-ms”/>

<parent_ion m-z=“830.9570” abundance=“194.9604” z=“2”/>

<fragment_ion m-z=“580.2985” abundance=“0.3592”/>

<fragment_ion m-z=“688.3214” abundance=“0.2526”/>

<fragment_ion m-z=“779.4759” abundance=“38.4939”/>

<fragment_ion m-z=“784.3607” abundance=“21.7736”/>

<fragment_ion m-z=“1543.7476” abundance=“1.3822”/>

<fragment_ion m-z=“1544.7595” abundance=“2.9977”/>

<fragment_ion m-z=“1562.8113” abundance=“37.4790”/>

<fragment_ion m-z=“1660.7776” abundance=“476.5043”/>

</ms-ms_peak_list>

OntologicalConcepts

ProPreO: Ontology-mediated provenance

Semantically Annotated MS Data

Page 9: Knowledge Enabled Information and Services Science Glycomics project overview.

Knowledge Enabled Information and Services Science

Semantic annotation of Scientific DataSemantic annotation of Scientific Data

Annotated ms/ms peaklist data

<ms/ms_peak_list>

<parameter

instrument=“micromass_QTOF_2_quadropole_time_of_flight_mass_spectrometer”

mode = “ms/ms”/>

<parent_ion_mass>830.9570</parent_ion_mass>

<total_abundance>194.9604</total_abundance>

<z>2</z>

<mass_spec_peak m/z = 580.2985 abundance = 0.3592/>

<mass_spec_peak m/z = 688.3214 abundance = 0.2526/>

<mass_spec_peak m/z = 779.4759 abundance = 38.4939/>

<mass_spec_peak m/z = 784.3607 abundance = 21.7736/>

<mass_spec_peak m/z = 1543.7476 abundance = 1.3822/>

<mass_spec_peak m/z = 1544.7595 abundance = 2.9977/>

<mass_spec_peak m/z = 1562.8113 abundance = 37.4790/>

<mass_spec_peak m/z = 1660.7776 abundance = 476.5043/>

<ms/ms_peak_list>

Page 10: Knowledge Enabled Information and Services Science Glycomics project overview.

Knowledge Enabled Information and Services Science

N-GlycosylationN-Glycosylation ProcessProcess (NGPNGP)Cell Culture

Glycoprotein Fraction

Glycopeptides Fraction

extract

Separation technique I

Glycopeptides Fraction

n*m

n

Signal integrationData correlation

Peptide Fraction

Peptide Fraction

ms data ms/ms data

ms peaklist ms/ms peaklist

Peptide listN-dimensional arrayGlycopeptide identificationand quantification

proteolysis

Separation technique II

PNGase

Mass spectrometry

Data reductionData reduction

Peptide identificationbinning

n

1

Page 11: Knowledge Enabled Information and Services Science Glycomics project overview.

Knowledge Enabled Information and Services Science

Storage

Standard FormatData

Raw Data

Filtered Data

Search Results

Final Output

Agent Agent Agent Agent Biological Sample Analysis

by MS/MS

Raw Data to

Standard Format

DataPre-

process

DB Search

(Mascot/Sequest)

Results Post-

process

(ProValt)

O I O I O I O I O

Biological Information

SemanticAnnotationApplications

Semantic Web Process to incorporate provenance

Page 12: Knowledge Enabled Information and Services Science Glycomics project overview.

Knowledge Enabled Information and Services Science

Raw2mzXML mzXML2Pkl Pkl2pSplit MASCOT Search ProVault

Raw mzXML Pkl pSplit MACOTresult

ProVaultresult

ExperimentalData Semantic

Annotation MetadataFile

SPARQL query-based User Interface

SemanticMetadataRegistry

PROTEOMECOMMONS

PROTEOMICS WORKFLOW

Integrated Semantic Information and knowledge System (Isis)

ProPreO ontology

EXPERIMENTAL DATA

Have I performed an error? Give me all result files from a similar

organism, cell, preparation, mass spectrometric conditions

and compare results.

Is the result erroneous? Give me all result files from a similar

organism, cell, preparation, mass spectrometric conditions

and compare results.

Page 13: Knowledge Enabled Information and Services Science Glycomics project overview.

Knowledge Enabled Information and Services Science

Semantic Biological Web Service Registry

Semantic Web Service

Page 14: Knowledge Enabled Information and Services Science Glycomics project overview.

Knowledge Enabled Information and Services Science

<?xml version="1.0" encoding="ISO-8859-1"?><!DOCTYPE GlydeCT SYSTEM "http://glycomics.ccrc.uga.edu/GLYDE-CT/GLYDE-CT_v2.11.DTD"><GlydeCT xmlns:GlydeCT="http://glycomics.ccrc.uga.edu/GLYDE-CT/GLYDE-CT_v2.11"> <structure type="molecule" id="molecule_1" name=“GP1"> <part type="moiety" id=“moiety_1" ref=“some_file#GNGS" name="GNGS"/> <part type="moiety" id=“moiety_2" ref=“some_file#Man3" name="Man3GlcNAc2"/> <link from=“moiety_2" to=“moiety_1"> <link from=“residue_1" to=“residue_2"> <link from="C1" to="N4"/> </link> </link> </structure></Glyde-CT>

Gly|Asn|Gly|Ser

moiety_2

moiety_1

123

5

4

1

2

3

4

GLYDE-CT : GLYcan Data Exchange Based on a Connection Table Format

Page 15: Knowledge Enabled Information and Services Science Glycomics project overview.

Knowledge Enabled Information and Services Science

Data, ontologies, more publications at Biomedical Glycomics project web site:

http://knoesis.wright.edu/research/bioinformatics/index.html

Thank You