The Pathway Tools Schema

Post on 07-Jan-2016

33 views 0 download

Tags:

description

The Pathway Tools Schema. Motivations for Understanding Schema. Pathway Tools visualizations and analyses depend upon the software being able to find precise information in precise places within a Pathway/Genome DB - PowerPoint PPT Presentation

Transcript of The Pathway Tools Schema

The Pathway Tools Schema

SRI InternationalBioinformaticsMotivations for Understanding

Schema

Pathway Tools visualizations and analyses depend upon the software being able to find precise information in precise places within a Pathway/Genome DB

When writing complex queries to PGDBs, those queries must name classes and slots within the schema

A Pathway/Genome Database is a web of interconnected objects; each object represents a biological entity

SRI InternationalBioinformaticsReference

Pathway Tools User’s Guide, Volume I Appendix A: Guide to the Pathway Tools Schema

SRI InternationalBioinformaticsWeb of Relationships for One

Enzyme

Sdh-flavo Sdh-Fe-S Sdh-membrane-1 Sdh-membrane-2

sdhA sdhB sdhC sdhD

Succinate + FAD = fumarate + FADH2

Enzymatic-reaction

Succinate dehydrogenase

TCA Cycle

SRI InternationalBioinformaticsFrame Data Model

Frame Data Model -- organizational structure for a PGDB

Knowledge base (KB, Database, DB)

Frames

Slots

Facets

Annotations

SRI InternationalBioinformaticsKnowledge Base

Collection of frames and their associated slots, values, facets, and annotations

AKA: Database, PGDB

Can be stored within An Oracle or MySQL DB A disk file Pathway Tools binary program

SRI InternationalBioinformaticsFrames

Entities with which facts are associated

Kinds of frames: Classes: Genes, Pathways, Biosynthetic Pathways Instances (objects): trpA, TCA cycle

Classes: Superclass(es) Subclass(es) Instance(s)

A symbolic frame name (id, key) uniquely identifies each frame

SRI InternationalBioinformaticsSlots

Encode attributes/properties of a frame Integer, real number, string

Represent relationships between frames The value of a slot is the identifier of another frame

Every slot is described by a “slot frame” in a KB that defines meta information about that slot

SRI InternationalBioinformaticsSlot Links

Sdh-flavo Sdh-Fe-S Sdh-membrane-1 Sdh-membrane-2

sdhA sdhB sdhC sdhD

Succinate + FAD = fumarate + FADH2

Enzymatic-reaction

Succinate dehydrogenase

TCA Cycle

product

component-of

catalyzes

reaction

in-pathway

SRI InternationalBioinformaticsSlots

Number of values Single valued Multivalued: sets, bags

Slot values Any LISP object: Integer, real, string, symbol (frame name)

Slotunits define properties of slots: datatypes, classes, constraints

Two slots are inverses if they encode opposite relationships

Slot Product in class Genes Slot Gene in class Polypeptides

SRI InternationalBioinformaticsRepresentation of Function

Sdh-flavo Sdh-Fe-S Sdh-membrane-1 Sdh-membrane-2

sdhA sdhB sdhC sdhD

Succinate + FAD = fumarate + FADH2

Enzymatic-reaction

Succinate dehydrogenase

TCA Cycle

EC#Keq

CofactorsInhibitors

Molecular wtpI

Left-end-position

SRI InternationalBioinformaticsMonofunctional Monomer

Gene

Reaction

Enzymatic-reaction

Monomer

Pathway

SRI InternationalBioinformaticsBifunctional Monomer

Gene

Reaction

Enzymatic-reaction

Monomer

Pathway

Reaction

Enzymatic-reaction

SRI InternationalBioinformaticsMonofunctional Multimer

Monomer Monomer Monomer Monomer

Gene Gene Gene Gene

Reaction

Enzymatic-reaction

Multimer

Pathway

SRI InternationalBioinformaticsPathway and Substrates

Reactant-1

Reaction

Pathway

ReactionReactionReaction

Reactant-2

Product-2

Product-1

in-pathwayleft

right

SRI InternationalBioinformaticsTranscriptional Regulation

site001

pro001

trpE

trpD

trpC

trpB

trpA

trpL

Int003 RpoSig70

TrpR*trpInt001

trpLEDCBA

trp

apoTrpRInt005

SRI InternationalBioinformaticsPrinciple Classes

Class names are capitalized, plural, separated by dashes

Genetic-Elements, with subclasses: Chromosomes Plasmids

Genes Transcription-Units RNAs

rRNAs, snRNAs, tRNAs, Charged-tRNAs Proteins, with subclasses:

Polypeptides Protein-Complexes

SRI InternationalBioinformaticsPrinciple Classes

Reactions, with subclasses: Transport-Reactions

Enzymatic-Reactions

Pathways

Compounds-And-Elements

SRI InternationalBioinformaticsFrame IDs of Instances

Instance frame ID conventions have evolved over time

Examples: Pathways

TRPSYN-PWY, P23-PWY Genes

AG10045 Monomers

TRPA-MONOMER, AG10045-MONOMER

SRI InternationalBioinformaticsSlots in Multiple Classes

Common-NameSynonymsNames (computed as union of Common-Name,

Synonyms)

CommentCitations

DB-Links

SRI InternationalBioinformaticsGenes Slots

Component-Of (links to replicon, transcription unit)

Left-End-PositionRight-End-PositionCentisome-PositionTranscription-DirectionProduct

SRI InternationalBioinformaticsProteins Slots

Molecular-Weight-SeqMolecular-Weight-Exp

pILocations

Modified-FormUnmodified-Form

Component-Of

SRI InternationalBioinformaticsPolypeptides Slots

Gene

SRI InternationalBioinformaticsProtein-Complexes Slots

Components

SRI InternationalBioinformaticsReactions Slots

EC-Number

Left, RightSubstrates (computed as union of Left, Right)

DeltaG0Keq

Spontaneous?

SRI InternationalBioinformaticsEnzymatic-Reactions Slots

EnzymeReactionActivatorsInhibitorsPhysiologically-RelevantCofactorsProsthetic-GroupsAlternative-SubstratesAlternative-Cofactors

SRI InternationalBioinformaticsPathways Slots

Reaction-ListPredecessorsPrimaries