The Pathway Tools Schema
description
Transcript of The Pathway Tools Schema
![Page 1: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/1.jpg)
The Pathway Tools Schema
![Page 2: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/2.jpg)
SRI InternationalBioinformaticsMotivations for Understanding
Schema
Pathway Tools visualizations and analyses depend upon the software being able to find precise information in precise places within a Pathway/Genome DB
When writing complex queries to PGDBs, those queries must name classes and slots within the schema
A Pathway/Genome Database is a web of interconnected objects; each object represents a biological entity
![Page 3: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/3.jpg)
SRI InternationalBioinformaticsReference
Pathway Tools User’s Guide, Volume I Appendix A: Guide to the Pathway Tools Schema
![Page 4: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/4.jpg)
SRI InternationalBioinformaticsWeb of Relationships for One
Enzyme
Sdh-flavo Sdh-Fe-S Sdh-membrane-1 Sdh-membrane-2
sdhA sdhB sdhC sdhD
Succinate + FAD = fumarate + FADH2
Enzymatic-reaction
Succinate dehydrogenase
TCA Cycle
![Page 5: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/5.jpg)
SRI InternationalBioinformaticsFrame Data Model
Frame Data Model -- organizational structure for a PGDB
Knowledge base (KB, Database, DB)
Frames
Slots
Facets
Annotations
![Page 6: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/6.jpg)
SRI InternationalBioinformaticsKnowledge Base
Collection of frames and their associated slots, values, facets, and annotations
AKA: Database, PGDB
Can be stored within An Oracle DB A disk file A Pathway Tools binary program
![Page 7: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/7.jpg)
SRI InternationalBioinformaticsFrames
Entities with which facts are associated
Kinds of frames: Classes: Genes, Pathways, Biosynthetic Pathways Instances (objects): trpA, TCA cycle
Classes: Superclass(es) Subclass(es) Instance(s)
A symbolic frame name (id, key) uniquely identifies each frame
![Page 8: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/8.jpg)
SRI InternationalBioinformaticsFrame IDs
Naming conventions for frame IDsUniqueness of frame IDs
Frame IDs must be unique within a PGDB Goal: Same frame ID within different PGDBs should refer to
the same biological entity Because many frames are imported from MetaCyc, this helps
ensure consistency of frame names Frame IDs for newly created frames (not imported) are
generated by Pathway Tools Those frame IDs contain a PGDB-specific identifier Example: CPLXzz-nnnn CPLXB3-0035
![Page 9: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/9.jpg)
SRI InternationalBioinformaticsSlots
Encode attributes/properties of a frame Integer, real number, string, symbols
Represent relationships between frames The value of a slot is the identifier of another frame
Every slot is described by a “slot frame” in a KB that defines meta information about that slot
![Page 10: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/10.jpg)
SRI InternationalBioinformaticsSlot Links
Sdh-flavo Sdh-Fe-S Sdh-membrane-1 Sdh-membrane-2
sdhA sdhB sdhC sdhD
Succinate + FAD = fumarate + FADH2
Enzymatic-reaction
Succinate dehydrogenase
TCA Cycle
product
component-of
catalyzes
reaction
in-pathway
![Page 11: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/11.jpg)
SRI InternationalBioinformaticsSlots
Number of values Single valued Multivalued: sets, bags
Slot values Any LISP object: Integer, real, string, symbol (frame name)
Slotunits define properties of slots: datatypes, classes, constraints
Two slots are inverses if they encode opposite relationships
Slot Product in class Genes Slot Gene in class Polypeptides
![Page 12: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/12.jpg)
SRI InternationalBioinformaticsRepresentation of Function
Sdh-flavo Sdh-Fe-S Sdh-membrane-1 Sdh-membrane-2
sdhA sdhB sdhC sdhD
Succinate + FAD = fumarate + FADH2
Enzymatic-reaction
Succinate dehydrogenase
TCA Cycle
EC#Keq
CofactorsInhibitors
Molecular wtpI
Left-end-position
![Page 13: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/13.jpg)
SRI InternationalBioinformaticsMonofunctional Monomer
Gene
Reaction
Enzymatic-reaction
Monomer
Pathway
![Page 14: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/14.jpg)
SRI InternationalBioinformaticsBifunctional Monomer
Gene
Reaction
Enzymatic-reaction
Monomer
Pathway
Reaction
Enzymatic-reaction
![Page 15: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/15.jpg)
SRI InternationalBioinformaticsMonofunctional Multimer
Monomer Monomer Monomer Monomer
Gene Gene Gene Gene
Reaction
Enzymatic-reaction
Multimer
Pathway
![Page 16: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/16.jpg)
SRI InternationalBioinformaticsPathway and Substrates
Reactant-1
Reaction
Pathway
ReactionReactionReaction
Reactant-2
Product-2
Product-1
in-pathwayleft
right
![Page 17: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/17.jpg)
SRI InternationalBioinformaticsTranscriptional Regulation
site001
pro001
trpE
trpD
trpC
trpB
trpA
trpL
Int003 RpoSig70
TrpR*trpInt001
trpLEDCBA
trp
apoTrpRInt005
![Page 18: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/18.jpg)
SRI InternationalBioinformaticsPrinciple Classes
Class names are capitalized, plural, separated by dashes
Genetic-Elements, with subclasses: Chromosomes Plasmids
Genes Transcription-Units RNAs
rRNAs, snRNAs, tRNAs, Charged-tRNAs Proteins, with subclasses:
Polypeptides Protein-Complexes
![Page 19: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/19.jpg)
SRI InternationalBioinformaticsPrinciple Classes
Reactions, with subclasses: Transport-Reactions
Enzymatic-Reactions
Pathways
Compounds-And-Elements
![Page 20: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/20.jpg)
SRI InternationalBioinformaticsSlots in Multiple Classes
Common-NameSynonyms
CommentCitations
DB-Links
![Page 21: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/21.jpg)
SRI InternationalBioinformaticsGenes Slots
Component-Of (links to replicon, transcription unit)
Left-End-PositionRight-End-PositionCentisome-PositionTranscription-DirectionProduct
![Page 22: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/22.jpg)
SRI InternationalBioinformaticsProteins Slots
Molecular-Weight-SeqMolecular-Weight-Exp
pILocations
Modified-FormUnmodified-Form
Component-Of
![Page 23: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/23.jpg)
SRI InternationalBioinformaticsPolypeptides Slots
Gene
![Page 24: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/24.jpg)
SRI InternationalBioinformaticsProtein-Complexes Slots
Components
![Page 25: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/25.jpg)
SRI InternationalBioinformaticsReactions Slots
EC-Number
Left, Right
DeltaG0Keq
Spontaneous?
![Page 26: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/26.jpg)
SRI InternationalBioinformaticsEnzymatic-Reactions Slots
EnzymeReactionActivatorsInhibitorsPhysiologically-RelevantCofactorsProsthetic-GroupsAlternative-SubstratesAlternative-Cofactors
![Page 27: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/27.jpg)
SRI InternationalBioinformaticsPathways Slots
Reaction-ListPredecessorsPrimaries
![Page 28: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/28.jpg)
SRI InternationalBioinformaticsGKB Editor
Browse class hierarchy and slot definitions
Tools -> Ontology Browser
GKB Editor described at http://www.ai.sri.com/~gkb/user-man.html
![Page 29: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/29.jpg)
Pathway Tools Data Access Mechanisms
![Page 30: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/30.jpg)
SRI InternationalBioinformaticsIntroduction
MANY ways to access and update PGDBs
APIs in Java, Perl, and Lisp
Import/export of files in many formats
Registry of Pathway/Genome Databases
Import PGDB data into BioWarehouse
Updating a PGDB from an external genome DB
![Page 31: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/31.jpg)
SRI InternationalBioinformaticsPathway Tools APIs
Support programmatic queries and updates to PGDBs
APIs in Java, Perl, and Lisp all provide access to a common set of procedures:
Generic Frame Protocol -- Ocelot object database API Additional Pathway Tools functions
For more information see http://bioinformatics.ai.sri.com/ptools/ptools-resources.html
![Page 32: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/32.jpg)
SRI InternationalBioinformaticsGeneric Frame Protocol (GFP)
A library of procedures for accessing Ocelot DBs
GFP specification: http://www.ai.sri.com/~gfp/spec/paper/paper.html
A small number of GFP functions are sufficient for most complex queries
Knowledge of Pathway Tools schema is critical for using the APIs:
Appendix I of Pathway Tools User’s Guide, Vol I
![Page 33: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/33.jpg)
SRI InternationalBioinformaticsGeneric Frame Protocol
get-class-all-instances (Class) Returns the instances of Class
Key Pathway Tools classes: Genetic-Elements Genes Proteins Polypeptides (a subclass of Proteins) Protein-Complexes (a subclass of Proteins) Pathways Reactions Compounds-And-Elements Enzymatic-Reactions Transcription-Units Promoters DNA-Binding-Sites
![Page 34: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/34.jpg)
SRI InternationalBioinformaticsGeneric Frame Protocol
Notation Frame.Slot means a specified slot of a specified frame
get-slot-value(Frame Slot) Returns first value of Frame.Slot
get-slot-values(Frame Slot) Returns all values of Frame.Slot as a list
slot-has-value-p(Frame Slot) Returns T if Frame.Slot has at least one value
member-slot-value-p(Frame Slot Value) Returns T if Value is one of the values of Frame.Slot
print-frame(Frame) Prints the contents of Frame
Note: Frame and Slot must be symbols!
![Page 35: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/35.jpg)
SRI InternationalBioinformaticsGeneric Frame Protocol
coercible-to-frame-p (Thing) Returns T if Thing is the name of a frame, or a frame object
save-kb Saves the current KB
![Page 36: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/36.jpg)
SRI InternationalBioinformaticsGeneric Frame Protocol –
Update Operations
put-slot-value(Frame Slot Value) Replace the current value(s) of Frame.Slot with Value
put-slot-values(Frame Slot Value-List) Replace the current value(s) of Frame.Slot with Value-List, which must be a list of values
add-slot-value(Frame Slot Value) Add Value to the current value(s) of Frame.Slot, if any
remove-slot-value(Frame Slot Value) Remove Value from the current value(s) of Frame.slot
replace-slot-value(Frame Slot Old-Value New-Value) In Frame.Slot, replace Old-Value with New-Value
remove-local-slot-values(Frame Slot) Remove all of the values of Frame.Slot
![Page 37: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/37.jpg)
SRI InternationalBioinformatics
Additional Pathway Tools Functions –Semantic Inference LayerSemantic inference layer defines built-in
functions to compute commonly required relationships in a PGDB
http://bioinformatics.ai.sri.com/ptools/ptools-fns.html
![Page 38: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/38.jpg)
SRI InternationalBioinformaticsInternal note
Note: Refer to local copy of ptools-fns.html to go through the semantic inference layer fns
![Page 39: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/39.jpg)
SRI InternationalBioinformaticsFile Import/Export Capabilities
PGDBs can be exported in whole or part to: SBML – Systems Biology Markup Language – sbml.org
Import supported by many simulation packages File -> Export -> Selected Reactions to SBML File
Pathway Tools Attribute-Value format and column-delimited format files
http://brg.ai.sri.com/ptools/flatfile-format.shtml Dump entire PGDB to a suite of files: File -> Export -> Entire DB to Flat
Files Dump selected frames to a single file: File -> Export -> Selected Frames
to File
![Page 40: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/40.jpg)
SRI InternationalBioinformaticsImport/Export
Import from attribute-value or column-delimited files File -> Import -> Frames From File
Import/Export to/from internal Pathway Tools format that allows pathways, reactions, enzymes, and compounds to be easily moved between Pathway Tools installations
Edit -> Add Pathway to File Export List File -> Export -> Selected Pathways to File File -> Import -> Pathways from File
Import/Export to/from MDL molfile format Edit -> Import compound structure from molfile Edit -> Export compound structure to molfile
![Page 41: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/41.jpg)
SRI InternationalBioinformaticsMiscellaneous Exports
Overview -> Highlight -> Save to File Overview -> Highlight -> Load from File
Gene / Protein Sequence / Save to file Chromosome -> Show Sequence of a Segment of Replicon
![Page 42: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/42.jpg)
SRI InternationalBioinformaticsNapster Comes to
Bioinformatics
Public sharing of Pathway/Genome Databases
PGDB registry maintained by SRI at URL http://biocyc.org/registry.html
Registry operations List contents of registry Download PGDBs listed in the registry Register PGDBs you have created
![Page 43: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/43.jpg)
SRI InternationalBioinformaticsRegistry Details
Why register your PGDB? Declare existence of your PGDB in a central location Facilitate download by other scientists
Why download a PGDB? Desktop Navigator provides more functionality than Web Comparative operations Programmatic querying and processing of PGDB
Registration process Registered PGDBs have open availability by default Authors can provide their own license agreements Registered PGDBs reside on authors’ FTP site
![Page 44: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/44.jpg)
SRI InternationalBioinformaticsBioWarehouse
Biospice.org
![Page 45: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/45.jpg)
SRI InternationalBioinformaticsNew Import/Export Tools
Suggestions?
Volunteers?
![Page 46: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/46.jpg)
SRI InternationalBioinformaticsUpdating a PGDB From an
External Genome DB
Example: AraCyc forms a pathway module to the TAIR DB
TAIR is authoritative source for gene and gene-product information
Update AraCyc to reflect updates in TAIR
![Page 47: The Pathway Tools Schema](https://reader035.fdocuments.us/reader035/viewer/2022081603/568137a3550346895d9f49fb/html5/thumbnails/47.jpg)
SRI InternationalBioinformaticsProposed Approach
Export TAIR to PathoLogic files Build AraCyc2 from those PathoLogic files – automated
PathoLogic only
Compare AraCyc1 (A1) to AraCyc2 (A2)A. Import new genes/proteins from A2 to A1B. Delete from A1 genes/proteins not found in A2C. Rename genes/proteins whose names changed from A2 to A1 Run name matcher on A1’ Check for pathways with no enzymes and report them so user can keep any that
otherwise PathoLogic will delete What about enzymes that were assigned to a pathway by the hole filler?
Re-run pathway predictor Remember what pathways user deletes so they are not re-predicted by
PathoLogic
Consider movement of genes from contig to chromosome