BioUML

151
BioUML 0100100010011101 ISB 0100100010011101 ISB Fedor Kolpakov Institute of Systems Biology (spin-off of DevelopmentOnTheEdge.com) Laboratory of Bioinformatics, Design Technological Institute of Digital Techniques Novosibirsk, Russia

description

BioUML. Fedor Kolpakov Institute of Systems Biology (spin-off of DevelopmentOnTheEdge.com) Laboratory of Bioinformatics, Design Technological Institute of Digital Techniques Novosibirsk, Russia. Agenda. Part 1: overview of BioUML workbench Cafe break - PowerPoint PPT Presentation

Transcript of BioUML

Page 1: BioUML

BioUML

0100100010011101

ISB0100100010011101

ISB

Fedor Kolpakov

Institute of Systems Biology(spin-off of DevelopmentOnTheEdge.com)

Laboratory of Bioinformatics,Design Technological Institute of Digital Techniques

Novosibirsk, Russia

Page 2: BioUML

Agenda

• Part 1: overview of BioUML workbench

• Cafe break

• Part 2: new concepts and possibilities(versions 0.8.0 – 0.8.3)

• Further development

• Questions and discussion

Page 3: BioUML

Part 1: overview of BioUML workbenchOverview• Main concepts• Meta model• Architecture overview• Diagram types• Database module

concepts• Full text search• Graph search• Simulation engine• BioUML server• BMOND/Biopath database

Live demonstration:•Installation of BioUML workbench•Creating and simulating simple model•SBML - Biomodels module•BioPAX import•BMOND database web interface•JavaScript shell

Page 4: BioUML

Part 2: new concepts and possibilities

Overview• Reconstruction as

solitaire game• Levels of biological

information• BioHub concept• Composite database

module• Composite diagram• Experiment concept• Graphic notation editor• Microarray data analysis

Live demonstration•Loading database modules from server•Text search•Graph search•Creating of composite database module•Creating of composite diagram•Experiment•Graphic notation editor•Microarray data analysis

Page 5: BioUML

Useful resourceshttp://www.biouml.org/demo

Flash movies that demonstrates how to work with BioUML workbench

http://www.biouml.org/user/help/index.html

http://www.biouml.org/download/0.7.8/manual.doc

Useguide, >200 pages

- HTML version

- MS Word document

http://bmond.biouml.org

Examples of pathway annotation:

BMOND – Biological Models aNd Diagrams database

Page 6: BioUML

Part 1

Overview of BioUML workbench

Page 7: BioUML

Main BioUML concepts and ideas• Visual modeling

o Meta model – problem domain neutral level of abstraction that describes system as compartmentalized graph

o Diagram type concept – formally defines graphical notation and provides its incorporation into BioUML workbench.

o Automated code generation for model simulation.• Database module concept - allows developer to incorporate

databases on biological pathways into BioUML workbench taking into account database peculiarities.

• Plug-in based architecture (Eclipse platform runtime from IBM company).

Page 8: BioUML

Biological databases

Data search and retrieving

Visual modeling

Automated code generation for model simulation of model behavior

Formal description of structure of biological system

MATLAB code Java code

Simulating using MATLAB.JMatLink allows to BioUML

workbench to start MATLAB and retrieve simulations results

Java simulation plug-in.Contains ODE solvers ported from odeToJava and methods

for hybrid models support.

… code

Page 9: BioUML

Meta model

Page 10: BioUML

][1 Akdt

dA

][2][1 BkAkdt

dB

][2 Bkdt

dC

Corresponding mathematical model:

Example: system from two chemical reactions

A B-k1[A] k1[A]

R1C

-k2[B] K2[B]

R2100 0 0

k1 - reaction rate for R1k2 – reaction rate for R2

Page 11: BioUML

A B

-k1[A] k1[A]

R1 C

-k2[B] k2[B]

R2

100 0 0

System structure is described as a graph

Mathematical modelof the system

Description of system components in the database

ID ACC .....//

ID R1A->B

...//

ID BCC .....//

ID R2B->C

...//

ID CCC .....//

A B-k1[A] k1[A]

R1C

-k2[B] k2[B]

R2100 0 0

Meta-model: example of formal description of system from two chemical reactions

Page 12: BioUML

Suggested approach can be applied for modeling biological systems using:– Systems of ordinary differential equations– Systems of algebra-differential equations– State and transition diagrams– Hybrid models– Boolean and logical networks– Petri nets– Markov chains– Stochastic models– Cellular automates– …

Some limitations– Spatial models– PDE– …

Page 13: BioUML

BioUML architecture

Page 14: BioUML

Plug-in based architecture

Plug-in- plugin.xml

- Java jar files

A plug-in is the smallest unit of BioUML workbench function that can be developed and delivered separately into BioUML workbench. A plug-in is described in an XML manifest file, called plugin.xml. The parsed contents of plug-in manifest files are made available programmatically through a plug-in registry API provided by Eclipse runtime.

- extension points are well-defined function points in the system where other plug-ins can contribute functionality.

- extension is a specific contribution to an extension point. Plug-ins can define their own extension points, so that other plug-ins can integrate tightly with them.

Plug-in - plugin.xml

- Java jar filesPlug-in

- plugin.xml- etc.

Eclipse platform runtime

Page 15: BioUML

Meta model

Executable model

Graph structure

Standard module

Database

Database adapter

Java objects

Gene Protein …

Diagram types- Semantic map- Pathway- Pathway simulation

Eclipse platform runtime Workbench UI

Diagram editor

Analysistools

Simulation tools

Other tools

Views, editors

Menus, toolbars, etc.

GeneNet module

KEGG/pathwaysmodule

TRANSPATHmodule

SBML module

Perspectives

Diagram view part

Diagram editor part

DiagramType-semantic controller-diagram view builder-diagram filter

ModuleType-diagram types-data categories-query engine

Query engine

Page 16: BioUML

Formal description and modeling of biological systems require coordinated efforts of different group of researchers:

• programmers - they should provide computer tools for this task.

• problem domain experts - they should specify what and how should be described.

• experimenters and annotators - they should describe corresponding data following to these rules.

• mathematicians - they should provide methods for models analysis and simulations.

BioUML architecture separates these tasks so they can be effectively solved by corresponding group of researchers and provides simple contract how these groups and corresponding software parts should communicate.

Page 17: BioUML

Diagram types

Page 18: BioUML

Diagram type conceptDiagram type defines:

·  types of biological components and their interactions that can be shown on the diagram;

·  diagram view builder - it is used to generate view for each diagram element taking into account problem domain peculiarities;

· semantic controller - provides semantic integrity of the diagram during its editing;

· filters – hide or highlight diagram elements according to some selection criteria.

Page 19: BioUML

Reconstruction and formal description of biological systems using different diagram types

1. Semantic network

2. Pathway diagram(semantic network + gene network or metabolic pathway)

3. Metabolic pathway 4. Gene network

5. Pathway simulation(mathematical model)

Formality,details Semi-structured

data

Structured data(reactions and its components)

Kinetic data(kinetic laws, constants,

initial values

Page 20: BioUML

Graphic notation

Page 21: BioUML

Stimulus activating NF-kappaB(semantic network, ontology)

Page 22: BioUML

NF-kappaB family (semantic network, ontology)

Page 23: BioUML

Function of human DNA methyltransferases (pathway diagram)

Page 24: BioUML

The biosynthesis of catecholamines(metabolic pathway)

Page 25: BioUML

Cell cycle model of mammalian G1/S transition control with E2F feedback loops

(pathway simulation diagram)

Page 26: BioUML

DGR0356 “NF-kB model” (Hoffmann et al., 2002)

Page 27: BioUML

NF-kB dynamics in nucleus and cytoplasm before and after TNF-alpha stimulation (Hoffmann et al., 2002)

Page 28: BioUML

Regulation of caspase-3 activation and degradation (Stucki and Simon, 2005 )

Page 29: BioUML
Page 30: BioUML
Page 31: BioUML
Page 32: BioUML
Page 33: BioUML

Database module conceptThe database module concept allows to

developer define new diagram types and incorporate other databases on biological pathways into BioUML framework.

The database module defines mapping of database content into diagram elements and diagram types that can be used with the database.

Module also provides query engine that can be used by BioUML workbench to find interactiong components of the system.

Page 34: BioUML

BioUML database modulesBioUML standard module

Databases• EBI databases: Ensembl, UniProt, ChEBI, GeneOntology• Biopath/BMOND (http://biopath.biouml.org)• KEGG/Ligand (http://www.kegg.com)• TRANSPATH (http://www.biobase.de) • GeneNet (http://wwwmgs.bionet.nsc.ru)

Formats• SBML – Systems Biology Markup Language, level 1, 2

(http:// www.sbml.org)• CellML – Cell Markup Language (http://www.cellml.org)• BioPax – Biological Pathways Exchange (http://www.biopax.org)• PSI-MI• OBO• GXL - Graph eXchange Language (http://www.gupro.de/GXL)

Page 35: BioUML

KEGG pathway

Page 36: BioUML

CellML model

Page 37: BioUML

SBML model

Page 38: BioUML

Full text search

Page 39: BioUML

User interface for full text search: 1) pop-up menu; 2) menu buttons for selected entity; 3) full text search pane.

Page 40: BioUML

Full text search (uses Lucene engine)

Page 41: BioUML
Page 42: BioUML

Graph search

Page 43: BioUML

Graph search engine

Page 44: BioUML
Page 45: BioUML

Simulation engine

Page 46: BioUML

Biological databases

Data search and retrieving

Visual modeling

Automated code generation for model simulation of model behavior

Formal description of structure of biological system

MATLAB code Java code

Simulating using MATLAB.JMatLink allows to BioUML

workbench to start MATLAB and retrieve simulations results

Java simulation plug-in.Contains ODE solvers ported from odeToJava and methods

for hybrid models support.

… code

Page 47: BioUML
Page 48: BioUML

%script for 'CellCycle_1991Gol' model simulation

%constants declarationglobal Reaction1_vi Reaction2_kd Reaction4_K1 Reaction4_Kc Reaction4_VM1 Reaction5_K3

Reaction5_VM3 Reaction6_K2 Reaction6_V2 Reaction7_K4 Reaction7_V4Reaction1_vi = 0.023Reaction2_kd = 0.00333Reaction4_K1 = 0.1Reaction4_Kc = 0.3Reaction4_VM1 = 0.5Reaction5_K3 = 0.1Reaction5_VM3 = 0.2Reaction6_K2 = 0.1Reaction6_V2 = 0.167Reaction7_K4 = 0.1Reaction7_V4 = 0.1

%Model rate variables and their initial valuesy = []y(1) = 0.0 % y(1) - $cytoplasm.Cy(2) = 0.0 % y(2) - $cytoplasm.EmptySety(3) = 0.0 % y(3) - $cytoplasm.My(4) = 0.0 % y(4) - $cytoplasm.X

%numeric equation solving[t,y] = ode23('CellCycle_1991Gol_dy',[0 100],y)

%plot the solver outputplot(t,y(:,1),'-',t,y(:,2),'-',t,y(:,3),'-',t,y(:,4),'-')title ('Solving Goldbeter problem')ylabel ('y(t)')xlabel ('x(t)')legend('$cytoplasm.C','$cytoplasm.EmptySet','$cytoplasm.M','$cytoplasm.X');

Page 49: BioUML

Function to calculate dy/dt for the model

function dy = CellCycle_1991Gol_dy(t, y)% Calculates dy/dt for 'CellCycle_1991Gol' model.

%constants declarationglobal Reaction1_vi Reaction2_kd Reaction4_K1 Reaction4_Kc Reaction4_VM1 Reaction5_K3

Reaction5_VM3 Reaction6_K2 Reaction6_V2 Reaction7_K4 Reaction7_V4

% write rules to calculate some eqution parametersrateOfReaction1 = Reaction1_vi;rateOfReaction4 = ((1 - y(3))*Reaction4_VM1*y(1))/((1 + Reaction4_K1 -

y(3))*(Reaction4_Kc + y(1)));rateOfReaction5 = (Reaction5_VM3*(1 - y(4))*y(3))/(1 + Reaction5_K3 - y(4));rateOfReaction6 = (y(3)*Reaction6_V2)/(Reaction6_K2 + y(3));rateOfReaction7 = (Reaction7_V4*y(4))/(Reaction7_K4 + y(4));rateOfReaction2 = y(1)*Reaction2_kd;

% calculates dy/dt for 'CellCycle-1991Gol.xml' modeldy = [ + rateOfReaction1 - rateOfReaction2 - rateOfReaction1 - rateOfReaction4 - rateOfReaction5 + rateOfReaction6 +

rateOfReaction7 + rateOfReaction2 + rateOfReaction4 - rateOfReaction6 + rateOfReaction5 - rateOfReaction7]

Page 50: BioUML
Page 51: BioUML
Page 52: BioUML

Results of SBML semantic tests

Page 53: BioUML
Page 54: BioUML

BioModels – comparison BioUML simulation

results with other simulators

http://www.biouml.org/_biomodels/

Page 55: BioUML

Simulators comparison criteriaPassed – CSV file was generated by simulator

interval criteria no difference - 0.999 * min < x < 1.001 * max or x < ZERO and max < ZERO small difference – 0.5 * min < x < 1.5 * max significant difference - otherwise

median criteria no difference - abs((x – median)/median) < 0.01 or x < ZERO and median < ZERO small difference - abs((x – median)/median) < 0.5 significant difference – otherwise

x – variable value provided by compared simulatormin, max, median – calculated from values provided by other simulators with

which the specified simulator is being compared.

Implementation note: if result file was not generated by BioUML, then other simulators can be compared one to each other.

Page 56: BioUML
Page 57: BioUML

BioUML Enterprise Edition:BioUML server

Page 58: BioUML

BioUML workbench

Servlet container: Tomcat

BioUML EE architecture

MySQL database

Web browser

JDBC

BeanExplorerEnterprise Edition

Client side:

Server side:

Database module

BioUML servlet

JDBC DB module

Lucene full textsearch engine

Page 59: BioUML

BMOND Biological MOdels aNd Diagrams

database

(former name – Biopath)

Page 60: BioUML

BioUML workbench

Servlet container: Tomcat

BMOND system architecture

BiopathMySQL database

Web browser

JDBC

BeanExplorerEnterprise Edition

Client side:

Server side:

Biopath module

Page 61: BioUML

Figure 4. G1/S entry model (Kel et al., 2000) described using BioUML technology.

Page 62: BioUML
Page 63: BioUML

BMOND web interfacelive demonstration

http://bmond.biouml.org

- Interface overview- View diagrams- View diagram components- List of diagram components- Categories (classification)- Filter- Dynamic columns- Web forms for components editing

Page 64: BioUML

Part 2

New concepts and possibilities

Page 65: BioUML

Part 2: new concepts and possibilities

Overview• Reconstruction as

solitaire game• Levels of biological

information• BioHub concept• Composite database

module• Composite diagram• Experiment concept• Graphic notation editor• Microarray data analysis

Live demonstration•Loading database modules from server•Text search•Graph search•Creating of composite database module•Creating of composite diagram•Experiment•Graphic notation editor•Microarray data analysis

Page 66: BioUML

Metaphor: biological systems reconstruction as solitaire (patience) game

Desk – BioUML editor

Solitaire – biological pathway

Cards – biological objects(genes, proteins, lipids, etc.)

Pack of cards – different biological databases

Page 67: BioUML

UniProtEnsembl ChEBI GOLevel 1: Catalogs

Level 2: Pathways,models

GeneModels

Biological objects

Levels of biological information

refers

Level 3: Problem specific

Cyclonet- leads- actions- targets

refers refers

LipidNetclassifications:- lipids- genes

refers

UbiProtclassifications:E1, E2, E3, …

Main idea for data integration and pathway reconstruction: - escape information duplication- classify components of biological pathways by levels- each next level should refer but do not duplicate information from previous levels- use free EBI databases whenever it is possible.

BMOND

refers

wiki

wiki

wiki

Page 68: BioUML

Add-on technologyThis approach should help us to solve difficulties with usage of external catalogs when external catalog does not contain needed entity (for example gene or substance) or when we would like to add some information to existing entity description. Example for BMOND2, gene: special table allow us to add new entity to BMOND2 if such entity missing in corresponding external catalog.

Gene catalogEnsembl

Geneadd-on table

Synonyms

Description

DB references

Literature references

Classification

SQL query

BioUML

BeanExplorerWeb interface

Java object

Lucene

Document

Page 69: BioUML

BioHub

Page 70: BioUML

BioHub concept• BioHUB – an approach link information from different databases.

Main usage:– binding microarray (omics) data to pathway diagrams– graph search– DBReferences editor– microarray (omics) data analysis

• Follows to MIRIAM standard:– References to database objects– Relationships between biological objects

• Simple Java API

Page 71: BioUML

BioHub structure Entities- DB_ID- version- ID- AC- species- description- key words

Relations- DB_ID_1- DB_version_1- ID_1- DB_ID_2- DB_version_2- ID_2- relation- evidence- comment

Databases- DB_ID- name- description- URL- url_patern_ID- url_patern_AC

RelationTypes- relation- description- backwardRelation- comment

RelationInfo- DB_ID_1- DB_ID_2- relation- comment

Page 72: BioUML

UniProtEnsembl ChEBI GOLevel 1: Catalogs

Level 2: Pathways,models

GeneModels

Biological objects

Linking with experimental data and results of analysis

refers

Level 3: Problem specific

BMOND

refers

Experimental data,results of analysis

BioHUB

OMICS data

Results of analysis

MSigDB GeneAtlas,NCI60

Cyclonet- leads- actions- targets

refers refers

LipidNetclassifications:- lipids- genes

refers

UbiProtclassifications:E1, E2, E3, …

wiki

wiki

wiki

Page 73: BioUML

UniProtEnsembl ChEBI GOLevel 1: Catalogs

Level 2: Pathways,models

GeneModels

Biological objects

Linking with external databases

refers

Level 3: Problem specific

BMOND

refers

Experimental data,results of analysis

BioHUB

OMICS data

Results of analysis

MSigDB GeneAtlas,NCI60

External databases:- KEGG- LipidMap, LipidBank- Reactome, …

Cyclonet- leads- actions- targets

refers refers

LipidNetclassifications:- lipids- genes

refers

UbiProtclassifications:E1, E2, E3, …

wiki

wiki

wiki

Page 74: BioUML

Coloring diagram according to microarray data.Each bar corresponds to one value from corresponding microarray series.

Page 75: BioUML

Coloring diagram according to omics data

Page 76: BioUML

BioHub usage: graph search engine

Page 77: BioUML

Composite database module

Flash movie: XML_module.exe

Page 78: BioUML

Composite database module is defined formally as XML document. It allows:• specify dependencies from other database modules• specify data types that can be used from external database modules• describe dynamic properties for add-on technology• specify what dynamic properties can be added to data types from external modules. This information will be stored in local module and merged dynamically with information from external modules. By this way user can add information to external catalogs like Ensembl, UniPropt, etc.• specify data types used by local module• specify diagram types used by local module• specify QueryEngine

Composite database module

Page 79: BioUML

DTD

<!ELEMENT dbModule (jdbcConnection, properties?, dependencies?, types?)>

<!ATTLIST dbModule > name CDATA #REQUIRED title CDATA #REQUIRED description PCDATA version CDATA "0.8.0" type CDATA text|SQL databaseType CDATA databaseVersion CDATA databaseName CDATA >

<!ELEMENT jdbcConnection><!ATTLIST jdbcConnection> name CDATA #REQUIRED jdbcDriverClass CDATA #REQUIRED jdbcURL CDATA #REQUIRED jdbcUser CDATA jdbcPassword CDATA>

Page 80: BioUML

<!-- ================================================================ --><!-- Properties - definition of properties for all types of diagram --><!-- elements used by the graphic notation. --><!-- --><!-- Possible property types: --><!-- - simple types: boolean, int, double, String --><!-- - array --><!-- - composite --><!-- ================================================================ --><!ELEMENT properties (property*)><!ELEMENT property (tags?)><!ATTLIST property name CDATA #REQUIRED type CDATA #REQUIRED short-description CDATA #IMPLIED value CDATA ><!ELEMENT tags (tag+)><!ELEMENT tag><!ATTLIST tag name CDATA #REQUIRED value CDATA #IMPLIED ><!ELEMENT propertyRef><!ATTLIST propertyRef name CDATA #REQUIRED value CDATA >

Page 81: BioUML

<!-- ================================================================--><!-- Dependencies from other databases and modules --><!-- Graphic notations can be defined in the specialized module --><!-- =============================================================== -->

<!ELEMENT dependencies (dbModule*, graphicNotation*)><!ELEMENT dbModule (externalType+)><!ATTLIST dbModule> name CDATA #REQUIRED ><!ELEMENT externalType (propertyRef*)><!ATTLIST externalType name CDATA #REQUIRED readOnly CDATA true|false ><!ELEMENT graphicNotation><!ATTLIST graphicNotation> name CDATA #REQUIRED type CDATA Java|XML class CDATA path CDATA>

Page 82: BioUML

<!-- ================================================================ --><!-- Internal data types for this module --><!-- Description of internal type should provide all information to --><!-- create corresponding DataCollection --><!-- ================================================================ -->

<!ELEMENT types (internalType*)><!ELEMENT internalType (querySystem, propertyRef*)><!ATTLIST internalType> section CDATA #REQUIRED name CDATA #REQUIRED class CDATA #REQUIRED transformer CDATA #REQUIRED ><!ELEMENT querySystem (index*)><!ATTLIST querySystem> class CDATA #REQUIRED luceneIndexes CDATA><!ELEMENT index><!ATTLIST index> class CDATA #REQUIRED table CDATA>

Page 83: BioUML

Editor for composite database module

Page 84: BioUML

Editor for composite database module

Page 85: BioUML

Editor for composite database module

Page 86: BioUML

Editor for composite database module

Page 87: BioUML

Current status:Implemented:• Database modules (initial version):

Ensembl, UniProt, ChEBI, GO,IntAct, Reactome, BioModels

• Composite module (external referencies)– Defined as XML– Composite module editor

• Selecting and loading modules from server

In process:• BioHUB• Protein state concept• Add-on technology• BMOND2 – redesigned version of BMOND.

Page 88: BioUML

From huge theory to practical output

Automated language translation

Practical output• electronic dictionaries• spell checkers

Biological data

integrations

Practical output• catalogs (Ensembl, UniProt, CheBI)• controlled vocabularies, ontologies• hubs

Page 89: BioUML

Model composition

Page 90: BioUML

Composite diagram: main conceptsblock (EModel)dx/dt = f1dy/dt = f2z = f3

block 2 (EModel)dx/dt = f5dy/dt = f6 + z k+z+f4 = 0

x x

y y

subdiagram (EModel)

Rs1 s2

e

x

e

block 3 (EModel)dx/dt = f5 dy/dt = f6 + block2.k k+z+f4 = 0

Indirect linkx

xforbidden

Rs2 s3

s1

s1

f(x)

s4direct participation of subdiagram elementin a reaction

Block types:1) block – only mathematical equations. Used mainly for physiological models;2) subdiagram – other diagram

Connection types:1) directed – input output. Transformation function can be used;2) undirected – contact. Indicates that 2 nodes in mode is the same entity.

Semantic constraints:There are semantic constraints, for example: block can have only one input for each variable. Two inputs are forbidden for the same variable.

Flat model:Before Matlab or Java code generation composite model is transformed into flat model and usual genertions routines are used.

Page 91: BioUML

Experiment

Page 92: BioUML

Experiment

To make a virtual experiment it is frequently needed to modify initial model.

Typical modifications (changes) are:• changing of initial values• changing of model parameters to imitate different conditions or mutations• deleting of some model elements to imitate knock-out mutations• adding events to imitate external influences on the model

To skip model duplications for each virtual experiment we introduce “changes” concept.

Page 93: BioUML
Page 94: BioUML
Page 95: BioUML

Graphic notation

formal definition as XML document

http://www.biouml.org/sbgn.shtmlFlash movie: Graphic_Notations_Editor.exe

Page 96: BioUML

Graphic notation versus graph layout

• allows edit diagram• allows to create new diagram• different graphic notations can be applied

to the same SBML model• allows formally define SBGN and use it in

SBML models• allows to reuse graphic notation by many

tools

Page 97: BioUML

Graphic notation can be defined formally as XML document• properties – formal definition of properties that can be used as properties of nodes and

edges (for example, title, multimer, etc.). Definition of property includes:– name– type– short description– controlled vocabulary (optional)

• node types – definition of node includes:– name– icon – properties– view function (JavaScript)– short description

• edge types – definition of edge includes:– name– icon – properties– view function (JavaScript)– short description

• semantic controller – defines rules for semantic control of diagram integrity. For this purpose it defines following functions:

– canAccept (JavaScript)– isResizable (JavaScript)– move (JavaScript)

• Examples – a set of diagrams that can be used as test cases, legend and examples for the graphic notation. DML - Diagram Markup Language – is used for this purpose.

Page 98: BioUML

SBML …

Diagram

Model API

BioPAX Layoutinformation

Graphic notation

Layout APINotation API

Rendering engine JavaScript functions:

- build node/edge view

- semantic control

Initial data

JavaScriptAPI for

data access

Rendering APIJavaScript API for creating primitives

similar with SBML layout extension

Basic software architecture for rendering of biological models according to specified graphic notation and layout information

Page 99: BioUML

Formal definition of graphic notation as XML document and integration with SBML format

Graphic notation components

Defined as SBML

Object types XML <annotations> Object properties XML <annotations> User defined properties XML <annotations> Rules for visualization JavaScript Rules for semantic control JavaScript Test cases XML model, module

Page 100: BioUML

Graphic notation editormain concepts

• graphic notation is defined formally as XML document• graphic notation editor provides user friendly interface for

XML document editing• SBGN graphic notation (prototype) is implemented• BioUML workbench allows to create and edit diagrams

using graphic notation defined as XML document• May be graphic editor will be useful for SBGN

community for:– improving SBGN specification– for testing SBGN specification by creating different diagrams

Details: http://www.biouml.org/sbgn.shtml

Page 101: BioUML

BioUML workbech

Select ‘Data’ tab to see the tab with a listwith available graphic notations

Page 102: BioUML

Click right mouse button onselected graphic notation to open it Graphic Notation Editor

Page 103: BioUML

Graphic Notation Editor

Main sections of formal definitionof graphic notation

Page 104: BioUML

List of specific propertiesthat are used by graphic notation

Properties editor

Page 105: BioUML

User can click right mouse button on Properties node to create new property

Page 106: BioUML

Nodes – contains list of all node typesused by graphic notation

Page 107: BioUML

For each node type user can define:- name- properties- icon- view function (JavaScript)

Page 108: BioUML

By clicking right mouse button on “Nodes”user can create new node type

Page 109: BioUML

By the same way user can define edge type:- name- properties- icon- view function (JavaScript)

Page 110: BioUML

“Examples” nodecontains a set of diagrams that demonstrates usage of graphic notation.

Page 111: BioUML

User can create and edit such diagram.

Page 112: BioUML

When user selects some element on the diagram he can edit:- object properties- JavaScript that builds a view for selected diagram element

Page 113: BioUML

“Semantic controller” nodecontains list of JavaScript functions that provide semantic constraints and semantic integrity of the diagram.

Page 114: BioUML

Graphic notation defined as XML document can be used by BioUML workbench to create corresponding diagram.

Page 115: BioUML
Page 116: BioUML

Graphic Notation Editor

SBGN examplescreated in BioUML

Page 117: BioUML
Page 118: BioUML
Page 119: BioUML
Page 120: BioUML
Page 121: BioUML
Page 122: BioUML

Skins

Page 123: BioUML
Page 124: BioUML
Page 125: BioUML

Microarray plug-in(alpha version)

Page 126: BioUML

Microarray plug-in- Import microarray data in tab delimited format- Show data as a table- Filter data by different criteria- Microarray data analysis

- Revealing up/down regulated genes- Meta-analyses

- Binding with diagram nodes by ID- Coloring diagrams- JavaScript functions

- Data manipulation (filter, join, intersect, trim, etc.)- Statistical analysis

Page 127: BioUML

Microarray plug-inCurrent work:- Powerful user interface for coloring diagrams- Support of other formats for microarray data and results

of analyses- Sophisticated binding algorithm using different database

references and ID (gene hub)

Further work:- Server module that will provide access to ArrayExpress

data

Page 128: BioUML

BioUML workbench.

Data tab contains section “Microarray”.User can import microarray data in tab delimited format into this section.

Page 129: BioUML
Page 130: BioUML
Page 131: BioUML

Possibility to filter probe sets:- by column values - selecting only those probe sets that can be linked to the specified diagram

Page 132: BioUML

Microarray analysis

Page 133: BioUML
Page 134: BioUML

Coloring diagram according to microarray data.Each bar corresponds to one value from corresponding microarray series.

Page 135: BioUML

Coloring diagram according to omics data

Page 136: BioUML

Further development:

Protein state

Page 137: BioUML

BioUML workbench: further development• Protein states• Complexes

• Improving team work on annotation– Login, single sign on– Editing history (what data were modified, whom and when)– Passing of changes from server to client

• Sequence analysis and visualization

• Agent based modeling

Page 138: BioUML

Protein state

Page 139: BioUML

Modification• The functions of macromolecular entities (mainly proteins)

are often determined not only by their primary sequences, but by chemical modifications they have undergone.

• In BMOND2 unmodified and modified forms of a protein refer to the same entity in UniProt database

• List of possible modifications is extracted from UniProt Feature Table

• BMOND2 modifications table– allows to describe modifications that are not described in UniProt.

These modifications are automatically added to the protein, referred from BMOND2.

• Modification type – control vocabulary that describes possible modification types (for example, phosphorylation, acetylation, ubiqutination)

• To take into account protein modifications State concept is used.

Page 140: BioUML

UniProt Feature Table•FT CHAIN 1 561 Cytosolic purine 5'-nucleotidase.•FT /FTId=PRO_0000064389.•FT REGION 202 210 Substrate binding (Potential).•FT COMPBIAS 549 561 Asp/Glu-rich (acidic).•FT ACT_SITE 52 52 Nucleophile.•FT ACT_SITE 54 54 Proton donor.•FT METAL 52 52 Magnesium.•FT METAL 54 54 Magnesium (via carbonyl oxygen).•FT METAL 351 351 Magnesium.•FT BINDING 127 127 Allosteric activator 1.•FT BINDING 154 154 Allosteric activator 2.•FT BINDING 354 354 Allosteric activator 2.•FT BINDING 436 436 Allosteric activator 1; via carbonyl•FT oxygen.•FT BINDING 453 453 Allosteric activator 2.•FT MOD_RES 527 527 Phosphoserine (By similarity).•FT VARIANT 3 3 T -> A (in dbSNP:rs10883841).•FT /FTId=VAR_024244.•FT VARIANT 136 136 Q -> R (in dbSNP:rs12262171).•FT /FTId=VAR_030242.

Page 141: BioUML

Modification

• position

• amynoacid

• modification type (controlled vocabulary)

• evidenceexperimental, by similarity, predicted

• comment

• Publication reference

Page 142: BioUML

State concept• State – describes states of all amino acids available for

modifications• possible values:

– ? – unknown, not specified– * – any– - – unmodified– p – phoshporylated– ac – acetylated– … – from controlled vocabulary

• Protein states are described in BMOND2 states table• Reaction – user should specify protein state• Diagram – user should specify protein state

Page 143: BioUML

State table

• module (database)

• id

• state – short name (like TRANSPATH)

• position

• modification

Page 144: BioUML

SBGN

Mapping: BMOND2 -- SBGNmodification – state variablestate – state of macromolecule

Page 145: BioUML

Complex concept

Page 146: BioUML

Complex concept

• A complex is s a biochemical entity composed of other biochemical entities, whether macromolecules, small molecules, multimers, or themselves complexes.

• Complex is specified as a set of units• Complex modifications

– all possible modifications of its units (some of them can not occur due to physical interactions between units – how we can take it into account)

• Complex state– var.1 – list of modifications for its subunits– var. 2 – list of states for its units

Page 147: BioUML

Complex tables• Complex

– ID– title (short name)– complete name– species– synonyms– comment

• References:– States– Synonyms– Structure– DBReferences– Publications

• Complex Units– complexDB– complexID– unitDB– unitID– multimer

Page 148: BioUML

SBGN

Page 149: BioUML

Reaction• Reaction components

– component identification• DB• id• [state]• [compartment]

• Reaction– [compartment]

• Reaction dialog– specie state– specie compartment– reaction compartment

• Tables– Reaction

• compartment– Reaction

components• state• compartment

Page 150: BioUML

Diagrams

• Macromolecule state– “New diagram element” dialog

• Graphic notation– BioUML

• states – right label, one modification• complexes

– SBGN skin

Page 151: BioUML

AcknowledgementsPart of this work was partially supported by following grants: • European Committee grant №037590 “Net2Drug”• Siberian Branch of Russian Academy of Sciences

(interdisciplinary projects № 46)• Volkswagen-Stiftung (I/75941), • INTAS Nr. 03-51-5218• RFBR Nr. 04-04-49826-а

Author is grateful to for useful comments, discussions and technical support

Alexander Kel Sergey Zhatchenko

Software developers Annotators Nikita Tolstyh Mikhail Puzanov Ruslan Sharipov Sergey Lapukhov Ilya Kiselev Ivan YevshinAlexander Magdysyuk Denis Ryumin Elena CheremushkinaVlad Zhvaleev Alexandr Koshukov Ekaterina Kalashnikova Vasiliy Hudyakov Igor Tyazhev Sergey Graschenko Oleg Onegov