Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

31
Affymetrix/BioCarta comparison & Java- based pathway analysis Michael Edmonson <[email protected]> 2/26/2003

Transcript of Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Page 1: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Affymetrix/BioCarta comparison & Java-based pathway analysis

Michael Edmonson <[email protected]>

2/26/2003

Page 2: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Goals• Create programmable models of BioCarta pathway

gene interaction networks• Encode “rules” of known gene interactions in

software• Create association between available experimental

assays (microarrays) and pathway elements• Populate model with experimental data and

compare with expected states

Page 3: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

BioCarta pathway example

Page 4: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Basic uses of model

• Static state diagram

• Dynamic system

Page 5: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Static/state-based modeling• Load model with static “snapshot” or state data taken

from microarray experiment• With data from normal tissues, use resulting state to

validate model (is the data consistent with the rules of the model?)

• With cancerous data, see if state of the model can be explained by “broken” logic: detect breakdowns in normal gene function and attempt to backtrace failures to first causes

Page 6: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Dynamic modeling• Integration of code with higher-level applications• Model will be a working system whose state changes

over a period of time\• Systematic/programmatic exploration of effects of

arbitrary changes in the model’s state• Explore interconnections between pathways

Page 7: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Source data

Page 8: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Fundamentals

• Functionality of model will be dictated by data used to populate it

• Need to connect BioCarta pathways with Affymetrix assays– Desirable to automatically maintain mapping as

new data becomes available

• Web-based chip/pathway browsing tools

Page 9: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Available BioCarta data

• List of pathway names and genes contained within them

• Graphic-only pathway diagrams (no annotations of relationships between pathway elements)

• Not computable

Page 10: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Available Affymetrix dataExisting database tables: Bob Clifford et al.:

Table Contentsrflp.affychip Probe information by chiprflp.affy_test Experimental data (Leslie

Derr et al.)rflp.affy_seq Probe sequencerflp.affy2ug UniGene cluster mapping

(static)clifforr.affy_tissue Tissue code table

clifforr.affy_histology Histology code tableclifforr.affy_sample Sample information

Page 11: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

New database tables: BioCarta

TABLE CONTENTSbiocarta_pathways Name, description

biocarta_genes Name, gene list

biocarta_keyword Keywords from name, genes

• derived from CGAP flatfile

• RFLP database on LPG server

Page 12: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

New tables: BioCarta to AffyTable ContentsAffyacc2gene Translates affy probe accessions to

gene symbols via UniGeneAffy_pathway For each pathway and chip, count

and percentage of genes presentAffy_pathway_gene Detail of present/absent genes for

each pathway and chipAffy_biocarta_basis UniGene build used for mapping

• “pathway” bot keeps tables updated with each new UniGene build

• revisions needed: UniGene clustering issues, ambiguous probes, etc

Page 13: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Chip/pathway browser:affy2biocarta

Page 14: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

affy2biocarta

• http://lpgfs.nci.nih.gov:82/perl/affy2biocarta• Frontend to database; details how well pathways are

covered by individual chips• Searchable by gene, pathway or chip• Master report for each pathway of best chip to use• Ability to search for probes for missing genes on

other chips

Page 15: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

affy2biocarta: top-level

Page 16: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

affy2biocarta: pathway selector

Page 17: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

affy2biocarta: pathway/chip selector

Page 18: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

affy2biocarta: gene detail

• Puzzlements: multiple sequences, missing entry

Page 19: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

affy2biocarta: “missing” gene search

• Note probes were found on an earlier chip!

Page 20: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Omissions in chip revisions

• HG-U133A generally has the most complete pathway coverage

• However, for 45 genes in BioCarta pathways no matching probe accessions could be found

• Of these 45:– 32 (71%) were found in Hs.127 (which predates 133 set)

– 36 (80%) were found on other chips

Page 21: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Multiple sequences/probes for same gene

• A single pathway element (gene) may have multiple probes/sequences representing it

• These states often do not all agree in expression data

• Relationship between probes and BioCarta elements needs clarification

Page 22: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Expression data with disagreements

Often not a 1:1 relationship between Affymetrix probes and pathway entries...

Page 23: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Pathway interconnections

• Many genes appear in multiple pathways, a few appear in many

• Concept of “connectome”, a.k.a. “furball”

• Potential for indirect feedback from greater system (no pathway is an island)

• Difficult to explore in detail without database of connections

Page 24: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Genes in multiple pathways

Page 25: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Java modeling

Page 26: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Implementation: Java• OOP

• Pathways are completely encapsulated in objects which can be embedded in higher-level programs– Programmatic control of node and connection states

• Simple classes representing elements in pathway and connections between them– Nodes, Connections, Complexes– ability to propagate signals around the network

Page 27: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Node

• A discrete component in the network: usually a gene but can be any event which can effect the system (contact inhibition, etc.)

• Each node has a state, which is currently binary (on or off)– Binary states resemble “present/absent” expression data, but

this highlights contention/deadlocking problem

• Contains incoming and outgoing connections to other nodes in the network

Page 28: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Connection

• Object describing a link between nodes and the relationship between them

• abstract execute_action() method implemented by different connection types

• example:– LogicalConnection: state of source node determines state of destination

node

– SimpleActivator, SimpleBlocker

• Connections may be individually disabled to emulate non-functioning of upstream process

Page 29: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

Complex

• Container for multiple discrete subelements

• Provides higher-order logic based on evaluation of components’ state; e.g. performing some action only when all subcomponents are considered active

• additional functionality beyond component parts

Page 30: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

State change propagation

• Setting the state of a node propagates the effect of that change on downstream connections

• During propagation a list of initiating nodes is accumulated and passed along; propagation stops if an initiating node is encountered again (prevents infinite loops)

Page 31: Affymetrix/BioCarta comparison & Java-based pathway analysis Michael Edmonson 2/26/2003.

What’s Next• State validation/sanity checking

• Diagnosis/backtracing of “broken” logic

• More subtle states and connection types (beyond a binary system)

• Improved probe/gene mappings

• Automated model instantiation from curated database

• Incorporation into higher-level programs