Biological storytelling: A Software Tool for Biological Information Organization Based upon...

Post on 11-Aug-2014

110 views 1 download

Tags:

description

Allan Kuchinsky, Kathy Graham, David Moh, Annette Adler, Ketan Babaria, Michael L. Creech; Biological storytelling: a software tool for biological information organization based upon narrative structure; ACM Siggroup Bulletin 01/2002; DOI:10.1145/1556262.1556315 ISBN: 1-58113-537-8 Presented by Anjani K Dhrangadhariya Junior Student, M.S. Life Science Informatics Bonn-Aachen International Center for Information Technology B-IT

Transcript of Biological storytelling: A Software Tool for Biological Information Organization Based upon...

Biological Storytelling

A Software Tool for Biological Information Organization Based upon Narrative Structure

Presented byAnjani K DhrangadhariyaJunior Student, M.S. Life Science InformaticsBonn-Aachen International Center for Information Technology B-IT

Allan Kuchinsky, Kathy Graham, David Moh, Annette Adler, Ketan Babaria, Michael L. Creech; Biological storytelling: a software tool for biological information organization based upon narrative structure; ACM Siggroup Bulletin 01/2002; DOI:10.1145/1556262.1556315 ISBN: 1-58113-537-8

Way research really is

Your preference ?

or...This

✔ Protein A with 2 active sites interacts with Protein B having 1 active site and 1 allosteric site, in presence of co-enzyme X under high concentration of XYZ in ABC pathway and BBC temperature in XOX fish found in XXX ocean.

INTRODUCTION

● Drug designing/discovery and treatment for life threatening diseases is a mind mapping process which takes into account piecing together the story of how genes and proteins behave in pathway (in disease pathogenesis).

● Paradigm is Personalized medicine which involves taking into account peculiar genomic information and tailoring drugs for individuals according to their genetic makeup.

● Microarray: Made it possible for researchers to correlate gene expression with disease progression, screen for mutations, and treat patients according to their genetic profiles.

● Yu X, Schneiderhan-Marra N, Joos TO.;Protein microarrays forpersonalized medicine.; Clin Chem. 2010 Mar;56(3):376-87. doi: 10.1373/clinchem.2009.137158. Epub 2010 Jan 14.

DNA Microarray

DNA microarray made Paradigm shift in Medicine possible and GAVE HUGE AMOUNT OF RAW DATA which remains to be sense of....by researchers

Analysis and Synthesis

● http://en.wikipedia.org/wiki/List_of_open-source_bioinformatics_software

Synthesis Task

● Synthesis is a Greek word meaning "to put together"

● Procedure to combine separate elements or components in order to form a coherent whole

● Synthesis and Analysis task go hand in hand and complement each other.

● Every synthesis is built upon the results of a preceding analysis, and every analysis requires a subsequent synthesis in order to verify and correct its results.

● But this simply does not seem the case in Bioinformatics whereby huge number of Analysis softwares are available but synthesis softwares lack.

● Tom Ritchey.; Analysis and synthesis: On scientific method – based on a study by bernhard riemann.; Systems Research Volume 8, Issue 4, pages 21–41, December 1991; DOI: 10.1002/sres.3850080402

Synthesis task

(1) Keeping track of the diverse pieces of information,

(2) Organizing and using the diverse information,

(3) Formulating hypotheses and higher-level explanations,

(4) Sharing the information

Findings from User Research● The facts gathered and hypothesis formed during the synthesis task

were usually stored in form of printed web pages and were kept in binders but it did not serve at later date when information was required. (Finding a needle from haystack!!!!!!!!!!!)

● User research

– results came up that researchers particularly have difficulty in

keeping the track of diverse pieces of information. (Hence the

problem in synthesis task was identified)

● From user studies and other collaborative design sessions with researchers at Cancer Genetics lab, a number of themes for navigating and interacting with this huge amount of data was identified AND...

● Vicki O’Day, Annette Adler, Allan Kuchinsky, Anna Bouch.; When Worlds Collide: Molecular Biology as Interdisciplinary Collaboration; Proceedings of European Conference on Computer Supported Collaborative Work (ECSCW2001), Bonn, Germany, 2001.

A Picture emerged which explained the Synthesis tasks involved in

Biological Researches

NARRATIVE

STRUCTURE

Piecing together the story

MADE SENSE

Aspects of User Research

● Connecting the dots “mind mapping”

● Information is in free form (Structure-less)

● Many researchers, many fields and many hypothesis

● Group work, multiple perspectives

● Solving the puzzle together

● Reasoning over data

● Sharing of data

The Role of Narrative Structure

Thorndyke: Comprehensibility and recall are were a function of amount of inherent plot

structure of the story, independent of the actual content.

Schank: When prior experience is indexed cleverly, we can call it to mind to

understand current situation. This process can lead to brand new insights.

● Thorndyke,P.W.; CognitiveStructures in Comprehension and Memory of Narrative Discourses; Cognitive Psychology, 9, 1977, pp.77-110

● Schank, R, Tell Me a Story: Narrative and Intelligence, North western University Press, 1990.

Middleton and Edwards: Telling stories preserves potentially arcane and

idiosyncratic pieces of information ...

Erickson: Depicts storytelling as an integral part of design. Stories have informalities that are well suited to lack of certainity

that characterises much design related knowledge.Stories also provide concrete examples that people

from vastly different backgrounds can relate to

● Middleton, D., and Edwards, E, Collective Remembering; SAGE Publications;1990.

● Erickson, T.; Notes on Design Practice. Stories and Prototypes as Catalysts for Communication, in Scenario Based Design: Envisioning Work and Technology in Systems Development (ed .J .Carroll), Wiley, 1995, pp.37-58

Birth of the ToolThus after applying above perspectives to user findings and drawing analogies, a prototype software tool was developed on the concept of STORY TELLING which utilizes Narrative Structure as framework and allows Biologists in their Synthesis tasks-

✔ Organize and use information

✔ Build hypothesis from data

✔ Construct alternative explantions from Biological Process

✔ share diverse information

(Labs are generally interdisciplinary with each researcher having varied field related terms e.g. KB = KiloByte for Computer Scientist and KB = KiloBase for a Biologist)

OVERVIEW OF THE SOFTWARE

Features provided by BIOLOGICAL STORYTELLING

● Free form database model

● Narrative framework

● Concrete social aspect of information sharing

● Creating alternative hypothesis with reasoning

● Semi-automatic clustering of biological entities

● Extensive cross references to public and proprietary data and literature

Elements

There are three connecting entitites in the free form data model of Biological Storytelling.

Items

Biological Stories

Collections

● www.cs.umd.edu/hcil/about/events/open.../AgilentStoryTelling.ppt

Items

● Basic unit of information

● Represent biological entities like gene, proteins or other gene product like a trascript

● Items contain detailed information about biological entity in form of links to different public and proprietary databases and literature information

● Data can be automatically loaded into Items (Tabular gene expression data)

● Researcher can also manually add information

● Viewer for dataset of Items loads from microarray experiment raw data

Items

Collections

● User created and free form set of items.

● Analogous to clustering

● Collections manager component provides tree view of collections (Analogous to tree view panel in Windows Explorer)

● Data can be manually or semi-automatically loaded into Collections.

● Collections are malleable. (one can split, merge, add to collections and/or move Items from one collection to another.

● Collections can have other Collections as well as other Items (Nested)

● Repositories for links to experimental data and literature

Collections

Biological Stories

● Narrative structure to represent state of Biological hypothesis

● Textually and graphically building up Biological stories (Story editor)

● Allows to represent story in form of themes, players and explanations

● Represent paths explored and alternative hypothesis formulated

● Repositories for links to experimental data and literature

Example...

● PAX3-FHKR oncogene

● activates Myogenin and MyoD

● Action of myogenin and MyoD induces My14

● Failure of muscle cells to differentiate and end cycle = uncontrolled proliferation = Cancer

Biological Stories

Grammar

● Top level element in story editor is a STORY..

Optimal Themes

A set of Players

A set of Explanations

STORY

Theme● Works as an abstract ( as in a publication

● Entered in free form text in Story Editor

● “PAX3 oncogene activates a myogenic transcription program, causing alveolar rhabdomyosarcoma”

Player

● Items and Collections which play a role in Biological Stories

● e.g. Genes and Proteins that interact in pathways or can say kind of characters playing role in a story

● Myogenin and MyoD

Explanation

● This forms the main plot of the Story

● Contains descriptive elements about a Biological Story

● Explanation contains

– An optimal theme

– A set of players

– A set of interactions

– Annotations which support or oppose the claims made in explanation

[Theme}+Players+Explanations+[Comment]

Description+[Comment]

[Player]*+[Comment]

item|collection|[Player]*

Description//could be a URL for acitation

{Explanation|Alternative|Interaction}*+[Comment]

Explantion:[Theme]+{Explanation|Alternative|Interaction)Player|Support|Oppose}*+[Comment]

Alternative:[Theme]+{Explanation|Alternative|Interaction|Player|Support|Oppose}*+[Comment]

Interaction:Description+[Comment]+{Support|Oppose|Interaction}*+[Comment]

Putting the story together graphically

People think Graphically rather than Textually

Diagram Editor tool

Nodes = Players (Nouns)

Edges = Interactions (Verbs)

Diagram Editor Tool● More general● Prefeined Verbs for relationships

between genes.● Inhibits● Binds● Promotes

Semantic Overlays● A way of juxtaposing Biological Stories with detailed experimental data● Validate higher level hypothesis

Annotation and Citations

● Textual notes

● Arbitrary list of citations (URL)

● Each citation can again have an arbitrary textual note attached to it

● For annotating biological elements an Object Editor is provided

Support for Group Work

● Annotation is tagged with user name and timestamp

● Provides Support and Oppose elements in Story Editors which can accomodate alternative hypothesis

Web repository

User Feedback

● Conflict about use of terms in Story Editor.

– Some researchers liked literary terms while others liked scientific terms

● Discussions on importance of free form data model. (which is rather malleable)

● User suggestion: Able to group experiments intos into Collections, not just Items

● Using Players globally v/s tagging them with alternatives

● Storyspace (EastGate Systems)

● Lotus Agenda

● STKE, BIND, KEGG, EcoCyc, TransPath, SPAD

● The eLab Book

● Cell Space software

Related Work

Better for microarray experiments and not the generalized research

Data format for loading (Tabular data for automatic loading into software)

Cons !!

✔ Provision for multiple data types

✔ Evolution to support “Systems approach”

✔ Support Scientific publications

✔ Converting diagramatic representation into textual stories via use of parsers

✔ Utilization of data mining methods. (Semi-automatic in populating collections)

Future Work

Image sources● http://www.clipartheaven.com/clipart/kids_stuff/images_%28a_-_f%29/boy_-_happy_3.gif

● http://www.picgifs.com/clip-art/activities/sweating/clip-art-sweating-328953.jpg

● Snap shot of papers

➔ http://www.ncbi.nlm.nih.gov/pubmed/19937809

➔ http://www.sciencedirect.com/science/article/pii/S0197458011001783

● Snap shot of public databases:

➔ PubMed http://www.ncbi.nlm.nih.gov/pubmed

➔ Pubchem: https://pubchem.ncbi.nlm.nih.gov/

➔ Protein DataBank: http://www.rcsb.org/pdb/home/home.do

● https://genome.unc.edu/images/microarray.jpg

● http://www.bio.davidson.edu/people/macampbell/acs_magic/excel_sheet.jpg

● http://www.ub.edu/stat/docencia/bioinformatica/microarrays/ADM/imatges/SingleArray1.jpg

● http://upload.wikimedia.org/wikipedia/commons/thumb/a/a0/Caesar-dot-to-dot.svg/350px-Caesar-dot-to-dot.svg.png

Image sources

● http://codingnews.inhealthcare.com/files/2010/04/PuzzleBrain.jpg

● http://fc02.deviantart.net/fs70/i/2013/030/7/9/biology___chemistry___physics_by_rutulis-d3byt2s.png

● http://daley.med.harvard.edu/assets/Art_Science/GQD_board.jpg

● http://docs.openstack.org/trunk/openstack-ops/content/figures/1/figures/1-IMG_4895.JPG

● http://www.paperstone.co.uk/images/NewsImages/2011/messy-office.jpg