Analysis EnvironmentsAnalysis Environments For Scientific CommunitiesFor Scientific Communities
From Bases to SpacesFrom Bases to Spaces
Bruce R. SchatzInstitute for Genomic Biology
University of Illinois at [email protected],www.beespace.uiuc.edu
Baker Center for BioinformaticsIowa State University
October 6, 2006
What are Analysis EnvironmentsWhat are Analysis Environments
Functional Analysis Find the underlying Mechanisms Of Genes, Behaviors, Diseases
Comparative Analysis Top-down data mining (vs Bottom-up) Multiple Sources especially literature
Building Analysis EnvironmentsBuilding Analysis Environments
Manual by Humans Interaction user navigation Classification collection indexing
Automatic by Computers Federation search bridges Integration results links
Trends in Analysis EnvironmentsTrends in Analysis Environments
Central versus Distributed Viewpoints
The 90s Pre-Genome Entrez (NIH NCBI) versus WCS (NSF Arizona)
The 00s Post-Genome GO (NIH curators) versus BeeSpace (NSF Illinois)
Pre-Genome EnvironmentsPre-Genome Environments
Focused on Syntax pre-Web
WCS (Worm Community System) Search words across sources Follow links across sources Words automatic, Links manual
Towards Integrated Searching
Post-Genome EnvironmentsPost-Genome Environments
Focused on Semantics post-Web
BeeSpace (Honey Bee Inter Space) Navigate concepts across sources Integrate data across sources Concepts automatic, Links automatic
Towards Conceptual Navigation
Worm Community SystemWorm Community System WCS Information:Literature BIOSIS, MEDLINE, newsletters,
meetings
Data Genes, Maps, Sequences, strains, cells
WCS FunctionalityBrowsing search, navigationFiltering selection, analysisSharing linking, publishing
WCS: 250 users at 50 labs across Internet (1991)
WCSMolecular
WCS Cellular
WCS invokes
gm
WCS vis-à-vis
acedb
from Objects to Concepts
from Syntax to Semantics
Infrastructure is Interaction with Abstraction
Internet is packet transmission across computers
Interspace is concept navigation across repositories
Towards the InterspaceTowards the Interspace
THE THIRD WAVE OF NET EVOLUTIONTHE THIRD WAVE OF NET EVOLUTION
PACKETS
OBJECTS
CONCEPTS
Technology
Engineering
Electrical
FORMAL
INFORMAL
(manual)
(automatic)
IEEE
communities
groups
individuals
LEVELS OF INDEXESLEVELS OF INDEXES
Post-Genome Informatics IPost-Genome Informatics I
Comparative Analysis within theDry Lab of Biological Knowledge
Classical Organisms have Genetic Descriptions.There will be NO more classical organisms beyondMice and Men, Worms and Flies, Yeasts and Weeds.
Must use comparative genomics on classical organismsVia sequence homologies and literature analysis.
Post-Genome Informatics IIPost-Genome Informatics II
Functional Analysis within theDry Lab of Biological Knowledge
Automatic annotation of genes to standard classifications, e.g. Gene Ontology via homology on computed protein sequences.
Automatic analysis of functions to scientific literature, e.g. concept spaces via text extractions. Thus must use functions in literature descriptions.
Informatics: From Bases to SpacesInformatics: From Bases to Spaces
data Bases support genome datae.g. FlyBase has sequences and mapsGenes annotated by GeneOntology and
linked to biological literature
information Spaces support biological literaturee.g. BeeSpace uses automatically generated conceptual relationships to navigate functions
BeeSpace FIBR ProjectBeeSpace FIBR Project
BeeSpace project is NSF FIBR flagshipFrontiers Integrative Biological Research, $5M for 5 years at University of Illinois
Analyzing Nature and Nurture in Societal Roles using honey bee as model
(Functional Analysis of Social Behavior)
Genomic technologies in wet lab and dry lab BeeBee [Biology] gene expressions SpaceSpace [Informatics] concept navigations
System ArchitectureSystem Architecture
Concept Navigation in BeeSpaceConcept Navigation in BeeSpace
NeuroscienceLiterature
MolecularBiology
Literature
BeeLiterature
Flybase,WormBase
BeeGenome
Brain RegionLocalization
Brain GeneExpression
Profiles
BehavioralBiologist
MolecularBiologist
Neuro-scientist
V1 BeeSpace Community CollectionsV1 BeeSpace Community Collections
Organism Honey Bee / Fruit Fly Song Bird / Soy Bean
Behavior Social / Territorial Foraging / Nesting
Development Behavioral Maturation Insect Development Insect Communication
Structure Fly Genetics / Fly Biochemistry Fly Physiology / Insect Neurophysiology
CONCEPT SWITCHINGCONCEPT SWITCHING
“Concept” versus “Term” set of “semantically” equivalent terms
Concept switching region to region (set to set) match
term
Semantic region
Concept SpaceConcept Space
BeeSpace Analysis EnvironmentBeeSpace Analysis Environment Build Concept Space of Biomedical Literature
for Functional Analysis of Bee Genes
-Partition Literature into Community Collections-Extract and Index Concepts within Collections-Navigate Concepts within Documents-Follow Links from Documents into Databases
Locate Candidate Genes in Related Literatures then follow links into Genome Databases
Well Characterized GeneWell Characterized Gene
Poorly Characterized GenePoorly Characterized Gene
Gene Summarization, BeeSpace V2
Collaboration across UsersCollaboration across Users
Category Browse (Collection)Category Browse (Collection)
Category Browse (Search)Category Browse (Search)
PlantSpace ExamplesPlantSpace Examples
Interactive Functional AnalysisInteractive Functional AnalysisBeeSpace will enable users to navigate a uniform space of
diverse databases and literature sources for hypothesis development and testing, with a software system beyond a searchable database, using literature analyses to discover functional relationships between genes and behavior.
Genes to BehaviorsBehaviors to GenesConcepts to ConceptsClusters to ClustersNavigation across Sources
BeeSpace Information SourcesBeeSpace Information Sources
General for All Spaces: Scientific Literature-Medline, Biosis, CAB Abstracts Genome Databases-GenBank, ProteinDataBank, ArrayExpress
Special for BeeSpace: Model Organisms (heredity)-Gene Descriptions (FlyBase, WormBase) Natural Histories (environment)-BeeKeeping Books (Cornell, Harvard)
XSpace Information SourcesXSpace Information SourcesOrganize Genome Databases (XBase)Compute Gene Descriptions from Model OrganismsPartition Scientific Literature for Organism XCompute XSpace using Semantic Indexing
Boost the Functional Analysis from Special SourcesCollecting Useful Data about Natural Historiese.g. CowSpace Leverage in AIPL Databases
Towards SoySpaceTowards SoySpace Organize Genome Databases (SoyBase) Partition Scientific Literature for SoyBean Gene Descriptions from Models (TAIR) Natural Histories from Population Databases
Key to Functional Analysis is Special Sources Collecting Appropriate Text about Genes Extracting Adequate Data about Histories Leverage is National Archives of germplasm
and Historical Records for soybean crops
Towards the InterspaceTowards the Interspace
The Analysis Environment technology is GENERAL!
BirdSpace? BeeSpace?PigSpace? CowSpace? BehaviorSpace? BrainSpace?SoySpace? PlantSpace?
BioSpace… Interspace
Top Related