Bio/Spice: Towards a Network Bioinformatics NIH, July 2001 Adam Arkin Howard Hughes Medical...
-
Upload
deborah-knight -
Category
Documents
-
view
214 -
download
0
Transcript of Bio/Spice: Towards a Network Bioinformatics NIH, July 2001 Adam Arkin Howard Hughes Medical...
Bio/Spice: Towards a Network Bioinformatics
NIH, July 2001
Adam Arkin
Howard Hughes Medical Institute
Departments of Bioengineering and Chemistry
University of California
Physical Biosciences Division
Lawrence Berkeley National Laboratory
Berkeley, CA 94720
http://genomics.lbl.gov
Can Molecular Biology Become Cellular Engineering?
Prediction, Control and Design
Funding: ONR, DOE, DARPA, NIH
Adult1.5 mm long~1000 cells
Genome projects are providing parts lists for the genetic and protein components of the cellular circuitry. Bioinformatics analysis of this data provides protein function and sometimes structure by homology, partial identification of regulatory sites on the DNA and functional RNAs. Partial networks can be constructed by homology to known biochemical networks. Genetic defects that lead to disease can also be identified at this level. Evolutionary relationships among organisms can also be calculated from this data.
Genome projects are providing parts lists for the genetic and protein components of the cellular circuitry. Bioinformatics analysis of this data provides protein function and sometimes structure by homology, partial identification of regulatory sites on the DNA and functional RNAs. Partial networks can be constructed by homology to known biochemical networks. Genetic defects that lead to disease can also be identified at this level. Evolutionary relationships among organisms can also be calculated from this data.
Structural biology provides experimental data on the 3-dimensional structure of biomolecules and computational approaches to predicting structure from sequence and for predicting biomolecular recognition. Both static and dynamic models of biomolecular interactions are the basis for rational drug design and automated biochemical reaction network prediction. Biochemical studies also provide much of this information as well as quantification of the kinetics and thermodynamics of the interactions.
Structural biology provides experimental data on the 3-dimensional structure of biomolecules and computational approaches to predicting structure from sequence and for predicting biomolecular recognition. Both static and dynamic models of biomolecular interactions are the basis for rational drug design and automated biochemical reaction network prediction. Biochemical studies also provide much of this information as well as quantification of the kinetics and thermodynamics of the interactions.
Biochemical and genetic network analysis integrates data from all the steps above to provide a prediction of cellular system function. Such analyses provide insight into how cells process and act upon complex external and internal signals. These are the fundamental control mechanisms that: 1) lead to partial penetrance of genotype and maintenance of population heterogeneity, 2) determine reliability of cellular function and the propensity for disease given partial failure of a network component, 3) govern adaptation of pathogens to pharmaceutical attack, the stages of facultative infection and dynamical diseases, and 4) may provide the basis for reversal of development defects and early detection of cellular control failure.
Biochemical and genetic network analysis integrates data from all the steps above to provide a prediction of cellular system function. Such analyses provide insight into how cells process and act upon complex external and internal signals. These are the fundamental control mechanisms that: 1) lead to partial penetrance of genotype and maintenance of population heterogeneity, 2) determine reliability of cellular function and the propensity for disease given partial failure of a network component, 3) govern adaptation of pathogens to pharmaceutical attack, the stages of facultative infection and dynamical diseases, and 4) may provide the basis for reversal of development defects and early detection of cellular control failure.
Ultimately, integration of genomic data and genome derived data such as that from gene chips, structural and molecular dynamic data, network functional analyses and data, will lead to a quantitative understanding of differential developmental processes and finally a full tracing of the molecular basis of development from fertilized egg to adult organism
Ultimately, integration of genomic data and genome derived data such as that from gene chips, structural and molecular dynamic data, network functional analyses and data, will lead to a quantitative understanding of differential developmental processes and finally a full tracing of the molecular basis of development from fertilized egg to adult organism
Single cells in the wave
Human neutrophil tracking aStaphylococcus.
Drosophila melanogaster embryodeveloping
Myxococcus xanthus colony undergoing traveling wave self-organization on its way to sporulation.
Complex Behaviors of Cellular Systems
Photos from everyone but me
>25 signalsInhomogenous environment
Non-simple geometrical spaceSite of infection
Primary chemoattractant
Response cytokineAnotherCytokine
Actin
PIPKgPIPK
P10
PIP4,5
PIP3,4,5
Rac
Goals of “Network Biology Approach”
SHiPPlx
oror
or
1. From the elementary interactions among the participating models, explain the complex behavior of a cellular function.• The Alliance for Cellular Signaling has
identified over 600 molecules involved in G-protein coupled signal transduction.
2. By comparing networks from many organisms, deducing the engineering principles by which cell perform particular functions and deal with uncertainty in their environment.
These networks become quite large and complex
Tucker, Gera, and Uetz (2001)
Genetic Engineering and Measurement
Methods for manipulating DNA have become better and better(Methods for design proteins, etc, are still not so good)
Methods for measuring cellular components exploding!(Still needs lots of improvement)
Goals
From Genome Sequence (and other data)
Reverse Engineer Cellular NetworkPredict Cellular FunctionDiagnose Failures (Disease)Design Control (Disease Treatments)
Forward Engineer New FunctionUse discovered control laws for biomimetic systems
What would success look like?
1. Very rapid deduction of new cellular function from well-controlled experiments
2. Rapid prediction of controllable aspects of cell function and design of control protocols
3. Robust forward design of novel function and systems1. Need for a rapid manufacture protocol
4. Identification of novel computational and control algorithms that can be abstracted into machinery.
Building a Rational Engineering Tool for Biosystems
SPICE for Cells?
Analysis and engineering of cellular circuitry
Courtesy of IBM From: Wasserman Lab, Loyola
Asynchronous Digital Telephone Switching Circuit
Full knowledge of parts listFull knowledge of “device physics”Full knowledge of interactions
No one fully understands how this circuit works!!Its just too complicated.
Designed and prototyped on a computer (SPICE analysis)Experimental implementation fault tested on computer
Asynchronous Analog Biological Switching Circuit
Partial knowledge of parts listPartial knowledge of “device physics”Partial knowledge of interactions
No one fully understands how this circuit works!!Its just too complicated.
We need a SPICE-like analysis for biological systems
SPICE: Simulation Program for Integrated Circuit Evaluation
Parts database
From subcircuitdatabase
Integrated circuit
database
Automatedfault
diagnosis
Genome Sequence
Genes/Regulatory Sequence
Proteins/RNAs
Other Chemical Species
Biochemical Pathways/Dynamics
Cytomechanical/Spatial Processes
Cell Development/Signaling
Tissue Physiology/Development
Organism Behavior
Tools for “multilevel” analysis
Finding Parts
Physical properties
Cellular networks
Assembled Genomes Polymorphisms
ORF Identification DNA Regulatory ID RNA Gene ID
mRNA Regulation mRNA Splicing RNA 2° Struct
Protein Sequence ID Homology Modeling RNA 3° Struct
Protein 3° Struct Protein Function ID RNA Function ID
Molecular InteractionPrediction
Chromatin StructureMacromolecular
Dynamics
Biochemical and Genetic Network Prediction
Metabolic/BiosyntheticAnalysis & Engineering
Signal TransductionAnalysis
Gene expression/networkAnalysis
Cytomechanical Analysis
Morphogenesis & Development
HomeostasisCell-Cell
Interactions
Tissue MechanicsCell Behavior &
EngineeringOrganismal Behavior
Epidemiological/EcologicalModels
CancerDynamics
Multi-organism function: e.g.Infectious disease
Design Philosophy and Goals
•Weakly-coupled architecture
•Provides application framework for extensibility
•Highly configurable to non-programmers
•Modular, object-oriented simulation and model analysis
•Multiple-layers of simulation, analogous to SPICE
•Full database and knowledge environment
•Realms of current development: GUI, middleware/kernels, and database
System Architecture
Local DB
GUI
Database access layer
Database
Reflection of remote DBs
Remote DBs
GUI component server
Analysis Kernels
Componentmanager
component 1
component 2
component 3
component n
BIO/SPICE: Databasing, simulation and analysis
Bio/Spice: A Web-Servable, Biologist-Friendly, database, analysis and simulation interface was developed into a true beta product.
Interfaces to ReactDB, MechDB, and ParamDB.
With Kernel, performs basic:flux-balance analysis, stochastic and deterministic kinetics,Scientific Visualization of results.
Notebook/Kernel design optimized for distributed computing.
GUI must represent biological models at different levels of abstraction.
Database
Local DB
Remote DBs
Databaseaccess layer
•Relational, open source
•Local database: NCBI / BIND schemas + modifications
•Reflections of useful remote databases
•API allows common database use among lab tools
Also tracks:
Data provenance
Data type: hypothetical, computed, measured
Quality measures: Edited/community
Authorities: submission, revision
Reflection of remote DBs
Knowledge representation for data classification and analysis
Data Ontology
Analysis Ontology
Mathematical Ontology
Cellular Ontology
Aid to user in decision making.Allows for data fusion.
Motion, Shape Change, Transport, Transformation
Differential, Algebraic, Stochastic
Leaves of the ontologies: Cellular
Gene expression
Transcription Translation
Initiation RBS Binding
Forms a hierarchy for modeling and data
Elongation Termination
Levels of AbstractionPhysical Mathematical Conceptual
Molecular Mechanics Time-scale separation Phenomenal Modelsab initio Ensemble averaging Boolean ApproximationsSemiempirical Large system limits Modularization(bioinformatic) Global/Local stability
Molecular DynamicsChemical Master EquationLangevin EquationsDeterministic KineticsReaction-Diffusion
Discrete MechanicalContinuum Mechanical
Statistical/Thermodynamic
Analysis kernel
Componentmanager
Mathematica dispatcher
MATLAB dispatcher
Bio/Spice simulator
component n
•Configuration XML
•Client/Server registry model
Automated Analysis/Target Hypothesis
Data Generation
Raw Data Storage
Data Filteringand Mining
Data Linkageto
Knowledge Base
Knowledge Base
Population
Gene Expression
Protein Expression
MetaboliteExpression
CellularPhysiologic
Imaging
LiteratureDatabase
Annotation
NetworkConstruction
NetworkDeductionStatistical
Data Modeling/QC
“Significant”Effect
Detection
PhenotypeCatalog
BiologicalSub-modelProduction
NetworkAnalytical
Suite
NetworkSimulation
SuiteBioinformatic
ToolIntegration
Stage I Stage II Stage III Stage IV Stage V
PerturbationSequenceDesign
ExperimentalReplication
Specific HypothesisTesting
Conclusions
It is time to move cell biology into a true engineering discipline
To do this we will need to develop a “sytems” theory of cell phenomenaPhysical models of cellular processes
Precise measurements of many variables in single cellsAbstractions of processes derived from physical models
Theories of how subprocesses communicateTheories of network decomposition
These circuits are not like electronic (or electrical) circuits but they Achieve pretty amazing engineering feats.
Knowledge representation is perhaps the central challengeOpen-source/freeware software development necessary.