Chapter FifteenPrentice-Hall ©2002Slide 1 of 31 1 http:\\asadipour.kmu.ac.ir.......44 slides 920117.
May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA...
-
Upload
helen-nicholson -
Category
Documents
-
view
212 -
download
0
Transcript of May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA...
May 23, 2002
Slide 1
Networks in Bioinformatics
Lenwood S. HeathVirginia Tech
Blacksburg, VA, [email protected]
I-SPAN’02Manila, Philippines
May 23, 2002
May 23, 2002
Slide 2
I. Some Molecular Biology
II. Language of the New Biology
III. Networks in Molecular Biology
IV. Gene Expression and Expresso
V. Stress and Response
VI. Networks and Computation
VII. Challenges for Bioinformatics
Overview
May 23, 2002
Slide 3
I. Some Molecular Biology
• The instruction set for a cell is contained in its chromosomes.
• Each chromosome is a long molecule called DNA.
• Each DNA molecule contains 100s or 1000s of genes.
• Each gene encodes a protein.
• A gene is transcribed to mRNA in the nucleus.
• An mRNA is translated to a protein on ribosomes.
May 23, 2002
Slide 4
Transcription and Translation
DNA mRNA ProteinTranscription Translation
May 23, 2002
Slide 5
Elaborating Cellular Function
DNA mRNA ProteinTranscription Translation
ReverseTranscription
Degradation
Regulation
Protein functions:• Structure• Catalyze chemical reactions• Regulate transcription
(Genetic Code)
Thousands of Genes!
May 23, 2002
Slide 6
Chromosomes• Long molecules of DNA: 104 to 108 base pairs
• 26 matched pairs in humans
• A gene is a subsequence of a chromosome that encodes a protein.
• Only a fraction of the genes are in use at any time.
• Every gene is present in every cell.
May 23, 2002
Slide 7
DNA Strand
C (cytosine) complements G (guanine)
C TCA AT T GA G CG
Bases
A (adenine) complements T (thymine)
2’-deoxyribose (sugar)5’ End 3’ End
May 23, 2002
Slide 8
Complementary DNA Strands
Double-Stranded DNA
C
G
TG A
C TCA AT T GA G CG
C
G
C
G
A
TA
T
A
T
A
T A
T
A
T C
G
C
G
C
G
GC CT TAA CG
May 23, 2002
Slide 9
RNA Strand
C UCA AU U GA G CG
Bases
U (uracil) replaces T (thymine)
Ribose (sugar)5’ End 3’ End
May 23, 2002
Slide 10
Transcription of DNA to mRNA
C
G
C
G
C
G
A
TA
T
A
T
A
T A
T
A
T C
G
C
G
C
G
TG A GC CT TAA CG
C UCA AU U GA G CG
mRNA Strand
Template DNA Strand
Coding DNA Strand
Template DNA Strand
May 23, 2002
Slide 11
Proteins and Amino Acids
• Protein is a large molecule that is a chain of amino acids (100 to 5000).
• There are 20 common amino acids(Alanine, Cysteine, …, Tyrosine)
• Three bases --- a codon --- suffice to encode an amino acid.
• There are also START and STOP codons.
May 23, 2002
Slide 12
Genetic Code
May 23, 2002
Slide 13
Translation to a Protein
C UCA AU U GA G CG
Phenylalanine ArginineHistidine Alanine
mRNA Strand
Nascent Polypeptide: Amino Acids Bound Together by Peptide Bonds
May 23, 2002
Slide 14
Cell’s Fetch-Execute Cycle
• Stored Program: DNA, chromosomes, genes
• Fetch/Decode: RNA, ribosomes
• Execute Functions: Proteins --- oxygen transport, cell structures, enzymes, regulation
• Inputs: Nutrients, environmental signals, external proteins
• Outputs: Waste, response proteins, enzymes
May 23, 2002
Slide 15
A new language has been created. Words in the language that are useful for today’s talks.
Genomics
Functional Genomics
Microarrays
Patterns in Gene Expression
II. The Language of the New Biology
May 23, 2002
Slide 16
• Discovery of genetic sequences and the ordering of those sequences into
• individual genes;• gene families;• chromosomes.
• Identification of• sequences that code for gene products/proteins; • sequences that act as regulatory elements.
Genomics
May 23, 2002
Slide 17
Genome Sequencing Projects
• Drosophila
• Yeast
• Mouse
• Rat
• Arabidopsis
• Human
• Microbes
• …
May 23, 2002
Slide 18
Drosophila Genome
May 23, 2002
Slide 19
• The biological role of individual genes.
• Mechanisms underlying the regulation of their expression (transcription).
• Regulatory interactions among them .
Functional Genomics
May 23, 2002
Slide 20
• Metabolic Pathways: series of connected chemical reactions within a cell
• Signal Transduction Pathways: transfer signals from outside to inside the cell
• Transport Mechanisms: movement of substances across biological membranes and within the cell
III. Networks in Molecular Biology
May 23, 2002
Slide 21
Glycolytic Pathway, Citric Acid Cycle, and Related Metabolic Processes
May 23, 2002
Slide 22
Responses to Environmental Signals
May 23, 2002
Slide 23
Carbon Metabolism
May 23, 2002
Slide 24
• Gene Transcription into Messenger RNA
• Gene Regulation
• Microarray Technology: Taking a Snapshot of Gene Expression
IV. Gene Expression and Expresso
May 23, 2002
Slide 25
Production of Messenger RNA
DNA mRNA ProteinTranscription Translation
ReverseTranscription
Degradation
Regulation
(Genetic Code)
May 23, 2002
Slide 26
• Only certain genes are “turned on” at any particular time.
• When a gene is transcribed (copied to mRNA), it is said to be expressed.
• The total mRNA population of a cell can be isolated. The relative proportion of individual mRNAs within the total RNA give a snapshot of the genes currently being expressed.
• Correlating gene expression patterns with experimental conditions gives insights into the dynamic functioning of the cell.
Gene Expression
May 23, 2002
Slide 27
Microarray Technology
• In the past, gene expression and gene interactions were examined known gene by known gene, process by process.
• With microarray technology:
– Simultaneous examination of large groups of genes and associated interactions
– Possible discovery of new cellular mechanisms involving gene expression
May 23, 2002
Slide 28
Flow of a Microarray Experiment
Hypotheses
Select cDNAs
PCR
Test of Hypotheses
Extract RNA
Replication and Randomization
Reverse Transcription and
Fluorescent Labeling
Robotic Printing
Hybridization
Identify Spots
Intensities
Statistics
Clustering
Data Mining, ILP
May 23, 2002
Slide 29
Spots:(Sequences affixed to slide)
1 2 3
11
2
21
3
1 2
2333
Treatment Control
Mix
1 2 3
Excitatio
n
Emission
Detection
Relative AbundanceDetection
Hybridization
May 23, 2002
Slide 30
Gene Expression Varies
Pseudocoloring of the combined images illustrates the Cy5 to Cy3 intensity ratios
and differential gene expression.
May 23, 2002
Slide 31
Expresso System for Microarray Experiment Design, Management,
and Data Analysis
May 23, 2002
Slide 32
Boris Chevone (VT-PPWS)
Ron Sederoff (NCSU)
Dawei Chen (CS)
Ruth Grene (VT-PPWS)
Lenny Heath (VT-CS)
Naren Ramakrishnan
(VT-CS)
Keying Ye (VT-STAT)
Len van Zyl (NCSU)
Carol Loopstra (Texas A & M)
Jonathan Watkinson (VT-
PPWS) Margaret Ellis (CS)
Logan Hanks (CS)
Senior Collaborators
Students: VT
Cecilia Vasquez (PPWS)
PostDocs
Allan Sioson (CS)
Layne Watson (VT-CS)
Jennifer Weller (VBI)
Maulik Shukla
(CS)
May 23, 2002
Slide 33
Microarray Experiment Flow
May 23, 2002
Slide 34
• Biotic and Abiotic Stress: pathogens, insects, drought, heat, cold, salt, toxins
• Stress Response and Defense
• Stress Signal Transduction and Downstream Events
V. Stress and Response
May 23, 2002
Slide 35
Responses to Environmental Signals
May 23, 2002
Slide 36
ROS Response
May 23, 2002
Slide 37
Network of Munnik and Meijer
May 23, 2002
Slide 38
Network of Shinozaki and Yamaguchi-Shinozaki
May 23, 2002
Slide 39
VI. Networks and Computation
• Issues
• Approaches
May 23, 2002
Slide 40
• Mathematical Model(s) for Biological Networks
• Representation: What biological entities and parameters to represent and at what level of granularity?
• Operations and Computations: What manipulations and transformations are supported?
• Presentation: How can biologists visualize and explore networks?
Issues for Biological Networks
May 23, 2002
Slide 41
• Partial differential equations
• Boolean networks
• Bayesian networks
• Logic programs
• Neural networks
• Petri nets
• Fuzzy cognitive maps
• Weak or none (ad hoc)
Mathematical Models for Biological Networks
May 23, 2002
Slide 42
• Textual or diagrammatic from the biology literature
• A graph of some kind:
- Undirected, directed, or mixed
- Simple or multigraph or hypergraph
- Nodes and edges labeled with biological information
Representation of Biological Networks
May 23, 2002
Slide 43
• Chemical Reaction
• Molecules: proteins (enzymes and others), DNA, RNA, organic molecules, water, etc.
• Cellular components: membranes, chromosomes, nucleus, ribosomes, etc.
• Processes: metabolism, environmental sensing
• Environmental Condition
• Time or Stage
What a Node Might Represent
May 23, 2002
Slide 44
• Transformation in a Chemical Reaction: Substrate to product
• Catalytic Relationship: Enzyme to substrate or reaction
• Protein/Protein Interaction
• Signal Transduction
• Regulation of Transcription
• Regulation of Translation
• Activation and Deactivation
What an Edge Might Represent
May 23, 2002
Slide 45
VII. Challenges for Bioinformatics
• Representing uncertainty or missing data
• Reconciling multiple networks or introducing new biological data into an existing network
• Deriving conclusions and hypotheses from networks
• Network visualization and exploration
May 23, 2002
Slide 46
• Missing biological data is a fact of life
• As a consequence, a network can be lacking in some details, biologically wrong, or even self-contradictory
• Ability to reason computationally with uncertainty and with probabilities is essential
• Uncertainty can suggest hypotheses that can be tested experimentally to refine a network
Uncertainty in Networks
May 23, 2002
Slide 47
Reconciling Networks
May 23, 2002
Slide 48
• Nodes and edges have flexible semantics to represent:
- Time
- Uncertainty
- Cellular decision making; process regulation
- Cell topology and compartmentalization
- Rate constants, etc.
• Hierarchical
Multimodal Networks
May 23, 2002
Slide 49
• Create
• Refine and extend
• Check for consistency
• Combine: union, intersection, incorporation
• Reconcile
• Evaluate
• Simulate
Operations on a Database of Multimodal Networks
May 23, 2002
Slide 50
• Help biologists find new biological knowledge
• Visualize and explore
• Generating hypotheses and experiments
• Predict regulatory phenomena
• Predict responses to stress
• Incorporate into Expresso as part of closing the loop
Using Multimodal Networks