May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA...

50
May 23, 2002 Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA [email protected] I-SPAN’02 Manila, Philippines May 23, 2002

Transcript of May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA...

Page 1: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 1

Networks in Bioinformatics

Lenwood S. HeathVirginia Tech

Blacksburg, VA, [email protected]

I-SPAN’02Manila, Philippines

May 23, 2002

Page 2: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 2

I. Some Molecular Biology

II. Language of the New Biology

III. Networks in Molecular Biology

IV. Gene Expression and Expresso

V. Stress and Response

VI. Networks and Computation

VII. Challenges for Bioinformatics

Overview

Page 3: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 3

I. Some Molecular Biology

• The instruction set for a cell is contained in its chromosomes.

• Each chromosome is a long molecule called DNA.

• Each DNA molecule contains 100s or 1000s of genes.

• Each gene encodes a protein.

• A gene is transcribed to mRNA in the nucleus.

• An mRNA is translated to a protein on ribosomes.

Page 4: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 4

Transcription and Translation

DNA mRNA ProteinTranscription Translation

Page 5: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 5

Elaborating Cellular Function

DNA mRNA ProteinTranscription Translation

ReverseTranscription

Degradation

Regulation

Protein functions:• Structure• Catalyze chemical reactions• Regulate transcription

(Genetic Code)

Thousands of Genes!

Page 6: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 6

Chromosomes• Long molecules of DNA: 104 to 108 base pairs

• 26 matched pairs in humans

• A gene is a subsequence of a chromosome that encodes a protein.

• Only a fraction of the genes are in use at any time.

• Every gene is present in every cell.

Page 7: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 7

DNA Strand

C (cytosine) complements G (guanine)

C TCA AT T GA G CG

Bases

A (adenine) complements T (thymine)

2’-deoxyribose (sugar)5’ End 3’ End

Page 8: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 8

Complementary DNA Strands

Double-Stranded DNA

C

G

TG A

C TCA AT T GA G CG

C

G

C

G

A

TA

T

A

T

A

T A

T

A

T C

G

C

G

C

G

GC CT TAA CG

Page 9: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 9

RNA Strand

C UCA AU U GA G CG

Bases

U (uracil) replaces T (thymine)

Ribose (sugar)5’ End 3’ End

Page 10: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 10

Transcription of DNA to mRNA

C

G

C

G

C

G

A

TA

T

A

T

A

T A

T

A

T C

G

C

G

C

G

TG A GC CT TAA CG

C UCA AU U GA G CG

mRNA Strand

Template DNA Strand

Coding DNA Strand

Template DNA Strand

Page 11: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 11

Proteins and Amino Acids

• Protein is a large molecule that is a chain of amino acids (100 to 5000).

• There are 20 common amino acids(Alanine, Cysteine, …, Tyrosine)

• Three bases --- a codon --- suffice to encode an amino acid.

• There are also START and STOP codons.

Page 12: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 12

Genetic Code

Page 13: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 13

Translation to a Protein

C UCA AU U GA G CG

Phenylalanine ArginineHistidine Alanine

mRNA Strand

Nascent Polypeptide: Amino Acids Bound Together by Peptide Bonds

Page 14: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 14

Cell’s Fetch-Execute Cycle

• Stored Program: DNA, chromosomes, genes

• Fetch/Decode: RNA, ribosomes

• Execute Functions: Proteins --- oxygen transport, cell structures, enzymes, regulation

• Inputs: Nutrients, environmental signals, external proteins

• Outputs: Waste, response proteins, enzymes

Page 15: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 15

A new language has been created. Words in the language that are useful for today’s talks.

Genomics

Functional Genomics

Microarrays

Patterns in Gene Expression

II. The Language of the New Biology

Page 16: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 16

• Discovery of genetic sequences and the ordering of those sequences into

• individual genes;• gene families;• chromosomes.

• Identification of• sequences that code for gene products/proteins; • sequences that act as regulatory elements.

Genomics

Page 17: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 17

Genome Sequencing Projects

• Drosophila

• Yeast

• Mouse

• Rat

• Arabidopsis

• Human

• Microbes

• …

Page 18: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 18

Drosophila Genome

Page 19: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 19

• The biological role of individual genes.

• Mechanisms underlying the regulation of their expression (transcription).

• Regulatory interactions among them .

Functional Genomics

Page 20: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 20

• Metabolic Pathways: series of connected chemical reactions within a cell

• Signal Transduction Pathways: transfer signals from outside to inside the cell

• Transport Mechanisms: movement of substances across biological membranes and within the cell

III. Networks in Molecular Biology

Page 21: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 21

Glycolytic Pathway, Citric Acid Cycle, and Related Metabolic Processes

Page 22: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 22

Responses to Environmental Signals

Page 23: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 23

Carbon Metabolism

Page 24: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 24

• Gene Transcription into Messenger RNA

• Gene Regulation

• Microarray Technology: Taking a Snapshot of Gene Expression

IV. Gene Expression and Expresso

Page 25: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 25

Production of Messenger RNA

DNA mRNA ProteinTranscription Translation

ReverseTranscription

Degradation

Regulation

(Genetic Code)

Page 26: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 26

• Only certain genes are “turned on” at any particular time.

• When a gene is transcribed (copied to mRNA), it is said to be expressed.

• The total mRNA population of a cell can be isolated. The relative proportion of individual mRNAs within the total RNA give a snapshot of the genes currently being expressed.

• Correlating gene expression patterns with experimental conditions gives insights into the dynamic functioning of the cell.

Gene Expression

Page 27: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 27

Microarray Technology

• In the past, gene expression and gene interactions were examined known gene by known gene, process by process.

• With microarray technology:

– Simultaneous examination of large groups of genes and associated interactions

– Possible discovery of new cellular mechanisms involving gene expression

Page 28: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 28

Flow of a Microarray Experiment

Hypotheses

Select cDNAs

PCR

Test of Hypotheses

Extract RNA

Replication and Randomization

Reverse Transcription and

Fluorescent Labeling

Robotic Printing

Hybridization

Identify Spots

Intensities

Statistics

Clustering

Data Mining, ILP

Page 29: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 29

Spots:(Sequences affixed to slide)

1 2 3

11

2

21

3

1 2

2333

Treatment Control

Mix

1 2 3

Excitatio

n

Emission

Detection

Relative AbundanceDetection

Hybridization

Page 30: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 30

Gene Expression Varies

Pseudocoloring of the combined images illustrates the Cy5 to Cy3 intensity ratios

and differential gene expression.

Page 31: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 31

Expresso System for Microarray Experiment Design, Management,

and Data Analysis

Page 32: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 32

Boris Chevone (VT-PPWS)

Ron Sederoff (NCSU)

Dawei Chen (CS)

Ruth Grene (VT-PPWS)

Lenny Heath (VT-CS)

Naren Ramakrishnan

(VT-CS)

Keying Ye (VT-STAT)

Len van Zyl (NCSU)

Carol Loopstra (Texas A & M)

Jonathan Watkinson (VT-

PPWS) Margaret Ellis (CS)

Logan Hanks (CS)

Senior Collaborators

Students: VT

Cecilia Vasquez (PPWS)

PostDocs

Allan Sioson (CS)

Layne Watson (VT-CS)

Jennifer Weller (VBI)

Maulik Shukla

(CS)

Page 33: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 33

Microarray Experiment Flow

Page 34: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 34

• Biotic and Abiotic Stress: pathogens, insects, drought, heat, cold, salt, toxins

• Stress Response and Defense

• Stress Signal Transduction and Downstream Events

V. Stress and Response

Page 35: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 35

Responses to Environmental Signals

Page 36: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 36

ROS Response

Page 37: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 37

Network of Munnik and Meijer

Page 38: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 38

Network of Shinozaki and Yamaguchi-Shinozaki

Page 39: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 39

VI. Networks and Computation

• Issues

• Approaches

Page 40: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 40

• Mathematical Model(s) for Biological Networks

• Representation: What biological entities and parameters to represent and at what level of granularity?

• Operations and Computations: What manipulations and transformations are supported?

• Presentation: How can biologists visualize and explore networks?

Issues for Biological Networks

Page 41: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 41

• Partial differential equations

• Boolean networks

• Bayesian networks

• Logic programs

• Neural networks

• Petri nets

• Fuzzy cognitive maps

• Weak or none (ad hoc)

Mathematical Models for Biological Networks

Page 42: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 42

• Textual or diagrammatic from the biology literature

• A graph of some kind:

- Undirected, directed, or mixed

- Simple or multigraph or hypergraph

- Nodes and edges labeled with biological information

Representation of Biological Networks

Page 43: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 43

• Chemical Reaction

• Molecules: proteins (enzymes and others), DNA, RNA, organic molecules, water, etc.

• Cellular components: membranes, chromosomes, nucleus, ribosomes, etc.

• Processes: metabolism, environmental sensing

• Environmental Condition

• Time or Stage

What a Node Might Represent

Page 44: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 44

• Transformation in a Chemical Reaction: Substrate to product

• Catalytic Relationship: Enzyme to substrate or reaction

• Protein/Protein Interaction

• Signal Transduction

• Regulation of Transcription

• Regulation of Translation

• Activation and Deactivation

What an Edge Might Represent

Page 45: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 45

VII. Challenges for Bioinformatics

• Representing uncertainty or missing data

• Reconciling multiple networks or introducing new biological data into an existing network

• Deriving conclusions and hypotheses from networks

• Network visualization and exploration

Page 46: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 46

• Missing biological data is a fact of life

• As a consequence, a network can be lacking in some details, biologically wrong, or even self-contradictory

• Ability to reason computationally with uncertainty and with probabilities is essential

• Uncertainty can suggest hypotheses that can be tested experimentally to refine a network

Uncertainty in Networks

Page 47: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 47

Reconciling Networks

Page 48: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 48

• Nodes and edges have flexible semantics to represent:

- Time

- Uncertainty

- Cellular decision making; process regulation

- Cell topology and compartmentalization

- Rate constants, etc.

• Hierarchical

Multimodal Networks

Page 49: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 49

• Create

• Refine and extend

• Check for consistency

• Combine: union, intersection, incorporation

• Reconcile

• Evaluate

• Simulate

Operations on a Database of Multimodal Networks

Page 50: May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA heath@vt.edu I-SPAN’02 Manila, Philippines May 23, 2002.

May 23, 2002

Slide 50

• Help biologists find new biological knowledge

• Visualize and explore

• Generating hypotheses and experiments

• Predict regulatory phenomena

• Predict responses to stress

• Incorporate into Expresso as part of closing the loop

Using Multimodal Networks